Log inSign up
Vik Paruchuri
2,027 posts
user avatar
Vik Paruchuri
@VikParuchuri
Open source AI. Founder of @datalabto Past: founded @dataquestio
Brooklyn,NY
vikas.sh
Joined June 2012
186
Following
16.4K
Followers
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Aug 12, 2025
    Parsing PDFs has slowly driven me insane over the last year. Here are 8 weird edge cases to show you why PDF parsing isn't an easy problem. 🧵
    Image
    627K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Sep 10, 2025
    The PDF format is hard to parse - by design. Let's explore the internals of the PDF format to figure out how Adobe did this to us.
    Image
    240K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Jan 12, 2024
    Announcing surya - a multilingual text line detection model for documents. It gives you accurate line-level bboxes and column breaks. Find it here - github.com/VikParuchuri/s… .
    Image
    588K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Feb 12, 2024
    Announcing surya OCR - text recognition in 93 languages. It outperforms tesseract in almost all languages, often by large margins. Find it here - github.com/VikParuchuri/s… .
    Image
    Image
    184K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Oct 26, 2025
    Best OCR ever, huh?
    Image
    Image
    user avatar
    Harveen Singh Chadha
    @HarveenChadha
    Oct 26, 2025
    No, its not the best OCR ever here is the result from olmoOCR2 on the same and it does have a frightening degree of accuracy
    334K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Oct 21, 2025
    I'm excited to announce that Chandra OCR is open source! - Full layout information - Extracts and captions images and diagrams - Strong handwriting, form, table support - Works with transformers and vLLM
    Image
    Image
    Image
    Image
    128K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Jul 16, 2024
    I'm starting a company, Datalab: - Task-specific models that outperform frontier LLMs and existing tools - Examples: my projects marker and surya (25k GH stars) with task-specific arch - Goal: Train models, open source as much as possible, do hosted inference and on-prem
    152K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Apr 11, 2024
    I wrote a blog post on going from not knowing anything about deep learning last year to training state of the art OSS models - vikas.sh/post/how-i-got… . Hope it helps you. tldr; read the deep learning book, implemented papers + taught, built open source tools
    Image
    How I got into deep learning
    From vikas.sh
    220K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Oct 15, 2024
    I made a library to detect tables and extract to markdown or csv. It uses a new table recognition model I trained.
    Image
    Image
    Image
    79K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Aug 16, 2024
    Announcing Surya OCR 2! It uses a new architecture and improves on v1 in every way: - OCR with automatic language detection for 93 languages (no more specifying languages!) - More accurate on old/noisy documents - 20% faster - Basic English handwriting support
    Image
    Image
    Image
    Image
    72K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Nov 30, 2023
    I'm excited to ship marker - a pdf to markdown converter that is 10x faster than nougat, more accurate outside arXiv, and has low hallucination risk. Marker is optimized for throughput, like converting LLM pretrain data. Find it here - github.com/VikParuchuri/m… .
    Image
    146K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Feb 19, 2025
    We've improved marker (PDF -> markdown) a lot in 3 months - accuracy and speed now beat llamaparse, mathpix, and docling. We shipped: - llm mode that augments marker with models like gemini flash - improved math, w/inline math - links and references - better tables and forms
    Image
    Image
    78K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    May 5, 2024
    Cool to see a 500M param model I trained myself do better than Google cloud vision, Claude, and GPT-4V on this task. (look at the thread for the results) It's a relatively narrow one (OCR), but feels nice to see that small open source models still have a place.
    user avatar
    Brendan Dolan-Gavitt
    @moyix
    May 4, 2024
    It's weird how we live in an age of miracles with respect to AI/ML, and yet when I want to extract some text from a screenshot the best (very bad) option is tesseract, last updated ~7 years ago.
    172K
  • user avatar
    Vik Paruchuri
    @VikParuchuri
    Oct 8, 2024
    Announcing Surya Table Recognition! It uses a new architecture to outperform table transformer, the current SoTA open source model. - Recognizes table rows, columns, and cells - Works with complex layouts and rotated tables - Supports any language - Runs locally
    Image
    Image
    Image
    61K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement