Omar Khattab (@lateinteraction) / X

Omar Khattab

13.9K posts

Omar Khattab

@lateinteraction

asst professor @MIT CSAIL @nlp_mit. ColBERT.ai, DSPy.ai (@DSPyOSS), GEPA, RLMs, Pedagogical RL

Cambridge, MA

Joined December 2022

Omar Khattab
@lateinteraction
Sep 9, 2025
that sounds plausible
Acre
@Acre108
Sep 9, 2025
There is no word in English that means someone is 30% sure something will happen
560K
Omar Khattab
@lateinteraction
Sep 23, 2025
crazy that they called it context window when attention span was right there
333K
Omar Khattab
@lateinteraction
May 31, 2024
I'm excited to share that I will be joining MIT EECS as an assistant professor in Fall 2025! I'll be recruiting PhD students from the December 2024 application pool. Indicate interest if you'd like to work with me on NLP, IR, or ML Systems! Stay tuned for more about my new lab.
324K
Omar Khattab
@lateinteraction
Sep 12, 2025
It just hit me that sub 1B-parameter models that are way better than 175B GPT-3 are a dime a dozen today. Kinda cool.
508K
Omar Khattab
@lateinteraction
Jun 13, 2025
After ~6 years of building these types of architectures (starting with BERT, eg see Baleen), I think calling these multi-agent systems is a distraction. This is just software. Happens to be AI software. It doesn’t seem so complicated once you internalize it’s just a program.
Anthropic
@AnthropicAI
Jun 13, 2025
New on the Anthropic Engineering blog: how we built Claude’s research capabilities using multiple agents working in parallel. We share what worked, what didn't, and the engineering challenges along the way. anthropic.com/engineering/bu…
229K
Omar Khattab
@lateinteraction
Oct 15, 2025
btw Alex is a second-month PhD student; he did this work in 4 weeks i have my suspicions that Alex has secret recursive Alexes that do his work for him, but i haven't been able to confirm that haha really fun post on recursive LMs with interesting trace examples, check it out!
alex zhang
@a1zhang
Oct 15, 2025
What if scaling the context windows of frontier LLMs is much easier than it sounds? We’re excited to share our work on Recursive Language Models (RLMs). A new inference strategy where LLMs can decompose and recursively interact with input prompts of seemingly unbounded length,
205K
Omar Khattab
@lateinteraction
Sep 4, 2024
🔗 Thoughts on Research Impact in AI. Grad students often ask: how do I do research that makes a difference in the current, crowded AI space? This is a blogpost that summarizes my perspective in six guidelines for making research impact via open-source artifacts. Link below.
248K
Omar Khattab
@lateinteraction
Sep 9, 2025
Replying to @impossibilium
That’s implausible
14K
Omar Khattab
@lateinteraction
Aug 17, 2025
Every once in a while, it hits you that word2vec and attention were one year apart.
109K
Omar Khattab
@lateinteraction
Jan 24, 2023
Introducing Demonstrate–Search–Predict (𝗗𝗦𝗣), a framework for composing search and LMs w/ up to 120% gains over GPT-3.5. No more prompt engineering.❌ Describe a high-level strategy as imperative code and let 𝗗𝗦𝗣 deal with prompts and queries.🧵 arxiv.org/abs/2212.14024
226K
Omar Khattab
@lateinteraction
May 11, 2025
DSPy's biggest strength is also the reason it can admittedly be hard to wrap your head around it. It's basically say: LLMs & their methods will continue to improve but not equally in every axis, so: - What's the smallest set of fundamental abstractions that allow you to build
DSPy
@DSPyOSS
May 11, 2025
Is this guy talking about DSPy?
337K
Omar Khattab
@lateinteraction
Oct 11, 2023
A cool thread yesterday used GPT4 ($50), a 500-word ReAct prompt, and ~400 lines of code to finetune Llama2-7B to get 26% HotPotQA EM. Let's use 30 lines of DSPy—without any hand-written prompts or any calls to OpenAI ($0)—to teach a 9x smaller T5 (770M) model to get 39% EM! 🧵
251K
Omar Khattab
@lateinteraction
Jan 23, 2024
We started this project thinking LMs can’t be prompted to do classification tasks with over 10,000 classes — especially when documents are long! But the incredible @KarelDoostrlnck found this elegant DSPy program that, once optimized on ~50 examples, sets the state of the art.
Karel
@KarelDoostrlnck
Jan 23, 2024
📢Tasks with > 10k classes (e.g. information extraction) are hard for in-context learning: typically a tuned retriever or many in-context calls per input are used ($$$) Infer-Retrieve-Rank (IReRa) is a SotA program using 1 frozen retriever with a query predictor and reranker.
1M
Omar Khattab
@lateinteraction
Oct 13, 2025
Everyone talks about LLM as a judge. But what about LLM as a witness.
80K