Keyon Vafa (@keyonV) / X

Keyon Vafa

1,406 posts

Keyon Vafa

@keyonV

Postdoctoral fellow at @Harvard_Data | Former computer science PhD with @Blei_Lab at @Columbia University | Researching AI + implicit world models

Joined August 2011

Pinned
Keyon Vafa
@keyonV
Jul 11, 2025
Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵
00:00
1.4M
Keyon Vafa
@keyonV
Jun 20, 2024
New paper: How can you tell if a transformer has the right world model? We trained a transformer to predict directions for NYC taxi rides. The model was good. It could find shortest paths between new points But had it built a map of NYC? We reconstructed its map and found this:
841K
Keyon Vafa
@keyonV
Jul 11, 2025
Replying to @keyonV
Our paper aims to answer two questions: 1. What's the difference between prediction and world models? 2. Are there straightforward metrics that can test this distinction? Our paper is about AI. But it's helpful to go back 400 years to answer these questions.
71K
Keyon Vafa
@keyonV
Jul 11, 2025
Replying to @keyonV
If you only care about orbits, Newton didn't add much. His laws give the same predictions. But Newton's laws went beyond orbits: the same laws explain pendula, cannonballs, and rockets. This motivates our framework: Predictions apply to one task. World models generalize to many
42K
Keyon Vafa
@keyonV
Jun 20, 2024
Replying to @keyonV
The map let us visually inspect the incoherent world model. But how should we evaluate world models in non-map settings? Our paper proposes new evaluation metrics for world model recovery.
71K
Keyon Vafa
@keyonV
Jul 11, 2025
Replying to @keyonV
Perhaps the most influential world model had its start as a predictive model. Before we had Newton's laws of gravity, we had Kepler's predictions of planetary orbits. Kepler's predictions led to Newton's laws. So what did Newton add?
43K
Keyon Vafa
@keyonV
Jul 11, 2025
Replying to @keyonV
Newton's laws are a kind of foundation model. They provide a place to start when working on new problems. A good foundation model should do the same. The No Free Lunch Theorem motivates a test: Every foundation model has an inductive bias. This bias reveals its world model.
39K
Keyon Vafa
@keyonV
Jul 11, 2025
Replying to @keyonV
We propose a method to measure these inductive biases. We call it an inductive bias probe. Two steps: 1. Fit a foundation model to many new, very small synthetic datasets 2. Analyze patterns in the functions it learns to find the model's inductive bias
37K
Keyon Vafa
@keyonV
Jul 11, 2025
Replying to @keyonV
We then fine-tuned the model on a larger scale, to predict forces across 10K solar systems. We used a symbolic regression to compare the recovered force law to Newton's law. It not only recovered a nonsensical law—it recovered different laws for different galaxies.
90K
Keyon Vafa
@keyonV
Jun 20, 2024
Replying to @keyonV
Why does it matter that the transformer has an incoherent world model? It still manages to find shortest paths. Incoherence implies fragility: we show the transformer's traversal capabilities break down when we add detours to the underlying map.
32K
Keyon Vafa
@keyonV
Jul 11, 2025
Replying to @keyonV
Summary: 1. We propose inductive bias probes: a model's inductive bias reveals its world model 2. Foundation models can have great predictions with poor world models 3. One reason world models are poor: models group together distinct states that have similar allowed next-tokens
20K
Keyon Vafa
@keyonV
Jul 11, 2025
Replying to @keyonV
Paper: arxiv.org/abs/2507.06952 Co-authors: Peter Chang (@petergchang), Ashesh Rambachan (@asheshrambachan), Sendhil Mullainathan (@m_sendhil)
arxiv.org
What Has a Foundation Model Found? Using Inductive Bias to Probe...
Foundation models are premised on the idea that sequence prediction can uncover deeper domain understanding, much like how Kepler's predictions of planetary motion later led to the discovery of...
23K
Keyon Vafa
@keyonV
Jun 20, 2024
Replying to @keyonV
Paper: arxiv.org/abs/2406.03689 Code: github.com/keyonvafa/worl… Co-authors: Justin Chen (@justinychen), Jon Kleinberg, Sendhil Mullainathan (@m_sendhil), Ashesh Rambachan (@asheshrambachan)
arxiv.org
Evaluating the World Model Implicit in a Generative Model
Recent work suggests that large language models may implicitly learn world models. How should we assess this possibility? We formalize this question for the case where the underlying reality is...
20K
Keyon Vafa
@keyonV
Jul 11, 2025
Replying to @keyonV
We apply these probes to orbital, lattice, and Othello problems. Starting with orbits: we encode solar systems as sequences and train a transformer on 10M solar systems (20B tokens) The model makes accurate predictions many timesteps ahead. Predictions for our solar system:
00:00
33K