Log inSign up
Keyon Vafa
1,406 posts
user avatar
Keyon Vafa
@keyonV
Postdoctoral fellow at @Harvard_Data | Former computer science PhD with @Blei_Lab at @Columbia University | Researching AI + implicit world models
keyonvafa.com
Joined August 2011
888
Following
4,718
Followers
  • Pinned
    user avatar
    Keyon Vafa
    @keyonV
    Jul 11, 2025
    Can an AI model predict perfectly and still have a terrible world model? What would that even mean? Our new ICML paper formalizes these questions One result tells the story: A transformer trained on 10M solar systems nails planetary orbits. But it botches gravitational laws 🧵
    Image
    00:00
    1.4M
  • user avatar
    Keyon Vafa
    @keyonV
    Jun 20, 2024
    New paper: How can you tell if a transformer has the right world model? We trained a transformer to predict directions for NYC taxi rides. The model was good. It could find shortest paths between new points But had it built a map of NYC? We reconstructed its map and found this:
    Image
    841K
  • user avatar
    Keyon Vafa
    @keyonV
    Jul 11, 2025
    Replying to @keyonV
    Our paper aims to answer two questions: 1. What's the difference between prediction and world models? 2. Are there straightforward metrics that can test this distinction? Our paper is about AI. But it's helpful to go back 400 years to answer these questions.
    Image
    71K
  • user avatar
    Keyon Vafa
    @keyonV
    Jul 11, 2025
    Replying to @keyonV
    If you only care about orbits, Newton didn't add much. His laws give the same predictions. But Newton's laws went beyond orbits: the same laws explain pendula, cannonballs, and rockets. This motivates our framework: Predictions apply to one task. World models generalize to many
    42K
  • user avatar
    Keyon Vafa
    @keyonV
    Jun 20, 2024
    Replying to @keyonV
    The map let us visually inspect the incoherent world model. But how should we evaluate world models in non-map settings? Our paper proposes new evaluation metrics for world model recovery.
    Image
    71K
  • user avatar
    Keyon Vafa
    @keyonV
    Jul 11, 2025
    Replying to @keyonV
    Perhaps the most influential world model had its start as a predictive model. Before we had Newton's laws of gravity, we had Kepler's predictions of planetary orbits. Kepler's predictions led to Newton's laws. So what did Newton add?
    Image
    43K
  • user avatar
    Keyon Vafa
    @keyonV
    Jul 11, 2025
    Replying to @keyonV
    Newton's laws are a kind of foundation model. They provide a place to start when working on new problems. A good foundation model should do the same. The No Free Lunch Theorem motivates a test: Every foundation model has an inductive bias. This bias reveals its world model.
    39K
  • user avatar
    Keyon Vafa
    @keyonV
    Jul 11, 2025
    Replying to @keyonV
    We propose a method to measure these inductive biases. We call it an inductive bias probe. Two steps: 1. Fit a foundation model to many new, very small synthetic datasets 2. Analyze patterns in the functions it learns to find the model's inductive bias
    Image
    37K
  • user avatar
    Keyon Vafa
    @keyonV
    Jul 11, 2025
    Replying to @keyonV
    We then fine-tuned the model on a larger scale, to predict forces across 10K solar systems. We used a symbolic regression to compare the recovered force law to Newton's law. It not only recovered a nonsensical law—it recovered different laws for different galaxies.
    Image
    90K
  • user avatar
    Keyon Vafa
    @keyonV
    Jun 20, 2024
    Replying to @keyonV
    Why does it matter that the transformer has an incoherent world model? It still manages to find shortest paths. Incoherence implies fragility: we show the transformer's traversal capabilities break down when we add detours to the underlying map.
    Image
    32K
  • user avatar
    Keyon Vafa
    @keyonV
    Jul 11, 2025
    Replying to @keyonV
    Summary: 1. We propose inductive bias probes: a model's inductive bias reveals its world model 2. Foundation models can have great predictions with poor world models 3. One reason world models are poor: models group together distinct states that have similar allowed next-tokens
    20K
  • user avatar
    Keyon Vafa
    @keyonV
    Jul 11, 2025
    Replying to @keyonV
    Paper: arxiv.org/abs/2507.06952 Co-authors: Peter Chang (@petergchang), Ashesh Rambachan (@asheshrambachan), Sendhil Mullainathan (@m_sendhil)
    arXiv logo
    arxiv.org
    What Has a Foundation Model Found? Using Inductive Bias to Probe...
    Foundation models are premised on the idea that sequence prediction can uncover deeper domain understanding, much like how Kepler's predictions of planetary motion later led to the discovery of...
    23K
  • user avatar
    Keyon Vafa
    @keyonV
    Jun 20, 2024
    Replying to @keyonV
    Paper: arxiv.org/abs/2406.03689 Code: github.com/keyonvafa/worl… Co-authors: Justin Chen (@justinychen), Jon Kleinberg, Sendhil Mullainathan (@m_sendhil), Ashesh Rambachan (@asheshrambachan)
    arXiv logo
    arxiv.org
    Evaluating the World Model Implicit in a Generative Model
    Recent work suggests that large language models may implicitly learn world models. How should we assess this possibility? We formalize this question for the case where the underlying reality is...
    20K
  • user avatar
    Keyon Vafa
    @keyonV
    Jul 11, 2025
    Replying to @keyonV
    We apply these probes to orbital, lattice, and Othello problems. Starting with orbits: we encode solar systems as sequences and train a transformer on 10M solar systems (20B tokens) The model makes accurate predictions many timesteps ahead. Predictions for our solar system:
    Image
    00:00
    33K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement