Are world models necessary to achieve human-level agents, or is there a model-free short-cut?
Our new #ICML2025 paper tackles this question from first principles, and finds a surprising answer, agents _are_ world models… 🧵
Turns out there’s a neat answer to this question. We prove that any agent capable of generalizing to a broad range of simple goal-directed tasks must have learned a predictive model capable of simulating its environment. And this model can always be recovered from the agent.
Our paper, 'robust agents learn causal world models' got an honourable mention in the outstanding paper awards at #ICLR2024. Check out our talk in 20 mins in hall A3, or come chat with @tom4everitt and I at the poster session after!
Specifically, we show it’s possible to recover a bounded error approximation of the environment transition function from any goal-conditional policy that satisfies a regret bound across a wide enough set of simple goals, like steering the environment into a desired state.
No model-free path. If you want to train an agent capable of a wide range of goal-directed tasks, you can’t avoid the challenge of learning a world model. And to improve performance or generality, agents need to learn increasingly accurate and detailed world models.
And to achieve lower regret, or more complex goals, agents must learn increasingly accurate world models. Goal-conditioned policies are informationally equivalent to world models! But only for goals over mutli-step horizons, myopic agents do not need to learn world models.
World models are foundational to goal-directedness in humans, but are hard to learn in messy open worlds. We're now seeing generalist, model-free agents (Gato, PaLM-E, Pi-0…). Do these agents learn implicit world models, or have they found another way to generalize to new tasks?
Nice to see our work 'counterfactual harm' is a highlighted @DeepMind paper at #NeurIPS2022 this year. Interesting omen that 3 of the 9 highlighted papers use causality (all of them in the responsible AI category).
Going to @NeurIPSConf?
We’ll be presenting our latest research including:
🔵 Language models Chinchilla and Flamingo
🔵 New papers on algorithmic advances and optimising #RL
🔵 How we’re developing ethical and fair AI systems
And much more: dpmd.ai/neurips-tw#NeurIPS2022
Fundamental limitations on agency. In environments where the dynamics are provably hard to learn, or where long-horizon prediction is infeasible, the capabilities of agents are fundamentally bounded.
Causality. In previous work we showed a causal world model is needed for robustness. It turns out you don’t need as much causal knowledge of the environment for task generalization. There is a causal hierarchy, but for agency and agent capabilities, rather than inference!
Extracting world knowledge from agents. We derive algorithms that recover a world model given the agent’s policy and goal (policy + goal -> world model). These algorithms complete the triptych of planning (world model + goal -> policy) and IRL (world model + policy -> goal).
Emergent capabilities. To minimize training loss across many goals, agents must learn a world model, which can solve tasks the agent was not explicitly trained on. Simple goal-directedness gives rise to many capabilities (social cognition, reasoning about uncertainty, intent…).