TRI's latest Large Behavior Model (LBM) paper landed on arxiv last night! Check out our project website: toyotaresearchinstitute.github.io/lbm1/
One of our main goals for this paper was to put out a very careful and thorough study on the topic to help people understand the state of the
Russ Tedrake
30 posts
Professor at MIT, studying robotics. Founder of a stealth Physical AI startup.
Joined July 2022
- I'm super excited to start a great new collaboration with the fantastic team at Boston Dynamics. Scott Kuindersma and I chatted with Evan Ackerman about it earlier today.
- Very proud of Nicholas, who recently shared scalable-real2sim.github.io (for physics-quality assets from a small amount of interaction with a robot) and is now following up with his work on scene-level generation.Want to scale robot data with simulation, but don’t know how to get large numbers of realistic, diverse, and task-relevant scenes? Our solution: ➊ Pretrain on broad procedural scene data ➋ Steer generation toward downstream objectives 🌐 steerable-scene-generation.github.io 🧵1/8
00:00scalable-real2sim.github.ioScalable Real2Sim: Physics-Aware Asset Generation Via Robotic Pick-and-Place SetupsA fully automated pipeline that generates simulation-ready assets for real-world objects—no manual intervention required! - This work really sharpened my thinking about sim+real cotraining.Learning from both sim+real data could scale robot imitation learning. But what are the scaling laws & principles of sim+real cotraining? We study this in the first focused analysis of sim+real cotraining spanning 250+ policies & 40k+ evals arxiv.org/abs/2503.22634 (1/6)
- Replying to @RussTedrakeProbably my favorite plot from the paper, which sums it all up, is this one. The plot compares performance using different amounts of pretraining data used before training a new task: 0% (aka single task), 25, 50, or 100% of TRI’s data, then 100% of TRI’s data + all of the
- Replying to @RussTedrakeThe short version is: LBMs work! We see consistent and statistically significant improvements as we increase the amount of pretraining data. But doing the science is still hard; as a field we have more work to do to improve the statistical power of our experiments.
- Replying to @RussTedrakeSide note: I'm proud of the title of this paper, which we intentionally made pretty narrow/specific. I think that some of the most important work that we have to do as a field right now is careful empirical work to interrogate the properties of these models that we're creating.
- Replying to @RussTedrakeThis was a massive effort by the entire team, with a number of individuals really pouring their hearts into this paper. The paper is packed full of (too many?) details. Your comments and feedback would be very welcome.
- Replying to @RussTedrakeIn my mind, it's a bit like a biology paper that is focused on a particular animal model. I hope we'll learn more quickly from each other if we can make precise, substantiated claims about particular setups, so that as a field we can assemble those claims into a coherent picture.
- Replying to @anwesha_acyes. Of course the distribution and quality of the data matters.
- Replying to @RussTedrakeOne of the most interesting take-aways for me is that "high-performing policies need to know whether they are executing in sim or in real." A number of implications flow from that, including that sim+real cotraining can decrease performance if the visual gap is too small.
- Probably too late, but here's a notebook showing how to visualize it with graphviz: deepnote.com/workspace/Mani…





