π0.5 driving autonomous anastomosis!
Physical Intelligence
114 posts
Physical Intelligence (Pi), bringing AI into the physical world.
- Replying to @physical_intTo find out more, check out our blog post, videos, and full-length research paper:
- Replying to @physical_intWe are still discovering what π0.7 can do. It's fun to play with and the results so far have been quite surprising!
00:00 - Replying to @physical_intπ0.7 handles diverse prompts that don't just say what to do, but also how to do it, including rich language and multimodal information, such as visual subgoal images. At test time, these images can be produced by a lightweight world model.
00:00 - Replying to @physical_intCompositional generalization is a key capability of large models like LLMs, but it has been elusive in robotics. Another emergent ability we found is to control a new robot (UR5e) to fold t-shirts, even though we didn't have any laundry folding data on this robot.
00:00 - Replying to @physical_intWe are especially excited about how π0.7 seems to exhibit emergent compositional generalization: it can put together skills it learned in new ways based on the prompt, for example to figure out how to use an air fryer to cook a sweet potato.
00:00 - Our newest model, π0.7, has some interesting emergent capabilities: it can control a new robot to fold shirts for which we had no shirt folding data, figure out how to use an appliance with language-based coaching, and perform a wide range of dexterous tasks all in one model!
00:00 - Physical Intelligence repostedπ, But Make It Fly ✈️ We fine-tuned π0, a VLA model pretrained entirely on manipulators, to fly a drone that picks up objects, navigates through gates, and composes both skills from language commands.
00:00 - Replying to @physical_intTo learn more about RLT, check out our blog post:
- Replying to @physical_intWith RL, the robot can learn very precise tasks, like fastening a zip tie, and can actually do it more consistently and more quickly than even human teleoperation.
00:00 - Replying to @physical_intWhile the whole model takes a long time to train, with RLT we can adapt individual precise stages with as little as 15 minutes of robot data.
- Replying to @physical_intWe use RLT to fine-tune the most precise and critical stage of delicate tasks, such as using a screwdriver to attach a cover to one of our robot arms.
00:00 - Replying to @physical_intThe key idea with RL tokens (RLT) is to compress our model’s (e.g., π-0.6) internal representations into a concise feature vector, which can be used by a very small actor and critic network that trains in real time even as the robot is practicing the task.





