Introducing RT-1, a robotic model that can execute over 700 instructions in the real world at 97% success rate!
Generalizes to new tasks✅
Robust to new environments and objects✅
Fast inference for real time control✅
Can absorb multi-robot data✅
Powers SayCan✅
🧵👇
Super excited to introduce SayCan (say-can.github.io): 1st publication of a large effort we've been working on for 1+ years
Robots ground large language models in reality by acting as their eyes and hands while LLMs help robots execute long, abstract language instructions
Have you ever “heard” yourself talk in your head? Turns out it's a useful tool for robots too!
Introducing Inner Monologue: feeding continual textual feedback into LLMs allows robots to articulate a grounded “thought process” to execute long, abstract instructions 🧵👇
Very exited to announce our largest deep RL deployment to date: robots sorting trash end-to-end in real offices!
rl-at-scale.github.io (aka RLS)
This project took a long time (started before SayCan/RT-1/other newer works) but the learnings from it have been really valuable.🧵
PaLM-E or GPT-4 can speak in many languages and understand images. What if they could speak robot actions?
Introducing RT-2: robotics-transformer2.github.io our new model that uses a VLM (up to 55B params) backbone and fine-tunes it to directly output robot actions!
Super excited to announce that I've started as an Adjunct Professor @Stanford!
I'll continue to work @GoogleAI but I'll also be spending some time at Stanford, where I'll be co-advising a few students and continue co-teaching CS 330 (cs330.stanford.edu) 🧑🏫