Kyle Stachowicz (@KyleStachowicz) / X

Kyle Stachowicz

102 posts

Kyle Stachowicz

@KyleStachowicz

Robot learning @berkeley_ai @physical_int

Berkeley, CA

Joined August 2018

Kyle Stachowicz
@KyleStachowicz
Jan 30, 2025
R1's RL findings are great news for reasoning but grim for robotics. All the major takeaways (ground-truth reward, great base models, grouped rollouts from same initial state, sample-inefficient on-policy algos) are really hard to translate to the physical world.
74K
Kyle Stachowicz
@KyleStachowicz
Jan 16, 2025
How can we train high-frequency generalist robot policies with next-token prediction? In new work with @KarlPertsch/@physical_int we achieve SoTA generalist VLAs with way less training! Highlight: we train a VLA on DROID that can do zero-shot language tasks in the wild!
00:00
19K
Kyle Stachowicz
@KyleStachowicz
May 15, 2024
Sample-efficient RL makes it possible to train robots in the real world, but letting a robot loose in the wild can damage both the robot and the world. How can we mitigate this? My new work with @svlevine tackles this problem from the perspective of epistemic uncertainty. 1/🧵
GIF
57K
Kyle Stachowicz
@KyleStachowicz
Jan 31, 2025
Replying to @ChenTessler
Yup, that’s the subtext here But sim has its own challenges: IMO getting a sim with diversity+fidelity mirroring the real world is way harder than people think (most obvious way around this is real2sim type approaches)
3.8K
Kyle Stachowicz
@KyleStachowicz
Feb 4, 2025
The π0-FAST model is also included in this release (both base and a model finetuned for DROID), check it out if you've got access to a DROID setup!
Physical Intelligence
@physical_int
Feb 4, 2025
Many of you asked for code & weights for π₀, we are happy to announce that we are releasing π₀ and pre-trained checkpoints in our new openpi repository! We tested the model on a few public robots, and we include code for you to fine-tune it yourself.
00:00
1.9K
Kyle Stachowicz
@KyleStachowicz
Jan 13, 2025
Check out this work w/@carlo_sferrazza @oier_mees @Joshua_W_Jones @svlevine @pabbeel: fine-tuning VLA policies with multimodal inputs (vision, touch, audio) to generate sensory descriptions + robot actions! I’ve been pretty excited about VLA models lately, more to come soon :)
7.3K
Kyle Stachowicz
@KyleStachowicz
Feb 1, 2025
Replying to @harshit_sikchi
When you want funding
409
Kyle Stachowicz
@KyleStachowicz
Jan 28, 2025
Excited to announce that we'll be presenting FuSe at ICRA 2025!
Kyle Stachowicz
@KyleStachowicz
Jan 13, 2025
Check out this work w/@carlo_sferrazza @oier_mees @Joshua_W_Jones @svlevine @pabbeel: fine-tuning VLA policies with multimodal inputs (vision, touch, audio) to generate sensory descriptions + robot actions! I’ve been pretty excited about VLA models lately, more to come soon :)
2.1K
Kyle Stachowicz
@KyleStachowicz
Jul 15, 2024
I'm at #RSS2024 in Delft this week - I'll be presenting my work on risk-sensitive RL (x.com/KyleStachowicz…) Wednesday and some new work on online RL+robot foundation models @ the lifelong learning workshop Friday! If you're at RSS and want to chat about RL, let me know!
Kyle Stachowicz
@KyleStachowicz
May 15, 2024
Sample-efficient RL makes it possible to train robots in the real world, but letting a robot loose in the wild can damage both the robot and the world. How can we mitigate this? My new work with @svlevine tackles this problem from the perspective of epistemic uncertainty. 1/🧵
GIF
1.3K
Kyle Stachowicz
@KyleStachowicz
May 15, 2024
Replying to @KyleStachowicz
Learning in the real world requires dealing with both aleatoric uncertainty (inherent stochasticity) & epistemic uncertainty (insufficient data/underfitting). Distributional RL already handles aleatoric uncertainty in returns, but it ignores epistemic uncertainty. 2/🧵
1.5K
Kyle Stachowicz
@KyleStachowicz
Jan 30, 2025
Replying to @lukas_m_ziegler and @AmbiRobotics
P.P.S., XYZ+yaw = 4 axes 😉
131
Kyle Stachowicz
@KyleStachowicz
Jan 31, 2025
Replying to @simon_zhai and @deepseek_ai
Proposal: raise $5m and give it to Schmidhuber, if he can’t match R1 performance he has to stop making these threads
430
Kyle Stachowicz
@KyleStachowicz
May 15, 2024
Replying to @KyleStachowicz
One nice property of CVaR is convexity: the CVaR of the mixture of two distributions is pessimistic when they disagree (vs. the average of the CVaRs taken independently). This means that distributional ensembles+CVaR penalize both aleatoric & epistemic uncertainty! 5/🧵
1.3K
Kyle Stachowicz
@KyleStachowicz
May 15, 2024
Replying to @KyleStachowicz
Also, I'm excited to announce that we will be presenting RACER at RSS 2024 in Delft, Netherlands!
1.3K