I'm a senior research scientist on the core Gemini team at Google DeepMind.
My research focuses on the intersection of scaling and imitation/reinforcement learning for long & open-ended agent tasks. I'm especially interested in training sandboxes and modeling recipes that can enable LMs & VLMs to solve and/or provide assistance on tasks that take humans 10-1000s of hours to complete.
Previously, I was a PhD student at NYU CILVR, where I was advised by Rob Fergus and Lerrel Pinto. During my PhD, I worked on agent post-training for production MoE LLMs (Meta Llama Team, Meta FAIR), efficient distillation algorithms for Phi models (Microsoft), and neural nets for solving PDEs (Google Research). Before that, I did my undergrad in mathematics and computer science at MIT, during which I was exceptionally lucky to be mentored by Kelsey R. Allen and Josh Tenenbaum.
Once upon a time, I was a design assistant to the director of the Exhibitions Lab of the American Museum of Natural History.
LLMs are becoming increasingly capable agents. I'm interested in the data, algorithms, and environments that will enable models to autonomously complete and/or collaborate with humans on tasks that lie at the "edge of simulation," e.g. solving open problems in mathematics, developing maintainable & reliable software, or ascending in NetHack.
Unlike question-answering and short-horizon tool-use, it is difficult (and in some cases, impossible) to collect demonstration data or train models with "vanilla" online RL for such tasks.
Previously, during my Ph.D., I worked towards this setting by: