Roozbeh Mottaghi

me
Roozbeh Mottaghi

Senior Tech Lead
Meta Fundamental AI Research (FAIR) - Robotics

Email: roozbehm [at( gmail.com
 
Image Image Image

Project Highlights


Ask-to-Act (2025)
Using RL for training MLLMs in settings that an agent takes navigation actions and generates natural language questions.
PARTNR (2025)
A benchmark and a suite of models for planning and reasoning in human-robot collaborative tasks. (Mark Zuckerberg announcement, AI at Meta channels, YouTube).
Habitat 3.0 (2024)
A simulator for humans and robots used to train and evaluate models performing collaborative tasks. (GitHub, TechCrunch)
World Models
ObjectForesight (2026), ODIN (2024), world modeling in the 3D space for objects and scenes.
Track2Act (2024)
Learning robot manipulation from large-scale internet videos.
GOAT (2024)
Go to AnyThing (GOAT) is a powerful robot navigation model that accepts targets defined through images, object categories, or natural-language descriptions.
Unified-IO (2023)
A multi-modal model that unifies tasks with different types of inputs and outputs.
ProcTHOR (2022)
Large-scale Embodied AI using procedural generation. Won NeurIPS 2022 Outstanding Paper Award.
OK-VQA (2022)
OK-VQA and A-OKVQA are popular benchmarks for for visual question answering using reasoning and world knowledge.
Self-adaptation
SAVN (2018) and Interactron (2020) are self-adaptive models that continue training even during test-time interactions using meta-learning approaches.
DECADE (2018)
Learning representations from a dataset we collected of dog interactions captured with cameras and motion sensors mounted on the dog body. (TechCrunch, MIT Technology Review, NBC News)
AI2-THOR (2017)
A robot simulator developed for training and evaluating models designed for navigation, manipulation, and other robotics tasks. (GitHub, IEEE Spectrum, CBC News)
RL for Navigation
Reinforcement Learning (RL) for training a robot navigation model.
ForScene (2016)
Predicting the future movements of objects purely from visual observations. (MIT technology Review)

Highlights and News


Nov, 2025:
Giving invited talks at the Holistic Video Understanding and Vision Language Models: Challenges of Real-World Deployment workshops at NeurIPS 2025. I will talk about generating and understanding the 3D world and human-agent collaborative planning.
Nov, 2025:
Giving an invited talk at the CSE Colloquium at UCSC. I will talk about mitigating data scarcity in robotics via simulation.
Nov, 2025:
Serving as the lead Area Chair for CVPR 2026 and Area Chair for ICLR 2026.
Oct, 2025:
Giving invited talks at the Human-aware Embodied AI and AI Meets Autonomy: Vision, Language, and Autonomous Systems workshops at IROS 2025, and Human-Robot-Scene Interaction and Collaboration workshop at ICCV 2025.
Jun, 2025:
Giving an invited talk at the Generative Modeling Meets Human-Robot Interaction workshop at RSS 2025.
Jun, 2025:
Giving an invited talk at the 3D-LLM/VLA: Bridging Language, Vision and Action in 3D Environments and Embodied Humans: Symbiotic Intelligence between Virtual Humans and Humanoid Robots workshops at CVPR 2025.
Apr, 2025:
Giving an invited talk in VCR/AI seminars at Simon Fraser University.
Older items...

About Me

I am a Senior AI Research Scientist Manager at FAIR and an Affiliate Associate Professor in Paul G. Allen School of Computer Science & Engineering at the University of Washington.

Prior to joining FAIR, I was the Research Manager of the PRIOR team at the Allen Institute for AI. Before that, I was a Postdoctoral Researcher in the Computer Science Department at Stanford University. I received a Ph.D. in Computer Science from UCLA, advised by Alan Yuille. I obtained my Masters degrees from Simon Fraser University and Georgia Institute of Technology and my Bachelors degree from Sharif University of Technology.

Students and Interns

I have had the pleasure of working with the following students, Pre-doctoral Young Investigators (PYI) aka residents, and interns.

Interns

Publications

Press Coverage