Skip to content

allenai/MolmoBot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MolmoSpaces Logo
Large-Scale Simulation Enables Zero-Shot Manipulation
Image Image Image Image Image

Code and website for "MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation".

Getting started

MolmoBot policies have strong demonstrated sim-to-real transfer to a wide variety of novel scenes, objects, and camera viewpoints. Try it out for yourself on your DROID platform with MolmoBot-DROID!

MolmoBot-DROID uses only the wrist camera and 1 exo camera. Don't worry about camera placement, MolmoBot policies are robust to arbitrary camera viewpoints!

Trying it out in simulation

See here to try out MolmoBot interactively! Modify the scene and task to test policy behavior.

Set up and run MolmoBot-DROID

  1. Set up MolmoBot-DROID by following the installation instructions.

  2. See these instructions for detailed instructions on setting up and running the policy on your DROID! Any existing DROID or polymetis setups will work easily.

    Briefly, after starting the polymetis robot and gripper servers:

    # In one terminal
    cd MolmoBot/MolmoBot
    source .venv/bin/activate
    PYTHONPATH=. python launch_scripts/serve_molmo.py --hf-repo allenai/MolmoBot-DROID --action-type joint_pos
    # in another terminal
    cd MolmoBot/robot_eval
    conda activate molmobot
    python scripts/droid/run_policy.py robot.robot_host=<nuc_ip> robot.cameras.wrist_camera.id=<wrist_id> robot.cameras.exo_camera_1.id=<exo_id> task="put the red mug in the black bowl"

Using MolmoBot Data

To use MolmoBot-Data for training experiments, you will need to download it from hugging face using bulk_download.py.

Data postprocessing

Before using any dataset implementations in this repo, you will need to run a postprocessing script. This filters out any corrupted trajectories, and can optionally check for visibility of certain objects in a given camera. Below is some example usage of the script.

Example usage:

python validate_trajectories.py RBY1OpenDataGenConfig/part0/train --check-visibility head_camera door_handle

python validate_trajectories.py RBY1PickAndPlaceDataGenConfig/part0/train --check-visibility head_camera pickup_obj --check-visibility head_camera place_receptacle

python validate_trajectories.py FrankaPickAndPlaceOmniCamConfig/part0/train --check-visibility droid_shoulder_light_randomization pickup_obj --check-visibility droid_shoulder_light_randomization place_receptacle

Data statistics

Before training (and after data postprocessing), you should also calculate aggregate statistics with calculate_stats.py. Example usage:

python calculate_stats.py FrankaPickAndPlaceOmniCamConfig/part0/train --keys actions obs/agent/qpos

python calculate_stats.py RBY1OpenDataGenConfig/part0/train --keys actions obs/agent/qpos

python calculate_stats.py RBY1PickAndPlaceDataGenConfig/part0/train --keys actions obs/agent/qpos

BibTeX

@misc{deshpande2026molmobot,
      title={MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation},
      author={Abhay Deshpande and Maya Guru and Rose Hendrix and Snehal Jauhri and Ainaz Eftekhar and Rohun Tripathi and Max Argus and Jordi Salvador and Haoquan Fang and Matthew Wallingford and Wilbert Pumacay and Yejin Kim and Quinn Pfeifer and Ying-Chun Lee and Piper Wolters and Omar Rayyan and Mingtong Zhang and Jiafei Duan and Karen Farley and Winson Han and Eli Vanderbilt and Dieter Fox and Ali Farhadi and Georgia Chalvatzaki and Dhruv Shah and Ranjay Krishna},
      year={2026},
      eprint={2603.16861},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2603.16861},
}

About

Code and website for "MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation".

Topics

Resources

Stars

Watchers

Forks

Contributors