A scalable automated generation method for real-world robotic manipulation.
Project Page | Data | Paper
Manipulate-Anything is a scalable automated generation method for real-world robotic manipulation. Unlike prior work, this method operates in real-world environments without privileged state information or hand-designed skills, enabling manipulation of any static object.
Jiafei Duan*, Wentao Yuan*, Wilbert Pumacay, Yi Ru Wang, Kiana Ehsani, Dieter Fox, Ranjay Krishna
To set up the Manipulate-Anything environment, you will need four repositories, including this one.
conda env create -n manip_any python=3.11
conda install cuda -c nvidia/label/cuda-11.7.0
conda activate manip_any-
Setup and install Manipulate-Anything-QWenVL Go into the QWen-VL-MA and follow the steps.
-
Install PyRep PyRep requires version 4.1 of CoppeliaSim. Download:
Once you have downloaded CoppeliaSim, you can pull PyRep from git:
cd <install_dir>
git clone https://github.com/stepjam/PyRep.git
cd PyRepAdd the following to your ~/.bashrc file: (NOTE: the 'EDIT ME' in the first line)
export COPPELIASIM_ROOT=<EDIT ME>/PATH/TO/COPPELIASIM/INSTALL/DIR
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$COPPELIASIM_ROOT
export QT_QPA_PLATFORM_PLUGIN_PATH=$COPPELIASIM_ROOTRemember to source your bashrc (source ~/.bashrc) or
zshrc (source ~/.zshrc) after this.
Warning: CoppeliaSim might cause conflicts with ROS workspaces.
- Install YARR Manipulate-Anything uses my YARR fork.
cd <install_dir>
git clone -b peract https://github.com/MohitShridhar/YARR.git # note: 'peract' branch
cd YARR
pip install -r requirements.txt
python setup.py develop- Install current repo
pip install pointnet2_ops/
cd pointnet2_ops
pip install -r requirements.txt
pip install .git clone https://github.com/Robot-MA/manipulate-anything.git
cd RLBench
pip install -r requirements.txt
python setup.py develop- Download checkpoint.
- Setup GPT4V API-key.
meshcat-server- Run meshcat server.
export OPENAI_API_KEY="your_api_key_here"- Zero-shot data generation. Example task (play_jenga):
python dataset_generator.py \
eval.checkpoint=<PATH_TO_M2T2_CHECKPOINT> \
eval.mask_thresh=0.0 \
eval.retract=0.20 \
rlbench.task_name=<TASK_NAME>- Open http://127.0.0.1:7000/static to see the visualization. Press enter in terminal to see the next pose generated.
To reproduce the tasks from the paper, please swap out the .py and .ttm tasks folder from your RLBench task env with eval_tasks.
- Include multi-process functionality for searching the best MA plans.
- Set up interactive mode on Gradio.
- Include policy training code
If you find Manipulate-Anything useful for your research and applications, please consider citing our paper:
@article{duan2024manipulate,
title={Manipulate-anything: Automating real-world robots using vision-language models},
author={Duan, Jiafei and Yuan, Wentao and Pumacay, Wilbert and Wang, Yi Ru and Ehsani, Kiana and Fox, Dieter and Krishna, Ranjay},
journal={arXiv preprint arXiv:2406.18915},
year={2024}
}
