PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model [CVPR 2025]

This repository is an official implementation for:

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model [CVPR 2025]

Authors: Mingju Gao*, Yike Pan*, Huan-ang Gao*, Zongzheng Zhang, Wenyi Li, Hao Dong, Hao Tang, Li Yi, Hao Zhao

Introduction

As interest grows in world models that predict future states from current observations and actions, accurately modeling part-level dynamics has become increasingly relevant for various applications. Existing approaches, such as Puppet-Master, rely on fine-tuning large-scale pre-trained video diffusion models, which are impractical for real-world use due to the limitations of 2D video representation and slow processing times. To overcome these challenges, we present PartRM, a novel 4D reconstruction framework that simultaneously models appearance, geometry, and part-level motion from multi-view images of a static object. PartRM builds upon large 3D Gaussian reconstruction models, leveraging their extensive knowledge of appearance and geometry in static objects. To address data scarcity in 4D, we introduce the PartDrag-4D dataset, providing multi-view observations of part-level dynamics across over 20,000 states. We enhance the model’s understanding of interaction conditions with a multi-scale drag embedding module that captures dynamics at varying granularities. To prevent catastrophic forgetting during fine-tuning, we implement a two-stage training process that focuses sequentially on motion and appearance learning. Experimental results show that PartRM establishes a new state-of-the-art in part-level motion learning and can be applied in manipulation tasks in robotics. Project page: https://partrm.c7w.tech/

Environment Setup

Use conda to create a new virtual enviroment. We use torch==2.1.0+cu121.

conda env create -f environment.yaml
conda activate partrm

Also with gaussian splatting renderer

# a modified gaussian splatting (+ depth, alpha rendering)
git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization
pip install ./diff-gaussian-rasterization

PartDrag-4D Dataset

You can download PartDrag-4D dataset from here. And unzip pardrag_4d/partdrag_rendered.zip to PartDrag4D/data/render_PartDrag4D, unzip processed_data_partdrag4d.zip to ./PartDrag4D/data/processed_data_partdrag4d

Below is how to render the PartDrag-4D dataset from scratch.

You need to first get PartNet-Mobility dataset and put it in the PartDrag4D/data directory of this repo. Then

cd PartDrag4D

For mesh preprocessing and animating:

cd preprocess
python process_data_textured_uv.py
python animated_data.py

For rendering, first download blender and unzip it in the ../rendering/blender directory:

cd ../rendering/blender
wget https://download.blender.org/release/Blender3.5/blender-3.5.0-linux-x64.tar.xz
tar -xf blender-3.5.0-linux-x64.tar.xz

Then generate the rendering filelist and render the generated meshes using blender:

cd ..
python gen_filelist.py
bash render.sh

You can modify num_gpus and CUDA_VISIBLE_DEVICES in the bash script to adjust the degree of parallelism.

For surface drags extraction:

cd ..
python z_buffer_al.py

The animated meshes and extracted surface drags are stored in ./PartDrag4D/data/processed_data_partdrag4d. The rendering results are stored in ./PartDrag4D/data/render_PartDrag4D.

We split the PartDrag-4D dataset into training and evaluation sets. You can refer to ./filelist/train_filelist_partdrag4d.txt and ./filelist/val_filelist_partdrag4d.txt for details.

Images and Drags Preprocessing

You can get Zero123++ and SAM checkpoint from here. Then put them into preprocess/zero123_ckpt and preprocess/sam_ckpt respectively.

To generate multi-view images for evaluation data:

cd ../preprocess
python gen_mv_partdrag4d.py --src_filelist /path/to/src/rendering/filelist --output_dir /path/to/save/dir # For PartDrag-4D
python gen_mv_objaverse_hq.py --src_filelist /path/to/src/rendering/filelist --output_dir /path/to/save/dir # For Objaverse-Animation-HQ,

The src_filelist is the path to the rendering filelist. You can refer to this filelist for PartDrag4D and this filelist for Objaverse-Animation-HQ for example.

To generate RGBA format images for the input of PartRM:

python gen_rgba.py --filelist /path/to/zero123/filelist --dataset [dataset_name]

You can refer to this filelist for PartDrag4D and this filelist for Objaverse-Animation-HQ for example.

To generate propagated drags for PartDrag-4D dataset (You can download our preprocessed propagated drags from here):

python gen_propagate_drags.py --val_filelist /path/to/src/rendering/filelist --sample_num [The number of propagated drags] --save_dir /path/to/save/drags

The val_filelist is the same as the src_filelist (multi-view images generation) for PartDrag-4D above.

Training

We provide training scripts for PartDrag-4D and Objaverse-Animation-HQ datasets. You can adjust the dataset for training in the train.py and eval.py (partdrag4d or objaverse_hq). Then run:

CUDA_VISIBLE_DEVICES=0,1,2,3 accelerate launch --config_file acc_configs/gpu4.yaml train.py big --workspace [your workspace]

You should specify the train_filelist, val_filelist, zero123_val_filelist, propagated_drags_base and mesh_base in core/options.py and core/options_pm.py.

For train_filelist, you can refer to filelist/train_filelist.txt and filelist/train_objavser_hq.txt.
For val_filelist, you can refer to filelist/val_filelist.txt and filelist/eval_objaverse_hq.txt.
For zero123_val_filelist, you can refer to filelist/zero123_val_filelist.txt and filelist/zero123_val_filelist_objavser_hq.txt.

For the 2-stage training proposed in paper, you should first set the stage1 in core/options.py and core/options_pm.py True. After the motion-learning traing, set the stage1 False to conduct the apperance learning training.

Evaluation

For evaluation, you should first run

CUDA_VISIBLE_DEVICES=0 accelerate launch --config_file acc_configs/gpu1.yaml eval.py big --workspace [your workspace]

Note you should set the stage1 in core/options.py and core/options_pm.py False.

Then you should generte your eval filelist with every line like

gt_image_path,pred_image_path,source_image_path

The specify the VAL_FILELIST (The path of generated eval filelist) in compute_metrics.py and run:

python compute_metrics.py

You can get the PSNR, LPIPS and SSIM metrics.

Acknowledgement

We build our work on LGM, Zero123++ and 3D Gaussian Splattings.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model [CVPR 2025]

Introduction

Environment Setup

PartDrag-4D Dataset

Images and Drags Preprocessing

Training

Evaluation

Acknowledgement

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
PartDrag4D		PartDrag4D
acc_configs		acc_configs
core		core
filelist		filelist
images		images
preprocess		preprocess
README.md		README.md
compute_metrics.py		compute_metrics.py
environment.yaml		environment.yaml
eval.py		eval.py
train.py		train.py

GasaiYU/PartRM

Folders and files

Latest commit

History

Repository files navigation

PartRM: Modeling Part-Level Dynamics with Large Cross-State Reconstruction Model [CVPR 2025]

Introduction

Environment Setup

PartDrag-4D Dataset

Images and Drags Preprocessing

Training

Evaluation

Acknowledgement

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages