Authors: Zhuochen Miao, Jun Lv, Hongjie Fang, Yang Jin, Cewu Lu
To set up the environment, run:
conda env create -f environment.yaml
conda activate KnowledgeFor real-world evaluation, you also need to run:
pip install pyrealsense2 pynputand place flexiv_rdk/ into the root folder of this project.
To facilitate understanding of the data format, example data is available on Google Dirve and Baidu Netdisk. Please extract it into data/ folder.
- Capture an RGBD image of the target object.
- Save the files
color.png,depth.png, andintr.npyinto the directorydata/templates/[object_name].
To manually select keypoints, run:
python -m scripts.select --template_path data/templates/[object_name]The selected keypoints will be saved as points.npy in the same directory.
Currently, templates for mug and coaster are provided.
To prepare a dataset (e.g., data/dataset/mug), modify config.yaml as follows:
anchor_name: "anchors.npy"
num_anchors: 12 # Total number of keypoints
objects:
- "data/templates/coaster"
- "data/templates/mug"
average:
- []
- [0, 1, 2, 3] # For the mug rim, DINOv2 features may vary slightly; average the first four keypoints to reduce this variation.
calib:
relative_path: "calib/0212"
camera_serial:
- "135122075425"
- "104422070042"To perform template matching on all training data, run:
python -m scripts.anchor --dataset_path data/dataset/mugAdd --vis to visualize the generated anchors.
To train the model, run:
torchrun --standalone --nproc_per_node=1 train.py \
--data_path data/dataset/mug \
--num_action 20 \
--ckpt_dir ./logs/mug \
--batch_size 60 \
--num_epochs 1000 \
--save_epochs 100 \
--num_workers 24 \
--seed 233 \
--policy_type anchor \
--augIt may takes several hours.
To evaluate the trained model on real-world data, run:
python eval.py \
--ckpt logs/mug/policy_epoch_1000_seed_233.ckpt \
--calib data/calib/eval \
--policy_type anchor \
--cfg data/dataset/mug/config.yaml \
--vis- Our codebase is adapt from RISE, realeased under CC BY-NC-SA 4.0 License.
- The diffusion module is from Diffusion Policy, released under MIT License.
- The feature extrctor Dinov2 is provided under the Apache License 2.0.
- We also refer to DP3, P3PO for parts of the implementation.
@misc{miao2025knowledgedrivenimitationlearningenabling,
title={Knowledge-Driven Imitation Learning: Enabling Generalization Across Diverse Conditions},
author={Zhuochen Miao and Jun Lv and Hongjie Fang and Yang Jin and Cewu Lu},
year={2025},
eprint={2506.21057},
archivePrefix={arXiv},
primaryClass={cs.RO},
url={https://arxiv.org/abs/2506.21057},
}