Gnerating 6DoF Object Manipulation Trajecotries from Action Description in Egocentric Vision,
Tomoya Yoshida*, Shuhei Kurita, Taichi Nishimura, Shinsuke Mori, CVPR 2025, Paper at arxiv
- Release codes and dataset.
-
05.04.2025: Selected as Highlight Paper! -
27.02.2025: Accepted at CVPR 2025!
Follow the steps below to set up the environment.
# Clone with submodules
git clone --recursive https://github.com/your-username/EgoScaler.git
cd EgoScaler
# If you already cloned without --recursive, run this to fetch submodules
git submodule update --init --recursive# Python version 3.8 or higher is required
conda create -n egoscaler python=3.8.17
conda activate egoscaler
# Run this under the root directory of the EgoScaler repository
pip install -e .# Experiments were conducted using CUDA 11.8
conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=11.8 -c pytorch -c nvidiaWe release both the dataset and the processing code to facilitate reproducibility.
- 📄 Instructions: Please follow the guide in
egoscaler/data/README.md. - 📦 Dataset: Google Drive.
- 📦 HOT3D Test Data Google Drive
(Note: The test set differs from the one used in the paper, since some original data were unavailable at that time. We later reconstructed it, resulting in a larger evaluation set.)
We provide both training/evaluation code and pretrained checkpoints for reproducibility.
- 🧠 Code: Please follow the guide in
egoscaler/models/README.md. - 📍 Checkpoints: Google Drive
- Visualize extracted trajectories.
Output trajectory with point cloud as video.
python vis/video.pyHere's the expected output.
- Generate object trajecotries.
Draw trajectory and point cloud using Open3D.
python vis/interactive.pyIf you find our work useful in your research, please consider citing:
@inproceedings{egoscaler,
title={Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision},
author={Tomoya, Yoshida and Shuhei, Kurita and Taichi, Nishimura and Shinsuke, Mori},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
year={2025}
}
