Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision

Gnerating 6DoF Object Manipulation Trajecotries from Action Description in Egocentric Vision,
Tomoya Yoshida*, Shuhei Kurita, Taichi Nishimura, Shinsuke Mori, CVPR 2025, Paper at arxiv

News

Release codes and dataset.
05.04.2025: Selected as Highlight Paper!
27.02.2025: Accepted at CVPR 2025!

🛠️ Installation

Follow the steps below to set up the environment.

1. Clone this repository

# Clone with submodules
git clone --recursive https://github.com/your-username/EgoScaler.git
cd EgoScaler

# If you already cloned without --recursive, run this to fetch submodules
git submodule update --init --recursive

2. Create a Conda Environment

# Python version 3.8 or higher is required
conda create -n egoscaler python=3.8.17
conda activate egoscaler

# Run this under the root directory of the EgoScaler repository
pip install -e .

3. Install pytorch, torchvision

# Experiments were conducted using CUDA 11.8
conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=11.8 -c pytorch -c nvidia

Dataset Construction

We release both the dataset and the processing code to facilitate reproducibility.

📄 Instructions: Please follow the guide in egoscaler/data/README.md.
📦 Dataset: Google Drive.
📦 HOT3D Test Data Google Drive

(Note: The test set differs from the one used in the paper, since some original data were unavailable at that time. We later reconstructed it, resulting in a larger evaluation set.)

Train/Eval Models

We provide both training/evaluation code and pretrained checkpoints for reproducibility.

🧠 Code: Please follow the guide in egoscaler/models/README.md.
📍 Checkpoints: Google Drive

Demo

Visualize extracted trajectories.

Output trajectory with point cloud as video.

python vis/video.py

Here's the expected output.

Generate object trajecotries.

Draw trajectory and point cloud using Open3D.

python vis/interactive.py

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{egoscaler,
    title={Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision},
    author={Tomoya, Yoshida and Shuhei, Kurita and Taichi, Nishimura and Shinsuke, Mori},
    booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
assets		assets
egoscaler		egoscaler
scripts		scripts
vis		vis
.gitignore		.gitignore
.gitmodules		.gitmodules
Menlo-Regular.ttf		Menlo-Regular.ttf
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision

News

🛠️ Installation

1. Clone this repository

2. Create a Conda Environment

3. Install pytorch, torchvision

Dataset Construction

Train/Eval Models

Demo

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Biscue5/EgoScaler

Folders and files

Latest commit

History

Repository files navigation

Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision

News

🛠️ Installation

1. Clone this repository

2. Create a Conda Environment

3. Install pytorch, torchvision

Dataset Construction

Train/Eval Models

Demo

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages