Rong Li*,
Yuhao Dong*,
Tianshuai Hu*,
Ao Liang*,
Youquan Liu*,
Dongyue Lu*
Liang Pan,
Lingdong Kong†,
Junwei Liang‡,
Ziwei Liu‡
*Equal contribution †Project lead ‡Corresponding authors
- Cross-Platform: First 3D grounding dataset spanning vehicle, drone, and quadruped platforms
- Large-Scale: Large-scale annotated samples across diverse real-world scenarios
- Multi-Modal: Synchronized RGB, LiDAR, and language annotations
- Challenging: Complex outdoor environments with varying object densities and viewpoints
- Reproducible: Unified evaluation protocols and baseline implementations
If you find our work helpful, please consider citing:
@inproceedings{li2025_3eed,
title = {{3EED}: Ground Everything Everywhere in {3D}},
author = {Rong Li and Yuhao Dong and Tianshuai Hu and Ao Liang and Youquan Liu and Dongyue Lu and Liang Pan and Lingdong Kong and Junwei Liang and Ziwei Liu},
booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
volume = {38},
year = {2025}
}📄 For detailed dataset statistics and analysis, please refer to our paper.
- [2025.10] Dataset and code are now publicly available on HuggingFace and GitHub! 📦
- [2025.09] 3EED has been accepted to NeurIPS 2025 Dataset and Benchmark Track! 🎉
- Highlights
- Statistics
- News
- Table of Contents
- Installation
- Pretrained Models
- Dataset
- Quick Start
- License
- Acknowledgements
We support both CUDA 11 and CUDA 12 environments. Choose the one that matches your system:
Option 1: CUDA 11.1 Environment
| Component | Version |
|---|---|
| CUDA | 11.1 |
| cuDNN | 8.0.5 |
| PyTorch | 1.9.1+cu111 |
| torchvision | 0.10.1+cu111 |
| Python | 3.10 / 3.11 |
Option 2: CUDA 12.4 Environment
| Component | Version |
|---|---|
| CUDA | 12.4 |
| cuDNN | 8.0.5 |
| PyTorch | 2.5.1+cu124 |
| torchvision | 0.20.1+cu124 |
| Python | 3.10 / 3.11 |
cd ops/teed_pointnet/pointnet2_batch
python setup.py develop
cd ../roiaware_pool3d
python setup.py developDownload the RoBERTa-base checkpoint from HuggingFace and move it to data/roberta_base.
Download the 3EED dataset from HuggingFace:
🔗 Dataset Link: https://huggingface.co/datasets/RRRong/3EED
After extraction, organize your dataset as follows:
data/3eed/
├── drone/ # Drone platform data
│ ├── scene-0001/
│ │ ├── 0000_0/
│ │ │ ├── image.jpg
│ │ │ ├── lidar.bin
│ │ │ └── meta_info.json
│ │ └── ...
│ └── ...
├── quad/ # Quadruped platform data
│ ├── scene-0001/
│ └── ...
├── waymo/ # Vehicle platform data
│ ├── scene-0001/
│ └── ...
├── roberta_base/ # Language model weights
└── splits/ # Train/val split files
├── drone_train.txt
├── drone_val.txt
├── quad_train.txt
├── quad_val.txt
├── waymo_train.txt
└── waymo_val.txt
Train the baseline model on different platform combinations:
# Train on all platforms (recommended for best performance)
bash scripts/train_3eed.sh
# Train on single platform
bash scripts/train_waymo.sh # Vehicle only
bash scripts/train_drone.sh # Drone only
bash scripts/train_quad.sh # Quadruped onlyOutput:
- Checkpoints:
logs/Train_<datasets>_Val_<datasets>/<timestamp>/ - Training logs:
logs/Train_<datasets>_Val_<datasets>/<timestamp>/log.txt - TensorBoard logs:
logs/Train_<datasets>_Val_<datasets>/<timestamp>/tensorboard/
Evaluate trained models on validation sets:
Quick Evaluation:
# Evaluate on all platforms
bash scripts/val_3eed.sh
# Evaluate on single platform
bash scripts/val_waymo.sh # Vehicle
bash scripts/val_drone.sh # Drone
bash scripts/val_quad.sh # Quadruped- Update
--checkpoint_pathin the script to point to your trained model - Ensure the validation dataset is downloaded and properly structured
Output:
- Results saved to:
<checkpoint_dir>/evaluation/Val_<dataset>/<timestamp>/
Visualize predictions with 3D bounding boxes overlaid on point clouds:
# Visualize prediction results
python utils/visualize_pred.pyVisualization Output:
- 🟢 Ground Truth: Green bounding box
- 🔴 Prediction: Red bounding box
Output Structure:
visualizations/
├── waymo/
│ ├── scene-0001_frame-0000/
│ │ ├── pointcloud.ply
│ │ ├── pred/gt_bbox.ply
│ │ └── info.txt
│ └── ...
├── drone/
└── quad/
Baseline models and predictions are available at: Huggingface
This repository is released under the Apache 2.0 License (see LICENSE).
We sincerely thank the following projects and teams that made this work possible:
- BUTD-DETR - Bottom-Up Top-Down DETR for visual grounding
- WildRefer - Wild referring expression comprehension
- Waymo Open Dataset - Vehicle platform data
- M3ED - Drone and quadruped platform data
| 😎 Awesome | Projects |
|---|---|
![]() |
3D and 4D World Modeling: A Survey [GitHub Repo] - [Project Page] - [Paper] |
![]() |
WorldLens: Full-Spectrum Evaluations of Driving World Models in Real World [GitHub Repo] - [Project Page] - [Paper] |
![]() |
LiDARCrafter: Dynamic 4D World Modeling from LiDAR Sequences [GitHub Repo] - [Project Page] - [Paper] |
![]() |
Are VLMs Ready for Autonomous Driving? A Study from Reliability, Data & Metric Perspectives [GitHub Repo] - [Project Page] - [Paper] |
![]() |
Perspective-Invariant 3D Object Detection [GitHub Repo] - [Project Page] - [Paper] |
![]() |
DynamicCity: Large-Scale 4D Occupancy Generation from Dynamic Scenes [GitHub Repo] - [Project Page] - [Paper] |
❤️ by the 3EED Team







