mmdepth/projects/ScaleDepth at main · RuijieZhu94/mmdepth

Name	Name	Last commit message	Last commit date
parent directory ..
backbone	backbone
configs	configs
decode_head	decode_head
log	log
loss	loss
pretrained_weights	pretrained_weights
utils	utils
README.md	README.md
metafile.yaml	metafile.yaml

ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

Ruijie Zhu, Chuxin Wang, Ziyang Song, Li Liu, Tianzhu Zhang, Yongdong Zhang,
University of Science and Technology of China
Arxiv 2024

TL;DR: Finetuning CLIP with 4 RTX 3090 in 8 hours to obtain a robust metric depth estimation model!

Within a unified framework, our method ScaleDepth achieves both accurate indoor and outdoor metric depth estimation without setting depth ranges or finetuning models. Left: the input RGB image and corresponding depth prediction. Right: the comparison of model parameters and performance. With overall fewer parameters, our model ScaleDepth-NK significantly outperforms the state-of-the-art methods under same experimental settings.

Without any finetuning, our model can generalize to scenes with different scales and accurately estimate depth from indoors to outdoors.

The overall architecture of the proposed ScaleDepth. We design bin queries to predict relative depth distribution and scale queries to predict scene scale. During training, we preset text prompts containing 28 scene categories as input to the frozen CLIP text encoder. We then calculate the similarity between the updated scale queries and text embedding, and utilize the scene category as its auxiliary supervision. During inference, only a single image is required to obtain the relative depth and scene scale, thereby synthesizing a metric depth map.

Installation

Please refer to get_started.md for installation and dataset_prepare.md for dataset preparation.

You may also need to install these packages:

pip install "mmdet>=3.0.0rc4"
pip install open_clip_torch
pip install future tensorboard
pip install -r requirements/albu.txt

And download the checkpoint of text embeddings from Google Drive and place it to projects/ScaleDepth/pretrained_weights folder.

Training and Inference

We provide train.md and inference.md for the instruction of training and inference.

Train

# ScaleDepth-N
bash tools/dist_train.sh projects/ScaleDepth/configs/ScaleDepth/scaledepth_clip_NYU_480x480.py 4
# ScaleDepth-K
bash tools/dist_train.sh projects/ScaleDepth/configs/ScaleDepth/scaledepth_clip_KITTI_352x1120.py 4
# ScaleDepth-NK
bash tools/dist_train.sh projects/ScaleDepth/configs/ScaleDepth/scaledepth_clip_NYU_KITTI_352x512.py 4

Test

# ScaleDepth-N
python tools/test.py projects/ScaleDepth/configs/ScaleDepth/scaledepth_clip_NYU_480x480.py work_dirs/scaledepth_clip_NYU_KITTI_352x512/iter_40000.pth
# ScaleDepth-K
python tools/test.py projects/ScaleDepth/configs/ScaleDepth/scaledepth_clip_KITTI_352x1120.py work_dirs/scaledepth_clip_NYU_KITTI_352x512/iter_40000.pth
# ScaleDepth-NK
python tools/test.py projects/ScaleDepth/configs/ScaleDepth/scaledepth_clip_NYU_KITTI_352x512.py work_dirs/scaledepth_clip_NYU_KITTI_352x512/iter_40000.pth

Offical weights

Method	Backbone	Train Iters	Results	Config	Checkpoint	GPUs
ScaleDepth-NK	CLIP(ConvNext-Large)	40000	log	config	iter_40000.pth	4 RTX 3090

Bibtex

If you like our work and use the codebase or models for your research, please cite our work as follows.

@ARTICLE{zhu2024scale,
  title={ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation}, 
  author={Zhu, Ruijie and Wang, Chuxin and Song, Ziyang and Liu, Li and Zhang, Tianzhu and Zhang, Yongdong},
  journal={arXiv preprint arXiv:2407.08187},
  year={2024}
}

Acknowledgement

We thank Jianfeng He and Jiacheng Deng for their thoughtful and valuable suggestions. We thank the authors of Binsformer and Zoedepth for their code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

Installation

Training and Inference

Train

Test

Offical weights

Bibtex

Acknowledgement

FilesExpand file tree

ScaleDepth

Directory actions

More options

Directory actions

More options

Latest commit

History

ScaleDepth

Folders and files

parent directory

README.md

ScaleDepth: Decomposing Metric Depth Estimation into Scale Prediction and Relative Depth Estimation

Installation

Training and Inference

Train

Test

Offical weights

Bibtex

Acknowledgement