Skip to content

MKJia/DINO-Tok

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DINO-Tok: Adapting DINO for Visual Tokenizers

arXiv

Mingkai Jia1,2, Mingxiao Li2, Zhijian Shu2,3, Anlin Zheng4, Liaoyuan Fan2, Jiaxin Guo5, Tianxing Shi3, Dongyue Lu2, Zeming Li1, Xiaoyang Guo2, Xiaojuan Qi4, Xiao-Xiao Long3, Qian Zhang2, Ping Tan1*, Wei Yin2*§,

HKUST1, Horizon Robotics2, NJU3, HKU4, CUHK5,
* Corresponding Author, § Project Leader

Image

🚀News

  • [April 2026] Released Inference Code
  • [April 2026] Released models & stats.
  • [Nov 2025] Released paper.

🔨TO DO LIST

  • Training code.
  • Models & Evaluation code.
  • Huggingface models & stats.

🔑 Quick Start

Installation

git clone https://github.com/MKJia/DINO-Tok.git
cd DINO-Tok

Prepare env

conda create -n dinotok python=3.10
conda activate dinotok
pip3 install -r requirements.txt

Download models

Download the pretrained models & stats from our model & stat to your /path/to/your/ckpt.

Data Preparation

We default use the ImageNet-1k dataset. Or you can try our UHDBench dataset on huggingface and download to your /path/to/your/dataset.

Evaluation

Remember to change the paths in scripts.

bash scripts/test_aetok.bash
bash scripts/test_aegen.bash
bash scripts/test_vqtok.bash
bash scripts/test_vqgen.bash

🗄️Demos

  • 🔥 Qualitative reconstruction images.

Image

  • 🔥 Qualitative class-to-image generation of Imagenet.

Image

  • 🔥 Evaluation of dino-tok-ae on 256×256 ImageNet benchmark.

Image

  • 🔥 Evaluation of dino-tok-vq on 256×256 ImageNet benchmark.

Image

📌 Citation

If the paper and code from DINO-Tok help your research, we kindly ask you to give a citation to our paper ❤️. Additionally, if you appreciate our work and find this repository useful, giving it a star ⭐️ would be a wonderful way to support our work. Thank you very much.

@article{jia2025dinotok,
  title={DINO-Tok: Adapting DINO for Visual Tokenizers},
  author={Jia, Mingkai and Li, Mingxiao and Fan, Liaoyuan and Shi, Tianxing and Guo, Jiaxin and Li, Zeming and Guo, Xiaoyang and Long, Xiao-Xiao and Zhang, Qian and Tan, Ping and others},
  journal={arXiv preprint arXiv:2511.20565},
  year={2025}
}

License

This repository is under the MIT License. For more license questions, please contact Mingkai Jia (mjiaab@connect.ust.hk) and Wei Yin (yvanwy@outlook.com).

About

[Arxiv'25] DINO-Tok: Adapting DINO for Visual Tokenizers

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors