Image
Tong He
Email : tonghe90[at]gmail[dot]com
I am now a Research Fellow at Shanghai AI Lab, working with Prof. Qiao Yu . I was a Research Fellow at Australian Institute for Machine Learning (AIML), the University of Adelaide, working with Prof. Chunhua Shen and Prof. Anton van den Hengel

(Google scholar)

I got my PhD in computer science at the University of Adelaide and supervised by Chunhua Shen. I was a visiting student at MMLAB of the Chinese University of Hong Kong at Shenzhen under the supervision of Dr.Weilin Huang and Prof.Yu Qiao. We are looking for self-motivated PhD students (joint PhD program with SJTU, FDU, ZJU, USTC etc) and interns. If you are interested in joining us, please feel free to contact me with your CV!

News

  • Sep, 2025: Our paper Aether has been accepted by ICCV 2025 and is selected as Outstanding paper on RIWM workshop.
  • Mar, 2025: Three papers are accepted by ICCV2025.
  • Mar, 2025: We released our world model AETHER. Try it here.
  • Feb, 2025: Two papers have been accepted by CVPR2025
  • Jan, 2025: Six papers have been accepted by ICLR2025
  • Oct, 2024: Four papers have been accepted by NIPS2024
  • Oct, 2024: One paper have been accepted by T-PAMI
  • Ranked as Worldwide Top 2% Scientists by Stanford University (2024.10)
  • June, 2024: Five papers have been accepted by ECCV2024
  • Ranked as Worldwide Top 2% Scientists by Stanford University (2023.10)

Selected Paper on World Model

Image
VInO: A Unified Visual Generator with Interleaved OmniModal Context
J. Chen, T. He, Z. Fu, P. Wan, K. Gai, W. Ye.
arxiv 2025 [PDF] [code] [project]
Image
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling
Y. Zhou, Y. Wang, J. Zhou, W. Chang, H. Guo, Z. Li, K. Ma, X. Li, Y. Wang, H. Zhu, M. Liu, D. Liu, J. Yang, Z. Fu, J. Chen, C. Shen, J. Pang, K. Zhang and T. He*.
arxiv 2025 [PDF] [code] [project]
Image
DeepVerse: 4D Autoregressive Video Generation as a World Model
J. Chen, H. Zhu, X. He, Y. Wang, J. Zhou, W. Chang, Y. Zhou, Z Li, Z. Fu, J. Pang and T. He*.
arxiv 2025, [PDF] [code] [project]
Image
Aether: Geometric-Aware Unified World Modeling
H. Zhu*, Y. Wang*, J. Zhou*, W. Chang*, Y. Zhou*, Z. Li*, J. Chen*, C. Shen, J. Pang and T. He**.
ICCV 2025 & [Best Paper] on workshop on ICCV 2025 Reliable and Interactive World Models (RIWM), [PDF] [code] [project]
Image
Sekai: A Video Dataset towards World Exploration
Z. Li, C. Li, ...T. He, J. Pang, Y. Qiao, Y. Jia, K. Zhang.
arxiv 2025, [PDF] [code] [project]
Image
Yume1.5: A Text-Controlled Interactive World Generation Model
X. Mao, Z. Li, C. Li, X. Xu, K. Ying, T. He, J. Pang, Y. Qiao and K. Zhang.
arxiv 2025, [PDF] [code] [project]
Image
Yume: An Interactive World Generation Model
X. Mao, S. Lin, Z. Li, C. Li, W. Peng, T. He, J. Pang, M. Chi, Y. Qiao and K. Zhang.
arxiv 2025, [PDF] [code] [project]

Selected Paper on Embodied AI

Image
VQ-VLA: Improving Vision-Language-Action Models via Scaling Vector-Quantized Action Tokenizers
Y. Wang, H. Zhu, M. Liu, J. Yang, H. Fang and T. He*.
ICCV 2025, [PDF] [code] [project]
Image
Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
J. Yang, H. Zhu, Y. Wang, G. Wu, T. He, L. Wang.
CVPR 2025, [PDF] [code]
Image
Point Cloud Matters: Rethinking the Impact of Different Observation Spaces on Robot Learning
H. Zhu, Y. Wang, D. Huang, W. Ye, W. Ouyang and T. He*.
NIPS 2024, [PDF] [code]
Image
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
H. Zhu, H. Yang, Y. Wang, J. Yang, L. Wang and T. He*.
ICLR 2025, [PDF] [code]

Selected Paper on 3D Vision

Image
π3: Scalable Permutation-Equivariant Visual Geometry Learning
Y. Wang, J. Zhou, H. Zhu, W. Chang, Y. Zhou, Z Li, J. Chen, J. Pang, C. Shen and T. He*.
arxiv 2025, [PDF] [code] [project]
Image
NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction
Y. Wang, D. Huang, W. Ye, G. Zhang, W. Ouyang and T. He*.
NIPS 2024, [PDF] [code]
Image
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
Y. Chen, T. He*, D. Huang, W. Ye, S. Chen, J. Tang, Z. Cai, L. Yang, G. Yu, G. Lin and C. Zhang.
ICLR 2025, [PDF] [code]
Image
Point Transformer V3: Simpler, Faster, Stronger
X. Wu, L. Jiang, P. Wang, Z. Liu, X. Liu, Y. Qiao, W. Ouyang, T. He* and H. Zhao*.
CVPR 2024, [PDF] [code]

Professional activities

    Journals

    Transactions on Pattern Analysis and Machine Intelligence (T-PAMI)

    International Journal of Computer Vision (IJCV)

    Transaction on Image Processing(TIP)

    Pattern Recognition(PR)

    IEEE Transactions on Circuits and Systems for Video Technology(TCSVT)

    Conferences

    CVPR, ICCV, ECCV, NIPS, ICLR, AAAI, etc.

Last Updated on 26th 7, 2025

Published with GitHub Pages