Prior to my doctoral journey, I received my B.S. degree in Electronic Engineering with honours
from Tsinghua University in 2022.
My academic pursuits revolve around the dynamic intersection of multimodal world models and unified multimodal understanding and generation models.
This compelling research area fuels my passion for exploring innovative solutions and contributing to the cutting-edge advancements in the field.
Astra: General Interactive World Model with Autoregressive Denoising Yixuan Zhu, Jiaqi Feng, Wenzhao Zheng, Yuan Gao, Xin Tao, Pengfei Wan, Jie Zhou, Jiwen Lu
The Fourteenth International Conference on Learning Representations (ICLR), 2026
[Paper]
[Code]
[Project Page]
We introduce Astra, an interactive world model that delivers realistic long-horizon video rollouts under a wide range of scenarios and action inputs.
VARestorer: One-Step VAR Distillation for Real-World Image Super-Resolution Yixuan Zhu, Shilin Ma, Haolin Wang, Ao Li, Yanzhe Jing, Yansong Tang, Lei Chen, Jiwen Lu, Jie Zhou
The Fourteenth International Conference on Learning Representations (ICLR), 2026
[Coming soon]
We introduce VARestorer, a one-step VAR distillation framework for real-world image super-resolution that mitigates error accumulation.
FADE: Frequency-Aware Diffusion Model Factorization for Video Editing Yixuan Zhu, Haolin Wang, Shilin Ma, Wenliang Zhao, Yansong Tang, Lei Chen#, Jie Zhou
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[Paper]
[Code]
We introduce FADE—a training-free yet highly effective video editing approach that fully leverages the inherent priors from pre-trained video diffusion models via frequency-aware factorization.
InstaRevive: One-Step Image Enhancement via Dynamic Score Matching Yixuan Zhu*, Haolin Wang*, Ao Li, Wenliang Zhao, Yansong Tang, Jingxuan Niu, Lei Chen#, Jie Zhou, Jiwen Lu
The Thirteenth International Conference on Learning Representations (ICLR), 2025
[Paper]
[Code]
We propose InstaRevive, a straightforward yet powerful image enhancement framework that employs score-based diffusion distillation to harness potent generative capability and minimize the sampling steps.
FlowIE: Efficient Image Enhancement via Rectified Flow Yixuan Zhu*, Wenliang Zhao*, Ao Li, Yansong Tang#, Jie Zhou, Jiwen Lu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR Oral), 2024
[Paper]
[Code]
We proposed a unified framework for various efficient image enhancement tasks with generative diffusion priors.
DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery Yixuan Zhu*, Ao Li*, Yansong Tang#, Wenliang Zhao, Jie Zhou, Jiwen Lu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
[Paper]
[Code]
[Project Page]
We propose a new method to exploit diffusion priors for human mesh recovery (HMR) in occlusion and crowded scenarios.