I am interested in generative models for vision. More specifically, I work on leveraging diffusion models and other generative approaches for tasks such as depth estimation, image restoration, video synthesis, and 3D content creation. I am also interested in the intersection of generative AI with graphics, computational photography, and novel imaging systems.
I am actively looking for internship opportunities in the summer of 2026.
If you are interested in my profile, please feel free to reach out to me via email.
A camera-controlled video diffusion model that performs novel video trajectory synthesis by conditioning on 4D scene latents from a large 4D reconstruction model.
A novel diffusion-based approach for dense prediction, providing foundation models and pipelines for a range of computer vision and image analysis tasks, including monocular depth estimation, surface normal prediction, and intrinsic image decomposition.
Fine-tuning a pre-trained video diffusion model on event camera data turns it into a state-of-the-art event-based video interpolater and extrapolator with high temporal resolution.
We present an accurate and fast gaze tracking method using single-shot phase-measuring-deflectometry (PMD), taking advantage of dense 3D surface measurements of the eye surface.
Using a differentiable renderer to simulate reflection light transport allows us to take advantage of dense sturctured screen illumination to accurately optimize eye gaze direction.
We specialize a pre-trained latent diffusion model into an efficient data generation pipeline with style control and multi-resolution semantic adherence, improving downstream domain generalization performance.
Our method lifts the power of generative 2D image models, such as Stable Diffusion, into 3D. Using them as a way to synthesize unseen content for novel views, and NeRF to reconcile all generated views, our method provides vivid painting of the input 3D assets in a variety of shape categories.
We present a first study on the trade-offs between projection and reflection based 3D imaging systems under diffusive and specular reflection mixtures. Experiments are conducted in our projection/reflection simulator framework using the Mitsuba2 renderer.
Aside from research, I delight in the symphony of flavors that dance on the tongue (good food) and the harmonious melodies that serenade the ears (music). A few of my musical heroes include 竇唯 (Dou Wei), Radiohead, Brian Eno, and Fleet Foxes.