Hi! I am a Ph.D. student from Shanghai Jiao Tong University (SJTU), under the supervision of Prof. Ran Yi in Digital Media & Computer Vision Laboratory (DMCV).
My research interests mainly lie in Computer Vision and Generative Models,
supported by National Natural Science Foundation of China for Young Ph.D. students (国家自然科学基金青年学生基础研究项目(博士生)) and CIE-Tencent Doctoral Research Incentive Project (首届中国电子学会—腾讯博士生科研激励计划(混元大模型专项)).
Controllable Image and Video Generation:
how to generate high-quality images and videos with the given conditions. (main research area)
Few-shot Generative Model Adaption:
how to employ generative model in producing high-quality and diverse images in a new domain with only a small number of training data.
Stroke-based Neural Painting and Image Vectorization:
how to recreate a pixel-based image with a set of brushstrokes (or scalable vector graphics path) like real human-beings while achieving both faithful reconstruction and stroke style at the same time.
Anomaly Generation and Detection:
how to generate anomaly image with few-shot data to help anomaly detection.
Internship. We am looking for self-motivated students to join our research group! If you are interested in our research, feel free to contact me!
News
[2025.11.8]: Our papers UltraGen is accepted by AAAI 2026!
[2025.9.30]: Our papers AttnPainter is accepted by TVCG 2025!
[2023.12.05]: 我们的论文 《基于薄板样条插值的弯曲笔触神经绘画与风格化方法》 被中国科学:信息科学接收! Our paper Curve-Stroke-Based Neural Painting and Stylization with Thin Plate Spline Interpolation is accepted by SCIENTIA SINICA Informationis!
We propose MotionMaster, a novel training-free video motion transfer model, which disentangles camera motions and object motions in source videos, and transfers the extracted camera motions to new videos.
We introduce a one-shot and few-shot camera motion disentanglement method, and design a camera motion combination method, enabling our model a more controllable and flexible camera control.
We propose AesStyler, a novel Aesthetic Guided Universal Style Transfer method, which utilizes pre-trained aesthetiic assessment model, a novel Universal Aesthetic Codebook and a novel Universal and Specific Aesthetic-Guided Attention (USAesA) module. Extensive experiments and user-studies have demonstrated that our approach generates aesthetically more harmonious and pleasing results than the state-of-the-art methods.
We propose SuperSVG, a superpixel-based vectorization model that achieves fast
and high-precision image vectorization. we decompose the input image into superpixels
to help the model focus on areas with similar colors and textures.
Then, we propose a two-stage self-training framework, where a coarse-stage model
is employed to reconstruct the main structure and a refinement-stage model is used
for enriching the details.
We propose SAMVG, a multi-stage model to vectorize raster images into SVG (Scalable Vector Graphics). Through a series of extensive experiments, we demonstrate that SAMVG can produce high quality SVGs in any domain while requiring less computation time and complexity compared to previous state-of-the-art methods.
we propose AnomalyDiffusion, a novel diffusion-based few-shot anomaly generation model, which utilizes the strong prior information of latent diffusion model learned from large-scale dataset to enhance the generation authenticity under few-shot training data.
We propose a new curved brushstroke parameter model based on thin-plate spline interpolation. By curving and affine-transforming real brushstroke templates in succession, we can generate more realistic and varied brushstroke images. Furthermore, we propose a hierarchical brushstroke optimization method that decomposes the entire image into multiple brushstrokes, from large to small, effectively improving the model’s painting ability for both the overall structure and local details of the image
We propose Compositional Neural Painter, a novel stroke-based rendering framework which dynamically predicts the next painting region based on the current canvas, instead of dividing the image plane uniformly into painting regions. Extensive experiments show our model outperforms the existing models in stroke-based neural painting.
We propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss, which targets different learning objectives at distinct training stages of the diffusion model. Theoretical analysis, and experiments demonstrate the superiority of our approach in few-shot generative model adaption tasks.
Honors & Awards
National Natural Science Foundation of China for Young Ph.D. students (国家自然科学基金青年学生基础研究项目(博士生)), 2025
National Scholarship for Graduate students, 2025
CIE-Tencent Doctoral Research Incentive Project (首届中国电子学会—腾讯博士生科研激励计划(混元大模型专项)), 2024
Zhiyuan Outstanding Student Scholarship of ShanghaiJiao Tong University, 2022
Zhiyuan Honors Bachelor's Degree of Shanghai Jiao Tong University, 2022
Outstanding Graduate of Shanghai Jiao Tong University, 2022
Zhiyuan Honors Scholarship of Shanghai Jiao Tong University, 2018-2021