I am a Ph.D. student at Shanghai Jiao Tong University, advised by Prof. Zhipeng Zhang .
I completed my master's degree in Computer Science at Shanghai Jiao Tong University advised by Prof. Li Niu and Prof. Linfeng Zhang and my bachelor's degree in Instrument Science and Control Technology at Southeast University.
I'm interested in LLM posttraining and multimodal learning. I am also interested in multi-agent systems.
I am actively seeking collaborations on exploring reasoning ability of LLMs and multi-agent systems, please feel free to contact me!!
To address the questions "How to create a webpage from an academic paper?" and "How to evaluate the project webpage?", we propose AutoPage and PageBench. AutoPage transforms academic papers into polished, published-ready project webpages through a human-in-the-loop multi-agent pipeline, while PageBench provides automatic evaluation across content quality and visual design quality dimensions.
Decouple-Then-Merge: Finetune Diffusion Models as Multi-Task Learning
Qianli Ma, Xuefei Ning, Dongrui Liu, Li Niu†, Linfeng Zhang† IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
arXiv / CVF Open Access / BibTeX / Project Page / Code
This paper proposes a new finetuning method for diffusion models, which decouples the diffusion process into multiple denoising tasks and then merges them. We show that this method can effectively finetune diffusion models for various tasks, including text-to-image generation, unconditional image generation.
Efficient Diffusion as Low Light Enhancer
Guanzhou Lan*, Qianli Ma*, Yuqi Yang, Zhigang Wang, Dong Wang, Xuelong Li†, Bin Zhao† IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
arXiv / CVF Open Access / BibTeX / Project Page / Code
This paper proposes an efficient diffusion model for low light enhancement, which can be applied to various low light enhancement tasks.
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint
Qianli Ma*, Dongrui Liu*, Qian Chen, Linfeng Zhang, Jing Shao† The 63rd Annual Meeting of the Association for Computational Linguistics (ACL main), 2025
arXiv / ACL Anthology / BibTeX / Code
This paper proposes a method to mitigate safety-utility conflicts in model merging for LLMs, which can be applied to various safety-utility tasks.
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
We present RAPO++, a cross-stage prompt optimization framework that unifies training-data--aligned refinement, test-time iterative scaling, and large language model (LLM) fine-tuning to substantially improve T2V generation without modifying the underlying generative backbone.
Survey of General End-to-End Autonomous Driving: A Unified Perspective
We present a comprehensive survey of general end-to-end autonomous driving. This survey collects and organizes key papers in General End-to-End Autonomous Driving, classifying them into Conventional (e.g., UniAD), VLM-centric (e.g., DriveLM), and Hybrid (e.g., Senna) approaches. In addition, this survey curates both Normal and Vision-Language datasets relevant to General End-to-End Autonomous Driving. Based on this taxonomy and dataset collection, our analysis further outlines the main research branches and emerging trends that are shaping the field.