Publications ( *, †, ‡ indicates the equal contributions, corresponding author, project leader, respectively.)
Currently, my interest lies in Embodied Agents ,
which are at the intersection of Multimodal Large Language Models and Embodied AI,
with particular interests in high-level planning and low-level control with spatio-temporal intelligence ,
working towards an generalist agent in a complex real-world environment.
Representative works are highlighted .
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics
Enshen Zhou * ,
Jingkun An * ,
Cheng Chi *‡ ,
Yi Han ,
Shanyu Rong ,
Chi Zhang ,
Pengwei Wang ,
Zhongyuan Wang ,
Tiejun
Huang ,
Lu Sheng† ,
Shanghang Zhang†
Paper /
Project /
Code /
Copy BibTeX
Copy Success!
TL;DR: From words to exactly where you mean using RoboRefer!
NeurIPS 2025
AGFSync: Leveraging AI-Generated Feedback for Preference Optimization in Text-to-Image
Generation
Jingkun An * ,
Yinghao Zhu * ,
Zongjian Li* ,
Enshen Zhou ,
Haoran Feng ,
Xijie Huang ,
Bohua
Chen ,
Yemin Shi ,
Chengwei Pan†
Paper /
Project /
Code /
Copy BibTeX
Copy Success!
TL;DR: Train T2I Diffusion model with AI-Generated Feedback for DPO!
AAAI 2025
RoboTracer: Mastering Spatial Trace with Reasoning in Vision-Language Models for Robotics
Enshen Zhou * ,
Cheng Chi *‡ ,
Yibo Li*; ,
Jingkun An * ,
Jiayuan Zhang ,
Shanyu Rong ,
Yi Han ,
Yuheng Ji ,
Mengzhen Liu ,
Pengwei Wang ,
Zhongyuan Wang ,
Tiejun Huang ,
Lu Sheng† ,
Shanghang Zhang†
Paper /
Project /
Code /
Copy BibTeX
Copy Success!
TL;DR: From what you say to where it moves using RoboTracer!
Arxiv 2025
TIGeR: Tool-Integrated Geometric Reasoning in Vision-Language Models for Robotics
Yi Han * ,
Cheng Chi *‡ ,
Enshen Zhou * ,
Shanyu Rong ,
Jingkun An * ,
Pengwei Wang ,
Zhongyuan Wang ,
Lu Sheng† ,
Shanghang Zhang†
Paper /
Project /
Code /
Copy BibTeX
Copy Success!
TL;DR: Equipping VLMs to perform accurate geometric reasoning for robotics!
Submitted to ICRA 2026
Medical MLLM is Vulnerable: Cross-Modality Jailbreak and Mismatched Attacks on Medical Multimodal Large Language Models
Xijie Huang * ,
Xinyuan Wang * ,
Haotao Zhang * ,
Yinghao Zhu * ,
Jiawen Xi ,
Jingkun An ,
Hao Wang ,
Hao Liang ,
Chengwei Pan†
Paper /
Project /
Code /
Copy BibTeX
Copy Success!
TL;DR: Medical MLLM is Vulnerable!
AAAI 2025
M3Fair: Mitigating Bias in Healthcare Data through Multi-Level and Multi-Sensitive Attribute Reweighting Method
Yinghao Zhu * ,
Jingkun An * ,
Enshen Zhou ,
Hao Li ,
Haoran Feng ,
Paper /
Project /
Code /
Copy BibTeX
Copy Success!
TL;DR: BeFair (A Bias Detection and Mitigation Tool)!
2023 NIH Bias Detection Third Prize (Top 5)
Selected Awards and Honors
2024: Outstanding Graduate of Beihang University.
2023: Grand Prize (Top 1) in "Challenge Cup" Competition of Science Achievement in China.