My research interests are primarily on machine learning and AI safety. I'm enthusiastic about advancing the scientific understanding of frontier models and algorithms (e.g., large language models, multimodal models, AI alignment, diffusion models) as a foundation for developing safe, reliable, and trustworthy AI/AGI systems. I am looking for highly motivated PhD students and research interns interested in AI theory, algorithms, and safety.
News
[Nov. 2025] I served as an Area Chair for NeurIPS2025/AAAI2026/ICLR2026.
[Jun. 2024] We open-sourced MultiTrust, a comprehensive benchmark to evaluate the trustworthiness of multimodal large language models (MLLMs). Please see more details in our paper and code.
[Oct. 2023] I am named in the list of world's top-2% scientists for the single-year impact released by Stanford/Elsevier.
[Sep. 2023] Our recent paper shows that multimodal large language models are also vulnerable to adversarial perturbations of images. Our attack leads to 45% success rate against GPT-4V, 22% success rate against Google's Bard, 26% success rate against Bing Chat, and 86% success rate against ERNIE Bot. Code is available.
MultiTrust: A Comprehensive Benchmark Towards Trustworthy Multimodal Large Language Models
Yichi Zhang, Yao Huang, Yitong Sun, Chang Liu, Zhe Zhao, Zhengwei Fang, Yifan Wang, Huanran Chen, Xiao Yang, Xingxing Wei, Hang Su, Yinpeng Dong#, and Jun Zhu
Advances in Neural Information Processing Systems (NeurIPS) Datasets and Benchmarks Track, Vancouver, Canada, 2024
Diffusion Models are Certifiably Robust Classifiers
Huanran Chen, Yinpeng Dong, Shitong Shao, Zhongkai Hao, Xiao Yang, Hang Su, and Jun Zhu
Advances in Neural Information Processing Systems (NeurIPS), Vancouver, Canada, 2024
Natural Language Induced Adversarial Images
Xiaopei Zhu, Peiyang Xu, Guanning Zeng, Yinpeng Dong, and Xiaolin Hu
ACM International Conference on Multimedia (MM), Melbourne, Australia, 2024
Robust Classification via a Single Diffusion Model
Huanran Chen, Yinpeng Dong, Zhengyi Wang, Xiao Yang, Chengqi Duan, Hang Su, and Jun Zhu
International Conference on Machine Learning (ICML), Vienna, Austria, 2024
How Robust is Google’s Bard to Adversarial Image Attacks? Yinpeng Dong, Huanran Chen, Jiawei Chen, Zhengwei Fang, Xiao Yang, Yichi Zhang, Yu Tian, Hang Su, and Jun Zhu
NeurIPS 2023 Workshop on Robustness of Few-shot and Zero-shot Learning in Foundation Models, New Orleans, USA, 2023
The Art of Defense: Letting Networks Fool the Attacker
Jinlai Zhang, Yinpeng Dong, Binbin Liu, Bo Ouyang, Jihong Zhu, Minchi Kuang, Houqing Wang, and Yanmei Meng
IEEE Transactions on Information Forensics and Security (TIFS), 2023
BadDet: Backdoor Attacks on Object Detection (Best Paper Award)
Shih-Han Chan, Yinpeng Dong, Jun Zhu, Xiaolu Zhang, Jun Zhou
ECCV 2022 workshop on Adversarial Robustness in the Real World, Tel Aviv, Israel, 2022
Two Coupled Rejection Metrics Can Tell Adversarial Examples Apart
Tianyu Pang, Huishuai Zhang, Di He, Yinpeng Dong, Hang Su, Wei Chen, Jun Zhu, and Tie-Yan Liu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, Louisiana, USA, 2022
Bag of Tricks for Adversarial Training
Tianyu Pang, Xiao Yang, Yinpeng Dong, Hang Su, and Jun Zhu
International Conference on Learning Representations (ICLR), Vienna, Austria, 2021
Composite Binary Decomposition Networks
You Qiaoben, Zheng Wang, Jianguo Li, Yinpeng Dong, Yu-Gang Jiang, and Jun Zhu
The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI), Honolulu, Hawaii, USA, 2019
2018
Adversarial Vision Challenge
Wieland Brendel, Jonas Rauber, Alexey Kurakin, Nicolas Papernot, Behar Veliqi, Sharada P. Mohanty, Florian Laurent, Marcel Salathé, Matthias Bethge, Yaodong Yu, Hongyang Zhang, Susu Xu, Hongbao Zhang, Pengtao Xie, Eric P. Xing, Thomas Brunner, Frederik Diehl, Jérôme Rony, Luiz Gustavo Hafemann, Shuyu Cheng, Yinpeng Dong, Xuefei Ning, Wenshuo Li, Yu Wang
NeurIPS 2018 Competition Chapter
Towards Robust Detection of Adversarial Examples (Spotlight)
Tianyu Pang, Chao Du, Yinpeng Dong, and Jun Zhu
Advances in Neural Information Processing Systems (NeurIPS), Montreal, Canada, 2018
Adversarial Attacks and Defences Competition
Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang, Jun Zhu, Xiaolin Hu, Cihang Xie, Jianyu Wang, Zhishuai Zhang, Zhou Ren, Alan Yuille, Sangxia Huang, Yao Zhao, Yuzhe Zhao, Zhonglin Han, Junjiajia Long, Yerkebulan Berdibekov, Takuya Akiba, Seiya Tokui, and Motoki Abe
NeurIPS 2017 Competition Chapter
Forecast the Plausible Paths in Crowd Scenes
Hang Su, Jun Zhu, Yinpeng Dong, and Bo Zhang
International Joint Conference on Artificial Intelligence (IJCAI), Melbourne, Australia, 2017
Our team (Yinpeng Dong, Chang Liu, Wenzhao Xiang, Yichi Zhang, Haoxing Ye) won the first place in the Adversarial Robustness of Deep Learning track of 2022 International Algorithm Case Competition.
Our team (Xiao Yang, Dingcheng Yang, Shilong Liu, Zihao Xiao, Yinpeng Dong) won the first place in the GeekPwn DeepFake competition (October 24th, 2020).
Our team (Shuyu Cheng, Xiao Yang, Dingcheng Yang, Yinpeng Dong) won the first places in the GeekPwn CAAD CTF and Adversarial Patch competitions (October 24th, 2019).
Our team (Shuyu Cheng and Yinpeng Dong) won the second place in the Untargeted Attack track of NeurIPS 2018 Adversarial Vision Challenge.
Our team (Yinpeng Dong, Tianyu Pang, Chao Du) won the second places in the Targeted Adversarial Attack track and Defense Against Adversarial Attack track, as well as the third place in the Non-targeted Adversarial Attack track of GeekPwn CAAD (Competition on Adversarial Attacks and Defenses).
'84' Future Innovation Scholarship, CST Department of Tsinghua University, 2019.12
This award is given to Tianyu pang and me for our research on adversarial robustness.
Microsoft Research Asia (MSRA) Fellowship, 2019.11
China National Scholarship, Tsinghua University, 2019.10
VALSE Annual Outstanding Student Paper Award, 2019.04
This award is given to "Boosting Adversarial Attacks with Momentum" in CVPR 2018.
CCF-CV Academic Emerging Award (CCF-CV 学术新锐奖), 2018.11
Only 3 students in China were awarded for their research in computer vision during the first three years of Ph.D. career.
China National Scholarship, Tsinghua University, 2018.10
Tsinghua University Future PhD Fellowship, Tsinghua University, 2017.09
This fellowship was given to only 2 students in our department.
Teaching
2023.06, Lecturer in CCF ADL140: Robust Machine Learning
2019 spring, Head TA in Statistical Machine Learning, instructed by Prof. Jun Zhu
Last update: Nov. 2025 by Yinpeng Dong