About Me

I’m a fourth-year Ph.D. student in Computer Science at Shanghai Jiao Tong University (SJTU), advised by Prof. Quanshi Zhang. I’m a member of the Lab for Interpretable Machine Learning. Previously, I enrolled in the Dual Degree Program at University of Michigan - Shanghai Jiao Tong University Joint Institute (UM-SJTU JI), in which I obtained a B.S.Eng. degree in Computer Science at University of Michigan, Ann Arbor, and a B.S.Eng. degree in Electrical and Computer Engineering (ECE) at Shanghai Jiao Tong University.

My research spans Explainable AI (XAI), LLM and agent safety, LLM reasoning, and agentic RL:

Explaining deep models with game-theoretic interactions
Frontier risks of LLMs and agents
Understanding post-training techniques (e.g., SFT, RL) of large reasoning models (LRMs)
Agentic RL for long-horizon tasks

News

[2026.01] Four papers accepted by ICLR 2026!
[2025.07] Safework-R1 model series is released at WAIC 2025!
[2024.11-12] Talks at University of California, Los Angeles (UCLA) / University of Southern California (USC) / University of California, Berkeley (UCB) / Johns Hopkins University (JHU) / University of Pennsylvania (UPenn). An unforgettable journey in the U.S.!
[2024.10] Remote talk at Carnegie Mellon University (CMU) with Prof. Quanshi Zhang.
[2024.09] One paper (see Project page) accepted by NeurIPS 2024!
[2024.01] One paper (see Project page) accepted by ICLR 2024!

Projects

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45° Law
Core contributor, responsible for efficient & safety-aware reasoning training. Work done during an internship at Shanghai AI Lab.
Technical Report / Huggingface
AgentDoG: A Diagnostic Guardrail Framework for AI Agent Safety and Security
Core contributor. Work done during an internship at Shanghai AI Lab.
Technical Report / GitHub / Huggingface

Publications

(* indicates equal contribution)

Preprints

Rethinking Generalization in Reasoning SFT: A Conditional Analysis on Optimization, Data, and Model Capability
Qihan Ren*, Peng Wang*, Ruikun Cai, Shuai Shao, Dadi Guo, Yuejin Xie, Yafu Li, Quanshi Zhang, Xia Hu, Jing Shao, Dongrui Liu
arxiv 2026 / Paper / GitHub / Huggingface
Alita: Generalist Agent Enabling Ccalable Agentic Reasoning with Minimal Predefinition and Maximal Self-evolution
Jiahao Qiu et al. (Co-author)
arxiv 2025 / Paper / GitHub
Revisiting Generalization Power of a DNN in Terms of Symbolic Interactions
Lei Cheng, Junpeng Zhang, Qihan Ren, Quanshi Zhang
arxiv 2025 / Paper

Conference papers

Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents
Shuai Shao*, Qihan Ren*, Chen Qian, Boyi Wei, Dadi Guo, Jingyi Yang, Xinhao Song, Linfeng Zhang, Weinan Zhang, Dongrui Liu, Jing Shao
ICLR 2026 / Paper / GitHub
Conditional Advantage Estimation for Reinforcement Learning in Large Reasoning Models
Guanxu Chen*, Yafu Li*, Yuxian Jiang, Chen Qian, Qihan Ren, Jingyi Yang, Yu Cheng, Dongrui Liu, Jing Shao
ICLR 2026 / Paper / GitHub
Towards Self-Evolving Agent Benchmarks: Validatable Agent Trajectory via Test-Time Exploration
Dadi Guo*, Tianyi Zhou*, Dongrui Liu*, Chen Qian, Qihan Ren, Shuai Shao, Zhiyuan Fan, Yi-R. Fung, Kun Wang, Linfeng Zhang, Jing Shao
ICLR 2026 / Paper
Towards the Dynamics of a DNN Learning Symbolic Interactions
Qihan Ren*, Junpeng Zhang*, Yang Xu, Yue Xin, Dongrui Liu, and Quanshi Zhang
NeurIPS 2024 / Paper / Zhihu / Project page
Where We Have Arrived in Proving the Emergence of Sparse Interaction Primitives in DNNs
Qihan Ren, Jiayang Gao, Wen Shen, and Quanshi Zhang
ICLR 2024 / Paper / GitHub / Zhihu / Project page
Towards the Difficulty for a Deep Neural Network to Learn Concepts of Different Complexities
Dongrui Liu*, Huiqi Deng*, Xu Cheng, Qihan Ren, Kangrui Wang, and Quanshi Zhang
NeurIPS 2023 / Paper / GitHub
Bayesian Neural Networks Avoid Encoding Perturbation-sensitive and Complex Concepts
Qihan Ren*, Huiqi Deng*, Yunuo Chen, Siyu Lou, and Quanshi Zhang
ICML 2023 / Paper / GitHub / Video
Discovering and Explaining the Representation Bottleneck of DNNs
Huiqi Deng*, Qihan Ren*, Hao Zhang, and Quanshi Zhang
ICLR 2022 (Oral) / Paper / GitHub / Video / Zhihu
Interpreting Representation Quality of DNNs for 3D Point Cloud Processing
Wen Shen, Qihan Ren, Dongrui Liu, and Quanshi Zhang
NeurIPS 2021 / Paper / GitHub / Video

Journal papers

A Survey of Self-evolving Agents: On Path to Artificial Super Intelligence
Huan-ang Gao*, Jiayi Geng*, Wenyue Hua*, Mengkang Hu*, Xinzhe Juan*, Hongzhang Liu*, Shilong Liu*, Jiahao Qiu*, Xuan Qi*, Qihan Ren*, Yiran Wu*, Hongru Wang*, Han Xiao*, Yuhang Zhou*, Shaokun Zhang*, Jiayi Zhang, Jinyu Xiang, Yixiong Fang, Qiwen Zhao, Dongrui Liu, Cheng Qian, Zhenhailong Wang, Minda Hu, Huazheng Wang, Qingyun Wu, Heng Ji, Mengdi Wang
* Equal contribution and the order is determined alphabetically
TMLR 2025 / Paper / GitHub
Interpretable Rotation-Equivariant Multiary-Valued Network for Attribute Obfuscation
Quanshi Zhang, Hao Zhang, Yiting Chen, Qihan Ren, Jie Ren, Xu Cheng, Liyao Xiang
IEEE T-PAMI 2025 / Paper
Rotation-Equivariant Quaternion Neural Networks for 3D Point Cloud Processing
Wen Shen, Zhihua Wei, Qihan Ren, Binbin Zhang, Shikun Huang, Jiaqi Fan, and Quanshi Zhang
IEEE T-PAMI 2024 / Paper

Book chapters

Engaged in the writing of the book “Introduction to Explainable Artificial Intelligence” (in Chinese 可解释人工智能导论) as a chapter co-author.
Book link

Experience

[2026.01 - Present] Intern at Minimax, Shanghai. Focus on agentic RL for long-horizon, repository-level coding tasks.
[2025.03 - 2026.01] Research intern at Shanghai AI Lab. Focus on LLM reasoning & LLM/Agent safety, mentored by Dongrui Liu.

Presentations and Invited Talks

[2024.11-12] Can inference logic of a neural network be faithfully explained as symbolic concepts? Talks at University of California, Los Angeles (UCLA) / University of Southern California (USC) / University of California, Berkeley (UCB) / Johns Hopkins University (JHU) / University of Pennsylvania (UPenn).
[2024.10] Can inference logic of a neural network be faithfully explained as symbolic concepts? Remote talk at Carnegie Mellon University (CMU) with Prof. Quanshi Zhang.
[2024.09] Theory and dynamical analysis of symbolic concepts encoded by deep neural networks. “AI+X” National Excellent PhD Forum (“AI+X”全国优秀博士生论坛) at Peking University.
[2022.04] Discovering and explaining the representation bottleneck of DNNs. At TechBeat with Huiqi Deng. See recording link.
[2022.03] Discovering and explaining the representation bottleneck of DNNs. At BAIYULAN OPEN AI, with Huiqi Deng. See recording link.

Selected Honors and Awards

[2023.01] SJTU Wenjun Wu AI class 吴文俊班 (16 selected)
[2022.06] Outstanding graduate of Shanghai Jiao Tong University
[2022.01] James B. Angell Scholar
[2021.12] Dean’s List of University of Michigan
[2020.10] National Scholarship (ranking 1/244)
[2019.10] National Scholarship (ranking 1/244)

Teaching

Machine Learning (CS3308 & CS3612), SJTU. Spring 2023 / Spring 2024.
Teaching assistant
Instructor: Quanshi Zhang
Academic Writing (VY100 & VY200), SJTU. Fall 2022 / Fall 2023.
Teaching Assistant
Instructor: Andrew Yang

Qihan Ren