I am a first-year Ph.D. student at Electrical and Computer Engineering Department, Princeton University, advised by Prof. Chi Jin.
Previously, I did my undergraduate at Yuanpei College, Peking University.
I am interested in the intersection of RL and LLMs, especially on certifiable reasoning.
I view exploration as the core challenge in RL, and test-time search being necessary to achieve it for LLM agents.
The key technical problem is how search procedures and expert decision-making systems can be internalized as reasoning ability, rather than remaining external scaffolding.
I see two tightly coupled aspects: backfilling expert search behavior into the model through learning, and on-the-fly calibration that lets the model assess uncertainty and decide when to search, explore, or trust its own prediction.
Publications
Goedel-Prover-V2: The Strongest Open-Source Theorem Prover to Date Yong Lin*,
Shange Tang*,
Bohan Lyu*,
Ziran Yang*,
Jui-Hui Chung*,
Haoyu Zhao*,
Lai Jiang*,
Yihan Geng*,
Jiawei Ge,
Jingruo Sun,
Jiayun Wu,
Jiri Gesi,
David Acuna,
Kaiyu Yang,
Hongzhou Lin*,
Yejin Choi,
Danqi Chen,
Sanjeev Arora,
Chi Jin*,
AI4MATH@ICML 2025 (Oral), ICLR 2026