My research aims to understand and improve foundation models, enabling them to be more robust, trustworthy, and to collaborate more effectively with humans. Specifically, I am interested in:
(1) Evaluation and oversight: How can we rigorously evaluate models or agents to quantify both emergent capabilities and reliability risks? How can we systematically monitor and audit model behaviors in user-facing, interactive, or open-ended environments?
(2) Alignment through collaboration: How can we mitigate reliability risks by leveraging diverse human experiences and insights distilled from simulated or real-world interaction dynamics? Can such interaction-centric approaches also enhance personalization and diversity to support effective Human-AI collaboration?
(3) Understanding the principles of reasoning and their connection to reliability: How do models reason and how can we make reasoning more reliable? Can we better understand and analyze the underlying mechanisms that lead to various reliability failures by analyzing data influence, training dynamics, or internal representations?
During Summer 2025, I was a Research Engineer Intern at the Center for AI Safety. Previously, I was a research assistant at Dartmouth College where I had the chance to work with Dr. Ruibo Liu and Prof. Soroush Vosoughi on Value Alignment for LLMs.
A benchmark and set of analyses for evaluating whether vision-language models respect contextual integrity in location disclosure for image geolocation, revealing that violations of contextual norms can lead to contextual harm, characterized by over-disclosure of sensitive locations, poor privacy-utility tradeoffs, and misalignment with human privacy expectations.
Competed for Shared Task 0: Generalization and Typologically Diverse Morphological Inflection and achieved the highest performance among all submission in both small and large training conditions.
Misc
I come from Nanjing, a beautiful and historical city that served as the capital of six ancient Chinese dynasties over the past two thousand years.
I like listening to Rock N' Roll, ranging from Progressive Rock to BritPop and Pop Rock.
I've also been known to (awkwardly) hoop, smash, and stroke. (Style borrowed here from Prof. Schmidt)