I am a second-year PhD student in Computer Science at the Johns Hopkins University Whiting School of Engineering. I am advised by Dr. Eric Nalisnick and Dr. Anqi Liu, and I work closely with Dr. Gillian Hadfield. My research interests span AI safety and robustness, AI alignment, and human-AI collaboration. My work brings together ideas from machine learning and cognitive science to create more reliable, trustworthy, and human-centered AI systems.
January 2025: I was named a Junior Member of the Future of Life Institute's AI Existential Safety community for my work on human-centric AI!
October 2024: I received the Jun Wu and Yan Zhang Graduate Student Fellowship, as well as the Louis M. Brown Engineering Fellowship, from Johns Hopkins University.
Large language models (LLMs) demonstrate a remarkable ability to learn new tasks from a few in-context examples.
However, this flexibility introduces safety concerns: LLMs can be influenced by incorrect or malicious demonstrations.
This motivates principled designs in which the system itself includes built-in mechanisms to guard against such attacks.
We propose a novel approach to limit the degree to which harmful demonstrations can degrade model performance.
We present both theoretical and empirical results showing that our approach can effectively control this risk for harmful in-context demonstrations,
while simultaneously achieving substantial performance and efficiency gains with helpful demonstrations.
Multi-agent debate, often proposed to improve AI reasoning, can sometimes harm performance. Prior work has only studied homogeneous agents,
but this work examines how diverse tasks and model capabilities affect debate dynamics. Experiments show that accuracy may decrease over time,
even when stronger models outnumber weaker ones, as agents tend to adopt peers' incorrect reasoning for the sake of agreement.
This reveals key failure modes of multi-agent debate, suggesting that naive applications of debate risk degrading performance when agents cannot resist persuasive but flawed arguments.
We study a previously overlooked factor that influences a ML agent's ability to learn human values (representational alignment), drawing on insights from cognitive science.
We demonstrate that aligning AI representations with humans can improve safety, sample efficiency, and generalization ability when learning a wide range of human values in personalization tasks.
This introduces a new avenue for pursuing scalable, robust, and personalizable alignment of AI agents with human values.
I find and prove the existence of Dirac conical points in multiple 2D materials under certain conditions on electric potential, a property which has been conjectured to be related to the unique properties of graphene. I additionally discovered and proved the existence of a new type of spectral touching point, which I named the mesa touching point.
Work Experience
LinkedIn Core AI Team, PhD Research Intern, May - August 2025.
Expedia Group Vacation Rental Dynamic Pricing Team, Machine Learning Science Intern, May - July 2023.
Amazon Web Services Pool Balancing & Demand Forecasting Team, Software Development Engineer Intern, June - August 2022.