Maarten Sap

I am an assistant professor at CMU's LTI department with a courtesy appointment in HCII, and a part-time research scientist and AI safety lead at the Allen Institute for AI (AI2). My research focuses on (1) measuring and improving AI systems' social and interactional intelligence, (2) assessing and combatting social inequality, safety risks, and socio-cultural biases in human- or AI-generated language, and (3) building narrative language technologies for prosocial outcomes. I was named a 2025 Packard Fellow and a recipient of the 2025 Okawa Research Award.

I received my PhD from the University of Washington where I was advised by Noah Smith and Yejin Choi.
[bio for talks]

Recent updates:

December 2025 πŸ…πŸ“ƒ: Very excited to have our paper Artificial Hivemind: The Open-Ended Homogeneity of Language Models (and Beyond) selected for a Best Paper Award at NeurIPS 2025 (Datasets and Benchmarks Track)!! Huge congrats to the first author Liwei Jiang!!!

November 2025 πŸ’ŽπŸš€: Honored to be a Spring 2025 recipient of the Amazon Research Award for our project on measuring AI agentic safety!

October 2025 πŸ…β­: I’m super excited and grateful to announce that I'm part of the 2025 class of Packard Fellows. The Packard Foundation and this fellowship will allow me to explore exciting research directions towards culturally responsible and safe AI 🌍🌈

October 2025 πŸ”πŸ§‘β€πŸŽ“: Due to my lab being quite full already, I'm not taking looking for any new students in this upcoming PhD application cycle 😟.

October 2025 πŸ‡¨πŸ‡¦πŸŽ‰: Excited to be attending COLM 2025 in Montreal this October! I'll be giving a talk at the Social Sim Workshop on Unlocking Social Intelligence in AI agents. I'm also thrilled that five papers I co-authored will be presented by my amazing collaborators at COLM: HAICOSYSTEM: An Ecosystem for Sandboxing Safety Risks in Human-AI Interactions (led by Xuhui Zhou et al.), ALFA: Aligning LLMs to Ask Good Questions: A Case Study in Clinical Reasoning (co-led by Jimin Mun et al.), PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages, Fluid Language Model Benchmarking, and The Delta Learning Hypothesis: Preference Tuning on Weak Data can Yield Strong Gains.

August 2025 🌟: Incredibly honored to be one of 7 US recipients of the 2025 Okawa Research Grant from the Okawa Foundation!

August 2025 πŸ§‘β€πŸŽ“: Welcoming my first postdoc, Vasudha Varadarajan, to the lab!

[older news]


My research group:

Image
Dan Chechelnitsky

CMU Portugal LTI PhD student
co-advised with Chrysoula Zerva

Image
Joel Mire

LTI PhD student

Image
Karina Halevy

LTI PhD student
co-advised with Mona Diab

Image
Jimin Mun

LTI PhD student

Image
Jocelyn Shen

MIT PhD student
co-advised with Cynthia Breazeal

Image
Kynnedy Smith

HCII PhD student
co-advised with Motahhare Eslami

Image
Vasudha Varadarajan

LTI Postdoc

Image
Akhila Yerukola

LTI PhD student

Image
Mingqian Zheng

LTI PhD student
co-advised with Carolyn RosΓ©

Image
Xuhui Zhou

LTI PhD student


Overarching Research Themes

Themes extracted and images generated with the OpenAI API; there may be inconsistencies.

AI Agent Safety Frameworks

Image
My research group explores advanced frameworks for ensuring the safety and reliability of AI agents in real-world applications. One of our important papers is the [OpenAgentSafety: A Comprehensive Framework for Evaluating Real-World AI Agent Safety](https://arxiv.org/abs/2507.06134), which offers a structured approach to assessing AI behaviors and safety risks. Additionally, our paper titled [TOM-SWE: User Mental Modeling For Software Engineering Agents](https://arxiv.org/abs/2510.21903) addresses user interactions with software engineering agents, emphasizing how mental modeling can improve performance. Together, these works contribute to a deeper understanding of the complexities surrounding AI agent safety protocols.

Ethics in AI Development

Image
My research group explores the ethical implications of AI technologies, particularly in the context of decision-making and user trust. We highlight the findings from our study, [PluriHarms: Benchmarking the Full Spectrum of Human Judgments on AI Harm](https://arxiv.org/abs/2601.08951), which evaluates diverse perspectives on AI-induced harm and promotes responsible AI adoption. Another crucial paper, [Synthetic Socratic Debates: Examining Persona Effects on Moral Decision and Persuasion Dynamics](https://arxiv.org/abs/2506.12657), emphasizes the moral nuances involved in AI interactions through simulated debates. These studies help inform the ongoing dialogue about ethical standards and responsible behavior in AI systems.

Narrative Analysis and Empathy

Image
My research group explores how narrative frameworks can enhance understanding and emotional engagement in human-computer interactions. Our recent work, [Social Story Frames: Contextual Reasoning about Narrative Intent and Reception](https://arxiv.org/abs/2512.15925), investigates how story structures impact user perceptions and emotional responses. In another significant paper, [HEART-felt Narratives: Tracing Empathy and Narrative Style in Personal Stories with LLMs](https://arxiv.org/abs/2405.17633), we analyze the role of personalized storytelling in fostering empathy between users and AI systems. These explorations underscore the importance of stories in shaping AI experiences and user interactions.

Evaluating Social Intelligence in AI

Image
My research group explores innovative methods for assessing social intelligence in AI systems, focusing on how these agents perceive and interact within human social contexts. Our paper, [SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents](https://arxiv.org/abs/2310.11667), introduces a framework for evaluating the social cognitive capabilities of language agents. Additionally, [SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions](https://arxiv.org/abs/2506.23046) investigates the ability of AI to understand and simulate social dynamics from multiple perspectives, raising questions about the complexity of human-like interaction capabilities. These studies contribute to a better understanding of social responsibilities in AI actions.