🎓

Bingbing Wen

PhD Student

👋 About Me

I am a final-year Ph.D. candidate at the University of Washington, where I am advised by Prof. Bill Howe and Prof. Lucy Lu Wang. I am also a member of the UW RAISE Center and collaborate closely with Prof. Yulia Tsvetkov.

My research focuses on the efficiency and reliability of foundation models and agentic systems, aiming to reduce computational overhead while enhancing model trustworthiness. My work is structured around three core pillars:

Data-Centric Optimization: I develop methods for optimal data mixture selection and curation, designing fine-grained preference signals that align models beyond simple correctness. ICLR2026DATA-FM,COLM25
Agent Workflows & Modular Architectures: I design adaptive systems that optimize tool use and dynamic routing mechanisms. My work explores how reinforcement learning can orchestrate collaboration among specialized experts to streamline complex agent workflows. TACL2025,ICML2025,IUI2026,CoA
Reliability-Aware Evaluation: I design frameworks for selective prediction and abstention, enabling models to quantify uncertainty and conserve resources by avoiding unnecessary computation on low-confidence samples. ACL2025, EMNLP2024,Neurips2025

Education

PhD in Information Science (Natural Language Processing)

University of Washington

MS in Computational Science & Engineering (Artificial Intelligence)

University of Hong Kong

BS in Control Science & Engineering (Robotics)

Zhejiang University

Research Interests

Developing data‑ and compute‑efficient methods that enable foundation models to learn, adapt, and allocate resources optimally across tasks and data sources—from training through inference

Featured Publications

Multimodal

MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining

Uncertainty-aware data mixture optimization for multimodal LLM midtraining via interpretable domain decomposition.

Mar 2, 2026 • 1 min read

Agentic Systems

Clarify or Answer: Reinforcement Learning for Agentic VQA with Context Under-specification

Reinforcement learning for agentic VQA that balances clarification and answering under underspecified context.

Jan 20, 2026 • 1 min read

Large Language Models

MARVEL: Modular Abstention for Reliable and Versatile Expert LLMs

A modular abstention framework for reliable expert LLMs that enables selective abstention from uncertain questions.

Jul 1, 2025 • 1 min read

Large Language Models

AutoScale-Automatic Prediction of Compute-optimal Data Composition for Training LLMs

Automatic prediction of compute-optimal data composition for efficient LLM training.

May 1, 2025 • 1 min read

Large Language Models

Do Language Models Mirror Human Confidence? Exploring Psychological Insights to Address Overconfidence in LLMs

Exploring psychological insights to address overconfidence in LLMs by comparing with human confidence patterns.

May 1, 2025 • 1 min read

See all

Recent Publications

Bingbing Wen, Sirajul Salekin, Feiyang Kang, Bill Howe, Lucy Lu Wang, Javier Movellan, Manjot Bilkhu (2026). MixAtlas: Uncertainty-aware Data Mixture Optimization for Multimodal LLM Midtraining. ICLR 2026 DATA-FM.

PDF

Zhen Cao, Bingbing Wen, Lucy Lu Wang (2026). Clarify or Answer: Reinforcement Learning for Agentic VQA with Context Under-specification. arXiv.

PDF

Lin Guo, Chenhao Yuan, Meng Zhong, Robert Wolfe, Rui Zhong, Yuxuan Xu, Bingbing Wen, Hao Shen, Others (2026). SusBench: An Online Benchmark for Evaluating Dark Pattern Susceptibility of Computer-Use Agents. IUI 2026.

Zhen Cao, Bingbing Wen, Lucy Lu Wang (2025). Asking the Missing Piece: Context-Driven Clarification for Ambiguous VQA. NeurIPS 2025 FoRLM.

Yifei Yang, Changping Lee, Sheng Shen Feng, Dongxu Zhao, Bingbing Wen, Andrew Z. Liu, Yulia Tsvetkov, Bill Howe (2025). Escaping the SpuriVerse: Can Large Vision-Language Models Generalize Beyond Seen Spurious Correlations?. NeurIPS 2025 D&B.

See all

📰 News

3/2026 Our paper on uncertainty-aware data mixture optimization for MLLM midtraining has been accepted by ICLR 2026 Workshop DATA-FM!

1/2026 Our paper on reinforcement learning for agentic VQA has been released on arXiv.

1/2026 Our benchmark on dark pattern susceptibility of computer-use agents has been accepted by IUI 2026.

9/2025 Our paper about MLLM spurious correlation has been accepted by NeurIPS 2025!

7/2025 I presented our abstention survey in LLMs (oral presentation) and confidence calibration (poster) at ACL 2025!

7/2025 Our paper about modular abstention has been accepted by ICML 2025!

6/2025 I will start my summer internship at Apple as a research intern!

5/2025 Our paper about optimal data mixing in pretraining has been accepted by [COLM 2025]!

5/2025 Our paper about confidence calibration has been accepted by ACL 2025!

2/2025 Our paper about abstention survey in LLMs has been accepted by TACL 2025!