Ph.D Candidate MIT Computer Science

I am Morris a Ph.D Candidate at MIT in EECS advised by Prof. Jacob Andreas, Prof. Stefanie Jegelka, and Prof. Ankur Moitra. My research is broadly in the area of Natural Language Processing and Machine Learning, with a particular focus on the algorithmic foundations of language and intelligence. Towards these ends, I am currently focused on building the architectures, algorithms, and theoretical machinery required to design neural sequence models with both strong empirical performance and provable guarantees on correctness, efficiency, and sample complexity. This requires fundamental innovation on three fronts.
Next Generation Neural Sequence Model: Is there an ”optimal” neural architecture both in its computational cost during training and memory
bound during inference? Is autoregression even the right paradigm for language modeling?
Algorithmic Foundations: Can we design learning algorithms for a broad class of neural sequence models that converge to a global optimum and guarantee generalization? Is there a single algorithm that adapts a model to changing data continuously?
Generator-Verifier Gap or P vs. NP: How do we train models to search exponentially large solution spaces, and can they deliver optimal or
certifiably near-optimal approximations to intractable problems?
I received my masters in computer science from UC Berkeley under the wonderful supervision of Prof. Prasad Raghavendra studying the foundations of approximation. And before that I was an undergraduate at Harvard University where I had the great fortune to work with Prof. Madhu Sudan on aspects of algebraic complexity.
Manuscripts
Author order alphabetical, *indicates first authorship
- Sequential Parallel Duality in Prefix Scannable Models
Authors: Morris Yau*, Sharut Gupta, Valerie Engelmayer, Kazuki Irie, Jacob Andreas, Stefanie Jegelka
Arxiv Link: https://arxiv.org/abs/2506.10918
Publications
- Learning Linear Attention in Polynomial Time
Authors: Morris Yau*, Eykin Akyurek, Jiayuan Mao, Joshua B. Tenenbaum, Stefanie Jegelka, Jacob Andreas (Oral Top 1.5% of Accepted Papers) Neurips 2025
Arxiv Link: https://arxiv.org/abs/2410.10101
- Are Graph Neural Networks Optimal Approximation Algorithms? Authors: Morris Yau*, Eric Lu, Nikolaos Karalias, Jessica Xu, Stefanie Jegelka
Venue: (Spotlight) Neurips 2024
Arxiv Link: https://arxiv.org/abs/2310.00526 - Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
Authors: Kazuki Irie, Morris Yau, Sam Gershman
Neurips 2025
Arxiv Link: https://arxiv.org/abs/2506.00744 - Tensor Decompositions Meet Control Theory: Learning General Mixtures of Linear Dynamical Systems
Authors: Ainesh Bakshi, Allen Liu, Ankur Moitra, Morris Yau
Venue: 40th ICML
- A New Approach To Learning Linear Dynamical Systems
Authors: Ainesh Bakshi, Allen Liu, Ankur Moitra, Morris Yau
Venue: 55th Annual ACM Symposium on Theory of Computing (STOC 2023)
Arxiv Link: https://arxiv.org/abs/2301.09519
- Kalman Filtering with Adversarial Corruptions
Authors: Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
Venue: 54th Annual ACM Symposium on Theory of Computing (STOC 2022)
Arxiv Link: https://arxiv.org/pdf/2111.06395.pdf
- Online and Distribution Free Robustness: Regression and Contextual Bandits with Huber Contamination
Authors: Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
Venue: Proceedings of the 62nd Annual IEEE Symposium on Foundations of Computer Science (FOCS 2021)
Arxiv Link: https://arxiv.org/abs/2010.04157 - Classification under Misspecfication: Halfspaces, Generalized Linear Models, and Connections to Evolvability
Authors: Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
Venue: Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Spotlight
Arxiv Link: https://arxiv.org/abs/2006.04787 - List Decodable Mean Estimation in Nearly Linear Time
Authors: Yeshwanth Cherapanamjeri, Sidhanth Mohanty, Morris Yau
Venue: Proceedings of the 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS 2020)
Arxiv Link: https://arxiv.org/abs/2005.09796 - List Decodable Subspace Recovery
Authors: Prasad Raghavendra and Morris Yau
Venue: The 33rd Annual Conference on Learning Theory (COLT 2020)
Arxiv Link: https://arxiv.org/abs/2002.03004 - List Decodable Learning via Sum of Squares
Authors: Prasad Raghavendra and Morris Yau
Venue: Proceedings of the Thirty-First Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2020)
Arxiv Link: https://arxiv.org/abs/1905.04660