Morris Yau


Ph.D Candidate MIT Computer Science

Image

I am Morris a Ph.D Candidate at MIT in EECS advised by Prof. Jacob Andreas, Prof. Stefanie Jegelka, and Prof. Ankur Moitra. My research is broadly in the area of Natural Language Processing and Machine Learning, with a particular focus on the algorithmic foundations of language and intelligence. Towards these ends, I am currently focused on building the architectures, algorithms, and theoretical machinery required to design neural sequence models with both strong empirical performance and provable guarantees on correctness, efficiency, and sample complexity. This requires fundamental innovation on three fronts.

Next Generation Neural Sequence Model: Is there an ”optimal” neural architecture both in its computational cost during training and memory
bound during inference? Is autoregression even the right paradigm for language modeling?

Algorithmic Foundations: Can we design learning algorithms for a broad class of neural sequence models that converge to a global optimum and guarantee generalization? Is there a single algorithm that adapts a model to changing data continuously?

Generator-Verifier Gap or P vs. NP: How do we train models to search exponentially large solution spaces, and can they deliver optimal or
certifiably near-optimal approximations to intractable problems?

I received my masters in computer science from UC Berkeley under the wonderful supervision of Prof. Prasad Raghavendra studying the foundations of approximation. And before that I was an undergraduate at Harvard University where I had the great fortune to work with Prof. Madhu Sudan on aspects of algebraic complexity.

Manuscripts

Author order alphabetical, *indicates first authorship

  • Sequential Parallel Duality in Prefix Scannable Models
    Authors: Morris Yau*, Sharut Gupta, Valerie Engelmayer, Kazuki Irie, Jacob Andreas, Stefanie Jegelka
    Arxiv Link: https://arxiv.org/abs/2506.10918

Publications

  • Learning Linear Attention in Polynomial Time
    Authors: Morris Yau*, Eykin Akyurek, Jiayuan Mao, Joshua B. Tenenbaum, Stefanie Jegelka, Jacob Andreas (Oral Top 1.5% of Accepted Papers) Neurips 2025
    Arxiv Link: https://arxiv.org/abs/2410.10101
  • Are Graph Neural Networks Optimal Approximation Algorithms? Authors: Morris Yau*, Eric Lu, Nikolaos Karalias, Jessica Xu, Stefanie Jegelka
    Venue: (Spotlight) Neurips 2024
    Arxiv Link: https://arxiv.org/abs/2310.00526

  • Blending Complementary Memory Systems in Hybrid Quadratic-Linear Transformers
    Authors: Kazuki Irie, Morris Yau, Sam Gershman
    Neurips 2025
    Arxiv Link: https://arxiv.org/abs/2506.00744

  • Tensor Decompositions Meet Control Theory: Learning General Mixtures of Linear Dynamical Systems
    Authors: Ainesh Bakshi, Allen Liu, Ankur Moitra, Morris Yau
    Venue: 40th ICML
  • A New Approach To Learning Linear Dynamical Systems
    Authors: Ainesh Bakshi, Allen Liu, Ankur Moitra, Morris Yau
    Venue: 55th Annual ACM Symposium on Theory of Computing (STOC 2023)
    Arxiv Link: https://arxiv.org/abs/2301.09519
  • Kalman Filtering with Adversarial Corruptions
    Authors: Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
    Venue: 54th Annual ACM Symposium on Theory of Computing (STOC 2022)
    Arxiv Link: https://arxiv.org/pdf/2111.06395.pdf
  • Online and Distribution Free Robustness: Regression and Contextual Bandits with Huber Contamination
    Authors: Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
    Venue: Proceedings of the 62nd Annual IEEE Symposium on Foundations of Computer Science (FOCS 2021)
    Arxiv Link: https://arxiv.org/abs/2010.04157

  • Classification under Misspecfication: Halfspaces, Generalized Linear Models, and Connections to Evolvability
    Authors: Sitan Chen, Frederic Koehler, Ankur Moitra, Morris Yau
    Venue: Advances in Neural Information Processing Systems 33 (NeurIPS 2020), Spotlight
    Arxiv Link: https://arxiv.org/abs/2006.04787

  • List Decodable Mean Estimation in Nearly Linear Time
    Authors: Yeshwanth Cherapanamjeri, Sidhanth Mohanty, Morris Yau
    Venue: Proceedings of the 61st Annual IEEE Symposium on Foundations of Computer Science (FOCS 2020)
    Arxiv Link: https://arxiv.org/abs/2005.09796

  • List Decodable Subspace Recovery
    Authors: Prasad Raghavendra and Morris Yau
    Venue: The 33rd Annual Conference on Learning Theory (COLT 2020)
    Arxiv Link: https://arxiv.org/abs/2002.03004

  • List Decodable Learning via Sum of Squares
    Authors: Prasad Raghavendra and Morris Yau
    Venue: Proceedings of the Thirty-First Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2020)
    Arxiv Link: https://arxiv.org/abs/1905.04660