Log inSign up
Mayee Chen
Engram
631 posts
user avatar
Mayee Chen
Engram
@MayeeChen
CS PhD student @StanfordAILab @HazyResearch, undergrad @princeton. Working on all things data! she/her 🎃
Stanford, CA
mayeechen.github.io
Joined February 2020
730
Following
2,353
Followers
  • Pinned
    user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Feb 13
    Data mixing - determining ratios across your training datasets - matters a lot for model quality. While building Olmo 3, we learned it’s hard to set up a method that finds a strong mix, and hard to maintain that mix as datasets change throughout development. Introducing Olmix👇
    Image
    57K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Jul 29, 2023
    Large language models (LMs) rely heavily on training data quality. How do we best select training data for good downstream model performance across tasks? Introducing 🍳Skill-It: a data-driven framework for understanding and training LMs! Paper: arxiv.org/abs/2307.14430 1/13
    111K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Oct 15, 2023
    Stanford CS graduate students are running an application support program for diversity students (broadly defined); if you are applying to our CS PhD program, we'll give one round of feedback on your application. SPACE IS LIMITED. APPLY BY OCTOBER 27: Link: tinyurl.com/3x5ke5uz
    79K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Apr 19, 2022
    New preprint alert! 📣 How do we produce transferable and robust representations with supervised contrastive learning? We need *geometric spread* and an inductive bias towards *latent subclass clustering* in representation space. 📜 arxiv.org/abs/2204.07596 👇 (1/n)
    Image
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Jun 24, 2025
    LLMs often generate correct answers but struggle to select them. Weaver tackles this by combining many weak verifiers (reward models, LM judges) into a stronger signal using statistical tools from Weak Supervision—matching o3-mini-level accuracy with much cheaper models! 📊
    Image
    24K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Nov 12, 2024
    There are many algorithms for constructing pre-training data mixtures—which one should we use? Turns out: many of them fall under one framework, have similar issues, and can be improved with a straightforward modification. Introducing Aioli! 🧄 1/9
    Image
    28K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Apr 22, 2025
    !!! I'm at #ICLR2025 to present 🧄Aioli🧄 a unified framework for data mixing on Thursday afternoon! 🔗 arxiv.org/abs/2411.05735 Message me to chat about pre/post training data (mixing, curriculum, understanding); test-time compute/verification; or to try new food 🇸🇬
    Image
    19K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Jul 2, 2021
    New paper appearing in #ICML2021! Mandoline: Model Evaluation under Distribution Shift: Paper: arxiv.org/abs/2107.00643 Code: github.com/HazyResearch/m… work done w/ equal contribution from @krandiash and @nimit_sohoni , as well as @faitpoms, @kayvonf, and @HazyResearch 1/6
    Image
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Oct 31, 2023
    Skill-It has been accepted to #NeurIPS2023 as a spotlight! Want to understand how skills in your training data can give way to better data selection methods? Our code is available here: github.com/HazyResearch/s…
    Logo of a skillet with ingredients resembling data. Beneath the skillet are the words, "Skill-It!"
    user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Jul 29, 2023
    Large language models (LMs) rely heavily on training data quality. How do we best select training data for good downstream model performance across tasks? Introducing 🍳Skill-It: a data-driven framework for understanding and training LMs! Paper: arxiv.org/abs/2307.14430 1/13
    23K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Dec 10, 2024
    Given open-ended generations from K different LLMs for an input, can we learn to select the best generation, without needing any labeled data? Introducing our #NeurIPS2024 paper, Smoothie, a label-free test-time LLM routing algorithm! 🥤 1/4
    Image
    7.3K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Oct 3, 2024
    honored to have contributed to this incredibly exciting work! Archon shows how intelligently blending concepts like ensembling, fusing, and repeated sampling can create very strong LLM inference systems, using 70B+ open models to outperform GPT-4o and Claude 3.5 sonnet!
    user avatar
    Jon Saad-Falcon
    @JonSaadFalcon
    Sep 30, 2024
    What is the best way to spend your inference compute budget to create LLM systems greater than the sum of their parts? In our latest paper, we present Archon, an architecture search framework for inference-time techniques! Archon is enabled by inference-time architecture search
    14K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Aug 15, 2023
    Embroid's ability to improve LLM performance across 95 classification tasks is quite impressive, and it does so just by exploiting smoothness of other pre-trained model embeddings via knn (with theoretical guarantees too). Amazing work by @NeelGuha and honored to have helped out!
    user avatar
    Neel Guha
    @NeelGuha
    Aug 14, 2023
    We’re excited to share Embroid: a method for “stitching” together an LLM with embedding information from multiple smaller models (e.g., BERT), allowing us to automatically correct LLM predictions without supervision. ✍️: hazyresearch.stanford.edu/blog/2023-08-1… 📜: arxiv.org/abs/2307.11031
    Image
    16K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Dec 10, 2024
    Omw to #NeurIPS2024! I have 2 papers: - Smoothie (w @NeelGuha): a test-time LLM routing approach that requires no labels/training (Thu 4:30 East) - DCLM: a benchmark for LLM data curation (Fri 4:30 West) Excited to meet new ppl & chat about data-centric AI/ test-time approaches!
    5K
  • user avatar
    Mayee Chen
    Engram
    @MayeeChen
    Dec 12, 2023
    I'm at #NeurIPS2023 from now to Sat and will be around at these posters: - Embroid Tues 5:15 - Skill-It Wed 10:45 - Segmentation for classification Wed 5:00 Let's chat about training data and data-centric frameworks for understanding/aligning LLMs! DM/email me 🍤 Paper details 👇
    8K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement