Log inSign up
Rylan Schaeffer
1,490 posts
Image
user avatar
Rylan Schaeffer
@RylanSchaeffer
AI RS @ Meta TBD. On-Leave from Stanford w/ @sanmikoyejo. Prev @ Gemini, Meta, MIT, Harvard, Uber, UCL, UC Davis
Mountain View, CA
rylanschaeffer.github.io
Joined October 2011
1,851
Following
6,248
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Sep 14, 2023
    Excited to announce my newest breakthrough project!! 🔥🔥 State-of-the-art results (100%!!) on widely used academic benchmarks (MMLU, GSM8K, HumanEval, OpenbookQA, ARC Challenge, etc.) 🔥🔥 1M param LLM trained on 100k tokens 🤯 How?? Introducing **phi-CTNL** 🧵👇 1/6
    Image
    233K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Jul 26, 2024
    Yesterday, I tweeted that model collapse appears when researchers intentionally induce it in ways that don't match what is done in practice Let me explain using the Shumailov et al. @Nature 2024 paper's methodology as an example Paper: nature.com/articles/s4158… 🧵⬇️ 1/N
    user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Jul 25, 2024
    For anyone interested in model collapse, I strongly urge people to look at our COLM 2024 paper arxiv.org/abs/2404.01413 Model collapse appears when researchers intentionally induce it in ways that simply don't match what is actually done practice @alexandr_wang is wrong
    Content cover image
    AI models collapse when trained on recursively generated data
    From nature.com
    258K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Jul 25, 2024
    For anyone interested in model collapse, I strongly urge people to look at our COLM 2024 paper arxiv.org/abs/2404.01413 Model collapse appears when researchers intentionally induce it in ways that simply don't match what is actually done practice @alexandr_wang is wrong
    user avatar
    Alexandr Wang
    Meta
    @alexandr_wang
    Jul 25, 2024
    1/ New paper in Nature shows model collapse as successive model generations models are recursively trained on synthetic data. This is an important result. While many researchers today view synthetic data as AI philosopher’s stone, there is no free lunch. Read more 👇
    Image
    arXiv logo
    arxiv.org
    Is Model Collapse Inevitable? Breaking the Curse of Recursion by...
    The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs? Recent...
    321K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Dec 18, 2023
    I had a wonderful but exhausting #NeurIPS2023 . On the flight home, I watched @srush_nlp 's lecture on linear RNNs & state space models youtube.com/watch?v=dKJEpO… Can't recommend highly enough. Dense with information plus references to know where to dig deeper💯💯
    63K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Nov 27, 2023
    Excited to begin announcing our #NeurIPS2023 workshop & conference papers (1/10)! 🔥🚀An Information-Theoretic Understanding of Maximum Manifold Capacity Representations🚀🔥 w/ amazing cast @vclecomte @BerivanISIK @sanmikoyejo @ziv_ravid @Andr3yGR @KhonaMikail @ylecun 1/7
    Image
    467K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Jul 3, 2025
    New position paper! Machine Learning Conferences Should Establish a “Refutations and Critiques” Track Joint w/ @sanmikoyejo @JoshuaK92829 @yegordb @bremen79 @koustuvsinha @in4dmatics @JesseDodge @suchenzang @BrandoHablando @MGerstgrasser @is_h_a @ObbadElyas 1/6
    Image
    94K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    May 1, 2024
    What happens when generative models are trained on their own outputs? Prior works foretold of a catastrophic feedback loop, a curse of recursion, descending into madness as models consume their own outputs. Are we poisoning the very data necessary to train future models? 1/N
    Image
    133K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Nov 1, 2022
    Very excited to announce our #NeurIPS2022 paper No Free Lunch from Deep Learning in Neuroscience: A Case Study through Models of the Entorhinal-Hippocampal Circuit. It's a story about NeuroAI, told through a story about grid & place cells. Joint w/ @KhonaMikail @FieteGroup 1/15
    Image
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Oct 14, 2024
    My 2nd to last #neuroscience paper will appear @unireps !! 🧠🧠 Maximizing Neural Regression Scores May Not Identify Good Models of the Brain 🧠🧠 w/ @KhonaMikail @neurostrow @BrandoHablando @sanmikoyejo Answering a puzzle 2 years in the making openreview.net/forum?id=vbtj0… 1/12
    openreview.net
    Position: Maximizing Neural Regression Scores May Not Identify Good...
    A prominent methodology in computational neuroscience posits that the brain can be understood by identifying which artificial neural network models most accurately predict biological neural...
    172K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Apr 11, 2025
    I'm going to catch hell for posting but to summarize: 1. This paper misled its way to an #ICLR2025 Oral 2. I pointed this out 3. AC rejected the paper 4. Authors complained & somehow persuaded ICLR to overrule the AC and award a Spotlight 5. AC made clear they were overruled
    120K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Dec 1, 2023
    Excited to announce: 🔥🧠 Associative Memory Under the Probabilistic Lens 🧠🔥 w @KhonaMikail @Andr3yGR @sanmikoyejo @neurostrow @FieteGroup & Nika Zahedi Appearing @ #NeurIPS2023 Associative Memory & Hopfield Networks Workshop ! 🧵👇 1/7
    Image
    63K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Jun 17, 2025
    🚨New preprint 🚨 Turning Down the Heat: A Critical Analysis of Min-p Sampling in Language Models We examine min-p sampling (ICLR 2025 oral) & find significant problems in all 4 lines of evidence: human eval, NLP evals, LLM-as-judge evals, community adoption claims 1/8
    Image
    75K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    May 1, 2023
    We had meant to keep this under wraps for a few weeks, but it seems that the cat is out of the bag. Excited to announce our newest preprint!! **Are Emergent Abilities of Large Language Models a Mirage?** Joint w/ @sanmikoyejo & @BrandoHablando arxiv.org/abs/2304.15004 1/12
    user avatar
    Aran Komatsuzaki
    @arankomatsuzaki
    May 1, 2023
    Are Emergent Abilities of Large Language Models a Mirage? Presents an alternative explanation for emergent abilities: one can choose a metric which leads to the inference of an emergent ability or another metric which does not. arxiv.org/abs/2304.15004
    Image
    arXiv logo
    arxiv.org
    Are Emergent Abilities of Large Language Models a Mirage?
    Recent work claims that large language models display emergent abilities, abilities not present in smaller-scale models that are present in larger-scale models. What makes emergent abilities...
    95K
  • user avatar
    Rylan Schaeffer
    @RylanSchaeffer
    Jun 10, 2024
    ❤️‍🔥❤️‍🔥Excited to share our new paper ❤️‍🔥❤️‍🔥 **Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?** w/ @haileysch__ @BrandoHablando @gabemukobi @varunrmadan @herbiebradley @ai_phd @BlancheMinerva @sanmikoyejo arxiv.org/abs/2406.04391 1/N
    Image
    69K
Advertisement
Advertisement