Rylan Schaeffer (@RylanSchaeffer) / X

Rylan Schaeffer

1,490 posts

Rylan Schaeffer

@RylanSchaeffer

AI RS @ Meta TBD. On-Leave from Stanford w/ @sanmikoyejo. Prev @ Gemini, Meta, MIT, Harvard, Uber, UCL, UC Davis

Mountain View, CA

rylanschaeffer.github.io

Joined October 2011

Rylan Schaeffer
@RylanSchaeffer
Sep 14, 2023
Excited to announce my newest breakthrough project!! 🔥🔥 State-of-the-art results (100%!!) on widely used academic benchmarks (MMLU, GSM8K, HumanEval, OpenbookQA, ARC Challenge, etc.) 🔥🔥 1M param LLM trained on 100k tokens 🤯 How?? Introducing **phi-CTNL** 🧵👇 1/6
233K
Rylan Schaeffer
@RylanSchaeffer
Jul 26, 2024
Yesterday, I tweeted that model collapse appears when researchers intentionally induce it in ways that don't match what is done in practice Let me explain using the Shumailov et al. @Nature 2024 paper's methodology as an example Paper: nature.com/articles/s4158… 🧵⬇️ 1/N
Rylan Schaeffer
@RylanSchaeffer
Jul 25, 2024
For anyone interested in model collapse, I strongly urge people to look at our COLM 2024 paper arxiv.org/abs/2404.01413 Model collapse appears when researchers intentionally induce it in ways that simply don't match what is actually done practice @alexandr_wang is wrong
AI models collapse when trained on recursively generated data
From nature.com
258K
Rylan Schaeffer
@RylanSchaeffer
Jul 25, 2024
For anyone interested in model collapse, I strongly urge people to look at our COLM 2024 paper arxiv.org/abs/2404.01413 Model collapse appears when researchers intentionally induce it in ways that simply don't match what is actually done practice @alexandr_wang is wrong
Alexandr Wang
@alexandr_wang
Jul 25, 2024
1/ New paper in Nature shows model collapse as successive model generations models are recursively trained on synthetic data. This is an important result. While many researchers today view synthetic data as AI philosopher’s stone, there is no free lunch. Read more 👇
arxiv.org
Is Model Collapse Inevitable? Breaking the Curse of Recursion by...
The proliferation of generative models, combined with pretraining on web-scale data, raises a timely question: what happens when these models are trained on their own generated outputs? Recent...
321K
Rylan Schaeffer
@RylanSchaeffer
Dec 18, 2023
I had a wonderful but exhausting #NeurIPS2023 . On the flight home, I watched @srush_nlp 's lecture on linear RNNs & state space models youtube.com/watch?v=dKJEpO… Can't recommend highly enough. Dense with information plus references to know where to dig deeper💯💯
63K
Rylan Schaeffer
@RylanSchaeffer
Nov 27, 2023
Excited to begin announcing our #NeurIPS2023 workshop & conference papers (1/10)! 🔥🚀An Information-Theoretic Understanding of Maximum Manifold Capacity Representations🚀🔥 w/ amazing cast @vclecomte @BerivanISIK @sanmikoyejo @ziv_ravid @Andr3yGR @KhonaMikail @ylecun 1/7
467K
Rylan Schaeffer
@RylanSchaeffer
Jul 3, 2025
New position paper! Machine Learning Conferences Should Establish a “Refutations and Critiques” Track Joint w/ @sanmikoyejo @JoshuaK92829 @yegordb @bremen79 @koustuvsinha @in4dmatics @JesseDodge @suchenzang @BrandoHablando @MGerstgrasser @is_h_a @ObbadElyas 1/6
94K
Rylan Schaeffer
@RylanSchaeffer
May 1, 2024
What happens when generative models are trained on their own outputs? Prior works foretold of a catastrophic feedback loop, a curse of recursion, descending into madness as models consume their own outputs. Are we poisoning the very data necessary to train future models? 1/N
133K
Rylan Schaeffer
@RylanSchaeffer
Nov 1, 2022
Very excited to announce our #NeurIPS2022 paper No Free Lunch from Deep Learning in Neuroscience: A Case Study through Models of the Entorhinal-Hippocampal Circuit. It's a story about NeuroAI, told through a story about grid & place cells. Joint w/ @KhonaMikail @FieteGroup 1/15
Rylan Schaeffer
@RylanSchaeffer
Oct 14, 2024
My 2nd to last #neuroscience paper will appear @unireps !! 🧠🧠 Maximizing Neural Regression Scores May Not Identify Good Models of the Brain 🧠🧠 w/ @KhonaMikail @neurostrow @BrandoHablando @sanmikoyejo Answering a puzzle 2 years in the making openreview.net/forum?id=vbtj0… 1/12
openreview.net
Position: Maximizing Neural Regression Scores May Not Identify Good...
A prominent methodology in computational neuroscience posits that the brain can be understood by identifying which artificial neural network models most accurately predict biological neural...
172K
Rylan Schaeffer
@RylanSchaeffer
Apr 11, 2025
I'm going to catch hell for posting but to summarize: 1. This paper misled its way to an #ICLR2025 Oral 2. I pointed this out 3. AC rejected the paper 4. Authors complained & somehow persuaded ICLR to overrule the AC and award a Spotlight 5. AC made clear they were overruled
120K
Rylan Schaeffer
@RylanSchaeffer
Dec 1, 2023
Excited to announce: 🔥🧠 Associative Memory Under the Probabilistic Lens 🧠🔥 w @KhonaMikail @Andr3yGR @sanmikoyejo @neurostrow @FieteGroup & Nika Zahedi Appearing @ #NeurIPS2023 Associative Memory & Hopfield Networks Workshop ! 🧵👇 1/7
63K
Rylan Schaeffer
@RylanSchaeffer
Jun 17, 2025
🚨New preprint 🚨 Turning Down the Heat: A Critical Analysis of Min-p Sampling in Language Models We examine min-p sampling (ICLR 2025 oral) & find significant problems in all 4 lines of evidence: human eval, NLP evals, LLM-as-judge evals, community adoption claims 1/8
75K
Rylan Schaeffer
@RylanSchaeffer
May 1, 2023
We had meant to keep this under wraps for a few weeks, but it seems that the cat is out of the bag. Excited to announce our newest preprint!! **Are Emergent Abilities of Large Language Models a Mirage?** Joint w/ @sanmikoyejo & @BrandoHablando arxiv.org/abs/2304.15004 1/12
Aran Komatsuzaki
@arankomatsuzaki
May 1, 2023
Are Emergent Abilities of Large Language Models a Mirage? Presents an alternative explanation for emergent abilities: one can choose a metric which leads to the inference of an emergent ability or another metric which does not. arxiv.org/abs/2304.15004
arxiv.org
Are Emergent Abilities of Large Language Models a Mirage?
Recent work claims that large language models display emergent abilities, abilities not present in smaller-scale models that are present in larger-scale models. What makes emergent abilities...
95K
Rylan Schaeffer
@RylanSchaeffer
Jun 10, 2024
❤️‍🔥❤️‍🔥Excited to share our new paper ❤️‍🔥❤️‍🔥 **Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?** w/ @haileysch__ @BrandoHablando @gabemukobi @varunrmadan @herbiebradley @ai_phd @BlancheMinerva @sanmikoyejo arxiv.org/abs/2406.04391 1/N
69K