Eric J. Michaud (@ericjmichaud

Eric J. Michaud

315 posts

Eric J. Michaud

@ericjmichaud_

Trying to make deep neural networks among the best understood objects in the universe. 💻🤖🧠👽🔭🚀

Berkeley

Joined February 2015

Pinned
Eric J. Michaud
@ericjmichaud_
Jan 13
How does scaling up neural networks change what they learn? Despite its importance, our understanding of this question remains nascent. I've written a long post reflecting on my model of neural scaling and its relationship to interpretability, etc.: ericjmichaud.com/quanta
00:00
361K
Eric J. Michaud
@ericjmichaud_
Nov 21, 2023
Replying to @dwarkesh_sp
tl;dr: Maybe learning simple things (basic knowledge, heuristics, etc) actually lowers the loss more than learning sophisticated things (algorithms associated with higher cognition that we really care about), and the sophisticated things will eventually be learned as scaling
xuan (ɕɥɛn / sh-yen)
@xuanalogue
Jun 8, 2023
I respect Jacob a lot but I find it really difficult to engage with predictions of LLM capabilities that presume some version of the scaling hypothesis will continue to hold - it just seems highly implausible given everything we already know about the limits of transformers!
149K
Eric J. Michaud
@ericjmichaud_
Feb 10, 2024
Our group has a new preprint out, in which we make some very tentative steps towards translating trained neural networks into code: arxiv.org/abs/2402.05110 Quick summary/thoughts 🧵:
106K
Eric J. Michaud
@ericjmichaud_
May 22, 2025
Today, the most competent AI systems in almost *any* domain (math, coding, etc.) are broadly knowledgeable across almost *every* domain. Does it have to be this way, or can we create truly narrow AI systems? In a new preprint, we explore some questions relevant to this goal...
GIF
61K
Eric J. Michaud
@ericjmichaud_
Oct 31, 2024
If you cluster language model features (from SAEs) into two groups based on whether they tend to fire together in the same document, you find two "lobes" of features that also turn out to be geometrically distinct! Math vs. prose features separate in this t-SNE plot...
00:00
David D. Baek
@dbaek__
Oct 29, 2024
1/6 New paper! “The Geometry of Concepts: Sparse Autoencoder Feature Structure.” We find that the concept universe of SAE features has interesting structure at three levels: 1) “atomic” small-scale, 2) “brain” intermediate-scale, and 3) “galaxy” large-scale!
157K
Eric J. Michaud
@ericjmichaud_
Jun 2, 2025
I've moved to SF and am working at @GoodfireAI this summer! Excited to be here and to spend time with many friends, old and new.
28K
Eric J. Michaud
@ericjmichaud_
Oct 12, 2024
Earlier this year I tried writing a doc to encourage the MIT Department of Physics to create a new PhD specialization in ML. The doc ended up being a little vague and so I didn't push it. But in light of the Physics Nobel this week, I thought I'd share it here:
31K
Eric J. Michaud
@ericjmichaud_
Mar 24, 2023
Understanding the origin of neural scaling laws and the emergence of new capabilities with scale is key to understanding what deep neural networks are learning. In our new paper, @tegmark, @ZimingLiu11, @uzpg_ and I develop a theory of neural scaling. 🧵:
arxiv.org
The Quantization Model of Neural Scaling
We propose the Quantization Model of neural scaling laws, explaining both the observed power law dropoff of loss with model and data size, and also the sudden emergence of new capabilities with...
81K
Eric J. Michaud
@ericjmichaud_
Nov 29, 2023
*The Space of LLM Learning Curves* The mean loss improves smoothly over LLM training. But this averages over very many loss curves on individual tokens. I've made some interactive visualizations for exploring the per-token curves: ericjmichaud.com/llm-curve-visu… Demo & observations:
00:00
12K
Eric J. Michaud
@ericjmichaud_
Apr 5, 2025
These discussions about AI timelines & risk often seem to hinge on a basic difference in intuition for what intelligence is and how powerful it is...
Dwarkesh Patel
@dwarkesh_sp
Apr 3, 2025
The @slatestarcodex & @DKokotajlo episode. Scott and Daniel break down every month from now until the 2027 intelligence explosion. Misaligned hive minds, Xi and Trump waking up, automated Ilyas accelerating AI progress. I went in quite skeptical. But I learned a tremendous
00:00
23K
Eric J. Michaud
@ericjmichaud_
Oct 26, 2022
New preprint out with @ZimingLiu11 and @tegmark on "Precision Machine Learning". In this paper, we consider what becomes involved when you care about the difference between approximating a function with error 0.001 vs 0.00000000000001 error. arxiv.org/abs/2210.13447
Eric J. Michaud
@ericjmichaud_
May 24, 2024
Our group has a new preprint out with some observations about the structure of language model features. I'll share some additional reflections on what motivated this work, how this relates to some phenomena in SAEs, some uncertainties, and its implications for interpretability:
Josh Engels
@JoshAEngels
May 24, 2024
1/10 New paper! "Not All Language Model Features Are Linear." Prior work says language model features are linear, but we find some that are multi-dimensional! How can we auto-find these multi-d features? Do models really use them? What even is a multi-d feature? Answers below🧵
7.8K
Eric J. Michaud
@ericjmichaud_
Jun 18, 2022
Last month, we put out a preprint led by @ZimingLiu11 on the phenomena of "grokking" in deep learning. Here's a blog post with some videos and additional discussion to accompany the paper: ericjmichaud.com/grokking-squar…
GIF
Eric J. Michaud
@ericjmichaud_
Nov 19, 2024
Since the internal structure of neural networks, through training, comes to reflect the structure of the external world, advances in interpretability and our understanding of neural computation more generally could have a huge impact across science over the coming years...
Elana Simon
@ElanaPearl
Nov 18, 2024
🧬What are protein language models (PLMs) actually learning about biology? Our paper introduces InterPLM - a framework that reveals interpretable features in PLMs using sparse autoencoders, giving us a window into how these models represent protein structure and function. 🧵(1/9)
3.6K