Log inSign up
Aaron Mueller
326 posts
Image
user avatar
Aaron Mueller
@amuuueller
Asst. Prof. in CS at @BU_Tweets ≡ {Mechanistic, causal} {interpretability, NLP}
Boston
aaronmueller.github.io
Joined September 2015
733
Following
1,936
Followers
  • user avatar
    Aaron Mueller
    @amuuueller
    Oct 30, 2024
    I'm recruiting PhD students for our new lab, coming to Boston University in Fall 2025! Our lab aims to understand, improve, and precisely control how language is learned and used in natural language systems (such as language models). Details below!
    Boston University's CDS building. (Official photo from BU CDS—not mine!)
    63K
  • user avatar
    Aaron Mueller
    @amuuueller
    Jan 30, 2023
    Announcing the BabyLM 👶 Challenge, the shared task at @conll_conf and CMCL'23! We’re calling on researchers to pre-train language models on (relatively) small datasets inspired by the input given to children learning language. babylm.github.io arxiv.org/abs/2301.11796
    BabyLM logo
    47K
  • user avatar
    Aaron Mueller
    @amuuueller
    May 24, 2024
    ⭐️🏛️ Very excited to announce that in fall 2025, I’ll be starting as an Assistant Professor of Computer Science at Boston University @BU_Tweets! Looking forward to joining a wonderful group of colleagues at @BUCompSci!
    16K
  • user avatar
    Aaron Mueller
    @amuuueller
    Jul 22, 2024
    What is cause and effect? What is a “mechanism”? And how do answers to these questions affect interpretability research? 📜 New preprint! 📜 Two key challenges for causal/mechanistic interpretability, and ways forward. To be presented at the mech interp workshop at #ICML2024:
    Missed Causes and Ambiguous Effects: Counterfactuals Pose Challenges for Interpreting Neural Networks. Aaron Mueller.
    11K
  • user avatar
    Aaron Mueller
    @amuuueller
    Mar 21, 2022
    We know that pre-trained seq2seq models such as T5 perform well on many downstream NLP tasks. Turns out, pre-training also teaches them about the hierarchical structure of language! 📜arxiv.org/abs/2203.09397 👨🏻‍💻github.com/sebschu/multil… w/ @bob_frank, @tallinzen, Wes Wang, @sebschu
    Image
  • user avatar
    Aaron Mueller
    @amuuueller
    Jun 28, 2023
    Scaling LMs works well. Is more parameters and data all it takes, or do certain architectural features or language styles bring out emergent abilities sooner? Let’s investigate by seeing what it takes for syntax 🌳 to emerge! At ACL! w/ @tallinzen 📜 arxiv.org/abs/2305.19905
    How to Plant Trees 🌳 in Language Models: Data and Architectural Effects on the Emergence of Syntactic Inductive Biases. By Aaron Mueller and Tal Linzen
    18K
  • user avatar
    Aaron Mueller
    @amuuueller
    Aug 22, 2024
    Thanks Tal! 📜 In this paper, we provide a theoretically grounded review of causal (which, imo, ⊇ mechanistic) interpretability. We argue that this gives a more cohesive narrative of the field, and makes it easier to see actionable open directions for future work! 🧵
    user avatar
    Tal Linzen
    @tallinzen
    Aug 19, 2024
    I very much enjoyed this survey of causal interpretability methods for neural networks from @amuuueller and many others - succinct, well organized, just opinionated enough. please write more reviews everyone arxiv.org/abs/2408.01416
    12K
  • user avatar
    Aaron Mueller
    @amuuueller
    Jun 24, 2021
    Language models are good at subject-verb agreement, even across center-embedded structures. Which neurons are responsible for this? Depends on the syntactic structure! arxiv.org/pdf/2106.06087… w/ @mattf1n, me, @sebgehr, @shieber, @tallinzen, & @boknilev. To appear at ACL'21!
    Image
  • user avatar
    Aaron Mueller
    @amuuueller
    Nov 15, 2023
    To those using in-context learning: LLMs behave differently on in-distribution vs. out-of-distribution examples—and chain-of-thought prompting has different effects on them! New preprint w/ @albertwebson, @jowenpetty, @tallinzen 📜 arxiv.org/abs/2311.07811
    Image
    8.9K
  • user avatar
    Aaron Mueller
    @amuuueller
    Dec 19, 2024
    What can mechanistic interpretability do for computational psycholinguists? @michaelwhanna and I took a stab at this question! We investigate garden path sentence processing in LMs at the feature (circuit) level.
    user avatar
    Michael Hanna
    @michaelwhanna
    Dec 19, 2024
    Sentences are partially understood before they're fully read. How do LMs incrementally interpret their inputs? In a new paper @amuuueller and I use mech interp to study how LMs process structurally ambiguous sentences. We show LMs rely on both syntactic & spurious features! 1/10
    Image
    5.7K
  • user avatar
    Aaron Mueller
    @amuuueller
    Apr 3, 2024
    Excited this project is out! Using sparse feature circuits, we can explain and modify how LMs arrive at a behavior. In this thread, I want to highlight open directions where computational linguists can use sparse feature circuits. 🧵
    user avatar
    Samuel Marks
    @saprmarks
    Apr 3, 2024
    Can we understand & edit unanticipated mechanisms in LMs? We introduce sparse feature circuits, & use them to explain LM behaviors, discover & fix LM bugs, & build an automated interpretability pipeline! Preprint w/ @can_rager, @ericjmichaud_, @boknilev, @davidbau, @amuuueller
    Image
    GIF
    13K
  • user avatar
    Aaron Mueller
    @amuuueller
    Jun 14, 2024
    On my way to #NAACL24! 🇲🇽 Friends and folks interested in evaluation, (mechanistic) interpretability, causality, robustness, psycholinguistics, and/or coffee, let’s meet up? (And if you’re interested in doing a PhD in any/all of these topics, I would love to chat!)
    3.7K
  • user avatar
    Aaron Mueller
    @amuuueller
    Mar 23, 2023
    The evaluation pipeline for the BabyLM 👶 Challenge is out! We’re evaluating on BLiMP and a selection of (Super)GLUE tasks. Code 💻:
    Image
    GitHub - babylm/evaluation-pipeline-2023: Evaluation pipeline for the BabyLM Challenge 2023.
    From github.com
    9.6K
  • user avatar
    Aaron Mueller
    @amuuueller
    Dec 6, 2022
    I’ll be in Abu Dhabi starting tomorrow for EMNLP! Come see my CoNLL talk about causally probing for syntax in multilingual LMs! I’ll be around to chat about interpretability, multilingual NLP, robustness, and syntax. Get in touch via DMs or email! arxiv.org/abs/2210.14328
    Image

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement