Daniel Johnson (@_ddjohnson) / X

Daniel Johnson

290 posts

Daniel Johnson

@_ddjohnson

Member of Technical Staff at @TransluceAI. Building tools to study neural nets and their behaviors. He/him.

San Francisco

Joined May 2010

Daniel Johnson
@_ddjohnson
Apr 19, 2024
Excited to share Penzai, a JAX research toolkit from @GoogleDeepMind for building, editing, and visualizing neural networks! Penzai makes it easy to see model internals and lets you inject custom logic anywhere. Check it out on GitHub: github.com/google-deepmin…
00:00
339K
Daniel Johnson
@_ddjohnson
Oct 7, 2022
When can you expect to learn a good representation with contrastive learning? In recent work, we show that multiple existing techniques can produce provably *minimax-optimal* representations, based on a surprising connection to kernel methods. 🧵 arxiv.org/abs/2210.01883
Daniel Johnson
@_ddjohnson
Apr 15, 2021
Life update: Excited to say I'll be starting a PhD this fall at the University of Toronto / Vector Institute!
Daniel Johnson
@_ddjohnson
Mar 27, 2023
Why do language models hallucinate? Here, I argue that they are "uncertain simulators": they divide probability across possible outcomes instead of acting conservatively when uncertain. I also give five high-level strategies for avoiding this mismatch.
danieldjohnson.com
Uncertain Simulators Don't Always Simulate Uncertain Agents
I argue that hallucinations are a natural consequence of the language modeling objective, which focuses on simulating confident behavior even when that behavior is hard to predict, rather than predict
68K
Daniel Johnson
@_ddjohnson
Feb 15, 2024
New paper: How can you tell when a model is hallucinating? Let it cheat! An expert doesn't need to cheat, so if your model learns to cheat, there must be something it doesn't know. Our general new approach for measuring uncertainty: arxiv.org/abs/2402.08733
63K
Daniel Johnson
@_ddjohnson
Dec 2, 2024
Personal news: I've left Google DeepMind to work on tools for understanding AI systems at @TransluceAI! I'm excited to build open tech for understanding and anticipating new AI behaviors, and to figure out what questions we should ask to make sure they are safe to deploy.
26K
Daniel Johnson
@_ddjohnson
Jul 10, 2020
We are excited to present the Graph Finite-State Automaton (GFSA) layer, which learns to add long-distance edges to graphs end-to-end based on a downstream objective! arxiv.org/abs/2007.04929 (With @numbercrunching and @hugo_larochelle. 1/9)
Daniel Johnson
@_ddjohnson
Apr 16, 2025
Pretty striking follow-up finding from our o3 investigations: in the chain of thought summary, o3 plans to tell the truth — but then it makes something up anyway!
Transluce
@TransluceAI
Apr 16, 2025
Replying to @TransluceAI
Interestingly, when o3 is asked for details about its laptop, the reasoning summary suggests the model knows it doesn’t have a real laptop, and intends to clarify to the user that it’s “just simulating this setup.” (2/)
32K
Daniel Johnson
@_ddjohnson
Aug 7, 2024
By popular demand, the Treescope pretty-printer from the Penzai neural net library can now be installed separately, and supports both JAX and PyTorch! And that's not all: Penzai itself now has less boilerplate and includes more pretrained Transformer models!
25K
Daniel Johnson
@_ddjohnson
Sep 26, 2020
Happy to announce that our paper "Learning Graph Structure With A Finite-State Automaton Layer" has been accepted to NeurIPS as a spotlight!
Daniel Johnson
@_ddjohnson
Jul 10, 2020
We are excited to present the Graph Finite-State Automaton (GFSA) layer, which learns to add long-distance edges to graphs end-to-end based on a downstream objective! arxiv.org/abs/2007.04929 (With @numbercrunching and @hugo_larochelle. 1/9)
Daniel Johnson
@_ddjohnson
Jul 22, 2024
I'm at ICML this week, presenting our recent work on quantifying model uncertainty! Come check out our poster on Wednesday July 24, from 1:30-3pm (Hall C #1005):
Daniel Johnson
@_ddjohnson
Feb 15, 2024
New paper: How can you tell when a model is hallucinating? Let it cheat! An expert doesn't need to cheat, so if your model learns to cheat, there must be something it doesn't know. Our general new approach for measuring uncertainty: arxiv.org/abs/2402.08733
11K
Daniel Johnson
@_ddjohnson
Apr 19, 2024
Replying to @_ddjohnson
Penzai integrates seamlessly with @GoogleColab and the JAX ecosystem. It represents models as legible, editable data structures, to help researchers understand and modify them after they are trained. Built with support from @DougalMaclaurin, @dtarlow2, and @hugo_larochelle!
7K
Daniel Johnson
@_ddjohnson
Mar 3, 2023
LLM-based assistants can speed up software development, but what should they do when they aren't sure what code to write? We're excited to share R-U-SURE, a drop-in system for adding uncertainty annotations to code suggestions! Read our paper here: arxiv.org/abs/2303.00732
19K
Daniel Johnson
@_ddjohnson
May 3, 2024
I'll be at ICLR in Vienna next week, demo-ing Penzai (Tues @ Google DeepMind booth) and presenting recent work on measuring model uncertainty (Sat @ R2-FM workshop)! Want to chat about what models know, how they work, or tools to help us understand them? Please reach out!
16K