Thomas Ahle (@thomasahle) / X

Thomas Ahle

6,233 posts

Thomas Ahle

@thomasahle

Head of AI @NormalComputing - building OpenClaw for EDA. Ex @Meta, @BARCdk, SupWiz, @OxfordQuantum. Tweets on Math, AI, #dspy, Probability, ML. Also tensors.

Copenhagen

thomasahle.com

Joined September 2010

Thomas Ahle
@thomasahle
Sep 8, 2024
In thermodynamics, heat flows from the hotter block to the cooler one until both blocks reach the same temperature. But what if you want to transfer all the heat from block A to block B, not just equalize temperatures? Just chop up the blocks like in this animation:
00:00
391K
Thomas Ahle
@thomasahle
Jun 19, 2024
Sam Altman: Ilya is leaving to work on a small project that's personally meaningful to him. Ilya: Single Shot Superintelligence
Ilya Sutskever
@ilyasut
Jun 19, 2024
I am starting a new company:
121K
Thomas Ahle
@thomasahle
May 6, 2025
Deepmind won the moment LLMs became about RL.
Oriol Vinyals
@OriolVinyalsML
May 6, 2025
Ahead of I/O, we’re releasing an updated Gemini 2.5 Pro! It’s now #1 on WebDevArena leaderboard, breaking the 1400 ELO barrier! 🥇 Our most advanced coding model yet, with stronger performance on code transformation & editing. Excited to build drastic agents on top of this!
108K
Thomas Ahle
@thomasahle
Jun 1, 2024
KANs (NNs with learned functions on the edges) have a quite elegant representation using Tensor Diagrams. This chart of MLP layers also shows some neat relationship between things like ReGLUs and MoEs.
112K
Thomas Ahle
@thomasahle
Nov 9, 2022
I've been laid off from Meta. Our entire research/infra org "Probability" was cut. I deeply appreciate the people who helped me get there, and the great people I worked with for a year and half. I hope to stay in the Bay Area a while longer, if anyone needs some algorithms.
Thomas Ahle
@thomasahle
Jun 11, 2023
Jax has a different definition of the identity matrix than I'm used to...
269K
Thomas Ahle
@thomasahle
May 18, 2024
Doing Matrix Calculus can be messy, specially when we need higher order derivatives. Writing them out using Tensor Diagrams makes even the Hessian Chain Rule relatively simple:
102K
Thomas Ahle
@thomasahle
May 25, 2024
I always found the tensor notation in Fast Matrix Multiplication algorithms confusing. But using tensor diagrams it's pretty easy to see what's going on:
91K
Thomas Ahle
@thomasahle
Oct 19, 2023
It is a common misconception that LLMs are just trained to "predict the next token". No. They are trained to predict an entire context window's worth of tokens, like 4k+. The gradients go end to end and the model is allowed to plan what it will say next.
229K
Thomas Ahle
@thomasahle
Sep 9, 2024
Each interaction increases entropy, but only a tiny amount—a nearly thermodynamic perfect loop.
00:00
77K
Thomas Ahle
@thomasahle
Nov 4, 2024
The Matrix Cookbook contains the following formula, but it contains a flaw. Let's use Penrose's tensor diagrams to derive it from scratch, and correct the mistake!
180K
Thomas Ahle
@thomasahle
Jul 20, 2025
to all the people saying OpenAI's math proofs are in a "weird terse language" - might remember this by @karpathy
Andrej Karpathy
@karpathy
Sep 16, 2024
You can tell the RL is done properly when the models cease to speak English in their chain of thought
68K
Thomas Ahle
@thomasahle
Nov 20, 2023
Lovely table of ways to compute e^A, the exponential of a matrix, by @nhigham
76K
Thomas Ahle
@thomasahle
Sep 3, 2024
I love retro websites—so I tried to design the Tensor Cookbook (tensorcookbook.com) based on the original Matrix Cookbook aesthetics from 2006. I think it works pretty well down to the purple text, Verdana font, and everything.
53K