My first PhD paper!🎉We learn *diffusion* models for code generation that learn to directly *edit* syntax trees of programs. The result is a system that can incrementally write code, see the execution output, and debug it. 🧵1/n
We develop an analogous version of “noise” for syntax trees inspired by the computer security literature on fuzzing🎲. And we teach our model to reverse this noise⏪. 2/n
I had a lot of fun working on this. I didn't believe that a chess playing neural net could learn to do look-ahead just in its weights, so I was definitely the non-believer in this project.
♟️Do chess-playing neural nets rely purely on simple heuristics? Or do they implement algorithms involving *look-ahead* in a single forward pass?
We find clear evidence of 2-turn look-ahead in a chess-playing network, using techniques from mechanistic interpretability! 🧵
We show how our approach outperforms previous methods, including rejection sampling a Vision-Language Transformer that is specifically trained on these tasks (CSGNet in this figure). 6/n
These languages are small, and we only show this approach on a fairly narrow inverse-graphics task. In the future, we hope to show that this approach may potentially work more generally with languages with loops and variables. 8/n