Log inSign up
Tom Zahavy
572 posts
Image
user avatar
Tom Zahavy
@TZahavy
Building creative agents @GoogleDeepMind. AlphaProof, AlphaZero_db, PuzzleGen, Convex RL, meta gradients. Staff research scientist, discovery team
London, England
tomzahavy.com
Joined December 2018
438
Following
3,593
Followers
  • Pinned
    user avatar
    Tom Zahavy
    @TZahavy
    Oct 29, 2025
    I am excited to share a work we did in the Discovery team at @GoogleDeepMind using RL and generative models to discover creative chess puzzles 🔊♟️♟️ #neurips2025 🎨While strong chess players intuitively recognize the beauty of a position, articulating the precise elements that
    Image
    398K
  • user avatar
    Tom Zahavy
    @TZahavy
    Apr 10, 2025
    I am looking to hire a student researcher to work with AlphaProof on a project at the intersection of AI, math, computation, and creativity. Background in AI for math, and/or Lean is desired. If interested, please get in touch. The position will be based in London.
    55K
  • user avatar
    Tom Zahavy
    @TZahavy
    Aug 21, 2023
    I'm super excited to share AlphaZeroᵈᵇ, a team of diverse #AlphaZero agents that collaborate to solve #Chess puzzles and demonstrate increased creativity. Check out our paper to learn more! arxiv.org/abs/2308.09175 A quick 🧵(1/n)
    Image
    95K
  • user avatar
    Tom Zahavy
    @TZahavy
    Nov 12, 2025
    Excited to announce our recent @GoogleDeepMind paper, AlphaProof, out in @Nature today! It has been over a year since AlphaProof achieved silver-medal standard solving International Mathematical Olympiad (IMO) problems, by teaching itself mathematics in LEAN (@leanprover).
    Image
    44K
  • user avatar
    Tom Zahavy
    @TZahavy
    Apr 7, 2024
    We are looking for brilliant and creative candidates with strong programming skills to join us at the Discovery team at @GoogleDeepMind 🧙 We build AI agents that discover new knowledge using RL, planning and LLMs. DM me if you have specific questions about working with us 🙏
    53K
  • user avatar
    Tom Zahavy
    @TZahavy
    Jul 29, 2024
    We are looking for brilliant and creative candidates with strong programming skills to join us at the Discovery team at @GoogleDeepMind 🧙 We are building AI agents that create new knowledge using RL, planning and LLMs in domains like Mathematics, chess and more. Please apply
    57K
  • user avatar
    Tom Zahavy
    @TZahavy
    Dec 3, 2021
    In our #Neurips2021 spotlight, we study RL problems where the goal is to minimize a cost over the state occupancy. When this cost is linear, we get the standard RL problem. When it is non-linear, we get apprenticeship learning, pure exploration, diversity and more. [1/7]
    Image
  • user avatar
    Tom Zahavy
    @TZahavy
    May 27, 2022
    Excited to share DOMiNO, a method to discover qualitative-diverse policies using a single latent-conditioned architecture and the "reward is enough" principle. Read more about it here: arxiv.org/pdf/2205.13521… DOMiNO's🍕 in Walker walk:
    Image
    00:00
  • user avatar
    Tom Zahavy
    @TZahavy
    Apr 22, 2022
    Super excited to share that our Bootstrapped Meta Learning paper led by @flennerhag received an Outstanding Paper Award from #iclr2022 Better meta learning -> doubled the performance of STACX in Atari to a new SOTA. Come talk with us at the poster session!blog.iclr.cc/2022/04/20/ann…
    user avatar
    Sebastian Flennerhag
    @flennerhag
    Sep 13, 2021
    What should a meta-learner optimize? What if we make it chase its own future outputs? Turns out, it can improve meta-optimization, set new SOTAs, and lead to new types of meta-learning. arxiv.org/pdf/2109.04504… w. Y. Schroecker, @tomzhavy, @hado, D. Silver, S. Singh. 🧵👇
    Image
  • user avatar
    Tom Zahavy
    @TZahavy
    Nov 13, 2025
    We are hiring students in the discovery team. If you are interested in creativity and RL, consider applying ❤️
    user avatar
    Alex Havrilla
    @Dahoas1
    Nov 13, 2025
    📣 Hiring Alert: Student Researcher - 2026 @vivek_veeriah and I are looking for a PhD Student Researcher to join the GDM Discovery team in London 🇬🇧! We will be investigating how creativity in LLMs generalizes, with application to scientific discovery 🔭 Apply below! ⬇️
    13K
  • user avatar
    Tom Zahavy
    @TZahavy
    Jul 25, 2024
    Very excited to share AlphaProof, an agent that self-taught itself Mathematics in Lean and achieved a silver-medal standard in the International Math Olympiad 🥈🥈🥈🥈 @leanprover is a functional programming language for formal Mathematics and a theorem prover. It enables you to
    user avatar
    Google DeepMind
    @GoogleDeepMind
    Jul 25, 2024
    We’re presenting the first AI to solve International Mathematical Olympiad problems at a silver medalist level.🥈 It combines AlphaProof, a new breakthrough model for formal reasoning, and AlphaGeometry 2, an improved version of our previous system. 🧵 dpmd.ai/imo-silver
    Image
    GIF
    13K
  • user avatar
    Tom Zahavy
    @TZahavy
    May 8, 2021
    A rejection story with a happy end. A paper from my #Phd was accepted to #ICML2021 after 4-5 rejections (I lost count honestly). Each time we had reviewers that liked it and some that didn’t. Believing in it and keep improving it over time eventually got it in. Don’t loose hope!
    You’re unable to view this Post because this account owner limits who can view their Posts. Learn more
  • user avatar
    Tom Zahavy
    @TZahavy
    Oct 29, 2025
    Replying to @TZahavy
    Read more about it: ♟️ @chesscom blogpost: chess.com/news/view/ai-l… 💻Booklet & Review: arxiv.org/abs/2510.23772 📃Paper: arxiv.org/abs/2510.23881
    Image
    DeepMind's AI Learns To Create Original Chess Puzzles, Praised By GMs
    From chess.com
    11K
  • user avatar
    Tom Zahavy
    @TZahavy
    Sep 15, 2022
    Late on arXiv (oral @CoLLAs_Conf): @jelennal_ who did a fantastic internship with us at the Discovery team @DeepMind studies how adding context to meta gradients can help agents to adapt when the environment changes. Thanks for sharing @_akhaliq
    user avatar
    AK
    @_akhaliq
    Sep 14, 2022
    Meta-Gradients in Non-Stationary Environments abs: arxiv.org/abs/2209.06159
    Image

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement