Log inSign up
Rishabh Agarwal
Periodic Labs
1,609 posts
Image
user avatar
Rishabh Agarwal
Periodic Labs
@agarwl_
Reinforcement Learner @periodiclabs, Adjunct Prof at McGill. Ex Meta, DeepMind, Brain, @iitbombay. NeurIPS Best Paper, On-Policy Distillation
agarwl.github.io
Joined May 2016
856
Following
22.3K
Followers
  • Pinned
    user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Aug 25, 2025
    This is my last week at @AIatMeta. It was a tough decision not to continue with the new Superintelligence TBD lab, especially given the talent and compute density. But after 7.5 years across Google Brain, DeepMind, and Meta, I felt the pull to take on a different kind of risk.
    463K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Feb 7, 2025
    I recently gave a tutorial on knowledge distillation for LLMs, explaining the mathematical derivations behind the commonly used methods. Sharing the slides here given the recent interest in this topic. drive.google.com/file/d/1xMohjQ…
    Image
    202K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Sep 17, 2024
    I gave my first guest lecture today in a grad course on LLMs as an (soon-to-be) adjunct prof at McGill. Putting the slides here, maybe useful to some folks ;) drive.google.com/file/d/1komQ7s…
    Image
    96K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Oct 23, 2025
    Yuandong is well-respected within Meta, detail oriented, and technically sharp -- this layoff doesn't make sense and my hunch is that it might be targeted towards ex-GenAI people. Meta's loss, but could be your win if you hiring frontier RL researchers ;)
    user avatar
    Yuandong Tian
    @tydsh
    Oct 23, 2025
    Several of my team members + myself are impacted by this layoff today. Welcome to connect :)
    160K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Oct 16, 2025
    *checks chatgpt* This paper costs ~4.2 million USD (400K GB200 hours) -- science! Our most expensive run was a 100K GPU hour (same amount as Deepseek-R1-zero but on GB200s). One finding here was that once we have a scalable RL algorithm, RL compute scaling becomes predictable
    user avatar
    Devvrit
    @Devvrit_Khatri
    Oct 16, 2025
    Wish to build scaling laws for RL but not sure how to scale? Or what scales? Or would RL even scale predictably? We introduce: The Art of Scaling Reinforcement Learning Compute for LLMs
    Image
    235K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Apr 7, 2025
    Joined the Llama team @AIatMeta today to work on RL and reasoning
    Image
    Image
    user avatar
    AI at Meta
    Meta
    @AIatMeta
    Apr 5, 2025
    Today is the start of a new era of natively multimodal AI innovation. Today, we’re introducing the first Llama 4 models: Llama 4 Scout and Llama 4 Maverick — our most advanced models yet and the best in their class for multimodality. Llama 4 Scout • 17B-active-parameter model
    64K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Apr 3, 2025
    After nearly 7 years at Google -- today's officially my last day at Google DeepMind (after an one month notice). This is what I sent to my co-workers: I joined Google Brain as an AI resident for a short stint but liked it here so much that I ended up staying until now! During
    Image
    50K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Sep 30, 2025
    I asked Liam: Why name our startup Periodic Labs? He said think about *periods* in time. And then it hit me: we define entire periods of history by their critical materials: Copper Age → Bronze Age → Iron Age → Silicon Age. The name says all about our mission: to discover
    user avatar
    Liam Fedus
    Periodic Labs
    @LiamFedus
    Sep 30, 2025
    Today, @ekindogus and I are excited to introduce @periodiclabs. Our goal is to create an AI scientist. Science works by conjecturing how the world might be, running experiments, and learning from the results. Intelligence is necessary, but not sufficient. New knowledge is
    Image
    79K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Oct 27, 2025
    Very nice blog post from Thinky (@_kevinlu et al) about on-policy distillation for LLMs -- we published this idea back in 2023 and it is *publicly* known to be successfully applied to Gemma 2 & 3, and Qwen3-Thinking (and probably many closed frontier models)! The idea behind
    user avatar
    Thinking Machines
    @thinkymachines
    Oct 27, 2025
    Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other
    Image
    61K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Nov 6, 2025
    Don't sleep on PipelineRL -- this is one of the biggest jumps in compute efficiency of RL setups that we found in the ScaleRL paper (also validated by Magistral & others before)! What's the problem PipelineRL solves? In RL for LLMs, we need to send weight updates from trainer to
    Image
    Image
    Image
    user avatar
    Alexandre
    Periodic Labs
    @alexpiche_
    Nov 4, 2025
    In-flight weight updates have gone from a “weird trick” to a must to train LLMs with RL in the last few weeks. If you want to understand the on-policy and throughput benefits here’s the CoLM talk @DBahdanau and I gave: youtu.be/Z1uEuRKACRs
    132K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    May 16, 2025
    All you often need is just one lucky break. For me, it was @geoffreyhinton who took a bet on me about 7 years ago. He said something along the following lines that stuck with me: “You have tried a bunch of interesting research directions , and all of them failed — that’s what
    user avatar
    Nathan Lambert
    @natolambert
    May 14, 2025
    My path into AI The sort of small wins that accumulate into a real career in AI. When I started grad school AI prof's didn't have space for me in their group and when I ended I had no papers at NeurIPS/ICLR/ICML, yet the process can still work. interconnects.ai/p/my-path-into…
    49K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Oct 5, 2024
    Really promising results we got recently: Generative CoT Verifiers trained on only grade-school math problems in GSM8K generalize quite well to much harder *high-school competition* problems in MATH!
    Image
    64K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Oct 16, 2023
    Our team would likely be hiring student researchers in Google DeepMind. Please fill the interest form if you would like to work with me. This role would start Jan / Feb 2024 and would be in-person in Montreal with 80-100% time at GDM.
    Image
    docs.google.com
    Student Researcher interest form
    This is for gathering interest for a student researcher I plan to host in GDM Montreal office in 2024 (this has to be in-person). About me: You can see my webpage and recent papers at https://agarw...
    126K
  • user avatar
    Rishabh Agarwal
    Periodic Labs
    @agarwl_
    Jan 26, 2022
    The field of ML has seen massive growth and it is becoming apparent it may be in need of self-reflection to ensure that efforts are directed towards real progress. To this end, we are organizing an @iclr_conf workshop on "ML Evaluation Standards". ml-eval.github.io [1/N]
    Image

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement