Log inSign up
Prithviraj (Raj) Ammanabrolu
3,415 posts
Image
user avatar
Prithviraj (Raj) Ammanabrolu
@rajammanabrolu
Reinforcement Learning and Language. Assistant Prof @UCSanDiego. Research Scientist @Nvidia.
San Diego, CA
prithvirajva.com
Joined April 2019
680
Following
8,248
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Nov 24, 2025
    My entire PEARLS Lab, and many NVIDIA colleagues, will be at #neurips2025 in SD to chat about our latest. Some papers in the conf are already kinda outdated so just reach out to @bosungkim17 for all things VLA, embodied AI, and long context memory arxiv.org/abs/2505.16928
    Image
    Image
    user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Nov 3, 2025
    I've done a few versions of this talk but this is the first that's been recorded publicly, thanks to @IVADO_Qc! A good overview of things my lab has been up to in the last year or so at least in balancing safety/capabilities esp re embodied human-AI colab
    19K
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Mar 4, 2025
    I taught a grad course on AI Agents at UCSD CSE this past quarter. All lecture slides, homeworks & course projects are now open sourced! I provide a grounding going from Classical Planning & Simulations -> RL Control -> LLMs and how to put it all together pearls-lab.github.io/ai-agents-cour…
    A screenshot of the class description
    155K
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Jan 26, 2025
    Simply, no. I've been looking at my old results from doing RL with "verifiable" rewards (math puzzle games, python code to pass unit tests) starting from 2019 with GPT-1/2 to 2024 with Qwen Math Deepseek's success likely lies in the base models improving, the RL is constant
    user avatar
    Kevin Patrick Murphy
    @sirbayes
    Jan 26, 2025
    Is it feasible to do a true tabula rasa version of deepseek R1 zero, starting from an LLM with random weights, similar to alpha zero? Or is starting with an LLM which is pre trained on math required?
    113K
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Jun 7, 2023
    Soon™, I'll be an Asst Prof @UCSanDiego @ucsd_cse focusing on interactive & grounded AI, RL, NLP I will also be a research scientist @MosaicML helping lead efforts to make tech like RLHF more accessible Looking for PhD students & research eng/scientists to join me in ☀️SoCal🏖️
    A photo of a sea side cliff with water, part of a city and flowers
    A photo of me in front of water with two lines of pillars in the back
    A picture of a library building in the shape of Diamond with a lot of glass
    191K
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    May 15, 2025
    I like the Ultra Scale Playbook from @huggingface and give it to my MS/first year PhD students to read as a prereq huggingface.co/spaces/nanotro… Is there an "RLSys" version of this on scaling RL+LLM training? If not + there's OSS community interest, I'll prob write one?
    Image
    35K
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Sep 21, 2020
    Using GPT-3 instead of regex
    user avatar
    James Farmer
    @JamesFarmer87
    Sep 19, 2020
    Well this has made my day.
    Image
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Apr 25, 2021
    I haven't been home in years. I stay up at night thinking of all the people I'll never see again. I'd like to have a home to go back to. All I can do is donate/RT so I'm boosting #CovidIndia posts that can help. If this bothers you, pls mute/unfollow. Don't send me DMs like this
    DM screenshot. Them: this is an AI account right? Maybe don't tweet RT covid stuff so much you lose followers just saying. Me: feel free to unfollow
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Oct 5, 2022
    The secret to aligning LMs to human preferences is reinforcement learning. But Why&How is it used? Announcing 💻RL4LMs: library to train any @huggingface LM w/ RL github.com/allenai/RL4LMs 👾GRUE: benchmark of 6 NLP tasks+rewards 📈NLPO: new RL alg 4 LMs 🌐rl4lms.apps.allenai.org
    Image
    GIF
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Jun 14, 2024
    ML Systems people need to be stopped. Half of these kernel fusions are not numerically stable 😭 Yes it makes GPU go brr but it also breaks policy gradient theorem and makes me question my life decisions every day
    Image
    109K
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Nov 13, 2023
    The PEARLS Lab at @ucsd_cse is now open for business! I'm recruiting Fall 24 PhD students in all things interactive and grounded AI, RL, and NLP!! Join us in the land of 🏖️ beach (🧋pearl tea included). Apply by Dec 20. Please help spread the word! More: pearls.ucsd.edu
    A blue colored logo for the PEARLS Lab with a human and a robot hand collaborating to draw out a model of the world
    184K
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Oct 31, 2024
    The year is 2027, NeurIPS tickets are now sold on Ticketmaster and the black market for thousands. Only 5 companies and friends can get in. It's easier to get tickets to the Taylor Swift concert next door so you can sneak into the poster halls
    user avatar
    NeurIPS Conference
    @NeurIPSConf
    Oct 29, 2024
    Due to a high demand for registrations, NeurIPS will be moving towards a randomized lottery system, effective immediately. Authors of accepted conference and workshop papers are still guaranteed registration, but this may change as we release spots to the lottery, so we urge
    20K
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Dec 19, 2021
    If it doesn't work with seed 42, it'll never work.
    This Post is from a suspended account. Learn more
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Jul 24, 2020
    I have a language modeling joke, but it's too dangerous to be released.
    user avatar
    Ida Momennejad
    @criticalneuro
    Jul 24, 2020
    I have a reinforcement learning joke, but not sure it's rewarding.
  • user avatar
    Prithviraj (Raj) Ammanabrolu
    @rajammanabrolu
    Mar 17, 2022
    Why do ML academics have such knee jerk reactions to writing rules or engines to ground and control an ML system? "It won't work in the real world" is such an unsubstantiated argument. Have you ever actually put an ML system in production?? How do you think those work???
Advertisement
Advertisement