Log inSign up
Isha Puri
71 posts
Image
user avatar
Isha Puri
@ishapuri101
AI PhD-ing @MIT_CSAIL, LLM RL and uncertainty interning @GoogleDeepMind, APC @NeurIPSConf, prev @AbridgeHQ, @Harvard
NY / BOS / SF
ishapuri.github.io
Joined July 2017
707
Following
1,555
Followers
  • user avatar
    Isha Puri
    @ishapuri101
    Aug 6, 2025
    It seems GPT‑OSS is very prone to hallucinations … check out our RLCR paper to see how we trained reasoning models to know what they don't know. Website 🌐 and code 💻 out today! rl-calibration.github.io 🚀
    Image
    33K
  • user avatar
    Isha Puri
    @ishapuri101
    Feb 6, 2025
    [1/x] can we scale small, open LMs to o1 level? Using classical probabilistic inference methods, YES! Joint @MIT_CSAIL / @RedHat AI Innovation Team work introduces a particle filtering approach to scaling inference w/o any training! check out …abilistic-inference-scaling.github.io
    Image
    45K
  • user avatar
    Isha Puri
    @ishapuri101
    Mar 4, 2025
    had a great time giving a talk about probabilistic inference scaling and the power of small models at the IBM Research ML Seminar Series - the best talks end with tons of questions, and it was great to see everyone so engaged : )
    15K
  • user avatar
    Isha Puri
    @ishapuri101
    Jul 23, 2025
    fun new paper training LLMs to analyze their own uncertainty and be more calibrated in their confidence!
    user avatar
    Mehul Damani
    @MehulDamani2
    Jul 23, 2025
    🚨New Paper!🚨 We trained reasoning LLMs to reason about what they don't know. o1-style reasoning training improves accuracy but produces overconfident models that hallucinate more. Meet RLCR: a simple RL method that trains LLMs to reason and reflect on their uncertainty --
    Image
    arXiv logo
    arxiv.org
    Beyond Binary Rewards: Training LMs to Reason About Their Uncertainty
    When language models (LMs) are trained via reinforcement learning (RL) to generate natural language "reasoning chains", their performance improves on a variety of difficult question answering...
    10K
  • user avatar
    Isha Puri
    @ishapuri101
    Sep 5, 2025
    Excited to see our Beyond Binary Rewards paper referenced in OpenAI’s latest work on hallucinations - it's important to incentivize models to express uncertainty, not just guess!
    user avatar
    Adam Tauman Kalai
    @adamfungi
    Sep 5, 2025
    New research explains why LLMs hallucinate, through a connection between supervised and self-supervised learning. We also describe a key obstacle that can be removed to reduce them. 🧵openai.com/index/why-lang…
    12K
  • user avatar
    Isha Puri
    @ishapuri101
    Mar 11, 2025
    🚀🚀thank you to @rohanpaul_ai for sharing! 🚀🚀 check out …abilistic-inference-scaling.github.io for more information on this cool inference scaling technique and how it can be leveraged to transform smaller, open models into more powerful reasoning agents! 🧠
    user avatar
    Rohan Paul
    @rohanpaul_ai
    Mar 10, 2025
    Qwen2.5-Math-7B-Instruct can scale to o1 level accuracy in only 32 rollouts. This paper's methods has a 4–16x better scaling rate over our deterministic search counterparts. Current inference-time scaling often relies on imperfect reward models that cause “reward hacking.”
    Image
    4.1K
  • user avatar
    Isha Puri
    @ishapuri101
    Jun 4, 2025
    just moved to SF to join @AbridgeHQ, working on AI and product! thrilled to be here working with amazing people in the health tech space :) let me know if you’re in the area - would love to chat!
    Image
    Image
    2.1K
  • user avatar
    Isha Puri
    @ishapuri101
    Feb 6, 2025
    Replying to @ishapuri101
    [6/x] joint work with @variational_i @xukai92 @GX_NLP @shivsr98 of the red-hat-ai-innovation-team.github.io Check out our website at …abilistic-inference-scaling.github.io and the arXiv at arxiv.org/abs/2502.01618!
    Image
    AI Innovation Team
    From ai-innovation.team
    1.1K
  • user avatar
    Isha Puri
    @ishapuri101
    Jun 25, 2025
    super excited to be working with such awesome people towards such a genuine mission :))
    Age-restricted adult content. This content might not be appropriate for people under 18 years old. To view this media, you’ll need to log in to X. Learn more
    621
  • user avatar
    Isha Puri
    @ishapuri101
    Mar 27, 2025
    MIT NLP is now on twitter!! follow along!
    user avatar
    MIT NLP
    @nlp_mit
    Mar 27, 2025
    Hello everyone! We are quite a bit late to the twitter party, but welcome to the MIT NLP Group account! follow along for the latest research from our labs as we dive deep into language, learning, and logic 🤖📚🧠
    Image
    1.1K
  • user avatar
    Isha Puri
    @ishapuri101
    Feb 6, 2025
    our new work w awesome @RedHat collaborators on novel inference scaling techniques - check out bit.ly/3CHs1Zz for more on how to scale small LMs to o1 performance! 🚀🚀🚀
    user avatar
    Red Hat
    @RedHat
    Feb 6, 2025
    Can we use classical probabilistic inference methods to scale small LMs to o1 level? 🤔 @MIT_CSAIL and Red Hat AI Innovation teams explore: bit.ly/3CHs1Zz
    Image
    1.3K
  • user avatar
    Isha Puri
    @ishapuri101
    Jul 27, 2025
    check out our beyond binary rewards paper :))
    user avatar
    The AI Timeline
    @TheAITimeline
    Jul 27, 2025
    🚨This week's top AI/ML research papers: - GSPO - Diffusion Beats Autoregressive in Data-Constrained Settings - Gemini 2.5 Pro Capable of Winning Gold at IMO 2025 - Rubrics as Rewards - Deep Researcher with Test-Time Diffusion - Learning without training - Stabilizing Knowledge,
    Image
    1.9K
  • user avatar
    Isha Puri
    @ishapuri101
    Feb 6, 2025
    Replying to @ishapuri101
    [5/x] some plots:
    Image
    Image
    Image
    1.1K
  • user avatar
    Isha Puri
    @ishapuri101
    Feb 6, 2025
    Replying to @ishapuri101
    [4/x] Without ANY training, we are able to scale a 1) Llama 1B model to almost reach Llama 70B, 2) Llama 8B model to reach GPT-4o, and 3) Qwen 7B Math model to reach o1! Our method is elegant, simple and could even have novel applications for downstream training applications!
    Image
    945

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement