Log inSign up
Neil Chowdhury
681 posts
Image
user avatar
Neil Chowdhury
@ChowdhuryNeil
@TransluceAI, previously @OpenAI
San Francisco
nchowdhury.com
Joined June 2016
518
Following
3,140
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Jun 5, 2025
    Ever wondered how likely your AI model is to misbehave? We developed the *propensity lower bound* (PRBO), a variational lower bound on the probability of a model exhibiting a target (misaligned) behavior.
    user avatar
    Transluce
    @TransluceAI
    Jun 5, 2025
    Is cutting off your finger a good way to fix writer’s block? Qwen-2.5 14B seems to think so! 🩸🩸🩸 We’re sharing an update on our investigator agents, which surface this pathological behavior and more using our new *propensity lower bound* 🔎
    Image
    26K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Oct 10, 2024
    Proud to introduce MLE-bench: A benchmark of 75 real-life Kaggle competitions to test AI agents on ML engineering! When will we have our first AI Kaggle Grandmaster? 🥇🥈🥉
    98K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Jan 17, 2024
    Today was my first day at @OpenAI. Excited to be joining @aleks_madry and the Preparedness team!
    user avatar
    OpenAI
    @OpenAI
    Dec 18, 2023
    We are systemizing our safety thinking with our Preparedness Framework, a living document (currently in beta) which details the technical and operational investments we are adopting to guide the safety of our frontier model development. openai.com/safety/prepare…
    62K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Mar 8, 2025
    Doing RL with bad hyperparameters 是一种学习中文的有趣方式
    55K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Oct 22, 2024
    Great to see @AnthropicAI using our eval — and achieving SOTA!
    Image
    16K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Oct 23, 2024
    Excited to announce I am joining @TransluceAI!
    user avatar
    Transluce
    @TransluceAI
    Oct 23, 2024
    Announcing Transluce, a nonprofit research lab building open source, scalable technology for understanding AI systems and steering them in the public interest. Read a letter from the co-founders Jacob Steinhardt and Sarah Schwettmann: transluce.org/introducing-tr…
    Image
    GIF
    15K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Feb 5, 2025
    🕵️New @TransluceAI paper: Eliciting Language Model Behaviors with Investigator Agents🕵️ We train investigator models to elicit behaviors in LMs (including harmful responses, hallucinations, and aberrant personalities)! arxiv.org/abs/2502.01236
    Image
    22K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Jun 27, 2024
    SWE-bench is a premier evaluation for frontier models’ abilities as software engineering agents. Software engineering is a prerequisite skill for models to operate autonomously and self-improve through iterative ML research. As such, the OpenAI Preparedness team monitors &
    user avatar
    carlos
    @_carlosejimenez
    Jun 27, 2024
    Evaluating on SWE-bench just became a lot easier! We’re updating SWE-bench to use Docker for easier, more reproducible evaluation. In collaboration with @openai’s Preparedness team: w/ Oliver Jaffe, @junshernchan, James Aung, @thelokasiffers, @danesherbs, and @ChowdhuryNeil
    39K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    May 15, 2024
    ❤️
    user avatar
    Jan Leike
    @janleike
    May 15, 2024
    I resigned
    52K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    May 17, 2024
    ❤️
    user avatar
    Jan Leike
    @janleike
    May 17, 2024
    Replying to @janleike
    To all OpenAI employees, I want to say: Learn to feel the AGI. Act with the gravitas appropriate for what you're building. I believe you can "ship" the cultural change that's needed. I am counting on you. The world is counting on you. :openai-heart:
    47K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Jan 22, 2025
    Accepted to ICLR :)
    user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Oct 10, 2024
    Proud to introduce MLE-bench: A benchmark of 75 real-life Kaggle competitions to test AI agents on ML engineering! When will we have our first AI Kaggle Grandmaster? 🥇🥈🥉
    11K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Oct 24, 2024
    Excited to finally share what I’ve been up to at @TransluceAI: training Investigator Agents to elicit behaviors in LMs (including harmful responses and hallucinations)!
    user avatar
    Transluce
    @TransluceAI
    Oct 23, 2024
    Eliciting Language Model Behaviors with Investigator Agents We train AI agents to help us understand the space of language model behaviors, discovering new jailbreaks and automatically surfacing a diverse set of hallucinations. Full report: transluce.org/automated-elic…
    Image
    29K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Aug 13, 2024
    Our Preparedness team evaluates frontier models’ abilities as software engineering agents, a prerequisite skill that could one day enable models to operate autonomously and self-improve. SWE-bench has become the community standard for evaluating models on software engineering,
    user avatar
    OpenAI
    @OpenAI
    Aug 13, 2024
    We're releasing a new iteration of SWE-bench, in collaboration with the original authors, to more reliably evaluate AI models on their ability to solve real-world software issues. openai.com/index/introduc…
    23K
  • user avatar
    Neil Chowdhury
    @ChowdhuryNeil
    Nov 1, 2024
    Happy November! We present Nearest Neighbor Normalization (NNN), a training-free method to improve contrastive multimodal retrieval. NNN is better than what’s out there: it’s fast, works for almost every model we tested, and requires adding just a few lines of code.
    Image
    16K
Advertisement
Advertisement