Log inSign up
Yann Dubois
604 posts
Image
user avatar
Yann Dubois
@yanndubs
Posttraining @OpenAI | PhD @StanfordAILab
San Francisco
yanndubs.github.io
Joined August 2017
1,320
Following
14.8K
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Yann Dubois
    @yanndubs
    Mar 5
    🔥Two things I'm esp excited about 5.4: 1. Unification: we merged our codex & mainline models 2. Efficiency: we brought the efficiency of 5.3-codex to CUA & knowledge work. We only showed 3 such plots in the blog but many of our evals required less time (tokens/tools) than 5.2.
    Image
    Image
    Image
    50K
  • user avatar
    Yann Dubois
    @yanndubs
    Aug 12, 2025
    I saw a lot of people complaining about 32k context size in ChatGPT for plus users, which would be terrible for coding. But actually we are giving 196k context size for plus users when using GPT5 thinking and that’s the model you should use for coding use-cases! 32k is for the
    user avatar
    Mark Kretschmann
    @mark_k
    Aug 11, 2025
    GPT-5 with a 32K context: Why it causes problems for @OpenAI users! A 32K token window for GPT-5 sounds generous until you try to do real work with it. In multi turn chats and coding sessions, that budget melts fast. Every message carries overhead you never see, including system
    887K
  • user avatar
    Yann Dubois
    @yanndubs
    Jun 22, 2021
    Most data is processed by algorithms, but compressors (eg JPEG) are for human eyes. 🤓Our fix: formalize lossy compression that ensures perfect downstream predictions 🔥1000x gains vs JPEG on ImageNet🔥 arxiv.org/abs/2106.10800 w. Ben Bloem-Reddy @karen_ullrich @cjmaddison 1/9
    Image
  • user avatar
    Yann Dubois
    @yanndubs
    Jun 8, 2023
    Developing chat LLMs is hard without an automated way to measure improvements 🔥It just became easier with AlpacaEval🔥 An automated evaluation pipeline that’s - easy to use - fast - cheap - validated w/ 20K human annotations 🥇leaderboard: tatsu-lab.github.io/alpaca_eval/ 🧵
    Image
    167K
  • user avatar
    Yann Dubois
    @yanndubs
    Aug 10, 2025
    We significantly increased the rate limits to reasoning model by popular demand. If correctness is really important for you ask the model to “think deeper” or select “gpt5 thinking” in the model picker, this uses a higher reasoning effort than when you are auto switched to
    user avatar
    Sam Altman
    OpenAI
    @sama
    Aug 10, 2025
    Replying to @techikansh
    trying 3000 per week now!
    92K
  • user avatar
    Yann Dubois
    @yanndubs
    Mar 13, 2023
    🦙Excited to share this demo of Alpaca 🔥Highlights: ~GPT3.5 performance for < 600$🔥 The goal was to have a simple model /training procedure that academics could study and improve with limited resources We achieved that by finetuning a 7B LLaMA on 52K generated instructions
    215K
  • user avatar
    Yann Dubois
    @yanndubs
    Aug 12, 2025
    Now that we can vibe code with GPT5 thinking, there's no way we will mess up the GPT6 plot!
    Image
    00:00
    59K
  • user avatar
    Yann Dubois
    @yanndubs
    Apr 17, 2025
    Our model is also pretty good at doing useless but fun stuff! @OpenAI
    Image
    Image
    Image
    00:44
    user avatar
    OpenAI
    @OpenAI
    Apr 16, 2025
    Introducing OpenAI o3 and o4-mini—our smartest and most capable models to date. For the first time, our reasoning models can agentically use and combine every tool within ChatGPT, including web search, Python, image analysis, file interpretation, and image generation.
    58K
  • user avatar
    Yann Dubois
    @yanndubs
    Jul 27, 2023
    We now evaluated also the @metaai's LLaMA 2 70B model! 🔥 Exciting times ahead! We also updated ChatGPT which seems to have improved over the last months Thanks for providing the compute/API end point @a16z @replicatehq @appenz @rajko_rad @Mascobot
    Image
    Image
    user avatar
    Yann Dubois
    @yanndubs
    Jul 19, 2023
    We evaluated LLaMA-2 Chat! It seems to be similar quality as the latest Vicuna's. Excited to see how much the community will be able to improve it using LLaMA-2 base and their fine-tuning pipelines! @WizardLM_AI @lmsysorg @huggingface 🚀 github.com/tatsu-lab/alpa…
    142K
  • user avatar
    Yann Dubois
    @yanndubs
    Aug 24, 2025
    You can thank @ericmitchellai for caring so much about that
    user avatar
    Kol Tregaskes
    @koltregaskes
    Aug 18, 2025
    GPT-5 says 'I don't know'. Love this, thank you.
    Image
    68K
  • user avatar
    Yann Dubois
    @yanndubs
    Aug 12, 2025
    Clearly, our GPT-5 UX for pro users is less than ideal… But thankfully, you can ask GPT-5 to vibe code it for you! Send us any design, and we’d love to consider it as we work on improving the UX! cc @ericmitchellai @max_a_schwarzer @sama
    Image
    00:00
    95K
  • user avatar
    Yann Dubois
    @yanndubs
    Nov 17, 2025
    Our goal with "gpt5.1 thinking" was making thinking models usable as *daily drivers* for productive usecases. That's why we focused on improving the model's efficiency with its thinking (~60% less thinking on easy prod queries) while retaining accuracy! If you use ChatGPT for
    Image
    Image
    42K
  • user avatar
    Yann Dubois
    @yanndubs
    Jul 18, 2023
    I looked a little into the Gzip OOD results and there seems to be another big problem: train-test overlap. E.g. DengueFilipino has the same train and test set. KirundiNews has 90% overlap... Still nice to see people revisit old ideas and the use of information theory for ML :)
    Image
    user avatar
    Lucas Beyer (bl16)
    @giffmana
    Jul 18, 2023
    Looks like the gzip paper I was enthusiastic about over-estimated its scores because of a bug in the code: it did top-2 knn instead of k=2. We should remember this as (yet another) a strong case for testing in ml code. I still like that it put a new idea in my toolbox.
    208K
  • user avatar
    Yann Dubois
    @yanndubs
    Dec 6, 2021
    #NeurIPS2021 Spotlight: Lossy Compression for Lossless Prediction We formalize compression for machine learning rather than human perception - 💥1000x💥 compression gains compared to JPEG - prove minimal bit-rate for given downstream perf. join us @ poster session 2 tomorrow
    Image
    GIF
    Image
    user avatar
    Yann Dubois
    @yanndubs
    Jun 22, 2021
    Most data is processed by algorithms, but compressors (eg JPEG) are for human eyes. 🤓Our fix: formalize lossy compression that ensures perfect downstream predictions 🔥1000x gains vs JPEG on ImageNet🔥 arxiv.org/abs/2106.10800 w. Ben Bloem-Reddy @karen_ullrich @cjmaddison 1/9
Advertisement
Advertisement