Log inSign up
David Dohan
571 posts
user avatar
David Dohan
@dmdohan
reducing perplexity @openai | past: probabilistic programs, proteins, science & reasoning @ google brain 🧠
ddohan.com
Joined August 2011
1,620
Following
12K
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    David Dohan
    @dmdohan
    Jul 22, 2022
    Happy to release our work on Language Model Cascades. Read on to learn how we can unify existing methods for interacting models (scratchpad/chain of thought, verifiers, tool-use, …) in the language of probabilistic programming. paper: arxiv.org/abs/2207.10342
    Image
  • user avatar
    David Dohan
    @dmdohan
    Nov 6, 2022
    “99% of Americans don’t talk about AI at parties. You can too if you try!”
    A poster that says “Not AI room” with a drawing of a crossed out robot, and suggestions for alternate topics.
  • user avatar
    David Dohan
    @dmdohan
    Mar 1, 2023
    New chapter: Happy to share that I recently joined @OpenAI! Thankful for many collaborators, friends, and mentors who made my 6 years of research @Google Brain special🧠 Excited to collaborate toward reliable reasoning & alignment in AI systems and products like #ChatGPT
    185K
  • user avatar
    David Dohan
    @dmdohan
    Dec 20, 2024
    o3 @ 87.5% on ARC-AGI It was 16 hours at an increase rate of 3.5% an hour to "solved"
    Image
    user avatar
    David Dohan
    @dmdohan
    Dec 20, 2024
    At this rate, how long til ARC-AGI is “solved”? For context: - gpt-4o @ 5% - Sonnet3.5 @ 14% - o1-preview @ 18% - o1 @ 32% - best scaffolded solution @ 54%
    1.1M
  • user avatar
    David Dohan
    @dmdohan
    Dec 20, 2024
    imo the improvements on FrontierMath are even more impressive than ARG-AGI. Jump from 2% to 25% Terence Tao said the dataset should "resist AIs for several years at least" and "These are extremely challenging. I think that in the near term basically the only way to solve them,
    user avatar
    Nat McAleese
    @__nmca__
    Dec 20, 2024
    Replying to @__nmca__
    Well, on FrontierMath 2024-11-26 o3 improves the state of the art from 2% to 25% accuracy. These are absurdly hard strongly held out math questions. And on ARC, the semi-private test set and public validation set scores are 87.5% (private) and 91.5% (public). (7/n)
    Image
    153K
  • user avatar
    David Dohan
    @dmdohan
    Nov 20, 2023
    🩶🫶 Ilya and Sam’s yin/yang was a major reason I joined OpenAI. It is still possible to repair what was shattered.
    user avatar
    Ilya Sutskever
    @ilyasut
    Nov 20, 2023
    I deeply regret my participation in the board's actions. I never intended to harm OpenAI. I love everything we've built together and I will do everything I can to reunite the company.
    168K
  • user avatar
    David Dohan
    @dmdohan
    Sep 12, 2024
    Replying to @dmdohan
    It's important to emphasize that this is a huge leap /and/ we're still at the start Give o1-preview a try, we think you'll like it. And in a month, give o1 a try and see all the ways it has improved in such a short time And expect that to keep happening
    233K
  • user avatar
    David Dohan
    @dmdohan
    Nov 20, 2023
    OpenAI is nothing without its people
    57K
  • user avatar
    David Dohan
    @dmdohan
    Dec 20, 2024
    At this rate, how long til ARC-AGI is “solved”? For context: - gpt-4o @ 5% - Sonnet3.5 @ 14% - o1-preview @ 18% - o1 @ 32% - best scaffolded solution @ 54%
    user avatar
    ARC Prize
    @arcprize
    Dec 19, 2024
    Verified o1 performance on ARC-AGI's Semi-Private Eval (100 tasks) o1, Low: 25% ($1.5/task) o1, Medium: 31% ($2.5/task) o1, High: 32% ($3.8/task)
    235K
  • user avatar
    David Dohan
    @dmdohan
    Sep 12, 2024
    🍓is ripe and is ready to think, fast and slow: check out OpenAI o1, trained to reason before answering I joined OpenAI to push boundaries of science & reasoning with AI. Happy to share this result of team's amazing collaboration does just that Try it on your hardest problems
    user avatar
    OpenAI
    @OpenAI
    Sep 12, 2024
    We're releasing a preview of OpenAI o1—a new series of AI models designed to spend more time thinking before they respond. These models can reason through complex tasks and solve harder problems than previous models in science, coding, and math. openai.com/index/introduc…
    36K
  • user avatar
    David Dohan
    @dmdohan
    Nov 19, 2023
    🩶🫶
    user avatar
    Sam Altman
    OpenAI
    @sama
    Nov 19, 2023
    i love the openai team so much
    58K
  • user avatar
    David Dohan
    @dmdohan
    Nov 24, 2023
    language models are superhuman at predicting the next word try this yourself to see how hard it is rr-lm-game.herokuapp.com
    user avatar
    Jason Wei
    @_jasonwei
    Nov 24, 2023
    Like the International Math Olympiad or Spelling Bee, there should be a “language modeling competition” where humans compete to predict the next word in a sequence. The best humans would probably still lose to GPT-2, and we’d have more empathy for how hard it is to be an LLM :)
    181K
  • user avatar
    David Dohan
    @dmdohan
    Feb 17, 2023
    LM performance typically gets worse given irrelevant info. Simple prompting improves it: "Feel free to ignore irrelevant information given in the questions." Work led by @fredahshi, with @xinyun_chen_, @kanishkamisra, @nkscales_google, @edchi, Nathanael Schärli, & @denny_zhou
    37K
  • user avatar
    David Dohan
    @dmdohan
    Dec 20, 2024
    We are used to the cadence of big model releases: GPT2->3->4 took two years each time We’re in a different world now o1 was announced months ago, now already on next generation Expect faster improvement going forward: o1 is like gpt2 if we could jump to gpt4 ~immediately
    Image
    Image
    Image
    Image
    48K
Advertisement
Advertisement