Log inSign up
Josh Albrecht
220 posts
user avatar
Josh Albrecht
@joshalbrecht
Helping machines learn. CTO at Imbue (formerly Generally Intelligent)
joshalbrecht.com
Joined April 2009
66
Following
3,162
Followers
  • Pinned
    user avatar
    Josh Albrecht
    @joshalbrecht
    Apr 2
    mngr: programmatically manage 100s of claude code sessions in parallel 🤖 open source today. lets you do things like: — for each open GitHub issue, create a PR — for each flaky test in the past week, fix it — for each rule in style guide, scan codebase & fix all instances runs
    Image
    GIF
    32K
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Oct 20, 2022
    Exciting news: our simulator for training more general RL agents is now open source! It's one of the fastest 3D RL environments (~10,000 frames per second on a single GPU). We're excited for researchers to create more general, robust, & safe AI agents: generallyintelligent.com/launch/
    user avatar
    Kanjun 🐙
    @kanjun
    Oct 20, 2022
    Today, AI systems can create stunning art & beat humans at chess & Go. But they can't do things a 3-year-old can do. Why? We're launching @genintelligent & open sourcing our research environment, Avalon, so as a community we can answer these questions: generallyintelligent.com/launch/
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Mar 10, 2023
    Jim Fan is one of my favorite people I met last year. He has a lot of really interesting ideas, and it's definitely worth checking out the conversation below if you're interested in RL, minecraft, or considering a career in prompt engineering :)
    user avatar
    Imbue
    @imbue_ai
    Mar 9, 2023
    Learn about foundation models for embodied agents, why prompt engineering will become irrelevant, and much more with @DrJimFan from NVIDIA! Blog: generallyintelligent.com/podcast/2023-0… Video: youtu.be/17CeLwjwTVI
    Image
    00:00
    17K
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Nov 17, 2023
    This is exactly the kind of work I mean when I say "practical theory" Neural networks are not magical black boxes, and actually understanding what they're doing lets us waste a lot less time fiddling around. Very happy to be able to share this work by our very own Jamie Simon
    user avatar
    Greg Yang
    @TheGregYang
    Nov 17, 2023
    1/ μP is optimal scaling rule of learning rate & init as network width → ∞. Been confused? 🆕μP = holding the "natural" (I'll explain) operator norm constant for every weight W & its updates ΔW: μP <=> ‖W‖_nat = Θ(1) = ‖ΔW‖_nat. 🆕Frobenius norm is the wrong norm to measure!
    Image
    8.3K
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Dec 7, 2022
    Giving two talks about Avalon, our featured NeurIPS paper today: 1. Discussion with MineDojo author @DrJimFan at NeurIPS, 10:45a PT: neurips.cc/virtual/2022/s… 2. @ycombinator tech talk at 2p PT on "how we created one of the fastest RL simulators": ycombinator.com/events/ml-tech…
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Sep 7, 2023
    Replying to @arram
    Awww, thanks @arram! I really appreciate the support, and the vote of confidence.
    2K
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Jan 29, 2023
    Replying to @percyliang
    Two ways to guarantee LLMs don't train on test data: 1. Retain a held out set that is not published 2. Create new test data each time Perhaps we should be moving towards the 2nd option anyway since it enables adversarial testing, which better measures worst case performance
    8.4K
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Aug 5, 2023
    Misha Belkin wrote a nice post pushing back on the narrative that "LLMs are too complicated to understand" generallyintelligent.com/perspectives/t…
    2.8K
  • user avatar
    Josh Albrecht
    @joshalbrecht
    May 31, 2021
    Financial instruments are the world doing computation about the future.
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Jun 21, 2023
    Being able to hit "go", go to sleep, and wake up to having every hyperparameter perfectly tuned is so nice. Every point below is a fully trained network, but the whole process is only a few times slower than a single run (bc the optimizer is cost-aware and parallel) See the
    user avatar
    Imbue
    @imbue_ai
    Jun 21, 2023
    What if you could automatically discover both the scaling laws for every hyperparameter PLUS the best-performing network for every compute budget? Introducing CARBS, a cost-aware hyperparameter tuner that does this & more: 👇
    Image
    GIF
    4.3K
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Jun 25, 2024
    Today we're releasing: - Cleaned up (and extended) versions of 11 public NLP benchmarks - An open source method for automatically discovering scaling laws - A guide to bringing up a 4000 GPU cluster from bare metal - ...and more, see below!
    user avatar
    Imbue
    @imbue_ai
    Jun 25, 2024
    Early this year, we trained a 70B model optimized for reasoning and coding. This model roughly matches LLAMA 3 70B despite being trained on 7x less data. Today, we’re releasing a toolkit to help others do the same, including: • 11 sanitized and extended NLP reasoning
    Image
    1.8K
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Mar 26, 2024
    If you're sad about not getting into NVDA years ago, go work with this team--they're great, and we're very excited to use their chips as soon as they're ready
    user avatar
    MatX
    @MatXComputing
    Mar 26, 2024
    Introducing MatX: we design hardware tailored for LLMs, to deliver an order of magnitude more computing power so AI labs can make their models an order of magnitude smarter. Our hardware would make it possible to train GPT-4 and run ChatGPT, but on the budget of a small startup.
    Image
    7K
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Jun 26, 2024
    This was such a fun conversation with @swyx and @jefrankle I'm really proud of the work our team did, and it was great to get a chance to share some of the details.
    user avatar
    Latent.Space
    @latentspacepod
    Jun 25, 2024
    🆕 State of the Art: Training >70B LLMs latent.space/p/llm-training… We are excited to deep dive into @imbue_ai's incredible new releases with @joshalbrecht (CTO of Imbue) AND with the best return GUEST COHOST possible for the job: @jefrankle (Chief AI Scientist of @DbrxMosaicAI
    Image
    3.4K
  • user avatar
    Josh Albrecht
    @joshalbrecht
    Oct 18, 2022
    This essay was one of the best things I've read this year.
    user avatar
    Kanjun 🐙
    @kanjun
    Oct 18, 2022
    In 2020, @michael_nielsen & I began a 2-month project to write: "how would we fund science?" 2 years & 40,000 words later, it's become: "how can the culture & institutions of science actually change, and ultimately become self-improving?" Our answer: scienceplusplus.org/metascience/in…

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement