Kyle Corbitt (@corbtt) / X

Kyle Corbitt

2,940 posts

Kyle Corbitt

@corbtt

Currently building @OpenPipeAI (acquired by @CoreWeave). Formerly @ycombinator, @google.

Seattle

Joined September 2012

Kyle Corbitt
@corbtt
May 28, 2025
At a recent dinner I met a very senior engineer at one of the Big Four tech cos. His team develops tooling for a 0-engineer future. They're not allowed to tell anyone internally what they're working on to avoid mass panic. He figures mega layoffs start in 18 months.
1.2M
Kyle Corbitt
@corbtt
Mar 25, 2024
Spoke to a Microsoft engineer on the GPT-6 training cluster project. He kvetched about the pain they're having provisioning infiniband-class links between GPUs in different regions. Me: "why not just colocate the cluster in one region?" Him: "Oh yeah we tried that first. We
1.9M
Kyle Corbitt
@corbtt
Oct 23, 2024
Just launched agent.exe, a free, open-source Mac/Windows/Linux app that lets you use Claude 3.5 Sonnet to control your computer! This was a fun little project to explore the API and see what the model can do. Computer use is really cool—I expect 2025 will be the year of agents.
636K
Kyle Corbitt
@corbtt
Jul 23, 2024
Guys fine-tuned Llama 3.1 8B is completely cracked. Just ran it through our fine-tuning test suite and blows GPT-4o mini out of the water on every task. There has never been an open model this small, this good.
348K
Kyle Corbitt
@corbtt
Aug 6, 2025
Announcing MCP•RL: teach your model how to use any MCP server automatically using reinforcement learning! Just connect any MCP server, and your model will start playing with it and (using RL) "learn from experience" how to use its tools most effectively!
206K
Kyle Corbitt
@corbtt
May 8, 2025
Wow, may be the most significant paper of 2025! A team at Tsinghua has figured out how to get an AI to generate its own training data, and surpassed the performance of models trained on expert human-curated data. We may not hit another data wall between here and ASI.
Andrew Zhao
@_AndrewZhao
May 7, 2025
❄️Introducing Absolute Zero Reasoner: Our reasoner learns to both propose tasks that maximize learnability and improve reasoning by solving them, entirely through self-play—with no external data! It overall outperforms other "zero" models in math & coding domains. 🧵 1/
196K
Kyle Corbitt
@corbtt
Sep 27, 2024
Recently overheard a Groq employee: apparently their per-token costs are 1-2 orders of magnitude higher than what they charge, and the new chip won't materially help. There's no credible plan to fix this. This is why they aren't raising rate limits. Very bearish.
272K
Kyle Corbitt
@corbtt
Dec 30, 2024
A few weeks ago, OpenAI announced Reinforcement Fine-Tuning (RFT)—a new way to adapt LLMs to complex tasks with very little training data. Here’s a quick rundown of how it works, why it’s a big deal, and when you should use it. 🧵
263K
Kyle Corbitt
@corbtt
Sep 2, 2025
🚨 We’ve just published a recipe to train a frontier-level deep research agent using RL. With just 30 hours on an H200, any developer can now beat Sonnet-4 on DeepResearch Bench using open-source tools. (Thread 🧵)
213K
Kyle Corbitt
@corbtt
Jul 11, 2025
Big news: we've figured out how to make a *universal* reward function that lets you apply RL to any agent with: - no labeled data - no hand-crafted reward functions - no human feedback! A 🧵 on RULER
180K
Kyle Corbitt
@corbtt
Jun 12, 2024
Crazy fact that everyone deploying LLMs should know—GPT-4 is "smarter" at temperature=1 than temperature=0, even on deterministic tasks. I honestly didn't believe this myself until I tried it, but shows up clearly on our evals. ht to @eugeneyan for the tip!
246K
Kyle Corbitt
@corbtt
Apr 29, 2025
🚀 Meet ART·E—our open-source RL-trained email research agent that searches your inbox and answers questions more accurately, faster, and cheaper than o3. Let's go deeper on how we built it. 🧵
159K
Kyle Corbitt
@corbtt
May 27, 2025
"RL from a single example works" "RL with random rewards works" "Base model pass@256 can match RL model pass@1" "RL updates a small % of params" Recent papers all point in the same direction: RL is mostly just eliciting latent behavior already learned in pretraining, not
104K
Kyle Corbitt
@corbtt
Mar 26, 2024
Replying to @Jessassin
my general understanding of the business model is "whoever builds agi first wins the whole game." you can agree with them or not, but openai really does believe they're playing for all the marbles here.
99K