Kevin Lu (@_kevinlu) / X

Kevin Lu

76 posts

Kevin Lu

@_kevinlu

Research @thinkymachines

SF 🏳️‍🌈

Joined October 2020

Kevin Lu
@_kevinlu
Aug 18, 2025
I recently joined @thinkymachines -- super excited to work with the team, I think we have the highest density of research talent in the world 🙂 we have a very ambitious roadmap ahead, the right team to work on it, & I think now is a great time to join; you should reach out to
Mira Murati
@miramurati
Jul 15, 2025
Thinking Machines Lab exists to empower humanity through advancing collaborative general intelligence. We're building multimodal AI that works with how you naturally interact with the world - through conversation, through sight, through the messy way we collaborate. We're
292K
Kevin Lu
@_kevinlu
Jan 31, 2025
We released o3-mini, available today to all users in ChatGPT (for free)! o3-mini-low is faster (and often better) than o1-mini, and o3-mini-high is the most capable publicly available reasoning model in the world. openai.com/index/openai-o… with @ren_hongyu @shengjia_zhao
63K
Kevin Lu
@_kevinlu
Jul 18, 2024
I recently joined OpenAI! Come check out our new model: 82% MMLU at 60 cents per 1M output tokens! openai.com/index/gpt-4o-m…
77K
Kevin Lu
@_kevinlu
Dec 20, 2024
We trained o3-mini: both more capable than o1-mini, and around 4x faster end-to-end when accounting for reasoning tokens with @ren_hongyu @shengjia_zhao & others
43K
Kevin Lu
@_kevinlu
Oct 27, 2025
in our new post, we walk through great prior work from @agarwl_ & the @Alibaba_Qwen team exploring on-policy distillation using an open source recipe: you can run our experiments on Tinker today! github.com/thinking-machi… i'm especially excited by the use of on-policy
Thinking Machines
@thinkymachines
Oct 27, 2025
Our latest post explores on-policy distillation, a training approach that unites the error-correcting relevance of RL with the reward density of SFT. When training it for math reasoning and as an internal chat assistant, we find that on-policy distillation can outperform other
96K
Kevin Lu
@_kevinlu
Sep 12, 2024
Come check out o1-mini: SoTA math reasoning in a small package openai.com/index/openai-o… with @ren_hongyu @shengjia_zhao @Eric_Wallace_ & the rest of the OpenAI team
156K
Kevin Lu
@_kevinlu
Oct 1, 2025
anyone who's tried running RL on top of language models knows how painful it is -- building on top of new research, tinker makes finetuning frontier LLMs easy and performant! it's the latest in a long-standing dream to use finetuning to democratize training and personalization.
Thinking Machines
@thinkymachines
Oct 1, 2025
Introducing Tinker: a flexible API for fine-tuning language models. Write training loops in Python on your laptop; we'll run them on distributed GPUs. Private beta starts today. We can't wait to see what researchers and developers build with cutting-edge open models!
30K
Kevin Lu
@_kevinlu
Sep 29, 2025
I used to be really excited about the properties of LoRA's for compositionality and personalization back in the stable diffusion days (kevinlu.ai/loras-as-progr…) -- turns out they are still promising! come check out @johnschulman2 's modern analysis on LoRA's for modern LLM
Thinking Machines
@thinkymachines
Sep 29, 2025
LoRA makes fine-tuning more accessible, but it's unclear how it compares to full fine-tuning. We find that the performance often matches closely---more often than you might expect. In our latest Connectionism post, we share our experimental results and recommendations for LoRA.
kevinlu.ai
LoRAs as Composable Programs
There is a growing trend to think of large language models (LLMs) as operating systems (OS). They have the ability to read and write to short-term memory in ...
23K
Kevin Lu
@_kevinlu
Jun 29, 2025
A somewhat little known fact about me is that I have a blog 😀 Over the weekend I got around to writing up some of my thoughts on the recent LLM-Pokemon craze, and why I think video games are more interesting than most (maybe older) AI researchers think - Why is Pokemon hard,
20K
Kevin Lu
@_kevinlu
Jul 9, 2025
Replying to @_kevinlu
I really like this diagram from @_jasonwei and @hwchung27 about how to view the bitter lesson: It's a mistake not to add structure now, it's a mistake to not remove that structure later. We're at the precipice of setting up a huge, powerful RL training run that will define the
Jason Wei
@_jasonwei
Jun 12, 2024
Although the bitter lesson suggests that end-to-end eventually wins, this talk observes that at a given level of (compute, data, algorithm, architecture), there exists an optimal structure to add to make things work at all. But we often forget to remove these when more end-to-end
48K
Kevin Lu
@_kevinlu
Jul 9, 2025
Replying to @_kevinlu
Ultimately, we want AGI that benefits and interacts with humans, not just something that lives in a toy cage (like AlphaZero for chess, or reasoning models in math). In contrast to other researchers, I think it is therefore imperative to work on product. Are researchers who say
20K
Kevin Lu
@_kevinlu
Oct 29, 2025
thanks to multi-tenancy and the incredible engineering effort of the team, tinker is now both a joy to use, and super cheap! hope to see you try it out 🙂
Thinking Machines
@thinkymachines
Oct 29, 2025
Replying to @thinkymachines
Starting Monday, November 3rd, Tinker is switching to a pricing plan that reflects compute usage. This will ensure we have sufficient capacity to clear our waitlist by the end of the year, allowing anyone to sign up and start Tinkering. tinker-console.thinkingmachines.ai/rate-card
16K
Kevin Lu
@_kevinlu
Apr 14, 2025
Today we released GPT-4.1 nano, an amazing effort led by @johnohallman and @SuvanshSanjeev! Some cool features of today's release: - Faster & cheaper than 4o-mini - Significantly cheaper for image processing - Better reasoning across the board - 1M input context
21K
Kevin Lu
@_kevinlu
Dec 8, 2021
Come chat with us about sequence modeling for reinforcement learning @NeurIPSConf tomorrow (Thurs 12/9) at 8:30-10am PT! gather.town/app/XRWlik7kvt…
Igor Mordatch
@IMordatch
Jun 2, 2021
Can RL algorithms be replaced with transformer-based language models? We’ve looked at this question with our work on Decision Transformer: Website: sites.google.com/corp/berkeley.… Code: github.com/kzl/decision-t… 1/8