Ashvin Nair (@ashvinair) / X

Ashvin Nair

101 posts

Ashvin Nair

@ashvinair

RL foundations @cursor_ai. Prev: o1, o3, Code Interpreter @openai, 9 years learning to poke by poking at UC Berkeley

Berkeley, CA

Joined December 2014

Pinned
Ashvin Nair
@ashvinair
Oct 29, 2025
Some exciting new to share - I joined Cursor! We just shipped a model 🐆 It's really good - try it out! cursor.com/blog/composer I left OpenAI after 3 years there and moved to Cursor a few weeks ago. After working on RL for my whole career, it was incredible to see RL come alive
Cursor
@cursor_ai
Oct 29, 2025
Introducing Cursor 2.0. Our first coding model and the best way to code with agents.
00:00
Composer: Building a fast frontier model with RL · Cursor
From cursor.com
221K
Ashvin Nair
@ashvinair
Oct 22, 2021
A PyTorch re-implementation is now available - see github.com/rail-berkeley/… Use IQL for all your offline RL and online finetuning needs!
Ilya Kostrikov
@ikostrikov
Oct 13, 2021
Excited to present our work with @ashvinair and @svlevine, Offline RL with Implicit Q-Learning (IQL), a simple method that achieves SOTA performance on D4RL arxiv.org/abs/2110.06169 and works 4x faster than prior SOTA github.com/ikostrikov/imp… Thread below
Ashvin Nair
@ashvinair
Sep 10, 2020
Blog post on accelerating online RL with offline data: bair.berkeley.edu/blog/2020/09/1… w/ code. AWAC is great for learning online with prior data (demos, off-policy data, etc). Joint with @mihdalal @abhishekunique7 @svlevine
GIF
Ashvin Nair
@ashvinair
Aug 10, 2021
Replying to @RyanDavidReece
Took a minute to understand but its this effect causing: thestar.com/news/insight/2… For designing cockpits, out of 4,063 pilots, not a single airman fit within the average range on all 10 dimensions.
Ashvin Nair
@ashvinair
Aug 8, 2024
Headed to Amherst for @RL_Conference - I assume I'll find everyone organically but drop me a message if you'd like to catch up!
2.3K
Ashvin Nair
@ashvinair
May 30, 2024
A little shook at the extent that ChatGPT memory distilled me into my essence
2.3K
Ashvin Nair
@ashvinair
Jun 3, 2021
Super excited to release latest work on visuomotor affordance learning (VAL)!! Highlights: 1. VAL utilizes a large prior dataset to fine-tune in new scenes with just 5 minutes of online robot interaction 2. Our high-quality robot interaction dataset now at
Age-restricted adult content. This content might not be appropriate for people under 18 years old. To view this media, you’ll need to log in to X. Learn more
sites.google.com
sites.google.com
Problem Statement How can robots learn about affordances from prior datasets and, when faced with new and unfamiliar environments, utilize this knowledge to practice relevant skills and update their...
Ashvin Nair
@ashvinair
Sep 12, 2024
Extremely excited about this release and for this next phase of progress with reinforcement learning+LLMs!
OpenAI
@OpenAI
Sep 12, 2024
We're releasing a preview of OpenAI o1—a new series of AI models designed to spend more time thinking before they respond. These models can reason through complex tasks and solve harder problems than previous models in science, coding, and math. openai.com/index/introduc…
2.4K
Ashvin Nair
@ashvinair
Nov 6, 2025
Replying to @gabriel1
Mamdani is certainly more abundance-pilled than either Cuomo or Silva (check out his podcast with Derek Thompson, he might not agree with every point but Mamdani seems to understand the issues pretty deeply); who were you hoping for?
1.5K
Ashvin Nair
@ashvinair
May 5, 2024
In Vienna for ICLR - reach out if you’re around and want to meet up!
1.6K
Ashvin Nair
@ashvinair
Mar 25, 2025
885
Ashvin Nair
@ashvinair
Oct 30, 2024
Impressive!
Murtaza Dalal
@mihdalal
Oct 30, 2024
Can my robot cook my food, rearrange my dresser, tidy my messy table and do so much more without ANY demos or real-world training data? Introducing ManipGen: A generalist agent for manipulation that can solve long-horizon robotics tasks entirely zero shot, from text input! 1/N
00:00
1.3K
Ashvin Nair
@ashvinair
Oct 13, 2021
Really excited about new work with @ikostrikov and @svlevine on offline RL. Implicit Q-Learning (IQL) gets great results offline and finetuning further online, and is really fast and simple to implement. See Ilya's thread for details
Ilya Kostrikov
@ikostrikov
Oct 13, 2021
Excited to present our work with @ashvinair and @svlevine, Offline RL with Implicit Q-Learning (IQL), a simple method that achieves SOTA performance on D4RL arxiv.org/abs/2110.06169 and works 4x faster than prior SOTA github.com/ikostrikov/imp… Thread below
Ashvin Nair
@ashvinair
Mar 23, 2023
Super excited for ChatGPT plugins - a way for LLMs to interact with the external world
Greg Brockman
@gdb
Mar 23, 2023
We’ve added initial support for ChatGPT plugins — a protocol for developers to build tools for ChatGPT, with safety as a core design principle. Deploying iteratively (starting with a small number of users & developers) to learn from contact with reality: openai.com/blog/chatgpt-p…
00:00
2.1K