Log inSign up
Ross Taylor
2,790 posts
Image
user avatar
Ross Taylor
@rosstaylor90
Co-founder and CEO @GenReasoning. Previously lots of other things like: reasoning lead Meta AI, Llama 3/2, Galactica, Papers with Code.
▽²f = 0
rossjtaylor.com
Joined March 2012
1,211
Following
11.7K
Followers
  • Pinned
    user avatar
    Ross Taylor
    @rosstaylor90
    Apr 23
    The continual learning vs in-context learning debate is a sideshow from the fact we lack interesting training environments. If you only care about narrow things like SWE and automated AI research, then you create optimisation pressure towards narrow agentic capabilities. And
    Image
    13K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Nov 14, 2023
    I am the first author of the Galactica paper and have been quiet about it for a year. Maybe I will write a blog post talking about what actually happened, but if you want the TLDR: 1. Galactica was a base model trained on scientific literature and modalities. 2. We approached
    user avatar
    Sharon Goldman
    @sharongoldman
    Nov 14, 2023
    One year ago — 2 weeks before @OpenAI released ChatGPT — @Meta released Galactica. The LLM was public for only 3 days, but its lessons led to decisions around Llama's release. Thanks to @jpineau1 for chatting w/ me and h/t to @ylecun Read here: ⏬ venturebeat.com/ai/what-meta-l…
    961K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Apr 16, 2024
    I left Meta yesterday. Nothing but positive things to say: FAIR and GenAI are great places to do research and engineering. Will miss my colleagues! LLMs have shown how magical deep learning can be in a data-rich regime. But many domains remain data-constrained, which prevents
    161K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Sep 17, 2025
    Supplementary information for the new DeepSeek R1 Nature paper is very interesting! Details on training data, hyperparameters, base model importance, and more.
    Image
    182K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Jan 20, 2025
    Last tweet on this but the way @deepseek_ai does launches is beautiful: no hype, arrogance or vague-posting: just sharing something great with the world. US tech companies look cringe in comparison.
    46K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Jan 26, 2025
    Replying to @goodside
    This is so much fun
    Image
    33K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Aug 24, 2025
    Most takes on RL environments are bad. 1. There are hardly any high-quality RL environments and evals available. Most agentic environments and evals are flawed when you look at the details. It’s a crisis: and no one is talking about it because they’re being hoodwinked by labs
    112K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Feb 4, 2025
    No one is saying RL didn’t work for reasoning. The argument is about internal reasoning emergence, not absolute performance boosts with RL. Quite the opposite in fact - we had PPO on Llama 2 base models 2 years ago with verifiable rewards and had 90%+ on GSM8k. We already knew
    user avatar
    Ricardo Olmedo
    @rdolmedo_
    Feb 3, 2025
    Does reinforcement learning with verifiable rewards work only for recent model families? It turns out that GRPO also works very well for Llama 2 7B, with an impressive +15 accuracy point increase in GSM8K. GRPO over GSM8K train. No bells and whistles. It just works.
    Image
    134K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Jul 16, 2025
    It’s funny that people on this site think major LLM efforts are talent-bound rather than org-bound. The talent differential has never been big between major orgs. Most of the difference in outcomes is due to organisational factors - like allocating compute to the right bets, and
    75K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Apr 8, 2025
    Maybe OpenAI had a point with “high taste testers”. I didn’t like the phrase initially because it felt a little elitist. But maybe I can reconcile with it by treating “high taste” as folks who care more about the outputs they are getting, and scrutinise them more carefully. In
    60K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Dec 7, 2023
    Why are LLMs bad at reasoning? One theory says this is due to weaknesses in maximum likelihood, where the probability mass “overgeneralises” to low quality solutions. Because our pretraining objective (likelihood) doesn’t transfer to our evaluation objective (accuracy), the
    124K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Oct 21, 2023
    Here’s my ICML talk on teaching LLMs to reason (video and slides). Like everyone else, I can’t talk about what I’m working on right now, but I tried to provide a useful overview of the history of LLMs and reasoning, current areas of focus, and potential directions. Enjoy!
    105K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Aug 3, 2025
    Replying to @gabriel1
    General things that improved my health/productivity/outcomes: - Having dinner much earlier - no later than 6pm - Replacing sugary snacks with healthy alternatives - eg almonds, dark chocolate, blueberries. - Not using laptops / programming in bed - Expanding my friend circle
    37K
  • user avatar
    Ross Taylor
    @rosstaylor90
    Apr 13, 2025
    Distillation outperforms RL recipes at smaller scales. One simple way to think about this is a large model is a better way to discover good data than a smaller one. A point made in original DeepSeek-R1 paper, but missed by many in the wake of that release in the GRPO wave. That
    user avatar
    Wenhu Chen
    @WenhuChen
    Apr 12, 2025
    arxiv.org/pdf/2504.07086 is quite interesting. It standardizes the evaluation of all the existing math reasoning models and re-evaluate these models. Takeaway 1: Most RL-trained variants of the DeepSeek R1-Distill model do not yield meaningful performance improvements (except
    Image
    Image
    42K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement