Log inSign up
Quentin Gallouédec
1,111 posts
Image
user avatar
Quentin Gallouédec
@QGallouedec
PhD - Post-training @huggingface 🤗 TRL lead maintainer 🇫🇷 in 🇨🇦
Joined May 2019
811
Following
4,456
Followers
  • Pinned
    user avatar
    Quentin Gallouédec
    @QGallouedec
    Mar 31
    We finally shipped TRL v1.0!! stable APIs, broad integrations, and a design built to absorb whatever the field throws at it next. Let's go! hf.co/blog/trl-v1
    Image
    17K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Jan 25, 2025
    Last moments of closed-source AI 🪦 : Hugging Face is openly reproducing the pipeline of 🐳 DeepSeek-R1. Open data, open training. open models, open collaboration. 🫵 Let's go!
    Image
    GitHub - huggingface/open-r1: Fully open reproduction of DeepSeek-R1
    From github.com
    180K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Mar 24, 2025
    ☄️ GRPO now scales to 70B+ models with multi-node training and super-fast performance. Install the latest v0.16 version of TRL pip install trl With all these the freshest features and optimizations that we've added, you can train up to 60 times faster! More details in the
    Image
    69K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Feb 9, 2025
    Train an agent with GRPO? Yes, it works! I've made a small demo example if you're interested!
    Image
    Image
    70K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Feb 2, 2025
    One week into Open-R1, our project to replicate its training pipeline and synthetic data. A thread 🧵 (0/13) More details here:
    Image
    Open-R1: Update #1
    From huggingface.co
    72K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Apr 25, 2025
    just pip install trl
    Image
    61K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Apr 22, 2024
    🆕 Introducing JAT, the first open-source multi-modal, multi-task multi-domain agent! 🤖 A step toward open generalist agents! 🚀 📰 Blog: huggingface.co/blog/jat
    Image
    00:00
    73K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Mar 22, 2025
    🪂 Getting GRPO Done Right (Dr GRPO) is now in TRL @zzlccc proved that scaling by the std introduces question-level difficulty bias! You can now remove this bias 🗑️
    Image
    51K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Apr 30, 2025
    GRPO x Curriculum learning 😳 The only difference is that I sorted the dataset (math questions) by difficulty. Do you agree that it's the kind of curve you'd expect? But the most interesting question is, does it give better results? Answer in the thread 🧵 (0/n)
    Image
    55K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Aug 18, 2025
    Replying to @_ma_thusal_em
    SFR, 200% Ce que vous voyez est une fibre cassée par le technicien, mais c’est au client de payer la réparation
    Image
    84K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Aug 14, 2025
    🚨 Big news! We decided that @huggingface’s post-training library, TRL, will natively supports training Vision Language Models 🖼️ This builds on our recent VLM support in SFTTrainer — and we’re not stopping until TRL is the #1 VLM training library 🥇 More here 👉
    Image
    30K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Mar 20, 2025
    🤹‍♀️ GRPO Trainer in TRL now handles mixed objectives! Simply return `None` if the reward function doesn’t apply to the sample. More in the documentation! Kudos to Shirin for contributing this feature to TRL.
    Image
    17K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Jul 29, 2025
    📢 TRL 0.20 drops: Fine-tune your VLM with GRPO! And it also includes GSPO. So basically, fine-tune your VLM with GSPO.
    Image
    24K
  • user avatar
    Quentin Gallouédec
    @QGallouedec
    Jul 27, 2025
    Merry Christmas 🎁 GSPO is in TRL. Looking forward to see your reward curves 📈
    Image
    32K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement