Log inSign up
Lucas Beyer (bl16)
27.4K posts
Image
user avatar
Lucas Beyer (bl16)
@giffmana
Researcher (now: Meta. ex: OpenAI, DeepMind, Brain, RWTH Aachen), Gamer, Hacker, Belgian. Anon feedback: admonymous.co/giffmana ✗DMs → email
Zürich, Suisse
lucasb.eyer.be
Joined December 2013
611
Following
141.3K
Followers
  • Pinned
    user avatar
    Lucas Beyer (bl16)
    @giffmana
    Sep 14, 2022
    My Transformer tutorial slides are now available at lucasb.eyer.be/transformer I'll append recordings to this thread as I get them. If you want to use some of the slides for your lecture, you may, as long as you credit me. If you'd like me to give the lecture: maybe; e-mail me.
    Image
    user avatar
    Lucas Beyer (bl16)
    @giffmana
    Sep 11, 2022
    Giving a lecture introducing the Transformer architecture in all gory details at @M2lSchool tomorrow. Also got permission to publish slides and will share recording if/when I get one. It's a pretty cool set of slides, largely thanks to @_basilM for inspiration!
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Sep 23, 2025
    Did you know that when they say stuff like "The A18 uses TSMC's 3nm process" or "announced the 2nm node" The 3nm, 2nm actually doesn't mean anything?! It's just like a version number. They make it up. Literally nothing measures 2nm or 3nm. I certainly didn't know.
    Image
    773K
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Nov 15, 2025
    I would like to, for no reason at all, remind my dear Googlers of the yearly training they get about insider trading.
    Image
    user avatar
    Sundar Pichai
    Google
    @sundarpichai
    Nov 14, 2025
    🤔🤔
    1.4M
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Jun 26, 2025
    hey all, couple quick notes: 1) yes, we will be joining Meta. 2) no, we did not get 100M sign-on, that's fake news. Excited about what's ahead though, will share more in due time! cc @__kolesnikov__ and @XiaohuaZhai.
    705K
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Dec 29, 2022
    How good of a BERT can one get in ONE DAY on ONE GPU? With all the recent studies about scaling compute up, this paper takes a refreshing turn and does a deep dive into scaling down compute. It's well written, stock full of insights. Here is my summary and my opinions. 🧶 1/N
    Image
    851K
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Sep 15, 2025
    Huh, did you really put your height and weight in your resume in the 70's!?
    Image
    Image
    user avatar
    Priyanshu Priyank
    @PriyanshuP1405
    Sep 14, 2025
    Bill gates resume as a fresher at Harvard The resume and experience is still better than probably 99% of the college students in tech out there.
    352K
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Feb 6, 2025
    Our PR folks somehow just forgot that they bought chat.com or something lol
    user avatar
    OpenAI
    @OpenAI
    Feb 5, 2025
    ChatGPT search is now available to everyone on chatgpt.com — no sign up required.
    Image
    764K
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Jun 21, 2025
    Ladies and gentlemen, i present to you the most surreal exchange I've had on x the everything app:
    Image
    230K
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Dec 27, 2024
    This actually reproduces as of today. In 5 out of 8 generations, DeepSeekV3 claims to be ChatGPT (v4), while claiming to be DeepSeekV3 only 3 times. Gives you a rough idea of some of their training data distribution.
    Image
    Image
    user avatar
    Ross Lazer
    @rosslazer
    Dec 27, 2024
    Replying to @mathemagic1an
    LOL I'm coming around to your theory
    1.1M
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Nov 20, 2024
    Hahahaha
    Image
    user avatar
    Liron Shapira
    @liron
    Nov 20, 2024
    Replying to @liron
    astralcodexten.com/p/how-did-you-…
    287K
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Aug 19, 2025
    Looks like there's an ongoing xAI exodus. Wild.
    709K
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Jul 1, 2025
    guys I'm under observation now👀
    Image
    234K
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Feb 4, 2025
    I took a brief look at the Harmonic Loss paper tl;dr: instead of dot-product with softmax, do euclid dist with normalized 1/d**n. I kinda want this to work. I've dabbled with preferring euclid many times throughout my career (eg triplet loss etc) However...
    user avatar
    David D. Baek
    @dbaek__
    Feb 4, 2025
    1/9 🚨 New Paper Alert: Cross-Entropy Loss is NOT What You Need! 🚨 We introduce harmonic loss as alternative to the standard CE loss for training neural networks and LLMs! Harmonic loss achieves 🛠️significantly better interpretability, ⚡faster convergence, and ⏳less grokking!
    Image
    GIF
    476K
  • user avatar
    Lucas Beyer (bl16)
    @giffmana
    Jan 10, 2025
    I've been trying out Cursor with o1 for a few weeks now, and it's been giving me proper "holy shit, this changes things a bit" vibes. The most impressive to me is not the "generate code for XYZ" you see everywhere. That's nice, but I can also do that myself just fine, so it's
    250K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement