Log inSign up
Zihan "Zenus" Wang
1,237 posts
Image
user avatar
Zihan "Zenus" Wang
@wzenus
Reasoning agent / RL / efficiency research @NorthwesternU & @nvidia. Ex @Microsoft @yutori_ai @deepseek_ai @uiuc_nlp @RUC1937.
zenus.me
Joined March 2022
667
Following
23K
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Zihan "Zenus" Wang
    @wzenus
    May 29
    🧵 Claude-Opus-4.8 takes you too much tokens - but is this issue general across agents? Do agents know how much they'll spend? Introducing Budget-Aware Agents (BAGEN): We study budget awareness across 4 envs & 5 frontier agents, and find structured failures in most of them.
    Image
    607K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Jan 28, 2025
    Wow. So why didn't you open-source it?
    user avatar
    Mark Chen
    @markchen90
    Jan 28, 2025
    Congrats to DeepSeek on producing an o1-level reasoning model! Their research paper demonstrates that they’ve independently found some of the core ideas that we did on our way to o1.
    3.1M
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Jan 24, 2025
    DEEPSEEK NOW IS THE #1 IN THE WORLD. 🌍🚀 Never been prouder to say I got to work here. Ambition. Grit. Integrity. That’s how you build greatness. Brilliant researchers, engineers, all-knowing architects, and visionary leadership—this is just the beginning. Let’s. Go. 💥🔥
    user avatar
    Arena.ai
    @arena
    Jan 24, 2025
    Breaking News: DeepSeek-R1 surges to the top-3 in Arena🐳! Now ranked #3 Overall, matching the top reasoning model, o1, while being 20x cheaper and open-weight! Highlights: - #1 in technical domains: Hard Prompts, Coding, Math - Joint #1 under Style Control - MIT-licensed A
    Image
    1M
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Jan 28, 2025
    Not responding to more interview request anymore (if you do think there is a need, email me tho). The best way to ask me questions is through X, where all people from the world could see who is correct who is not. Being wrong is not bad, being arrogant is.
    623K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Jan 28, 2025
    I don't think Anthropic is behind. Please give them (as well as other respectful research teams) more time. There is an old saying in China: if you become more low-key, you will become richer (中国有句古话叫做,闷声发大财。). How they are coping with non-coping when others are
    user avatar
    Theo - t3.gg
    @theo
    Jan 28, 2025
    Prediction: Anthropic will rush out a reasoning model within the next 30 days
    339K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Jan 28, 2025
    🚀 Introducing RAGEN—the world’s first reproduction of DeepSeek-R1(-Zero) methods for training agentic AI models! We’re betting big on the future of RL + LLM + Agents 🤖✨. This release is a minimally viable leap toward that vision. Code and more intro 🔗:
    Image
    Image
    Image
    356K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Dec 28, 2024
    [Long Tweet Ahead] I just have to say, I’m genuinely impressed by DeepSeek. 💡 It’s no wonder their reports are so elegant and fluffless. Here’s what I noticed about their culture, a space where real innovation thrives, during my time there ↓ — — — — — 🌟 1. Be nice and careful
    266K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Jan 28, 2025
    Replying to @wzenus
    I have no offense here (I know you don't believe). I'm just wondering how they are becoming more and more tentative about what has been their advantage and the most proud personality of the "Open AI" company, focusing on what's really important, and trust people like how people
    175K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Jan 24, 2025
    Things I'm worrying now is how "tech old moneys" are trying to offer DeepSeek whales 🐳 those 💵5.5M/yr salary, hoping to dissolve the team and disrupt such opponent. No. I'd never want this to happen, and whale HRs please keep offering your members the best material benefits
    97K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Jan 28, 2025
    Replying to @wzenus
    These days DeepSeek has been gathering more and more attention. While I'm not sure if I have the right, I think it's a great time to ask some questions to someone that worth being asked. It's not about competition or war, it's about the future of humanity. Do not be afraid of
    172K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Mar 3, 2025
    🚀 Introducing Chain-of-Experts (CoE), A Free-lunch optimization method for DeepSeek-like MoE models! within $200, we explore to train MoEs that enables 17.6-42% efficiency boost in memory! Code: github.com/ZihanWang314/c… Blog: notion.so/Chain-of-Exper…
    Image
    Image
    135K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Dec 25, 2024
    Rare life update: I'm married! 💍 And so surprised to hit "521" ("I love you" in Chinese) citations the same day :)It's really a great time and start of a new journey. Merry Christmas and happy holidays to everyone!🎄🎉🎅🏻
    Image
    Image
    Image
    Image
    86K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Aug 6, 2025
    To guys diving into fine-tuning open-source MoEs today: check out ESFT, our customized PEFT method for MoE models. Train with 90% less parameters, gain 95%+ task perf and keep 98% general perf :)
    user avatar
    DeepSeek
    @deepseek_ai
    Jul 5, 2024
    🚀 Introducing Expert-Specialized Fine-Tuning (ESFT) for Customizing LLMs with Sparse Architectures! 🌟 Highlights: - Train only task-relevant experts for LLM customization. - Reduces storage by up to 90% and training time by up to 30%. ✨ Performance: - Customizes LLMs
    Image
    48K
  • user avatar
    Zihan "Zenus" Wang
    @wzenus
    Jan 28, 2025
    If you'd ever been inspired by @deepseek_ai , Don't miss following @teortaxesTex — DeepSeek's Cheerleader since day 1, as early as 2023, back when ZERO followers roamed here 🦗 Never underestimate this cheering maestro 💥 His words? Unpredictable. His takes? Fresh. His energy?
    user avatar
    Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞)
    @teortaxesTex
    Jan 27, 2025
    Badge of honor if I've ever had one
    102K
This post is unavailable.
Advertisement
Advertisement