Log inSign up
Sachin
2,292 posts
user avatar
Sachin
@sachdh
cooking custom specialized models at @savantedotai
Joined April 2019
844
Following
4,016
Followers
  • Pinned
    user avatar
    Sachin
    @sachdh
    Jul 22, 2025
    Excited to share Aryabhatta 1.0, our leading model that scores 90.2% on JEE Mains, outperforming frontier models like o4 mini and Gemini Flash 2.5 Trained by us at @AthenaAgentRL , in collaboration with @physics__wallah, using custom RLVR training on 130K+ curated JEE problems
    Image
    198K
  • user avatar
    Sachin
    @sachdh
    Sep 12, 2025
    you dont need GRPO; plain old policy gradient works. this blog post is a short and sweet case study of how to train LLM with RL for your product use case - they have huge amount of dataset - experimented with reward structure - used simple policy gradient - worked on
    user avatar
    Cursor
    @cursor_ai
    Sep 11, 2025
    We've trained a new Tab model that is now the default in Cursor. This model makes 21% fewer suggestions than the previous model while having a 28% higher accept rate for the suggestions it makes. Learn more about how we improved Tab with online RL.
    Image
    00:00
    70K
  • user avatar
    Sachin
    @sachdh
    Sep 23, 2025
    the reason why I don’t like Anthropic - this video shows pure arrogance and disdain for anyone else who isn’t big labs their words and actions around safetyism reflect the same - only we can train the models and you peasants can’t open source AI is a wonderful thing and will win
    user avatar
    高级分析师
    @techeconomyana
    Sep 22, 2025
    Anthropic CEO Dario谈开源模型: - 大模型开放权重不同于软件开源,不存在开发者社区的反向贡献。 - 开源只是吸引注意力的幌子,用户只关心这个模型是否好用。Deepseek开源与否都无所谓,作为一个超大模型,推理起来很困难。 - 开源并不等于免费,推理服务器运行,是有成本的。
    Image
    00:00
    68K
  • user avatar
    Sachin
    @sachdh
    Apr 22, 2024
    YC Application submitted at 2:06 AM Ready to fly at 2:26 AM Peak engineer ? Or master procrastinator? 🤔
    Image
    Image
    27K
  • user avatar
    Sachin
    @sachdh
    Oct 20, 2025
    PPO is one of the RL algorithms; not the magic algorithm for every RL problem same goes for GRPO / GSPO or any other current policy gradients algo rl as a problem formulation is excellent - agent sees state - agent takes action - agent gets reward and next state this is so
    user avatar
    Csaba Szepesvari
    @CsabaSzepesvari
    Oct 19, 2025
    Replying to @karpathy
    @karpathy I think it would be good to distinguish RL as a problem from the algorithms that people use to address RL problems. This would allow us to discuss if the problem is with the algorithms, or if the problem is with posing a problem as an RL problem. 1/x
    36K
  • user avatar
    Sachin
    @sachdh
    Sep 11, 2025
    best / super efficient RL framework doesn't exist. profile everything and write your own training scrips. experiment with everything - reward functions, calculations of advantages, objective functions, training prompt distributions. GRPO is good; it is not untouchable. it is just
    user avatar
    will brown
    Prime Intellect
    @willccbb
    Sep 11, 2025
    "veRL is the best RL framework it's super efficient" really. are you sure about that. are you sure that you need 16 GPUs to tune a 7B model at 8k context. do you think that it's reasonable each step takes 19 minutes for this
    Image
    26K
  • user avatar
    Sachin
    @sachdh
    Aug 5, 2025
    want to prepare for the same skills? want to train the models to solve hard tasks? want to join the early, chaotic but extremely ambitious pirate ship? we are hiring and DMs are open. if you dont care about leetcode and trad job routes; but just want to take on ambitious
    user avatar
    TDM (e/λ) (L8 vibe coder 💫)
    @cto_junior
    Aug 5, 2025
    Whatever gold rush is happening in valley, will happen in Bangalore in max next 2 years So if you're feeling lost seeing those large comps, just prepare for the same skills Time will reward you well (somewhat PPP adjusted)
    22K
  • user avatar
    Sachin
    @sachdh
    Jul 27, 2025
    I am not a devout Hindu but statues and temples in Bali make me feel proud of and connected to the Vedic roots we should revive Indian cities to reflect Indian Sanskriti
    Image
    Image
    Image
    Image
    7.3K
  • user avatar
    Sachin
    @sachdh
    Mar 23, 2025
    Paras is summarizing the most important lesson of Geeta - perform your own dharma - in the context of societal dynamics.
    user avatar
    Paras Chopra
    @paraschopra
    Mar 23, 2025
    Replying to @paraschopra
    30/ The point is not to avoid working hard, but it is to work on things you find fulfilling while completely ignoring what you think is the objectively right thing to do (which is often an idea implanted into your head by the society).
    9.5K
  • user avatar
    Sachin
    @sachdh
    Aug 14, 2025
    The Aryabhatta 1.0 paper drop happened! Finally! Sorry for the delay! This paper talks about composition of our training datasets, some details about our training methodologies and few interesting tidbits about how we did effective exploration with RLVR. Do checkout and dm me /
    user avatar
    Sachin
    @sachdh
    Jul 22, 2025
    Excited to share Aryabhatta 1.0, our leading model that scores 90.2% on JEE Mains, outperforming frontier models like o4 mini and Gemini Flash 2.5 Trained by us at @AthenaAgentRL , in collaboration with @physics__wallah, using custom RLVR training on 130K+ curated JEE problems
    Image
    17K
  • user avatar
    Sachin
    @sachdh
    Sep 13, 2025
    it was so much fun discussing my learnings of training a sota maths llm - aryabhata 1.0 @lossfunk is where all of this journey started with - curious conversations about o1 - first talk about speculations of how o1 would have been trained - conversations with @paraschopra about
    user avatar
    Naveen Benny
    @navbenny
    Sep 13, 2025
    Yesterday's session by @sachdh session @lossfunk about how they achieved SOTA in JEE with RLFT+ was 🔥 I always learn something from him in every interaction and is one of the few ML practitioners I deeply respect (he's too humble for his own sake). My learnings 🧵
    Image
    Image
    Image
    9K
  • user avatar
    Sachin
    @sachdh
    Aug 2, 2025
    story of how I got into machine learning - my first job was with a services company - my manager wanted me to maintain legacy C++ software & I wanted to train models - found a lead data scientist in the same company on LinkedIn and convinced him to interview me - manager got
    user avatar
    Raj Dabre
    @prajdabre
    Aug 1, 2025
    Share a piece of your "I almost quit my job to do XYZ" lore.
    Image
    10K
  • user avatar
    Sachin
    @sachdh
    Jul 22, 2025
    Replying to @sachdh
    Aryabhatta 1.0 proves that sometimes David wins—not because of brute force, but because of precision. You mostly dont need 100B+ parameters or 16k+ context lengths.
    Image
    9.3K
  • user avatar
    Sachin
    @sachdh
    Jul 27, 2025
    if you are one of these giants and want to train custom LLMs for your own use case, my DMs are open. we started @AthenaAgentRL to help you train sota and cheap inference LLMs. please check our proof of work - Aryabhatta 1.0 on HF and then ping me
    user avatar
    sphinx
    @protosphinx
    Jul 27, 2025
    So apparently China has several top models now. I can understand India’s manufacturing challenges. I can understand the infra gaps. What I can’t understand is how Indian IT giants sitting on billions have made no real AI progress while their core business is under direct threat.
    7K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement