Log inSign up
Aksh Garg
Mercor
242 posts
user avatar
Aksh Garg
Mercor
@AkshGarg03
@mercor_ai, CS @stanford | ex @point72, @tesla, @spacex, @deshaw
Joined January 2022
318
Following
1,583
Followers

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
  • Pinned
    user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 15, 2024
    (1/5) @CKT_Conner, @dill_pkl, @emilyzsh, and I are excited to introduce Shard - a proof-of-concept for an infinitely scalable distributed system composed of consumer hardware for training and running ML models! Features: - Data + Pipeline Parallel for handling arbitrarily large
    Image
    00:00
    86K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 10, 2024
    liked by @karpathy i’ve officially made it in life 😇
    Image
    Image
    user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 10, 2024
    stanford midterms be like #cs231n @karpathy
    190K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    Jun 3, 2024
    Re Llama3V: First of all, we want to apologize to the original authors of MiniCPM. We wanted Mustafa to make the original statement but have been unable to contact him since yesterday. @siddrrsh and I posted Llama3V with @mustafaaljadery. Mustafa wrote the code for the project.
    user avatar
    PrimerYang
    @yangzhizheng1
    Jun 2, 2024
    Shocked! Llama3-V project from a Stanford team plagiarized a lot from MiniCPM-Llama3-V 2.5! its code is a reformatting of MiniCPM-Llama3-V 2.5, and the model's behavior is highly similar to a noised version of MiniCPM-Llama3-V 2.5 checkpoint. Evidence: github.com/OpenBMB/MiniCP…
    Image
    Image
    Image
    355K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 10, 2024
    stanford midterms be like #cs231n @karpathy
    Image
    218K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 20, 2024
    1/ As promised, here's my thesis on the future of decentralized training of foundation models. Covers: 1) why decentralized makes sense from scaling, margins, and marketplace lenses 2) challenges 3) exciting enabling research shifts In long form at:
    Image
    Shard: On the Decentralized Training of Foundation Models
    From aksh-garg.medium.com
    42K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 21, 2024
    1/ @SohamGovande, @jameszhou02, @jerryzhou and I spent the weekend building PodPlex: A platform for distributed training & serverless inference at scale I'm very glad to say that we left $10,000 GPU credits richer and 36 hours of sleep poorer more details in 🧵
    Image
    00:00
    30K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 9, 2024
    Releasing Gemma with a 10M context window! We feature: • 1250x context size • Local execution on <32GB of Ram • Infini-attention Check us out on: • 🤗: tinyurl.com/bdhu65xd • GitHub: tinyurl.com/mukxkve8 • Technical blog: tinyurl.com/558vfj85
    Image
    mustafaaljadery/gemma-2B-10M · Hugging Face
    From huggingface.co
    39K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    Sep 18, 2024
    it was beyond incredible working with @adarsh_exe, @BrendanFoody, @suryamidha, and the rest of the @mercor_ai team! truly hard to find a team that either moves as fast, with this much pmf, or has as ambitious founders let alone the combination of the three. So, so excited for
    13K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    Oct 16, 2024
    We made Specter at the @pearvc x @OpenAI hackathon! Using multi-agent trajectory rollouts, contextual retrieval, and real-time voice agents, it aims to provide a seamless negotiation prep experience for lawyers. Links: - Github: github.com/akshgarg7/Spec… - Website (laggy and
    Image
    00:00
    14K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    Feb 20, 2025
    INSANE shoutout to @BrendanFoody @adarsh_exe and @suryamidha!! working at mercor feels like time-traveling. life is perpetually on 2x - 2x faster revenue growth, 2x faster eng velocity and 2x faster community building you’re always updating priors, sniffing our revenue, and
    user avatar
    Mercor
    @mercor_ai
    Feb 20, 2025
    Mercor is solving talent allocation in the AI economy. The difference between greatness and failure is the right person being in the right place at the right time. Putting them there is the hardest unsolved problem in capitalism. We’re excited to announce our $100M Series B at
    Image
    00:00
    8.7K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 12, 2024
    Despite 43 million blind people worldwide, current assistive technology such as screen readers and braille displays are extremely expensive and limited. We’re hoping to change that with ApolloVision, a multimodal software layer that can help blind and visually impaired people
    Image
    00:00
    5K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 10, 2024
    first page of 🤗 in <24 hours!! And ahead of whisper, LLama3-70b, and Phi-3 🤯. Really excited to see how the community interacts with the model. If you haven't seen it yet, 10M context window in <32GB RAM. See below 👇
    Image
    user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 9, 2024
    Releasing Gemma with a 10M context window! We feature: • 1250x context size • Local execution on <32GB of Ram • Infini-attention Check us out on: • 🤗: tinyurl.com/bdhu65xd • GitHub: tinyurl.com/mukxkve8 • Technical blog: tinyurl.com/558vfj85
    7.8K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    Mar 3, 2025
    huge congrats to @JoeLi5050, @radi_cho, @zeynebnkaya, and @nicolesplaining! SOTA improvements on DLMs in 24 hours are nuts
    user avatar
    Joe Li
    @JoeLi5050
    Mar 3, 2025
    Just won 1st Place and $40k at the Mercor x Etched x Cognition inference-time compute hackathon! 4 Stanford freshman (@radi_cho, @zeynebnkaya, @nicolesplaining, and I) built "LLaDA-R1: Scaling Reasoning at Inference Time with Diffusion-LLMs" in 24 hours
    Image
    Image
    Image
    5K
  • user avatar
    Aksh Garg
    Mercor
    @AkshGarg03
    May 10, 2024
    Replying to @karpathy
    you should check out our work 10M context window Gemma-2B with <32 GB RAM (pinned on profile) - maybe a reason to retweet is close ahead :)
    5.2K
Advertisement
Advertisement