Aksh Garg (@AkshGarg03) / X

Aksh Garg

242 posts

Aksh Garg

@AkshGarg03

@mercor_ai, CS @stanford | ex @point72, @tesla, @spacex, @deshaw

Joined January 2022

Pinned
Aksh Garg
@AkshGarg03
May 15, 2024
(1/5) @CKT_Conner, @dill_pkl, @emilyzsh, and I are excited to introduce Shard - a proof-of-concept for an infinitely scalable distributed system composed of consumer hardware for training and running ML models! Features: - Data + Pipeline Parallel for handling arbitrarily large
00:00
86K
Aksh Garg
@AkshGarg03
May 10, 2024
liked by @karpathy i’ve officially made it in life 😇
Aksh Garg
@AkshGarg03
May 10, 2024
stanford midterms be like #cs231n @karpathy
190K
Aksh Garg
@AkshGarg03
Jun 3, 2024
Re Llama3V: First of all, we want to apologize to the original authors of MiniCPM. We wanted Mustafa to make the original statement but have been unable to contact him since yesterday. @siddrrsh and I posted Llama3V with @mustafaaljadery. Mustafa wrote the code for the project.
PrimerYang
@yangzhizheng1
Jun 2, 2024
Shocked! Llama3-V project from a Stanford team plagiarized a lot from MiniCPM-Llama3-V 2.5! its code is a reformatting of MiniCPM-Llama3-V 2.5, and the model's behavior is highly similar to a noised version of MiniCPM-Llama3-V 2.5 checkpoint. Evidence: github.com/OpenBMB/MiniCP…
355K
Aksh Garg
@AkshGarg03
May 10, 2024
stanford midterms be like #cs231n @karpathy
218K
Aksh Garg
@AkshGarg03
May 20, 2024
1/ As promised, here's my thesis on the future of decentralized training of foundation models. Covers: 1) why decentralized makes sense from scaling, margins, and marketplace lenses 2) challenges 3) exciting enabling research shifts In long form at:
Shard: On the Decentralized Training of Foundation Models
From aksh-garg.medium.com
42K
Aksh Garg
@AkshGarg03
May 21, 2024
1/ @SohamGovande, @jameszhou02, @jerryzhou and I spent the weekend building PodPlex: A platform for distributed training & serverless inference at scale I'm very glad to say that we left $10,000 GPU credits richer and 36 hours of sleep poorer more details in 🧵
00:00
30K
Aksh Garg
@AkshGarg03
May 9, 2024
Releasing Gemma with a 10M context window! We feature: • 1250x context size • Local execution on <32GB of Ram • Infini-attention Check us out on: • 🤗: tinyurl.com/bdhu65xd • GitHub: tinyurl.com/mukxkve8 • Technical blog: tinyurl.com/558vfj85
mustafaaljadery/gemma-2B-10M · Hugging Face
From huggingface.co
39K
Aksh Garg
@AkshGarg03
Sep 18, 2024
it was beyond incredible working with @adarsh_exe, @BrendanFoody, @suryamidha, and the rest of the @mercor_ai team! truly hard to find a team that either moves as fast, with this much pmf, or has as ambitious founders let alone the combination of the three. So, so excited for
13K
Aksh Garg
@AkshGarg03
Oct 16, 2024
We made Specter at the @pearvc x @OpenAI hackathon! Using multi-agent trajectory rollouts, contextual retrieval, and real-time voice agents, it aims to provide a seamless negotiation prep experience for lawyers. Links: - Github: github.com/akshgarg7/Spec… - Website (laggy and
00:00
14K
Aksh Garg
@AkshGarg03
Feb 20, 2025
INSANE shoutout to @BrendanFoody @adarsh_exe and @suryamidha!! working at mercor feels like time-traveling. life is perpetually on 2x - 2x faster revenue growth, 2x faster eng velocity and 2x faster community building you’re always updating priors, sniffing our revenue, and
Mercor
@mercor_ai
Feb 20, 2025
Mercor is solving talent allocation in the AI economy. The difference between greatness and failure is the right person being in the right place at the right time. Putting them there is the hardest unsolved problem in capitalism. We’re excited to announce our $100M Series B at
00:00
8.7K
Aksh Garg
@AkshGarg03
May 12, 2024
Despite 43 million blind people worldwide, current assistive technology such as screen readers and braille displays are extremely expensive and limited. We’re hoping to change that with ApolloVision, a multimodal software layer that can help blind and visually impaired people
00:00
5K
Aksh Garg
@AkshGarg03
May 10, 2024
first page of 🤗 in <24 hours!! And ahead of whisper, LLama3-70b, and Phi-3 🤯. Really excited to see how the community interacts with the model. If you haven't seen it yet, 10M context window in <32GB RAM. See below 👇
Aksh Garg
@AkshGarg03
May 9, 2024
Releasing Gemma with a 10M context window! We feature: • 1250x context size • Local execution on <32GB of Ram • Infini-attention Check us out on: • 🤗: tinyurl.com/bdhu65xd • GitHub: tinyurl.com/mukxkve8 • Technical blog: tinyurl.com/558vfj85
7.8K
Aksh Garg
@AkshGarg03
Mar 3, 2025
huge congrats to @JoeLi5050, @radi_cho, @zeynebnkaya, and @nicolesplaining! SOTA improvements on DLMs in 24 hours are nuts
Joe Li
@JoeLi5050
Mar 3, 2025
Just won 1st Place and $40k at the Mercor x Etched x Cognition inference-time compute hackathon! 4 Stanford freshman (@radi_cho, @zeynebnkaya, @nicolesplaining, and I) built "LLaDA-R1: Scaling Reasoning at Inference Time with Diffusion-LLMs" in 24 hours
5K
Aksh Garg
@AkshGarg03
May 10, 2024
Replying to @karpathy
you should check out our work 10M context window Gemma-2B with <32 GB RAM (pinned on profile) - maybe a reason to retweet is close ahead :)
5.2K