Log inSign up
Zain
Together AI
4,588 posts
Image
user avatar
Zain
Together AI
@ZainHasan6
I build and teach AI • AI/ML @togethercompute • EngSci ℕΨ/PhD @UofT • Previously: vector DBs, data scientist, lecturer & health tech founder • 🇺🇸🇨🇦🇵🇰
SF/Toronto
zainhas.github.io
Joined August 2012
2,002
Following
5,266
Followers
  • Pinned
    user avatar
    Zain
    Together AI
    @ZainHasan6
    Dec 28, 2023
    If you cannot explain something in simple terms, you don't understand it.
    Image
    18K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Mar 31, 2025
    they tested sota LLMs on 2025 US Math Olympiad hours after the problems were released Tested on 6 problems and spoiler alert! They all suck -> 5%
    Image
    1.2M
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Nov 21, 2023
    The most clearest and crisp explanation, I've ever heard, of how large language models compress and capture a "world-model" in their weights simply by learning to predict the next word accurately. Furthermore, how the raw power of these base models can then be tamed by teaching
    Image
    00:00
    971K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Jun 22, 2024
    Curriculum for Karpathy's new planned course is 🌶️ github.com/karpathy/LLM10…
    Image
    154K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Mar 11, 2025
    very cool and detailed breakdown of 7 years worth of advancements in post-training LLMs SFT, RLHF, DPO, visual SFT, MoE, reasoning
    Image
    172K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Feb 17, 2024
    "Minimal, clean, educational code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization."👀👀👀
    Image
    GitHub - karpathy/minbpe: Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly...
    From github.com
    165K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Dec 7, 2023
    Anthropic was able to solve the "lost in the middle" problem "by adding the sentence “Here is the most relevant sentence in the context:” to the start of Claude’s response. This was enough to raise Claude 2.1’s score from 27% to 98% on the original evaluation." Does it just take
    Image
    176K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Jul 5, 2024
    ⁉️Should you finetune your LLM or just give relevant examples in the prompt? How many examples should you give for best performance?? If you give more will it hurt perf?? Does order of the examples matter!?? New paper from Deepmind answers all these questions and more, so much
    Image
    110K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Jul 14, 2024
    You don't need a 2 trillion parameter model to tell you the capital of France is Paris. Be smart and route between a panel of models according to query difficulty and model specialty! New paper proposes a framework to train a router that routes queries to the appropriate LLM
    Image
    69K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Feb 28, 2025
    NeoBERT - Another successor to BERT! >> SotA performance on MTEB benchmark >> 250M parameters >> outperforms BERTlarge, RoBERTalarge, NomicBERT, and ModernBERT under identical fine-tuning conditions >> OSS - release all code, data, checkpoints, and training scripts
    Image
    37K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Apr 4, 2025
    New deepseek Paper+Model: DeepSeek-GRM models automatically generate judging principles and critiques without needing a human in the loop to achieve better reward scaling with inference-time compute. Open-source model coming!🧵
    Image
    34K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Oct 25, 2024
    Contextual RAG from Anthropic is pretty cool. Here's an overview of how it works. 👇 Currently re-implementing every part of the pipeline below, to learn it better. Will share a cookbook soon! Contextual RAG: 1. For every chunk - prepend an explanatory context snippet that
    Image
    39K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Jun 29, 2024
    🤖Can multiple smaller open-source LLMs be combined to outperform larger monolithic LLMs?​ New paper shows that LLMs tend to generate better responses when presented with outputs from other models, even if less capable. They use this to build a Mixture of Agents(MoA)
    Image
    39K
  • user avatar
    Zain
    Together AI
    @ZainHasan6
    Feb 21, 2024
    Replying to @paulg
    Can't explain this T-Rex one yet 😅
    Image
    41K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement