Log inSign up
Awni Hannun
5,030 posts
user avatar
Awni Hannun
@awnihannun
ow knee
awnihannun.com
Joined January 2011
348
Following
44.9K
Followers
  • user avatar
    Awni Hannun
    @awnihannun
    Jan 25, 2025
    DeepSeek R1 (the full 680B model) runs nicely in higher quality 4-bit on 3 M2 Ultras with MLX. Asked it a coding question and it thought for ~2k tokens and generated 3500 tokens overall:
    Image
    00:00
    997K
  • user avatar
    Awni Hannun
    @awnihannun
    Jan 20, 2025
    DeepSeek R1 671B running on 2 M2 Ultras faster than reading speed. Getting close to open-source O1, at home, on consumer hardware. With mlx.distributed and mlx-lm, 3-bit quantization (~4 bpw)
    Image
    00:00
    866K
  • user avatar
    Awni Hannun
    @awnihannun
    Jan 22, 2025
    If you want to really feel the future, take your iPhone out of its case and run a Deep Seek 7B reasoning model on it:
    Image
    557K
  • user avatar
    Awni Hannun
    @awnihannun
    Dec 5, 2023
    Just in time for the holidays, we are releasing some new software today from Apple machine learning research. MLX is an efficient machine learning framework specifically designed for Apple silicon (i.e. your laptop!) Code: github.com/ml-explore/mlx Docs: ml-explore.github.io/mlx/build/html…
    Image
    GitHub - ml-explore/mlx: MLX: An array framework for Apple silicon
    From github.com
    959K
  • user avatar
    Awni Hannun
    @awnihannun
    Jan 22, 2025
    DeepSeek R1 distilled to Qwen 1.5B easily runs on my iPhone 16 with MLX swift. Here's the 4-bit model reasoning entirely on device at almost 60 toks/sec:
    Image
    00:00
    1.1M
  • user avatar
    Awni Hannun
    @awnihannun
    Mar 5, 2025
    512 GB in a single Mac Studio! That will fit 4-bit Deep Seek R1 with room to spare.
    Image
    888K
  • user avatar
    Awni Hannun
    @awnihannun
    Sep 26, 2024
    Llama 3.2 1B in 4-bit runs at ~60 toks/sec with MLX Swift on my iPhone 15 pro. It's quite good and easily runs on-device:
    Image
    00:00
    492K
  • user avatar
    Awni Hannun
    @awnihannun
    Nov 7, 2025
    The new 1 Trillion parameter Kimi K2 Thinking model runs well on 2 M3 Ultras in its native format - no loss in quality! The model was quantization aware trained (qat) at int4. Here it generated ~3500 tokens at 15 toks/sec using pipeline-parallelism in mlx-lm:
    Image
    00:00
    501K
  • user avatar
    Awni Hannun
    @awnihannun
    Jul 31, 2024
    Quantized Gemma 2B runs pretty fast on my iPhone 15 pro in MLX Swift. code & docs: github.com/ml-explore/mlx… Comparable to GPT 3.5 turbo and Mixtral 8x7B in @lmsysorg benchmarks but runs efficiently on an iPhone. Pretty wild.
    Image
    00:00
    80K
  • user avatar
    Awni Hannun
    @awnihannun
    Jul 1, 2022
    Read a bit about Grokking recently. Here's some learnings: "Grokking" is a curious neural net behavior observed ~1 year ago (arxiv.org/abs/2201.02177). Continue optimizing a model long after perfect training accuracy and it suddenly generalizes. Figure:
    Image
  • user avatar
    Awni Hannun
    @awnihannun
    Sep 20, 2025
    Running Qwen3 8B thinking on an iPhone Air with MLX. The model is quantized to 4-bit and runs pretty well.
    Image
    00:00
    216K
  • user avatar
    Awni Hannun
    @awnihannun
    Jan 20, 2025
    Wow, DeepSeek R1 Distill Qwen 7B (in 4-bit) nailed the first hard math question I asked it. Thought for ~3200 tokens in about 35 seconds on M4 Max with mlx-lm.
    Image
    00:00
    278K
  • user avatar
    Awni Hannun
    @awnihannun
    Jul 11, 2025
    The new Kimi K2 1T model (4-bit quant) runs on 2 512GB M3 Ultras with mlx-lm and mx.distributed. 1 trillion params, at a speed that's actually quite usable:
    Image
    00:00
    Image
    user avatar
    Kimi.ai
    @Kimi_Moonshot
    Jul 11, 2025
    🚀 Hello, Kimi K2! Open-Source Agentic Model! 🔹 1T total / 32B active MoE model 🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models 🔹Strong in coding and agentic tasks 🐤 Multimodal & thought-mode not supported for now With Kimi K2, advanced agentic intelligence
    238K
  • user avatar
    Awni Hannun
    @awnihannun
    Sep 26, 2025
    2023 LLM training vs 2025 LLM training
    Image
    160K

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement