Log inSign up
Atila
argmax
843 posts
Image
user avatar
Atila
argmax
@atiorh
on-device AI at @argmax
San Francisco, CA
Joined July 2016
198
Following
2,721
Followers
  • Pinned
    user avatar
    Atila
    argmax
    @atiorh
    Jul 24, 2025
    Argmax Pro SDK is now generally available!!
    user avatar
    argmax
    @argmax
    Jul 24, 2025
    Introducing Real-time Transcription with Nvidia Parakeet - Same top accuracy as file transcription - Best-in-market 160 ms lips-to-screen latency - 744x more cost-efficient compared to cloud APIs - Available in Argmax Pro SDK starting today! Link in comments
    Image
    00:00
    9.6K
  • user avatar
    Atila
    argmax
    @atiorh
    Jun 14, 2023
    Exciting updates to #stablediffusion with Core ML! - 6-bit weight compression that yields just under 1 GB - Up to 30% improved Neural Engine performance - New benchmarks on iPhone, iPad and Macs - Multilingual system text encoder support - ControlNet github.com/apple/ml-stabl… 🧵
    Image
    GitHub - apple/ml-stable-diffusion: Stable Diffusion with Core ML on Apple Silicon
    From github.com
    377K
  • user avatar
    Atila
    argmax
    @atiorh
    Sep 28, 2023
    Stable Diffusion XL on iPhone with Core ML! - 4-bit weight compression - Works on iOS 17 & iPhone 13 Pro or newer - Other features and improvements to the repo 🧵
    Image
    GitHub - apple/ml-stable-diffusion: Stable Diffusion with Core ML on Apple Silicon
    From github.com
    232K
  • user avatar
    Atila
    argmax
    @atiorh
    Jun 6, 2022
    As part of #WWDC22 , we are open-sourcing a reference implementation of the Transformer architecture optimized for the Apple Neural Engine (ANE)! github.com/apple/ml-ane-t… (1/n) 🧵
    Image
    GitHub - apple/ml-ane-transformers: Reference implementation of the Transformer architecture...
    From github.com
  • user avatar
    Atila
    argmax
    @atiorh
    Dec 1, 2022
    Delighted to share #stablediffusion with Core ML on Apple Silicon built on top of @huggingface diffusers! 🧵
    Image
  • user avatar
    Atila
    argmax
    @atiorh
    May 4, 2024
    Thanks for updating the license to MIT @Apple ! Let's build 🫡
    Image
    ml-stable-diffusion/LICENSE.md at main · apple/ml-stable-diffusion
    From github.com
    49K
  • user avatar
    Atila
    argmax
    @atiorh
    Dec 21, 2023
    My takeaways from Apple's “LLM in a flash" (1/n)
    user avatar
    AK
    @_akhaliq
    Dec 20, 2023
    Apple announces LLM in a flash: Efficient Large Language Model Inference with Limited Memory paper page: huggingface.co/papers/2312.11… Large language models (LLMs) are central to modern natural language processing, delivering exceptional performance in various tasks. However, their
    Image
    164K
  • user avatar
    Atila
    argmax
    @atiorh
    Jul 27, 2023
    Stable Diffusion XL with Core ML on Apple Silicon! #SDXL The model grew 3x in size to ~2.6 billion parameters so we are releasing a new model compression technique that yields variants quantized to as little as 3 bits with minimal output difference 🧵
    Image
    69K
  • user avatar
    Atila
    argmax
    @atiorh
    Sep 12, 2023
    35 TFlops of ML compute in your pocket! (#iPhone15Pro) On-device inference is getting interesting.. #AppleEvent
    Image
    117K
  • user avatar
    Atila
    argmax
    @atiorh
    Jul 29, 2024
    Apple Intelligence hits the market in beta today: A pretty impressive 2.6b on-device LLM running on the Neural Engine compressed down to ~1GB. It consumes way below 10W. Congrats to my former teammates & colleagues on landing this! Tech report is also out:
    user avatar
    Max Weinbach
    Creative Strategies, Inc
    @mweinbach
    Jul 29, 2024
    Depending on the task you give Apple Intelligence, it can peak up to ~5.5W on the ANE Mail summarization is less than 1-2W, but rewriting here hits up to around 5.9W. This is admittedly very efficient. Also, it did a better job at re-writing this document than Gemini did lol
    Image
    00:00
    25K
  • user avatar
    Atila
    argmax
    @atiorh
    Dec 1, 2022
    Replying to @atiorh
    Please refer to our code repository for details:
    Image
    GitHub - apple/ml-stable-diffusion: Stable Diffusion with Core ML on Apple Silicon
    From github.com
  • user avatar
    Atila
    argmax
    @atiorh
    Jun 18, 2024
    Thanks @Apple
    user avatar
    argmax
    @argmax
    Jun 18, 2024
    WhisperKit is 40% faster on iOS 18 Improved from 165 to 237 tok/s on whisper-base Repo: github.com/argmaxinc/Whis… Test App: testflight.apple.com/join/LPVOyJZW
    Image
    00:00
    22K
  • user avatar
    Atila
    argmax
    @atiorh
    Nov 9, 2023
    Persimmon-8b LLM (@AdeptAILabs) has ~95% activation sparsity in many of its layers which is crazy! Here is a gist that prints some stats. Most zeros are shared across tokens too:
    Image
    Activation Sparsity in LLMs
    From gist.github.com
    40K
  • user avatar
    Atila
    argmax
    @atiorh
    Dec 1, 2022
    Replying to @atiorh
    Today's release of macOS Ventura 13.1 Beta 4 and iOS and iPadOS 16.2 Beta 4 include optimizations that let Stable Diffusion run with improved efficiency on the Apple Neural Engine as well as on Apple Silicon GPU

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms of Service|Privacy Policy|Cookie Policy|Accessibility|Ads info|© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement