Amir Haghighat (@amiruci) / X

Amir Haghighat

749 posts

Amir Haghighat

@amiruci

Co-founder @baseten

San Francisco, CA

Joined May 2009

Amir Haghighat
@amiruci
Nov 11, 2025
A few days ago Kimi K2 Thinking significantly narrowed the capability gap between open and closed LLMs. Today Baseten is the only provider to deliver over 100 tok/sec on this massive 1T-parameter model.
20K
Amir Haghighat
@amiruci
Sep 5, 2025
We closed our series D at $2.1b. It happened 8 months after our series C, which seems too fast until you consider the facts: 2-years worth of growth in 8 months, virtually 0 customer churn, healthy margins, and QoQ NDR numbers that are considered top-tier YoY. The market demand
53K
Amir Haghighat
@amiruci
Feb 19, 2025
Today we announced our $75m series C after growing revenue 6x in a year. But this milestone seemed impossible 3 years ago. This post is mostly about that. Baseten is 5.5 years old. The company truly wasn’t working for the first 3 years. Even back then we were an ML infra
26K
Amir Haghighat
@amiruci
Mar 1, 2023
We launched Blueprint today: an easy way for engineers to fine-tune and serve open source foundation models. 🧵
26K
Amir Haghighat
@amiruci
Aug 6, 2025
It's important to support newly released open-weight models on day 1. But it's not noteworthy. What's noteworthy is to have the inference optimization muscle to immediately blow the competition out of water on latency and throughput. As measured by OpenRouter:
29K
Amir Haghighat
@amiruci
Jun 10, 2025
"Where are your GPUs?" I get this question on sales calls. The answer is 10 different public clouds in 40+ regions. The hard part wasn't acquiring compute; it was using them dynamically to scale a single model across the world. It took us time to build, but the gains are worth
20K
Amir Haghighat
@amiruci
Jun 12, 2025
And this tweet is the reason why:
Amir Haghighat
@amiruci
Jun 10, 2025
"Where are your GPUs?" I get this question on sales calls. The answer is 10 different public clouds in 40+ regions. The hard part wasn't acquiring compute; it was using them dynamically to scale a single model across the world. It took us time to build, but the gains are worth
24K
Amir Haghighat
@amiruci
Oct 24, 2025
There's an obsession with tok/sec as *the* metric in LLM inference. But in latency-sensitive use cases the metic that matters more is time-to-first-token: - Code edit use cases have short outputs and overall latency is heavily determined by ttft - Voice AI use cases care about
18K
Amir Haghighat
@amiruci
May 21, 2025
Product launch with the backstory: Internally we had always said let's do *1 thing* but do it well. For us that was inference. And we said at some point we'll earn the rights to expand the surface area beyond that. That some point is today. The vast majority of our revenue
8.9K
Amir Haghighat
@amiruci
Oct 7, 2025
Go team @FactoryAI!
Factory
@FactoryAI
Oct 7, 2025
Replying to @FactoryAI
Deploy and serve custom models with enterprise-grade infrastructure on @baseten. Special promo for Factory users: receive $500 Model API credits when you fill out this form. baseten.co/talk-to-us/fac…
3.1K
Amir Haghighat
@amiruci
Apr 26, 2022
Announcement time: today anyone can use @baseten. During the past 2 years we've been busy: ◾Building the product we wish we had in our previous jobs ◾Onboarding customers, getting them to tangible value, and iterating 🧵
Amir Haghighat
@amiruci
Apr 14, 2023
Many highly requested ML infra features are packed in 1 screenshot. I'll unpack some:
7.5K
Amir Haghighat
@amiruci
Jun 27, 2024
Quick story: our customers kept telling us: "Baseten has the inference layer covered; great. But our workflow isn't 'call a single model and run with the result'. We need to call a series of custom models. And calling model-after-model is adding a) latency, b) egress cost, and
00:00
2.5K
Amir Haghighat
@amiruci
Dec 25, 2021
🎄🎅 Something fun for holiday family gatherings: restoring old photos. Here's one with my mom and I :) You can try it: app.baseten.co/applications/Q…. It's a @baseten app + GFP GAN.