Baseten (@baseten) / X

Baseten

2,384 posts

Baseten

@baseten

Inference is everything.

San Francisco and New York

Joined March 2021

Pinned
Baseten
@baseten
May 13
Intelligence should be defined by the people closest to the work. Intelligence should be owned by all of us. Let’s build a many model future!
Tuhin Srivastava
@tuhinone
May 13
Article
A many model future
Obsessives have always moved the world forward. They are responsible for our most beloved products, proudest scientific achievements, most moving art, the greatest leaps in what we're capable of....
9.9K
Baseten
@baseten
15h
The longer the context, the more memory your LLM needs. We introduce research techniques to compress that memory 200x on the fly without changing the base model.
Charlie O'Neill
@oneill_c
15h
1/ You can shrink a language model's KV cache by 200×, in a single forward pass, and it still answers correctly. At 256k context that's 36 GiB of cache down to ~360 MiB, with no change to the base model. Here's how we did it 👇
2.7K
Baseten
@baseten
Jun 9
Replying to @baseten
respan.ai
AI Gateway for Production LLM Routing | Respan
OpenAI-compatible gateway with failover, response caching, per-key limits, and production tracing on one platform.
189
Baseten
@baseten
Jun 9
Baseten is live on the Respan Gateway. Congratulations to the @RespanAI team on their Gateway launch as they bring observability, evals, and routing to agents. Try Baseten Model APIs now on Respan.
829
Baseten reposted
Sarah Sachs
@sarahmsachs
Jun 8
Model selection isn't just a fancy term for "looking at benchmarks". If you're just auto-updating and going off twitter vibes, you're not really adding any value to your business or your customers. To do this well, it means you need to deeply understand your use cases, how much
Charlie O'Neill
@oneill_c
Jun 8
Working in the Training team at Baseten, I often see companies agonize over which model to use. So many people worry about how to keep up with benchmarks and new releases But with post-training and specialization, and as we see a rising tide in the intelligence of many
How to choose an AI model with Gamma and Notion · Luma
From luma.com
2.2K
Baseten
@baseten
Jun 8
Replying to @baseten @thatsjonsense and 3 others
Sign up here:
How to choose an AI model with Gamma and Notion · Luma
From luma.com
1.3K
Baseten
@baseten
Jun 8
Join Charlie for a conversation with @thatsjonsense and @sarahmsachs on how @GammaApp and @NotionHQ think about model selection on June 25th.
Charlie O'Neill
@oneill_c
Jun 8
Working in the Training team at Baseten, I often see companies agonize over which model to use. So many people worry about how to keep up with benchmarks and new releases But with post-training and specialization, and as we see a rising tide in the intelligence of many
2.5K
Baseten
@baseten
Jun 5
Replying to @baseten
Check it out here:
GLM 5.1 | Model library
From baseten.co
724
Baseten
@baseten
Jun 5
GLM 5.1 now achieves 160+ TPS and <2-second TTFT on Baseten. Ideal for agentic workloads that need high throughput and low latency.
6.5K
Baseten
@baseten
Jun 4
Replying to @baseten
Read our full write-up:
Introducing NVIDIA Nemotron 3 Ultra: The Nemotron 3.x family is here!
From baseten.co
392
Baseten
@baseten
Jun 4
Are you tired of waiting 17 minutes for an AI agent to finish a code change? As an agent’s context grows, standard transformer attention can turn long runs into a bottleneck. @NVIDIAAI Nemotron 3 Ultra addresses this with a hybrid architecture that replaces several
NVIDIA
@nvidia
Jun 4
Introducing NVIDIA Nemotron 3 Ultra. A frontier smart open model built for long-running agents that need to plan, reason, use tools and keep working across complex coding, research and enterprise workflows. Up to 5x faster inference and up to 30% lower cost for agentic tasks.
00:00
1.4K
Baseten reposted
Tuhin Srivastava
@tuhinone
Jun 2
Today we're announcing MAI-Thinking-1 with Microsoft and it will be available on Baseten soon. Microsoft built something genuinely different here: a commercial-grade thinking model trained on clean data with no distillation from third-party models and designed to be fine-tuned
38K
Baseten reposted
Dannie Herzberg
@DannieHerz
Jun 4
I’m thrilled to welcome Gabe Stern to Baseten to lead Legal. Gabe is the whole package: deeply experienced, sharp, highly trusted, and commercially minded. We first got to work together at Slack, where he was an exceptional partner and played a critical role through Slack's
Baseten
@baseten
Jun 4
We are excited to welcome Gabe Stern as General Counsel. Welcome, Gabe!
3.5K