Image
Image

Announcing our $50M Seed Round, led by Menlo Ventures

The Fastest LLLMs Ever Built

The Fastest LLLMs Ever Built

The Fastest LLLMs Ever Built

Diffusion LLMs: A Breakthrough for Speed and Quality

By using Mercury, you agree to our Terms of Use and have read our Privacy Policy

Image
Image
Image

Powering Cutting-Edge AI Applications

The Mercury Diffusion Models

The Mercury Diffusion Models

Blazing fast inference
with frontier quality
at a fraction of the cost.

Image
Image
Image
Image

The Diffusion Difference From Sequential to Parallel Text Generation

All other LLMs generate text one token at a time. Mercury diffusion LLMs (dLLMs) generate tokens in parallel, increasing speed and maximizing GPU efficiency.

See the difference

How it works

Image

The Diffusion Difference From Sequential to Parallel Text Generation

All other LLMs generate text one token at a time. Mercury diffusion LLMs (dLLMs) generate tokens in parallel, increasing speed and maximizing GPU efficiency.

See the difference

How it works

Image

The Diffusion Difference From Sequential to Parallel Text Generation

All other LLMs generate text one token at a time. Mercury diffusion LLMs (dLLMs) generate tokens in parallel, increasing speed and maximizing GPU efficiency.

See the difference

How it works

Image

AI Applications Made Possible
with Mercury

Lightning-fast code editing

Stay in flow with responsive autocomplete, intelligent tab suggestions, fast chat responses, and more.

Real-time voice agents

Engage naturally with AI for customer support, translation, and beyond.

Fast, creative co-pilots

Supercharge editorial and creative work—less waiting, more creating.

Rapid enterprise search

Instantly surface the right data from across your organization’s knowledge base.

Seamless enterprise workflows

Automate complex routing, analytics, and decision processes with ultra-responsive AI.

Image

AI Applications Made Possible
with Mercury

Lightning-fast code editing

Stay in flow with responsive autocomplete, intelligent tab suggestions, fast chat responses, and more.

Real-time voice agents

Engage naturally with AI for customer support, translation, and beyond.

Fast, creative co-pilots

Supercharge editorial and creative work—less waiting, more creating.

Rapid enterprise search

Instantly surface the right data from across your organization’s knowledge base.

Seamless enterprise workflows

Automate complex routing, analytics, and decision processes with ultra-responsive AI.

Image

AI Applications Made Possible
with Mercury

Lightning-fast code editing

Stay in flow with responsive autocomplete, intelligent tab suggestions, fast chat responses, and more.

Image

Real-time voice agents

Engage naturally with AI for customer support, translation, and beyond.

Fast, creative co-pilots

Supercharge editorial and creative work—less waiting, more creating.

Rapid enterprise search

Instantly surface the right data from across your organization’s knowledge base.

Seamless enterprise workflows

Automate complex routing, analytics, and decision processes with ultra-responsive AI.

Our Models

Our Models

Image

Mercury Coder

Mercury Coder

dLLM optimized to accelerate coding workflows

dLLM optimized to accelerate coding workflows

Streaming, tool use, and structured output

128K context window

Input $0.25 | Output $1 per 1M tokens

Image

Mercury

Mercury

General-purpose dLLM that provides ultra-low latency 

General-purpose dLLM that provides ultra-low latency 

Streaming, tool use, and structured output

128K context window

Input $0.25 | Output $1 per 1M tokens

Image
Image
Image

An Enterprise AI Partner

We’re available through major cloud providers like AWS Bedrock. Talk with us about fine-tuning, private deployments, and forward-deployed engineering support.

Image
Image
Image

Integrate in Seconds

Our models are OpenAI API compatible and a drop-in replacement for traditional LLMs.

Our models are OpenAI API compatible and a drop-in replacement for traditional LLMs.


Providers

What Customers are Saying

What Customers are Saying

Image

"I was amazed by how fast it was. The multi-thousand tokens per second was absolutely wild, nothing like I've ever seen."

Jacob Kim

Software Engineer

Image
Image
Image
Image

"After trying Mercury, it's hard to go back. We are excited to roll out Mercury to support all of our voice agents."

Oliver Silverstein

CEO

Image
Image
Image
Image

"We cut routing and classification overheads to sub-second latencies even on complex agent traces."

Damian Tran

CEO

Image
Image
Image