Open-source data infrastructure for AI

Fast, serverless, and scalable infrastructure supporting vector, full-text, regex, and metadata search. Built on object storage and trusted by millions of developers. Open-source Apache 2.0.

Or, get started locally.

Read case study →

AI App

Ask a question

Chroma

knowledge_base - 1,277,467 records

awaiting query input

15M+ monthly downloads

Apache 2.0
27k Github stars

Low latency search

Fast queries over billions of multi-tenant indexes.

Up to 10x cheaper

Built on object storage with automatic data tiering.

No engineering ops

Scales with your data and traffic. SOC 2 Type II.

Features

◆

Vector search

Semantic similarity search

◇

Sparse vector search

Lexical search (BM25, SPLADE)

●

Full-text search

Trigram and regex search

◐

Metadata search

Filtering and faceted search

◊

Forking

Dataset versioning, A/B testing, and roll-outs

▣

CLI

Command-line tools for development

import { ChromaClient } from 'chromadb'
const client = new ChromaClient()

const collection = await client.getOrCreateCollection({
  name: "my_collection"
})

// Add documents with embeddings
await collection.add({
  ids: ["id1", "id2"],
  documents: ["This is a document", "Another doc"],
  embeddings: [[1.2, 2.3, ...], [3.4, 4.5, ...]]
})

// Query by vector similarity
const results = await collection.query({
  queryEmbeddings: [[1.1, 2.2, ...]],
  nResults: 10
})

Terminal Output

Run the code above to see the output ^

Performance

Fast search over billions of multi-tenant indexes

Chroma's indexes are built and optimized for object-storage offering unparalleled cost and performance. State-of-the-art vector, full-text, and regex search.

Latency

Query Latency

@384 dim at 100k vectors

Warm

Cold

p50

20ms

650ms

p90

27ms

1.2s

p99

57ms

1.5s

Technical specs

Write throughput (per collection)30 MB/s (2000+ QPS)

Concurrent reads (per collection)10 (200+ QPS)

Collections per database1M

Records per collection5M

Recall90-100%

Zero-ops infra

┌───────────────────────────────┐
│ Query Layer                   │
│   Fast memory cache (hot)     │
│   SSD cache (warm)            │
└───────────────────────────────┘

↕ Intelligent tiering

┌───────────────────────────────┐
│ Storage Layer                 │
│   S3 / GCS (cold)             │
│     • All vectors             │
│     • All metadata            │
│     • All indexes             │
└───────────────────────────────┘

Unlike legacy search systems, Chroma is a database you'll want to be on-call for.

✓Auto-scales with usage

✓No manual tuning

✓Serverless pricing

Chroma takes full advantage of object storage with automatic query-aware data tiering and caching.

✓Vectors are large: 1GB text → 15GB of vectors

✓Memory is expensive: $5/GB/mo

✓Object storage is not: $0.02/GB/mo

Enterprise

Chroma brings the security, compliance, education and operational model enterprises need with our Apache 2.0 architecture.

BYOC in your VPC, multi-cloud/multi-region replication, point-in-time-recovery ensure a resilient and scalable search system with the same 0-ops story as Cloud.

 ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
 ▓░                                         ░▓
 ▓░  ┌──────────── YOUR VPC ─────────────┐  ░▓
 ▓░  │                                   │  ░▓
 ▓░  │   █ DATA PLANE █                  │  ░▓
 ▓░  │                                   │  ░▓
 ▓░  │   Your data, your cloud           │  ░▓
 ▓░  │                                   │  ░▓
 ▓░  │                                   │  ░▓
 ▓░  └───────────────────────────────────┘  ░▓
 ▓░                    │                    ░▓
 ▓░                    │                    ░▓
 ▓░                    ▼                    ░▓
 ▓░  ═════════════════════════════════════  ░▓
 ▓░  ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░  ░▓
 ▓░                                         ░▓
 ▓░  ┌────────── CHROMA VPC ─────────────┐  ░▓
 ▓░  │                                   │  ░▓
 ▓░  │   █ CONTROL PLANE █               │  ░▓
 ▓░  │                                   │  ░▓
 ▓░  │   Managed by Chroma               │  ░▓
 ▓░  │   Monitoring, backups, ops        │  ░▓
 ▓░  │                                   │  ░▓
 ▓░  └───────────────────────────────────┘  ░▓
 ▓░                                         ░▓
 ▓░  ✓ BYOC in your VPC                     ░▓
 ▓░  ✓ Multi-region replication             ░▓
 ▓░  ✓ 0-ops management                     ░▓
 ▓░                                         ░▓
 ▓░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░▓
 ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒

[▶] Videos

Chroma Context-1