Multi-providerCache-awareEdge-friendly

One API for embeddings.
OpenAI, Gemini, Voyage & local Llama.

One API call. Any provider. Built-in caching, usage tracking, and token overflow handling — so you can focus on your product, not your embedding pipeline.

Unified models

Route requests across providers with a single schema. Switch providers or models without touching client code.

Need a model that’s not listed? Request it and we’ll make it available on the Llama provider.

Smart caching

Reduce cost and latency with deterministic cache keys.

When an embedding is already cached, external provider calls are skipped entirely — so you only pay once per unique input.

Bring your own keys

Add your provider keys once in the dashboard, then use a single vectors.space key in your app — no secrets scattered across services.

Dashboard

Usage, performance, and request insights at a glance.

Request Distribution

API calls by model

gemini-embedding-001Gemini
31%
voyage-4Voyage
7%
embeddinggemma-300mLlama
7%
voyage-4-liteVoyage
6%

Token Usage

Last 12 months

4.50M
Total tokens processed

Cache Hit Rate

Last 30 days

34.0%
Efficiency over 12 months

Provider Key Vault

Store your keys for OpenAI, Gemini, and Voyage securely. Our built-in Llama provider requires no configuration.

Service Keys

Create and rotate vectors.space API keys for your applications and environments.

Detailed Request Logs

Inspect searchable logs to debug errors, monitor performance, and understand usage patterns.

Quickstart

Generate embeddings in a single request with your API key.

Simple endpoints

GET /v1/models to list models, POST /v1/embeddings to generate vectors.

Overflow strategies

fail, truncate, or chunk — control what happens when inputs are too long.

Custom dimensions

Reduce size with output_dimension for faster search and lower storage.

Usage info

Every response includes token usage for monitoring and cost estimation.

bash — cURL
curl -X POST https://api.vectors.space/v1/embeddings \
-H "Authorization: Bearer $API_KEY" \
-H "Content-Type: application/json" \
-d '{ "model": "embeddinggemma-300m", "provider": "llama", "input": "Your text here", "strategy": { "type": "truncate" } }'

Built by the team at
SQLite AI

We are building developer-first AI infrastructure. Our goal is to make the "data plane" of AI applications as reliable and predictable as a standard database.

Model Portability

Switch between OpenAI, Gemini, Voyage, or local models instantly. One unified API means you aren't tied to a single model provider's SDK.

Operational Stability

We focus on the boring parts of AI—uptime, latency, and caching—so you can focus on building your core product features.

Neutral Infrastructure

We provide the routing layer, not the models. We remain provider-neutral to ensure you always have access to the best performance.

Contact

Want early access or help integrating? Send a message.

By sending, you agree to be contacted about vectors.space.