Web Server

Overview

msgvault serve starts an HTTP server that exposes your local email archive over a REST API. It optionally runs a background sync scheduler to keep accounts up to date on a cron-based schedule.

The API queries the same SQLite database and attachment store as the CLI and TUI. Keyword search and ordinary archive reads stay local. If vector search is enabled, semantic and hybrid search also call the embedding endpoint configured in [vector.embeddings]. The server is designed for local integrations, dashboards, and automation scripts.

Quick Start

Add a [server] section to your config.toml:

[server]
api_port = 8080
api_key = "your-secret-key"

Start the server:

msgvault serve

Test connectivity:

# Health check (no auth required)
curl http://localhost:8080/health

# Archive stats (auth required)
curl -H "Authorization: Bearer your-secret-key" http://localhost:8080/api/v1/stats

Authentication

All endpoints except /health require authentication when api_key is set in your config. Three authentication methods are supported:

Method	Header	Example
Bearer token	`Authorization: Bearer <key>`	`Authorization: Bearer my-secret`
API key header	`X-API-Key: <key>`	`X-API-Key: my-secret`
Plain auth header	`Authorization: <key>`	`Authorization: my-secret`

If no api_key is configured, authentication is not required regardless of bind address. The separate allow_insecure / security validation prevents starting without an API key on non-loopback addresses.

API Endpoints

GET `/health`

Health check endpoint. Does not require authentication.

Response:

{"status": "ok"}

GET `/api/v1/stats`

Archive statistics. When vector search is configured on the server, the response also includes a vector_search sub-object describing the state of the index.

Response (vector search disabled):

{
  "total_messages": 142857,
  "total_threads": 48293,
  "total_accounts": 2,
  "total_labels": 47,
  "total_attachments": 31204,
  "database_size_bytes": 8589934592
}

Response (vector search enabled):

{
  "total_messages": 142857,
  "total_threads": 48293,
  "total_accounts": 2,
  "total_labels": 47,
  "total_attachments": 31204,
  "database_size_bytes": 8589934592,
  "vector_search": {
    "enabled": true,
    "active_generation": {
      "id": 3,
      "model": "nomic-embed-text-v1.5",
      "dimension": 768,
      "fingerprint": "nomic-embed-text-v1.5:768:p1-111111:c32768:e1",
      "state": "active",
      "activated_at": "2026-04-18T15:12:33Z",
      "message_count": 142820
    },
    "building_generation": {
      "id": 4,
      "model": "nomic-embed-text-v2",
      "dimension": 768,
      "started_at": "2026-04-19T09:02:10Z",
      "progress": { "done": 8200, "total": 142857 }
    },
    "pending_embeddings_total": 134657
  }
}

active_generation is always present in the object (null until the first build completes). building_generation is omitted when no rebuild is in flight. pending_embeddings_total is the sum of rows still pending across the active and building generations. See Vector Search for the end-to-end workflow.

GET `/api/v1/messages`

Paginated message list.

Parameter	Type	Default	Description
`page`	int	`1`	Page number
`page_size`	int	`20`	Results per page

Response:

{
  "total": 142857,
  "page": 1,
  "page_size": 20,
  "messages": [
    {
      "id": 12345,
      "subject": "Q4 Planning",
      "from": "[email protected]",
      "to": ["[email protected]"],
      "cc": ["[email protected]"],
      "sent_at": "2024-10-15T09:30:00Z",
      "snippet": "Here's the draft for Q4...",
      "labels": ["INBOX", "IMPORTANT"],
      "has_attachments": true,
      "size_bytes": 52480
    }
  ]
}

GET `/api/v1/messages/{id}`

Full message details including body and attachment metadata.

Response:

{
  "id": 12345,
  "subject": "Q4 Planning",
  "from": "[email protected]",
  "to": ["[email protected]"],
  "cc": ["[email protected]"],
  "bcc": ["[email protected]"],
  "sent_at": "2024-10-15T09:30:00Z",
  "snippet": "Here's the draft for Q4...",
  "labels": ["INBOX", "IMPORTANT"],
  "has_attachments": true,
  "size_bytes": 52480,
  "body": "<plain text body, or HTML when no plain text body exists>",
  "body_html": "<html><body><p>Full HTML body</p></body></html>",
  "attachments": [
    {
      "filename": "q4-plan.pdf",
      "mime_type": "application/pdf",
      "size_bytes": 204800
    }
  ]
}

The cc, bcc, and body_html fields are included only when present. body is the plain-text body when one exists; for HTML-only messages, it falls back to the HTML body so callers still receive message content.

GET `/api/v1/messages/{id}/inline?cid=<content-id>`

Fetch an inline MIME image part by content ID. This is intended for rendering cid: images referenced by body_html.

Parameter	Type	Default	Description
`cid`	string	(required)	MIME `Content-ID` to fetch

Only inline image parts are served. SVG images and non-image inline parts are rejected with 415 unsupported_type. If the query engine cannot fetch raw MIME, the endpoint returns 501 not_supported.

Successful responses set:

Header	Description
`Content-Type`	Inline image content type
`Content-Disposition`	`inline`
`Cache-Control`	`private, max-age=31536000, immutable`
`X-Content-Type-Options`	`nosniff`

GET `/api/v1/search`

Search messages. The default mode is full-text search (FTS5 with LIKE fallback). When the server is configured for vector search, mode=vector runs semantic-only search and mode=hybrid fuses BM25 and vector ranking via Reciprocal Rank Fusion.

mode=vector and mode=hybrid both require at least one free-text term in q — the free text is what gets embedded as the query vector. Operator-only queries such as q=from:alice have nothing to embed and return 400 missing_free_text; route filter-only requests to mode=fts instead.

Parameter	Type	Default	Description
`q`	string	(required)	Search query
`mode`	enum	`fts`	`fts`, `vector`, or `hybrid`
`page`	int	`1`	Page number (FTS only — vector/hybrid reject `page>1`)
`page_size`	int	`20`	Results per page (max 100 for FTS, max `[vector].search.max_page_size_hybrid` for vector/hybrid)
`explain`	0/1	`0`	When `1` and `mode=vector

Response (mode=fts, default):

{
  "query": "quarterly report",
  "total": 23,
  "page": 1,
  "page_size": 20,
  "messages": [
    {
      "id": 12345,
      "subject": "Q4 Planning",
      "from": "[email protected]",
      "to": ["[email protected]"],
      "cc": ["[email protected]"],
      "sent_at": "2024-10-15T09:30:00Z",
      "snippet": "Here's the draft for Q4...",
      "labels": ["INBOX", "IMPORTANT"],
      "has_attachments": true,
      "size_bytes": 52480
    }
  ]
}

Response (mode=vector or mode=hybrid):

{
  "query": "when is the planning offsite",
  "mode": "hybrid",
  "returned": 12,
  "pool_saturated": false,
  "generation": {
      "id": 3,
      "model": "nomic-embed-text-v1.5",
      "dimension": 768,
      "fingerprint": "nomic-embed-text-v1.5:768:p1-111111:c32768:e1",
      "state": "active"
    },
  "took_ms": 84,
  "results": [
    {
      "id": 12345,
      "subject": "Q2 planning offsite agenda",
      "from": "[email protected]",
      "to": ["[email protected]"],
      "sent_at": "2024-01-15T10:30:00Z",
      "snippet": "Proposed agenda for the offsite on...",
      "labels": ["INBOX"],
      "has_attachments": false,
      "size_bytes": 2048
    }
  ]
}

Vector and hybrid responses expose returned instead of total (ANN search does not have a meaningful total count), add a generation sub-object naming the index generation that answered the query, and include took_ms. The top-level results array replaces messages. pool_saturated is true when a vector or BM25 candidate pool hit its configured cap (or pure vector search returned as many hits as requested), hinting that increasing the limit or narrowing the query may expose more relevant results.

When explain=1, each element of results carries an extra score object exposing the fused-score components:

{
  "id": 12345,
  "subject": "...",
  "score": {
    "rrf": 0.032,
    "bm25": 7.4,
    "vector": 0.82,
    "subject_boosted": true
  }
}

bm25 and vector are omitted when the message did not appear in that signal (BM25 missed it or the ANN pool did not include it). rrf is omitted in mode=vector (only one signal — there is nothing to fuse). subject_boosted is true when the subject-line boost was applied.

See Searching for the full query syntax reference and Vector Search for vector / hybrid setup.

GET `/api/v1/accounts`

List configured accounts with sync status.

Response:

{
  "accounts": [
    {
      "email": "[email protected]",
      "display_name": "Your Name",
      "last_sync_at": "2024-10-15T08:00:00Z",
      "next_sync_at": "2024-10-15T09:00:00Z",
      "schedule": "0 * * * *",
      "enabled": true
    }
  ]
}

POST `/api/v1/auth/token/{email}`

Upload an OAuth token JSON file generated by a local msgvault client.

This endpoint is used by msgvault export-token during remote/headless deployment workflows.

Request headers:

X-API-Key: <api-key> (or any supported auth header)
Content-Type: application/json

Example request body (/api/v1/auth/token/[email protected]):

{
  "access_token": "ya29...",
  "token_type": "Bearer",
  "refresh_token": "1//0g...",
  "expiry": "2024-12-31T23:59:59Z",
  "scopes": ["https://www.googleapis.com/auth/gmail.modify"]
}

Successful response (201 Created):

{
  "status": "created",
  "message": "Token saved for [email protected]"
}

POST `/api/v1/accounts`

msgvault export-token posts to this endpoint automatically after uploading a token.

{
  "email": "[email protected]",
  "schedule": "0 2 * * *"
}

The enabled field is always set to true server-side.

If the account already exists (200 OK):

{
  "status": "exists",
  "message": "Account already configured for [email protected]"
}

On success (201 Created):

{
  "status": "created",
  "message": "Account added for [email protected]"
}

POST `/api/v1/sync/{account}`

Trigger a manual sync for an account. Returns immediately with a 202 status while the sync runs in the background.

Response (202 Accepted):

{
  "status": "accepted",
  "message": "Sync started for [email protected]"
}

GET `/api/v1/scheduler/status`

Scheduler state and per-account schedule details.

Response:

{
  "running": true,
  "accounts": [
    {
      "email": "[email protected]",
      "running": false,
      "last_run": "2024-10-15T08:00:00Z",
      "next_run": "2024-10-15T09:00:00Z",
      "schedule": "0 * * * *"
    }
  ]
}

Rate Limiting

The API enforces rate limiting of 10 requests per second per client IP, with a burst allowance of 20 requests. When the limit is exceeded, the server responds with HTTP 429 and includes a Retry-After header indicating how long to wait before retrying.

CORS

Cross-Origin Resource Sharing is disabled by default. To allow browser-based clients, configure allowed origins in your config.toml:

[server]
cors_origins = ["http://localhost:3000", "https://my-dashboard.example.com"]
cors_credentials = true
cors_max_age = 3600

Scheduled Sync

The server can automatically sync Gmail and IMAP accounts on a cron-based schedule. Add [[accounts]] sections to your config:

[[accounts]]
email = "[email protected]"
schedule = "0 * * * *"    # every hour
enabled = true

[[accounts]]
email = "[email protected]"
schedule = "*/15 * * * *" # every 15 minutes
enabled = true

The scheduler starts automatically with msgvault serve when account schedules are configured. It resolves each entry to a syncable source type, including Gmail and IMAP. Use the /api/v1/scheduler/status endpoint to monitor schedule state, and /api/v1/sync/{account} to trigger a manual sync outside the schedule.

msgvault serve also runs scheduled SyncTech SMS Backup & Restore Drive sources configured under [[synctech_sms.sources]]; see Configuration.

Security Model

The server is designed for local use:

Loopback-only by default. The default bind address is 127.0.0.1, restricting access to the local machine.
API key required for non-loopback. If you bind to a non-loopback address (e.g., 0.0.0.0), the server requires api_key to be set and will refuse to start without it.
Opt-in for insecure binding. To bind to a non-loopback address without an API key (not recommended), set allow_insecure = true.

Configuration Reference

All server settings go in the [server] section of config.toml. Account schedules use [[accounts]] sections.

`[server]`

Key	Default	Description
`api_port`	`8080`	Port the server listens on
`bind_addr`	`127.0.0.1`	Bind address
`api_key`	—	API key for authentication
`allow_insecure`	`false`	Allow non-loopback binding without `api_key`
`cors_origins`	`[]`	Allowed CORS origins
`cors_credentials`	`false`	Allow credentials in CORS requests
`cors_max_age`	`0`	CORS preflight cache duration in seconds (defaults to `86400` when `cors_origins` is set)

`[[accounts]]`

Key	Default	Description
`email`	(required)	Gmail/IMAP account identifier or display name
`schedule`	—	Cron expression for sync schedule
`enabled`	`true`	Whether scheduled sync is active

See the Configuration page for the full config file reference.

Web Server

Overview

Quick Start

Authentication

API Endpoints

GET /health

GET /api/v1/stats

GET /api/v1/messages

GET /api/v1/messages/{id}

GET /api/v1/messages/{id}/inline?cid=<content-id>

GET /api/v1/search

GET /api/v1/accounts

POST /api/v1/auth/token/{email}

POST /api/v1/accounts

POST /api/v1/sync/{account}

GET /api/v1/scheduler/status

Rate Limiting

CORS

Scheduled Sync

Security Model

Configuration Reference

[server]

[[accounts]]

GET `/health`

GET `/api/v1/stats`

GET `/api/v1/messages`

GET `/api/v1/messages/{id}`

GET `/api/v1/messages/{id}/inline?cid=<content-id>`

GET `/api/v1/search`

GET `/api/v1/accounts`

POST `/api/v1/auth/token/{email}`

POST `/api/v1/accounts`

POST `/api/v1/sync/{account}`

GET `/api/v1/scheduler/status`

`[server]`

`[[accounts]]`