Skip to content
GitHub stars
msgvault TUI showing the Senders view

msgvault

Archive a lifetime of email and chat. Fast keyword search, opt-in semantic search, and local AI workflows.

Supports Gmail sync, IMAP, Microsoft 365, PST import, MBOX import, Apple Mail import, and chat/text import (WhatsApp, iMessage, Google Voice, Facebook Messenger, SMS Backup & Restore).

Read the Introduction to learn more about why this project was created.

Install

Terminal window
curl -fsSL https://msgvault.io/install.sh | bash

Windows (PowerShell):

Terminal window
powershell -ExecutionPolicy ByPass -c "irm https://msgvault.io/install.ps1 | iex"

Then set up OAuth credentials and start syncing. You can also build from source.

Why msgvault?

Your email and message data is yours. msgvault downloads a complete local copy of your email (from Gmail, IMAP, or local archives) and imports chats and texts from WhatsApp, iMessage, Google Voice, Facebook Messenger, and SMS Backup & Restore. Keyword search, analytics, the TUI, and the MCP server query local SQLite and Parquet files. Nothing contacts your live mailbox outside sync and deletion commands that you run explicitly. Optional vector search calls only the embedding endpoint you configure; use a local or self-hosted endpoint if message text must never leave your machine or network.

Years of PDFs, photos, documents, and spreadsheets buried in your inbox become ordinary files on your filesystem, deduplicated and instantly searchable. Your data is no longer locked behind a web interface or an API. It’s just files on disk that you own and control.

Features

Full Email Backup

Downloads complete messages from Gmail or any IMAP server, including raw MIME, labels, metadata, and every attachment. Every PDF, photo, spreadsheet, and document you’ve ever received or sent is extracted and stored locally, deduplicated by content hash.

Lightning-Fast TUI

Explore hundreds of thousands of messages with instant aggregation and drill-down. Powered by DuckDB over Parquet, hundreds of times faster than SQL JOINs, in a small footprint.

Full-Text Search

SQLite FTS5-powered search with Gmail-like query syntax. Search by sender, date, label, size, attachments, and more.

Semantic & Hybrid Search

Opt-in semantic search with vectors stored locally, plus hybrid ranking that fuses BM25 and vector similarity via Reciprocal Rank Fusion. Point msgvault at a local or self-hosted OpenAI-compatible embedding endpoint and query by meaning, not just keywords. Exposed through local CLI search, the HTTP API, and MCP server.

Multi-Account

Archive multiple sources in a single database, group accounts into collections, manage per-account identities, and deduplicate safely.

Incremental Sync

Uses Gmail History API for efficient updates after initial full sync. Resumable checkpoints for interrupted syncs.

MCP Server

Expose your archive to AI assistants like Claude Desktop via the Model Context Protocol. Search, read, and analyze your messages from any MCP-compatible LLM.

Web Server

REST API for programmatic access to your archive. Optional cron-based background sync scheduling. Build dashboards, automations, and integrations.

Local Import

Import PST archives, MBOX archives, Apple Mail .emlx exports, and chats/texts from WhatsApp, iMessage, Google Voice, Facebook Messenger, and SMS Backup & Restore. Messages are indexed and searchable alongside your email data.

Safe Deletion

Stage messages for deletion in the TUI or via AI assistant, review manifests, then permanently delete from Gmail or IMAP provider.

How It Works

msgvault architecture: Gmail API syncs to SQLite, then offline Parquet analytics, FTS5 search, TUI, and MCP Server