A fast, SQLite-like embedded vector database with graph-based approximate nearest neighbor search
Open-source project from the team behind Pardus AI
PardusDB is designed for developers building local AI applications — RAG pipelines, semantic search, recommendation systems, or any project that needs lightweight, persistent vector storage without external dependencies.
While Pardus AI gives non-technical users a powerful no-code platform to ask questions of their CSV, JSON, and PDF data in plain English, PardusDB gives developers the same speed and privacy in an embeddable, fully open-source vector database.
- Single-file storage — Everything lives in one
.pardusfile, just like SQLite - Multiple tables — Store different vector dimensions and metadata in the same database
- Familiar SQL-like syntax — CREATE, INSERT, SELECT, UPDATE, DELETE feel natural
- UNIQUE constraints — O(1) duplicate detection using HashSet
- GROUP BY with aggregates — O(n) hash aggregation with COUNT, SUM, AVG, MIN, MAX
- JOINs — O(n+m) hash join algorithm for INNER, LEFT, RIGHT joins
- Fast vector similarity search — Graph-based approximate nearest neighbor search
- Thread-safe — Safe concurrent reads in multi-threaded applications
- Full transactions — BEGIN/COMMIT/ROLLBACK for atomic operations
- Optional GPU acceleration — For large batch inserts and queries
- Zero external dependencies — Pure Rust, MIT licensed
git clone https://github.com/pardus-ai/pardusdb
cd pardusdb
./setup.shThis will build PardusDB and install it as the pardusdb command, available system-wide.
git clone https://github.com/pardus-ai/pardusdb
cd pardusdb
cargo build --releaseThe binary will be at target/release/pardusdb.
pardusdb╔═══════════════════════════════════════════════════════════════╗
║ PardusDB REPL ║
║ Vector Database with SQL Interface ║
╚═══════════════════════════════════════════════════════════════╝
pardusdb [memory]> .create mydb.pardus
Created and opened: mydb.pardus
pardusdb [mydb.pardus]> CREATE TABLE docs (embedding VECTOR(768), content TEXT);
Table 'docs' created
pardusdb [mydb.pardus]> INSERT INTO docs (embedding, content)
VALUES ([0.1, 0.2, 0.3, ...], 'Hello World');
Inserted row with id=1
pardusdb [mydb.pardus]> SELECT * FROM docs
WHERE embedding SIMILARITY [0.1, 0.2, 0.3, ...] LIMIT 5;
Found 1 similar rows:
id=1, distance=0.0000, values=[Vector([...]), Text("Hello World")]
pardusdb [mydb.pardus]> quit
Saved to: mydb.pardus
Goodbye!
# Persistent file
pardusdb mydata.pardus
# In-memory only
pardusdb| Type | Description | Example |
|---|---|---|
VECTOR(n) |
n-dimensional float vector | VECTOR(768) |
TEXT |
UTF-8 string | 'hello world' |
INTEGER |
64-bit integer | 42 |
FLOAT |
64-bit float | 3.14 |
BOOLEAN |
true/false | true |
CREATE TABLE documents (
id INTEGER PRIMARY KEY,
embedding VECTOR(768),
title TEXT,
category TEXT,
score FLOAT
);
INSERT INTO documents (embedding, title, category, score)
VALUES ([0.1, 0.2, ...], 'Introduction to Rust', 'tutorial', 0.95);
SELECT * FROM documents WHERE category = 'tutorial' LIMIT 10;
UPDATE documents SET score = 0.99 WHERE id = 1;
DELETE FROM documents WHERE id = 1;Ensure column values are unique with O(1) duplicate detection:
CREATE TABLE users (
embedding VECTOR(128),
id INTEGER PRIMARY KEY,
email TEXT UNIQUE
);
-- This will fail - duplicate email
INSERT INTO users (embedding, id, email) VALUES ([0.1, ...], 1, '[email protected]');
INSERT INTO users (embedding, id, email) VALUES ([0.2, ...], 2, '[email protected]');
-- Error: Duplicate value for UNIQUE column 'email'Group and aggregate data with O(n) hash aggregation:
-- Aggregate functions: COUNT, SUM, AVG, MIN, MAX
SELECT category, COUNT(*), AVG(score), SUM(amount)
FROM sales
GROUP BY category;
-- With HAVING clause for filtered groups
SELECT category, SUM(amount) as total
FROM sales
GROUP BY category
HAVING SUM(amount) > 1000;Join tables with O(n+m) hash join algorithm:
-- INNER JOIN
SELECT * FROM orders
INNER JOIN users ON orders.user_id = users.id;
-- LEFT JOIN (include all left rows)
SELECT users.email, orders.product
FROM users
LEFT JOIN orders ON users.id = orders.user_id;
-- RIGHT JOIN (include all right rows)
SELECT * FROM users
RIGHT JOIN orders ON users.id = orders.user_id;SELECT * FROM documents
WHERE embedding SIMILARITY [0.12, 0.24, ...]
LIMIT 10;Results are automatically ordered by distance (closest first).
SHOW TABLES;
DROP TABLE documents;| Command | Description |
|---|---|
.create <file> |
Create and open a new database |
.open <file> |
Open an existing database |
.save |
Force save current database |
.tables |
List tables |
.clear |
Clear screen |
help |
Show help |
quit |
Exit (auto-saves if file open) |
| Operation | Time |
|---|---|
| Single insert | ~160 µs/doc |
| Batch insert (1,000 docs) | ~6 ms |
| Query (k=10) | ~3 µs |
Real-world benchmark comparing PardusDB against Neo4j 5.15 for vector similarity operations.
Test Configuration:
- Vector dimension: 128
- Number of vectors: 10,000
- Number of queries: 100
- Top-K: 10
| Database | Insert (10K vectors) | Search (100 queries) | Single Search |
|---|---|---|---|
| PardusDB | 18ms (543K/s) | 355µs (281K/s) | 3µs |
| Neo4j | 35.70s (280/s) | 153ms (650/s) | 1ms |
| Operation | PardusDB Advantage |
|---|---|
| Insert | 1983x faster |
| Search | 431x faster |
PardusDB supports batch inserts for massive performance gains:
| Batch Size | Insert (10K vecs) | Speedup vs Individual |
|---|---|---|
| Individual | 1.52s | 1.0x |
| 100 | 33ms | 45x |
| 500 | 10ms | 149x |
| 1000 | 6ms | 220x |
| Feature | PardusDB | Neo4j |
|---|---|---|
| Architecture | Embedded (SQLite-like) | Client-Server |
| Implementation | Rust (native) | Java (JVM) |
| Setup Time | 0 seconds | 5-10 minutes |
| Memory Overhead | Minimal (~50MB) | High (JVM ~1GB+) |
| Deployment | Single binary/file | Server + Docker/K8s |
| Query Language | SQL-like | Cypher |
Run the benchmark yourself:
# Without Neo4j (PardusDB only)
cargo run --release --bin benchmark_neo4j
# With Neo4j comparison (requires Neo4j running)
docker run -d -p 7687:7687 -e NEO4J_AUTH=neo4j/password123 neo4j:5.15
cargo run --release --features neo4j --bin benchmark_neo4jAccuracy comparison against brute-force exact search (ground truth).
PardusDB Results:
| Metric | K=10 | K=5 | K=1 | Description |
|---|---|---|---|---|
| Recall@K | 99.2% | 94.8% | 68.0% | True neighbors found |
| Precision@K | 99.2% | 94.8% | 68.0% | Correct results ratio |
| MRR | 0.292 | 0.439 | 0.680 | Mean Reciprocal Rank |
PardusDB vs Neo4j Accuracy Comparison:
| Metric | PardusDB | Neo4j | Winner |
|---|---|---|---|
| Recall@10 | 99.2% | 3.0% | PardusDB |
| Recall@5 | 94.8% | 2.8% | PardusDB |
| Recall@1 | 68.0% | 2.0% | PardusDB |
| MRR | 0.292 | 0.010 | PardusDB |
Run accuracy benchmark:
# Without Neo4j (PardusDB only)
cargo run --release --bin benchmark_accuracy
# With Neo4j comparison (requires Neo4j running)
cargo run --release --features neo4j --bin benchmark_accuracyComparison against HelixDB, an open-source graph-vector database built in Rust.
Test Configuration:
- Vector dimension: 128
- Number of vectors: 10,000
- Number of queries: 100
- Top-K: 10
| Database | Insert (10K vectors) | Search (100 queries) | Single Search |
|---|---|---|---|
| PardusDB | 14ms (696K/s) | 280µs (357K/s) | 2µs |
| HelixDB | 2.87s (3.5K/s) | 17ms (5.8K/s) | 172µs |
| Operation | PardusDB Advantage |
|---|---|
| Insert | 200x faster |
| Search | 62x faster |
| Feature | PardusDB | HelixDB |
|---|---|---|
| Architecture | Embedded (SQLite-like) | Server (Docker) |
| Implementation | Rust (native) | Rust (native) |
| Vector Index | HNSW (optimized) | HNSW |
| Graph Support | No | Yes |
| Deployment | Single binary/file | Docker + CLI |
| Setup Time | 0 seconds | 5-10 minutes |
| Memory Overhead | Minimal (~50MB) | Docker container |
| Query Language | SQL-like | HelixQL |
| Network Latency | None (in-process) | HTTP API overhead |
| Persistence | Single file (.pardus) | LMDB |
| License | MIT | AGPL-3.0 |
Run the benchmark yourself:
# Without HelixDB (PardusDB only)
cargo run --release --bin benchmark_helix
# With HelixDB comparison (requires HelixDB running)
curl -sSL "https://install.helix-db.com" | bash
mkdir helix_bench && cd helix_bench
helix init
# Add schema.hx and queries.hx for vectors
helix push dev
cargo run --release --features helix --bin benchmark_helixA complete RAG example demonstrating PardusDB's features:
cargo run --example simple_rag --releaseThis shows:
- Creating tables with VECTOR columns
- Individual inserts with
insert_direct() - Batch inserts with
insert_batch_direct() - Similarity search with
search_similar()
See examples/python/simple_rag.py — a RAG demo using Ollama for embeddings and PardusDB as the vector store.
cd examples/python
pip install requests
python simple_rag.pyThe Pardus AI team built PardusDB because we believe private, local-first AI tools should be accessible to everyone — from individual developers to large teams.
PardusDB gives you the low-level building block for fast, private vector search, while Pardus AI delivers the high-level no-code experience for analysts, marketers, and business users who just want answers from their data.
If you enjoy working with PardusDB, we’d love for you to try Pardus AI — upload your spreadsheets or documents and ask questions in plain English. Free tier available, no credit card required.
MIT License — use it freely in personal and commercial projects.
⭐ Star us on GitHub if you find this useful!
🚀 Building something cool with PardusDB? Share it with us on X or Discord — we’d love to hear from you.
Pardus AI — https://pardusai.org/