PardusDB

A fast, SQLite-like embedded vector database with graph-based approximate nearest neighbor search
Open-source project from the team behind Pardus AI

PardusDB is designed for developers building local AI applications — RAG pipelines, semantic search, recommendation systems, or any project that needs lightweight, persistent vector storage without external dependencies.

While Pardus AI gives non-technical users a powerful no-code platform to ask questions of their CSV, JSON, and PDF data in plain English, PardusDB gives developers the same speed and privacy in an embeddable, fully open-source vector database.

Features

Single-file storage — Everything lives in one .pardus file, just like SQLite
Multiple tables — Store different vector dimensions and metadata in the same database
Familiar SQL-like syntax — CREATE, INSERT, SELECT, UPDATE, DELETE feel natural
UNIQUE constraints — O(1) duplicate detection using HashSet
GROUP BY with aggregates — O(n) hash aggregation with COUNT, SUM, AVG, MIN, MAX
JOINs — O(n+m) hash join algorithm for INNER, LEFT, RIGHT joins
Fast vector similarity search — Graph-based approximate nearest neighbor search
Thread-safe — Safe concurrent reads in multi-threaded applications
Full transactions — BEGIN/COMMIT/ROLLBACK for atomic operations
Optional GPU acceleration — For large batch inserts and queries
Zero external dependencies — Pure Rust, MIT licensed

Installation

Quick Install (Recommended)

git clone https://github.com/pardus-ai/pardusdb
cd pardusdb
./setup.sh

This will build PardusDB and install it as the pardusdb command, available system-wide.

Manual Install

git clone https://github.com/pardus-ai/pardusdb
cd pardusdb
cargo build --release

The binary will be at target/release/pardusdb.

Quick Start

Interactive REPL

pardusdb

╔═══════════════════════════════════════════════════════════════╗
║                    PardusDB REPL                      ║
║          Vector Database with SQL Interface           ║
╚═══════════════════════════════════════════════════════════════╝

pardusdb [memory]> .create mydb.pardus
Created and opened: mydb.pardus

pardusdb [mydb.pardus]> CREATE TABLE docs (embedding VECTOR(768), content TEXT);
Table 'docs' created

pardusdb [mydb.pardus]> INSERT INTO docs (embedding, content)
VALUES ([0.1, 0.2, 0.3, ...], 'Hello World');
Inserted row with id=1

pardusdb [mydb.pardus]> SELECT * FROM docs
WHERE embedding SIMILARITY [0.1, 0.2, 0.3, ...] LIMIT 5;

Found 1 similar rows:
  id=1, distance=0.0000, values=[Vector([...]), Text("Hello World")]

pardusdb [mydb.pardus]> quit
Saved to: mydb.pardus
Goodbye!

Command Line

# Persistent file
pardusdb mydata.pardus

# In-memory only
pardusdb

SQL Syntax

Supported Data Types

Type	Description	Example
`VECTOR(n)`	n-dimensional float vector	`VECTOR(768)`
`TEXT`	UTF-8 string	`'hello world'`
`INTEGER`	64-bit integer	`42`
`FLOAT`	64-bit float	`3.14`
`BOOLEAN`	true/false	`true`

Basic Operations

CREATE TABLE documents (
    id INTEGER PRIMARY KEY,
    embedding VECTOR(768),
    title TEXT,
    category TEXT,
    score FLOAT
);

INSERT INTO documents (embedding, title, category, score)
VALUES ([0.1, 0.2, ...], 'Introduction to Rust', 'tutorial', 0.95);

SELECT * FROM documents WHERE category = 'tutorial' LIMIT 10;

UPDATE documents SET score = 0.99 WHERE id = 1;

DELETE FROM documents WHERE id = 1;

UNIQUE Constraint

Ensure column values are unique with O(1) duplicate detection:

CREATE TABLE users (
    embedding VECTOR(128),
    id INTEGER PRIMARY KEY,
    email TEXT UNIQUE
);

-- This will fail - duplicate email
INSERT INTO users (embedding, id, email) VALUES ([0.1, ...], 1, '[email protected]');
INSERT INTO users (embedding, id, email) VALUES ([0.2, ...], 2, '[email protected]');
-- Error: Duplicate value for UNIQUE column 'email'

GROUP BY with Aggregates

Group and aggregate data with O(n) hash aggregation:

-- Aggregate functions: COUNT, SUM, AVG, MIN, MAX
SELECT category, COUNT(*), AVG(score), SUM(amount)
FROM sales
GROUP BY category;

-- With HAVING clause for filtered groups
SELECT category, SUM(amount) as total
FROM sales
GROUP BY category
HAVING SUM(amount) > 1000;

JOINs

Join tables with O(n+m) hash join algorithm:

-- INNER JOIN
SELECT * FROM orders
INNER JOIN users ON orders.user_id = users.id;

-- LEFT JOIN (include all left rows)
SELECT users.email, orders.product
FROM users
LEFT JOIN orders ON users.id = orders.user_id;

-- RIGHT JOIN (include all right rows)
SELECT * FROM users
RIGHT JOIN orders ON users.id = orders.user_id;

Vector Similarity Search

SELECT * FROM documents
WHERE embedding SIMILARITY [0.12, 0.24, ...]
LIMIT 10;

Results are automatically ordered by distance (closest first).

Utility Commands

SHOW TABLES;
DROP TABLE documents;

REPL Commands

Command	Description
`.create <file>`	Create and open a new database
`.open <file>`	Open an existing database
`.save`	Force save current database
`.tables`	List tables
`.clear`	Clear screen
`help`	Show help
`quit`	Exit (auto-saves if file open)

Performance (Apple Silicon M-series)

Operation	Time
Single insert	~160 µs/doc
Batch insert (1,000 docs)	~6 ms
Query (k=10)	~3 µs

Benchmark: PardusDB vs Neo4j

Real-world benchmark comparing PardusDB against Neo4j 5.15 for vector similarity operations.

Test Configuration:

Vector dimension: 128
Number of vectors: 10,000
Number of queries: 100
Top-K: 10

Results

Database	Insert (10K vectors)	Search (100 queries)	Single Search
PardusDB	18ms (543K/s)	355µs (281K/s)	3µs
Neo4j	35.70s (280/s)	153ms (650/s)	1ms

Speedup

Operation	PardusDB Advantage
Insert	1983x faster
Search	431x faster

Batch Insert Performance

PardusDB supports batch inserts for massive performance gains:

Batch Size	Insert (10K vecs)	Speedup vs Individual
Individual	1.52s	1.0x
100	33ms	45x
500	10ms	149x
1000	6ms	220x

Feature Comparison

Feature	PardusDB	Neo4j
Architecture	Embedded (SQLite-like)	Client-Server
Implementation	Rust (native)	Java (JVM)
Setup Time	0 seconds	5-10 minutes
Memory Overhead	Minimal (~50MB)	High (JVM ~1GB+)
Deployment	Single binary/file	Server + Docker/K8s
Query Language	SQL-like	Cypher

Run the benchmark yourself:

# Without Neo4j (PardusDB only)
cargo run --release --bin benchmark_neo4j

# With Neo4j comparison (requires Neo4j running)
docker run -d -p 7687:7687 -e NEO4J_AUTH=neo4j/password123 neo4j:5.15
cargo run --release --features neo4j --bin benchmark_neo4j

Search Accuracy

Accuracy comparison against brute-force exact search (ground truth).

PardusDB Results:

Metric	K=10	K=5	K=1	Description
Recall@K	99.2%	94.8%	68.0%	True neighbors found
Precision@K	99.2%	94.8%	68.0%	Correct results ratio
MRR	0.292	0.439	0.680	Mean Reciprocal Rank

PardusDB vs Neo4j Accuracy Comparison:

Metric	PardusDB	Neo4j	Winner
Recall@10	99.2%	3.0%	PardusDB
Recall@5	94.8%	2.8%	PardusDB
Recall@1	68.0%	2.0%	PardusDB
MRR	0.292	0.010	PardusDB

Run accuracy benchmark:

# Without Neo4j (PardusDB only)
cargo run --release --bin benchmark_accuracy

# With Neo4j comparison (requires Neo4j running)
cargo run --release --features neo4j --bin benchmark_accuracy

Benchmark: PardusDB vs HelixDB

Comparison against HelixDB, an open-source graph-vector database built in Rust.

Test Configuration:

Vector dimension: 128
Number of vectors: 10,000
Number of queries: 100
Top-K: 10

Results

Database	Insert (10K vectors)	Search (100 queries)	Single Search
PardusDB	14ms (696K/s)	280µs (357K/s)	2µs
HelixDB	2.87s (3.5K/s)	17ms (5.8K/s)	172µs

Speedup

Operation	PardusDB Advantage
Insert	200x faster
Search	62x faster

Feature Comparison

Feature	PardusDB	HelixDB
Architecture	Embedded (SQLite-like)	Server (Docker)
Implementation	Rust (native)	Rust (native)
Vector Index	HNSW (optimized)	HNSW
Graph Support	No	Yes
Deployment	Single binary/file	Docker + CLI
Setup Time	0 seconds	5-10 minutes
Memory Overhead	Minimal (~50MB)	Docker container
Query Language	SQL-like	HelixQL
Network Latency	None (in-process)	HTTP API overhead
Persistence	Single file (.pardus)	LMDB
License	MIT	AGPL-3.0

Run the benchmark yourself:

# Without HelixDB (PardusDB only)
cargo run --release --bin benchmark_helix

# With HelixDB comparison (requires HelixDB running)
curl -sSL "https://install.helix-db.com" | bash
mkdir helix_bench && cd helix_bench
helix init
# Add schema.hx and queries.hx for vectors
helix push dev
cargo run --release --features helix --bin benchmark_helix

Examples

Rust Example

A complete RAG example demonstrating PardusDB's features:

cargo run --example simple_rag --release

This shows:

Creating tables with VECTOR columns
Individual inserts with insert_direct()
Batch inserts with insert_batch_direct()
Similarity search with search_similar()

Python Example

See examples/python/simple_rag.py — a RAG demo using Ollama for embeddings and PardusDB as the vector store.

cd examples/python
pip install requests
python simple_rag.py

Why We Built PardusDB

The Pardus AI team built PardusDB because we believe private, local-first AI tools should be accessible to everyone — from individual developers to large teams.

PardusDB gives you the low-level building block for fast, private vector search, while Pardus AI delivers the high-level no-code experience for analysts, marketers, and business users who just want answers from their data.

If you enjoy working with PardusDB, we’d love for you to try Pardus AI — upload your spreadsheets or documents and ask questions in plain English. Free tier available, no credit card required.

License

MIT License — use it freely in personal and commercial projects.

⭐ Star us on GitHub if you find this useful!
🚀 Building something cool with PardusDB? Share it with us on X or Discord — we’d love to hear from you.

Pardus AI — https://pardusai.org/

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
examples		examples
mcp		mcp
sdk		sdk
skill		skill
src		src
tests		tests
--version		--version
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
setup.sh		setup.sh

License

JasonHonKL/PardusDB

Folders and files

Latest commit

History

Repository files navigation

PardusDB

Features

Installation

Quick Install (Recommended)

Manual Install

Quick Start

Interactive REPL

Command Line

SQL Syntax

Supported Data Types

Basic Operations

UNIQUE Constraint

GROUP BY with Aggregates

JOINs

Vector Similarity Search

Utility Commands

REPL Commands

Performance (Apple Silicon M-series)

Benchmark: PardusDB vs Neo4j

Results

Speedup

Batch Insert Performance

Feature Comparison

Search Accuracy

Benchmark: PardusDB vs HelixDB

Results

Speedup

Feature Comparison

Examples

Rust Example

Python Example

Why We Built PardusDB

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages