Embedded inference
Run transformer models directly from SQL queries without a separate model server.
On-device intelligence
SQLite-AI brings model interaction directly into the database so edge applications can reason locally with simple SQL.
SQLite-AI
SQLite-native extension
Best for
Local inference
Repository
GitHubProject
SQLite-AI is an AI-native SQLite extension for loading GGUF models, generating text, creating embeddings, handling chat sessions, transcribing audio, and analyzing images from SQL.
It targets on-device and edge applications where low latency, privacy, offline inference, and a composable SQL interface matter more than calling a remote model API for every request.
Why it matters
Capabilities
Run transformer models directly from SQL queries without a separate model server.
Stream generated tokens through SQL aggregate-style interfaces for responsive UX.
Generate text embeddings locally for search, classification, RAG, and memory systems.
Create chat contexts, preserve history, and restore sessions across interactions.
Use Whisper models for speech-to-text across common audio formats.
Analyze images with multimodal models for local visual understanding.
Sample code
Load a GGUF model, create the right context, and call model functions from the database.
-- Load SQLite-AI
.load ./ai
-- Text generation
SELECT llm_model_load('./models/Qwen2.5-3B-Q4_K_M.gguf', 'gpu_layers=99');
SELECT llm_context_create_textgen();
SELECT llm_text_generate('Explain local-first AI in one paragraph.');
-- Embedding generation
SELECT llm_model_load('./models/nomic-embed-text-v1.5-Q8_0.gguf', 'gpu_layers=99');
SELECT llm_context_create_embedding('embedding_type=FLOAT32');
SELECT llm_embed_generate('SQLite is the edge data layer.');