Your AI. Your Hardware.
Your Privacy.
ezLocalai is a local AI inference server with an OpenAI-compatible API. Text generation, vision, voice cloning, speech-to-text, and image generation—all running on your machine. No cloud. No subscriptions. No data leaves your network.
The models you run today will still work tomorrow—unchanged, unmonitored, and uncontrolled by anyone but you.
$ pip install ezlocalai && ezlocalai startEverything You Need, Running Locally
A complete AI inference stack that runs entirely on your hardware. No API keys, no rate limits, no usage fees.
Text Generation
Run powerful language models locally with llama.cpp. Chat completions, text completions, and function calling with any GGUF model.
Vision & Analysis
Analyze images and documents with multimodal models. Describe images, extract text, answer visual questions—all offline.
Voice Cloning TTS
Clone any voice from a short audio sample. Generate natural speech in cloned voices for applications, assistants, and content.
Speech-to-Text
Transcribe audio with Whisper models. Real-time transcription, multi-language support, and speaker diarization.
Image Generation
Generate images from text prompts with Stable Diffusion. Create artwork, illustrations, and visual content without any cloud service.
OpenAI-Compatible API
Drop-in replacement for the OpenAI API. Use your existing code, libraries, and tools—just point them at your local server.
Auto GPU Detection
Automatically detects your GPU and optimizes model loading. Supports NVIDIA CUDA, AMD ROCm, and CPU-only inference.
Distributed Inference
Spread workloads across multiple machines. Automatic fallback to cloud providers if local resources are unavailable.
One-Command Setup
Install with pip and start with a single command. Auto-downloads models, configures GPU settings, and launches the API server.
Remember When Your AI “Understood” You?
Millions of people felt like they finally had an AI that “got” them. Then the provider replaced it with a “better” model—and suddenly, that connection was gone. You can't own something that lives on someone else's servers.
Cloud AI (ChatGPT, Claude, etc.)
- ✗Your data trains their models
Every conversation improves their AI, not yours
- ✗Models change without warning
The AI you rely on today can be deprecated tomorrow
- ✗Models get worse over time
Remember when GPT-4o "understood" you? Then they replaced it.
- ✗Zero transparency
You'll never know how your data is used, sold, or shared
- ✗Subscription prison
Stop paying, lose access. Your workflow is rented.
- ✗Corporate censorship
Policies change, capabilities get restricted without notice
ezLocalai (Your Machine)
- ✓Your data stays yours
Runs 100% locally, nothing phones home, ever
- ✓Your models never change
Unless YOU decide to update them. Frozen in time, by your choice.
- ✓Models stay consistent
No surprise "improvements" that break your workflows
- ✓Complete transparency
You own the weights, see the code, control everything
- ✓Free forever
Open source, no subscriptions, no lock-in, no ransom
- ✓No corporate overlords
No usage policies, no nannying, no restrictions
True AI Ownership Means Nobody Can Take It Away
When you run your models locally with ezLocalai, your interactions stay private. No company is training on your prompts. No one is analyzing your conversations. The AI that works for you today will still work for you in 10 years—unchanged, unmonitored, and uncontrolled by anyone but you.
Your Models Don't Get “Updated” Out From Under You
When a cloud provider “improves” their model, your carefully crafted prompts break. Your fine-tuned workflows stop working. Your users notice the difference. With ezLocalai, you choose when and if to change models. Your AI stack is as stable as the hardware it runs on.
Powering the DevXT Ecosystem
ezLocalai is the inference engine behind a growing family of AI products. Every one of these platforms can run entirely on your local hardware.
AGiXT
AI Agent Automation Platform. Build intelligent agents that can reason, plan, and execute tasks autonomously.
🔗 Uses ezLocalai as its primary local inference engine for agent interactions.
📖 DocumentationDevXT
AI-powered development tools and services. Custom AI appliances built and configured for your infrastructure.
🔗 Offers professional AI appliance setup services powered by ezLocalai and AGiXT.
XT Systems
Enterprise AI infrastructure solutions. Turnkey AI deployments for businesses that need local AI capabilities.
🔗 Enterprise deployments built on the ezLocalai + AGiXT stack.
📖 DocumentationNurseXT
AI-powered nursing education and clinical decision support. HIPAA-compliant, privacy-first healthcare AI.
🔗 Runs entirely on ezLocalai for HIPAA-compliant local inference—no patient data leaves the network.
📖 DocumentationiAmCopy
Your personal AI clone, trained on your data. Runs locally so your digital twin stays private.
🔗 Uses ezLocalai for local inference so your personal AI clone never sends data to the cloud.
📖 DocumentationNeed a Pre-Built AI Appliance?
Don't want to set up the hardware yourself? DevXT builds custom AI appliances pre-loaded with ezLocalai, AGiXT, and your choice of models. Plug it in, turn it on, and start using AI—no cloud required.
Explore DevXT ServicesUp and Running in Minutes
No account required. No API key. No cloud service. Just install, start, and go.
Install
$ pip install ezlocalaiOne command. Python 3.10+ required. Docker is installed automatically if needed.
Run
$ ezlocalai startModels download automatically. GPU is auto-detected. OpenAI-compatible API starts on port 8091.
Choose a Model
$ ezlocalai start --model unsloth/Qwen3-4B-Instruct-2507-GGUFUse any GGUF model from Hugging Face. Just pass the repo name and ezLocalai handles the rest.