100% Private · 100% Local · 100% Free

Your AI. Your Hardware.
Your Privacy.

ezLocalai is a local AI inference server with an OpenAI-compatible API. Text generation, vision, voice cloning, speech-to-text, and image generation—all running on your machine. No cloud. No subscriptions. No data leaves your network.

The models you run today will still work tomorrow—unchanged, unmonitored, and uncontrolled by anyone but you.

$ pip install ezlocalai && ezlocalai start
🔓 Open Source & Free☁️ Zero Cloud Dependency🔒 No Data Collection🔌 OpenAI-Compatible API
Capabilities

Everything You Need, Running Locally

A complete AI inference stack that runs entirely on your hardware. No API keys, no rate limits, no usage fees.

Text Generation

Run powerful language models locally with llama.cpp. Chat completions, text completions, and function calling with any GGUF model.

Vision & Analysis

Analyze images and documents with multimodal models. Describe images, extract text, answer visual questions—all offline.

Voice Cloning TTS

Clone any voice from a short audio sample. Generate natural speech in cloned voices for applications, assistants, and content.

Speech-to-Text

Transcribe audio with Whisper models. Real-time transcription, multi-language support, and speaker diarization.

Image Generation

Generate images from text prompts with Stable Diffusion. Create artwork, illustrations, and visual content without any cloud service.

OpenAI-Compatible API

Drop-in replacement for the OpenAI API. Use your existing code, libraries, and tools—just point them at your local server.

Auto GPU Detection

Automatically detects your GPU and optimizes model loading. Supports NVIDIA CUDA, AMD ROCm, and CPU-only inference.

Distributed Inference

Spread workloads across multiple machines. Automatic fallback to cloud providers if local resources are unavailable.

One-Command Setup

Install with pip and start with a single command. Auto-downloads models, configures GPU settings, and launches the API server.

⚠️ The Problem With Cloud AI

Remember When Your AI “Understood” You?

Millions of people felt like they finally had an AI that “got” them. Then the provider replaced it with a “better” model—and suddenly, that connection was gone. You can't own something that lives on someone else's servers.

Cloud AI (ChatGPT, Claude, etc.)

  • Your data trains their models

    Every conversation improves their AI, not yours

  • Models change without warning

    The AI you rely on today can be deprecated tomorrow

  • Models get worse over time

    Remember when GPT-4o "understood" you? Then they replaced it.

  • Zero transparency

    You'll never know how your data is used, sold, or shared

  • Subscription prison

    Stop paying, lose access. Your workflow is rented.

  • Corporate censorship

    Policies change, capabilities get restricted without notice

ezLocalai (Your Machine)

  • Your data stays yours

    Runs 100% locally, nothing phones home, ever

  • Your models never change

    Unless YOU decide to update them. Frozen in time, by your choice.

  • Models stay consistent

    No surprise "improvements" that break your workflows

  • Complete transparency

    You own the weights, see the code, control everything

  • Free forever

    Open source, no subscriptions, no lock-in, no ransom

  • No corporate overlords

    No usage policies, no nannying, no restrictions

True AI Ownership Means Nobody Can Take It Away

When you run your models locally with ezLocalai, your interactions stay private. No company is training on your prompts. No one is analyzing your conversations. The AI that works for you today will still work for you in 10 years—unchanged, unmonitored, and uncontrolled by anyone but you.

🦙 Llama.cpp Runtime🔒 Zero Cloud Dependency📖 100% Open Source🌐 Works Fully Offline

Your Models Don't Get “Updated” Out From Under You

When a cloud provider “improves” their model, your carefully crafted prompts break. Your fine-tuned workflows stop working. Your users notice the difference. With ezLocalai, you choose when and if to change models. Your AI stack is as stable as the hardware it runs on.

Ecosystem

Powering the DevXT Ecosystem

ezLocalai is the inference engine behind a growing family of AI products. Every one of these platforms can run entirely on your local hardware.

DevXT logo

DevXT

AI-powered development tools and services. Custom AI appliances built and configured for your infrastructure.

🔗 Offers professional AI appliance setup services powered by ezLocalai and AGiXT.

NurseXT logo

NurseXT

AI-powered nursing education and clinical decision support. HIPAA-compliant, privacy-first healthcare AI.

🔗 Runs entirely on ezLocalai for HIPAA-compliant local inference—no patient data leaves the network.

📖 Documentation
iAmCopy logo

iAmCopy

Your personal AI clone, trained on your data. Runs locally so your digital twin stays private.

🔗 Uses ezLocalai for local inference so your personal AI clone never sends data to the cloud.

📖 Documentation

Need a Pre-Built AI Appliance?

Don't want to set up the hardware yourself? DevXT builds custom AI appliances pre-loaded with ezLocalai, AGiXT, and your choice of models. Plug it in, turn it on, and start using AI—no cloud required.

Explore DevXT Services
Get Started

Up and Running in Minutes

No account required. No API key. No cloud service. Just install, start, and go.

Step 01

Install

$ pip install ezlocalai

One command. Python 3.10+ required. Docker is installed automatically if needed.

Step 02

Run

$ ezlocalai start

Models download automatically. GPU is auto-detected. OpenAI-compatible API starts on port 8091.

Step 03

Choose a Model

$ ezlocalai start --model unsloth/Qwen3-4B-Instruct-2507-GGUF

Use any GGUF model from Hugging Face. Just pass the repo name and ezLocalai handles the rest.