This document provides a high-level introduction to Cog, an open-source tool for packaging machine learning models into production-ready Docker containers. It covers Cog's purpose, architecture, and core workflow from model definition to deployment.
For detailed information about specific subsystems:
Cog is a tool that packages machine learning models into standard Docker containers with an HTTP API for running predictions. It was created by Andreas Jansson and Ben Firshman to solve the complexity of deploying ML models to production.
Core Value Proposition:
cog.yaml instead of writing complex DockerfilesBasePredictor classcoglet)Primary Use Case: Researchers and ML engineers can ship models to production without deep Docker or deployment expertise. Cog bridges the gap between model code and production infrastructure.
Sources: README.md1-117 docs/llms.txt1-18
Cog uses a three-layer architecture that separates concerns between build-time tooling, user-facing interfaces, and serving infrastructure.
Layer Responsibilities:
CLI Layer (Go): User-facing commands (cog build, cog predict, cog push) orchestrate the build and runtime operations. Entry point is pkg/cli/root.go:NewRootCommand().
Build System (Go): Transforms cog.yaml into a Docker image by parsing configuration (pkg/config), generating Dockerfiles (pkg/dockerfile), and handling image builds (pkg/image).
Python SDK Layer: Provides the user interface for defining models. Users subclass BasePredictor and implement setup() and predict() methods. The cog.server.http module acts as a shim to coglet.
Runtime Layer (Rust): The coglet binary serves HTTP requests at /predictions and other endpoints, manages prediction lifecycle, and handles metrics collection.
Sources: pkg/cli/root.go14-60 README.md19-53 docs/python.md42-63
The typical Cog workflow progresses from model definition to containerized deployment:
Workflow Steps:
cog.yaml specifying Python version, packages, system dependencies, and GPU requirementspredict.py with a Predictor class that inherits from BasePredictorcog build which:
cog.yaml via pkg/config/Configpkg/dockerfile/StandardGeneratorcog and coglet wheel versions via pkg/wheelsdocker buildcog predict -i <inputs> or docker run the imagecog push to a registry and deploypython -m cog.server.http which starts the coglet Rust serverSources: README.md54-78 docs/getting-started.md44-172 pkg/cli/root.go45-56
The command-line interface is implemented in Go and provides these commands:
| Command | Module | Purpose |
|---|---|---|
cog build | pkg/cli/build.go | Build Docker image from cog.yaml |
cog predict | pkg/cli/predict.go | Run a prediction on a model |
cog push | pkg/cli/push.go | Build and push image to registry |
cog run | pkg/cli/run.go | Run arbitrary commands in Docker environment |
cog serve | pkg/cli/serve.go | Start HTTP prediction server |
cog init | pkg/cli/init.go | Generate starter cog.yaml and predict.py |
cog login | pkg/cli/login.go | Authenticate with container registry |
cog train | pkg/cli/train.go | Run model training (fine-tuning API) |
The CLI uses the Cobra framework and is built into a single binary. Global state like version and debug flags are in pkg/global/global.go.
Sources: pkg/cli/root.go45-57 pkg/global/global.go1-30
Users interact with Cog through the Python SDK, defining their model's prediction interface:
Core Classes:
BasePredictor: Base class for model implementation with setup() and predict() methodsInput(): Function to define input parameters with validation and documentationPath: Type for file inputs/outputs (paths on disk)File: Type for file-like objectsSecret: Type for sensitive string inputsBaseModel: Base class for custom Output objectsExample Usage:
The SDK is installed as the cog Python package in the Docker container. It integrates with coglet through the cog.server.http module.
Sources: docs/python.md42-100 README.md34-53
The coglet binary is a high-performance HTTP server written in Rust that:
/predictions, /health-check, /openapi.jsonWhen a Docker image starts, it runs python -m cog.server.http, which launches coglet. The coglet server then imports the user's Predictor class and routes HTTP requests to the predict() method.
Sources: docs/http.md1-86 README.md15-16 docs/deploy.md1-85
Cog integrates with:
r8.im (Replicate), Docker Hub, or any OCI-compliant registrycog and coglet Python wheels during build.github/workflows/ for releases and testingSources: go.mod1-43 docs/deploy.md1-48
The cog.yaml file defines the Docker environment and prediction interface:
The configuration is parsed by pkg/config/Config and validated during build.
Sources: docs/yaml.md1-231 README.md21-31
The predict.py file implements the model interface:
The predict field in cog.yaml must reference this class in the format "file.py:ClassName".
Sources: docs/python.md42-100 README.md34-53
Key Build Steps:
pkg/config/Config.Parse() reads cog.yaml and validates settingscog-base images (fastest), NVIDIA CUDA images, or Python slim imagespkg/dockerfile/StandardGenerator creates a multi-stage Dockerfilepkg/wheels/WheelConfig determines where to get cog and coglet packages (PyPI, local dist/, or URL via environment variables)docker build with BuildKitThe build artifacts are stored in .cog/ (defined as global.CogBuildArtifactsFolder).
Sources: pkg/cli/build.go pkg/dockerfile pkg/image pkg/config pkg/wheels
Execution Flow:
python -m cog.server.httpcoglet Rust binary starts and imports predict.pyPredictor.setup() runs once to load weightscoglet exposes /health-check endpoint returning {"status": "READY"}/predictions with inputscoglet creates a prediction scope for metrics and lifecyclePredictor.predict() with inputspredict() yields valuesself.record_metric()Sources: docs/http.md1-220 docs/deploy.md14-85 docs/python.md249-303
Cog uses lockstep versioning where all packages share the same version number:
crates/Cargo.toml contains the canonical version (e.g., 0.17.1)cog CLI binary (Go) - embedded version from build timecog Python SDK - published to PyPIcoglet binary (Rust) - published to PyPI as wheelscoglet on crates.ioRelease Process:
crates/Cargo.tomlv0.17.1 triggers .github/workflows/release-build.yaml.github/workflows/release-publish.yamlThe version can be overridden at build time using environment variables:
COG_SDK_WHEEL: Path/URL to custom cog wheelCOGLET_WHEEL: Path/URL to custom coglet wheelSources: pkg/global/global.go10-12 docs/yaml.md151-172
| Component | Language | Key Libraries | Purpose |
|---|---|---|---|
| CLI | Go 1.26 | cobra, docker/docker, google/go-containerregistry | User commands, Docker orchestration |
| Python SDK | Python 3.10-3.13 | pydantic | Model interface, input validation |
| Runtime Server | Rust | axum, tokio | High-performance HTTP serving |
| Build System | Go | moby/buildkit | Dockerfile generation, image building |
Sources: go.mod1-43 README.md1-117
For detailed information on specific subsystems, see the related wiki pages linked at the beginning of this document.
Refresh this wiki
This wiki was recently refreshed. Please wait 2 days to refresh again.