
Local-first Control Plane for Existing Ops Automation and AI Agent Workflows
Define workflows in simple declarative YAML syntax, execute them anywhere with a single binary, compose complex pipelines from reusable sub-workflows, and distribute tasks across workers. The built-in Web UI eliminates the need for SSHing into servers to debug failed runs, check logs, or retry steps manually. All without requiring databases, message brokers, or code changes to your existing scripts. It natively supports command execution via SSH, running docker containers, kubernetes jobs, and you can extend it with custom step types for your specific use case.
Built for developers who want powerful workflow orchestration without the operational overhead.
Motivation
Legacy systems often have complex and implicit dependencies between jobs. When there are hundreds of cron jobs on a server, it can be difficult to keep track of these dependencies and to determine which job to rerun if one fails. It can also be a hassle to SSH into a server to view logs and manually rerun shell scripts on e by one. Dagu aims to solve these problems by allowing you to explicitly visualize and manage pipeline dependencies as a DAG, and by providing a web UI for check ing dependencies, execution status, and logs and for rerunning or stopping jobs with a simple mouse click.
How a Workflow Runs
Dagu does not make you rewrite the work. Your scripts, SQL files, containers, SSH commands, APIs, and services can stay as they are; the YAML adds inputs, order, logs, retries, approvals, artifacts, and recovery controls around them.
params:
- name: customer_id
type: string
description: Customer or account identifier
- name: change_scope
type: string
description: What the repair is allowed to change
enum:
- metadata_only
- permissions
- full_account
default: metadata_only
- name: dry_run
type: boolean
default: true
steps:
- id: inspect_account
run: ./scripts/inspect-account.sh --customer "${customer_id}"
stdout:
artifact: reports/inspection.md
- id: review
action: noop
depends: inspect_account
approval:
prompt: Review the inspection report before running the repair.
- id: repair_account
run: >-
./scripts/repair-account.sh
--customer "${customer_id}"
--scope "${change_scope}"
--dry-run="${dry_run}"
depends: review
stdout:
artifact: reports/repair.logIn this example, the DAG turns an existing account-repair runbook into a reviewed workflow. The params block gives Dagu enough information to render a guided input form before the run starts. The inspection output is stored as an artifact, the repair waits for explicit approval, and the submitted values, logs, artifacts, and status stay attached to the run history.
During a run, Dagu resolves dependencies, starts ready steps, captures stdout and stderr, tracks status, applies retry rules, pauses for approvals, stores artifacts, and updates the Web UI in real time.
Core Terminology
Understanding Dagu is easier once the main terms are clear.
| Term | Meaning |
|---|---|
| DAG | A workflow file written in YAML. Steps run according to dependencies, so the execution order is explicit. |
| Step | One unit of work. A step can run a command, container, SSH command, HTTP request, SQL query, readiness wait, sub-workflow, or AI agent task. |
| Action | The kind of work a step runs, such as run, docker.run, kubernetes.run, ssh.run, http.request, postgres.query, wait.http, s3.upload, or agent.run. You can also define custom actions, call third-party actions, or use official actions such as duckdb@v1. |
| Dagu Action | A versioned action package such as python-script@v1, duckdb@v1, or ffmpeg@v1. |
| Parameter | A declared run input with a name, type, default, description, or allowed values. Parameters power the generated Web UI start form and keep submitted values visible with the run. |
| Tool | A pinned CLI package declared with tools. Dagu installs these before the run so host command steps use the expected binary version. |
| Run | One execution of a DAG. Runs keep status, logs, timing, outputs, and artifacts. |
| Notification | A UI-managed route that sends run events to Slack, email, Telegram, Google Chat, or webhooks. |
| Incident | A provider-backed failure lifecycle that opens on final failure, deduplicates repeated failures, and resolves after recovery. |
| Schedule | Cron-based automation for starting DAG runs, including timezone support. |
| Queue | Concurrency control for workflows, useful when jobs must not overlap or when workers are shared. |
| Worker | A machine that executes tasks in distributed mode. Workers can be selected by labels such as region, GPU, or environment. |
| Artifact | A file produced by a run and stored with the run history for preview, download, or audit. |
See Core Concepts for the deeper model.
Why Teams Choose Dagu
The main reason teams choose Dagu is that it turns existing automation into safe, visible workflows without turning that work into a platform rollout.
Single binary
Install one executable. The default quickstart setup runs without an external database or broker and without splitting the scheduler, queue, or Web UI into separate required services.
Local-first storage
Run history, logs, and artifacts stay local by default, which keeps self-hosting simple and fits private-network, data-local, and air-gapped deployment patterns.
No rewrite workflows
Wrap existing scripts and commands, SQL, dbt commands, DuckDB jobs, containers, SSH operations, HTTP calls, and other automation tasks instead of converting them into framework-specific jobs.
Generated forms
Declare typed parameters in YAML and Dagu automatically presents the right inputs in the Web UI, including defaults, descriptions, and allowed values.
Reproducible CLI tools
Declare pinned external tool packages in the DAG so portable CLIs such as jq, yq, and duckdb are installed automatically and every worker runs the same binary version.
Observable by default
Every run has status, per-step logs, timing and history, artifacts, approvals, notifications, incident routing, and UI controls for debugging, recovery, and handoff.
Scales gradually
Start on one machine, then move heavy, regional, or specialized jobs to distributed workers with label-based routing.
Plain YAML
Workflows live as plain YAML, can be reviewed in Git, generated with reusable tooling, edited by AI agents, and checked with validation before they run.
Architecture at a Glance
Dagu can run in a small local setup or scale out when workloads grow. The operating model changes, but the workflow YAML does not need to be rewritten.
Standalone
dagu start-all runs the Web UI, scheduler, and workflow runtime in one process.
Best for one server, a team utility box, a private automation host, or getting started quickly.
Headless
Run workflows from the CLI or API without relying on the Web UI.
Best for CI-like automation, locked-down servers, or environments where Dagu is managed by another system.
Coordinator and Workers
The scheduler queues work, the coordinator assigns tasks, and workers execute DAGs over gRPC.
Best for many machines, GPU jobs, regional routing, mixed workloads, and high-throughput batch processing.
See Architecture for internals and storage, and Deployment Models for local, self-hosted, managed, and hybrid deployment options.
How Dagu Is Different
| Existing problem | Dagu path |
|---|---|
| Operational tasks are scattered across scripts, SQL files, SSH commands, API calls, cron entries, and engineer runbooks | One YAML workflow with parameters, dependencies, approvals, retries, logs, artifacts, and run controls. |
| A custom admin UI is needed just to let support or operations teams run a safe command | Declare parameters in YAML and let Dagu generate the Web UI input form, validation, logs, and run history. |
| A cloud job platform would move execution away from private data, credentials, and internal networks | Run workflows where the data, credentials, files, and existing CLIs already live. |
| A large orchestrator is too much infrastructure for scripts and runbooks | Start with one binary and file-backed state, then add queues and distributed workers only when needed. |
| Important runbooks still require manual SSH sessions and tribal knowledge | Reviewed workflows give operators safe execution while engineers keep commands, logs, outputs, and approvals traceable. |
Real-World Use Cases
Dagu is useful anywhere existing scripts, containers, SQL jobs, operational tasks, or agent-driven jobs need parameters, approvals, scheduling, retries, visibility, and a safe way for a team to run them.
ETL and Data Operations
Run: PostgreSQL and SQLite queries, DuckDB through the official action, dbt commands, S3 transfers, pinned jq or yq tools, readiness waits, validation steps, and sub-workflows.
Why Dagu fits: daily data workflows stay declarative, run close to private data, remain easy to inspect in the Web UI, and are straightforward to retry when one step fails.
Cron and Legacy Script Management
Run: existing shell scripts, Python scripts, HTTP calls, and scheduled jobs without rewriting them.
Why Dagu fits: dependencies, logs, retries, and run history become visible in the Web UI instead of being hidden across crontabs and server log files.
Media Conversion
Run: shell-driven media tools like ffmpeg, thumbnail extraction, audio normalization, image processing, and other compute-heavy jobs.
Why Dagu fits: conversion work can run across distributed workers while run history, logs, and artifacts stay visible in one place for monitoring, debugging, and retries.
Infrastructure and Server Automation
Run: SSH backups, cleanup jobs, deploy scripts, patch windows, precondition checks, and lifecycle hooks.
Why Dagu fits: remote operations get schedules, retries, notifications, incident routing, and per-step logs without requiring operators to SSH into servers for every recovery.
GitHub-driven Workflows
Run: PR validation, preview deployments, release workflows, check reruns, workflow_dispatch, and repository_dispatch from GitHub.
Why Dagu fits: GitHub Integration keeps GitHub as the trigger source while Dagu executes the DAG on your licensed server and reports checks, reactions, and comments back to GitHub.
Container and Kubernetes Workflows
Run: Docker images, Kubernetes Jobs, shell glue, and follow-up validation steps.
Why Dagu fits: teams can compose image-based tasks and route them to the right workers with worker labels instead of building a custom control plane.
Customer Support Automation
Run: diagnostics, account repair jobs, data checks, and approval-gated support actions.
Why Dagu fits: non-engineers can run reviewed workflows from the Web UI while engineers keep logs and results traceable.
IoT and Edge Workflows
Run: sensor polling, local cleanup, offline sync, health checks, and device maintenance jobs.
Why Dagu fits: the single binary works well on small devices while still providing visibility through the Web UI.
AI Agent Workflows
Run: AI agent steps, agent-authored YAML workflows, log analysis, repair steps, and human-reviewed automation.
Why Dagu fits: workflows stay in plain YAML, so agents can create and debug them while humans keep logs, approvals, and run history in one place.
TIP
If it can run from a shell command, Docker image, Kubernetes Job, SSH session, HTTP call, readiness wait, or SQL query, Dagu can usually orchestrate it without rewriting the underlying tool. For portable host CLIs, add tools so the DAG controls the binary version too.
AI Agents and Workflow Operator
Dagu includes AI features, but they build on the same local-first workflow model. The built-in MCP server lets MCP-capable agents read Dagu state, preview or apply workflow changes, and start, enqueue, retry, or stop runs. Agent steps and external agent CLIs can also run inside workflows, with the same scheduling, logs, retries, approvals, and run history as any other step.
steps:
- id: analyze_logs
action: agent.run
with:
task: |
Analyze /var/log/app/errors.log from the last hour.
Summarize likely causes and suggest a safe recovery plan.
output: ANALYSIS_RESULTWorkflow Operator connects Slack, Telegram, Discord, or LINE to the built-in steward, so teams can ask for run status, debug failures, re-run workflows, and approve actions from chat.
- MCP Server explains how agents can inspect state and operate workflows through Dagu.
- AI Agent Authoring explains workflow generation and debugging with coding agents.
- Agent Step explains how to run agent tasks inside DAGs.
- Workflow Operator explains chat-operator setup.
