HolmesGPT — The CNCF SRE Agent

Open-source AI agent for investigating production incidents and finding root causes. A CNCF Sandbox project by Robusta.Dev.

Petabyte-scale data: Server-side filtering, JSON tree traversal, and tool output transformers keep large payloads out of context windows
Deep integrations: Prometheus, Grafana, Datadog, Kubernetes, and many more—plus any REST API
Bidirectional alert integrations: Fetch alerts from AlertManager, PagerDuty, OpsGenie, or Jira—and write findings back
Any LLM provider: OpenAI, Anthropic, Azure, Bedrock, Gemini, and more
Operator mode: Run investigations on a schedule as a Kubernetes operator

How it Works

HolmesGPT uses an agentic loop to query live observability data from multiple sources and identify root causes.

🔗 Data Sources

HolmesGPT integrates with popular observability and cloud platforms. The following data sources ("toolsets") are built-in. Add your own.

Data Source	Notes
AKS	Azure Kubernetes Service cluster and node health diagnostics
ArgoCD	Get status, history and manifests and more of apps, projects and clusters
AWS	RDS events, instances, slow query logs, and more (MCP)
Azure	Azure resources and diagnostics (MCP)
Azure SQL	Database health, performance, connections, and slow queries
Confluence	Private runbooks and documentation
Coralogix	Retrieve logs for any resource
Datadog	Query logs, metrics, and traces
Docker	Get images, logs, events, history and more
Elasticsearch / OpenSearch	Query logs, cluster health, shard and index diagnostics
GCP	Google Cloud Platform resources (MCP)
GitHub	Repositories, issues, and pull requests (MCP)
Grafana	Query and analyze dashboard configurations and panels
Helm	Release status, chart metadata, and values
Internet	Public runbooks, community docs etc
Kafka	Fetch metadata, list consumers and topics or find lagging consumer groups
Kubernetes	Pod logs, K8s events, and resource status (kubectl describe)
Loki	Query logs for Kubernetes resources or any query
MariaDB	MariaDB database queries and diagnostics (MCP)
MongoDB Atlas	Cluster health, slow queries, and performance diagnostics
NewRelic	Investigate alerts, query tracing data
OpenShift	Projects, routes, builds, security context constraints, and deployment configs
Prometheus	Investigate alerts, query metrics and generate PromQL queries
RabbitMQ	Partitions, memory/disk alerts, troubleshoot split-brain scenarios and more
Robusta	Multi-cluster monitoring, historical change data, runbooks, PromQL graphs and more
ServiceNow	Query tables and incident records
Sentry	Error tracking, issues, and performance monitoring (MCP)
Slab	Team knowledge base and runbooks on demand
Splunk	Log search and analysis (MCP)
SQL Databases	PostgreSQL, MySQL, ClickHouse, MariaDB, SQL Server, SQLite
Tempo	Fetch trace info, debug issues like high latency in application

See the full list of built-in toolsets for additional integrations including Cilium, KubeVela, Notion, Prefect, and more.

🚀 End-to-End Automation

HolmesGPT can fetch alerts/tickets to investigate from external systems, then write the analysis back to the source or Slack.

Integration	Status	Notes
Slack	✅	Demo. Available via Robusta.dev (commercial platform)
Microsoft Teams	✅	Available via Robusta.dev (commercial platform)
Prometheus/AlertManager	✅	Robusta SaaS or HolmesGPT CLI
PagerDuty	✅	HolmesGPT CLI only
OpsGenie	✅	HolmesGPT CLI only
Jira	✅	HolmesGPT CLI only
GitHub	✅	HolmesGPT CLI only

Installation

Read the installation documentation to learn how to install HolmesGPT.

Supported LLM Providers

Read the LLM Providers documentation to learn how to set up your LLM API key.

Using HolmesGPT

See the walkthrough documentation for usage guides, including:

Interactive mode for asking questions and follow-ups
Investigating Prometheus alerts
CI/CD troubleshooting

🔐 Data Privacy

By design, HolmesGPT has read-only access and respects RBAC permissions. It is safe to run in production environments.

License

Distributed under the Apache 2.0 License. See LICENSE for more information.

Community

Join our community to discuss the HolmesGPT roadmap and share feedback:

Community Meetups

Support

If you have any questions, feel free to message us on HolmesGPT Slack Channel

How to Contribute

Please read our CONTRIBUTING.md for guidelines and instructions.

For help, contact us on Slack or ask DeepWiki AI your questions.

Please make sure to follow the CNCF code of conduct - details here.

Name		Name	Last commit message	Last commit date
Latest commit History 1,187 Commits
.claude		.claude
.github		.github
docs		docs
examples		examples
experimental/ag-ui		experimental/ag-ui
helm		helm
holmes		holmes
holmes_operator		holmes_operator
images		images
loki		loki
scripts		scripts
tests		tests
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
ADOPTERS.md		ADOPTERS.md
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
Dockerfile.operator		Dockerfile.operator
FEATURES.md		FEATURES.md
GOVERNANCE.md		GOVERNANCE.md
LICENSE		LICENSE
MAINTAINERS.md		MAINTAINERS.md
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
build_with_arm.sh		build_with_arm.sh
config.example.yaml		config.example.yaml
conftest.py		conftest.py
holmes_cli.py		holmes_cli.py
mkdocs.yml		mkdocs.yml
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
run_benchmarks_local.py		run_benchmarks_local.py
server.py		server.py
tempo_cli.py		tempo_cli.py
tempo_cli_README.md		tempo_cli_README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HolmesGPT — The CNCF SRE Agent

How it Works

🔗 Data Sources

🚀 End-to-End Automation

Installation

Supported LLM Providers

Using HolmesGPT

🔐 Data Privacy

License

Community

Support

How to Contribute

About

Uh oh!

Releases 103

Packages

Uh oh!

Uh oh!

Contributors 56

Languages

License

HolmesGPT/holmesgpt

Folders and files

Latest commit

History

Repository files navigation

HolmesGPT — The CNCF SRE Agent

How it Works

🔗 Data Sources

🚀 End-to-End Automation

Installation

Supported LLM Providers

Using HolmesGPT

🔐 Data Privacy

License

Community

Support

How to Contribute

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 103

Packages 0

Uh oh!

Uh oh!

Contributors 56

Languages

Packages