Skip to content

HolmesGPT/holmesgpt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,187 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

HolmesGPT — The CNCF SRE Agent

Installation | Docs | Ask DeepWiki

Open-source AI agent for investigating production incidents and finding root causes. A CNCF Sandbox project by Robusta.Dev.

  • Petabyte-scale data: Server-side filtering, JSON tree traversal, and tool output transformers keep large payloads out of context windows
  • Deep integrations: Prometheus, Grafana, Datadog, Kubernetes, and many more—plus any REST API
  • Bidirectional alert integrations: Fetch alerts from AlertManager, PagerDuty, OpsGenie, or Jira—and write findings back
  • Any LLM provider: OpenAI, Anthropic, Azure, Bedrock, Gemini, and more
  • Operator mode: Run investigations on a schedule as a Kubernetes operator

How it Works

HolmesGPT uses an agentic loop to query live observability data from multiple sources and identify root causes.

holmesgpt-architecture-diagram

HolmesGPT Investigation Demo

🔗 Data Sources

HolmesGPT integrates with popular observability and cloud platforms. The following data sources ("toolsets") are built-in. Add your own.

Data Source Notes
AKS AKS Azure Kubernetes Service cluster and node health diagnostics
ArgoCD ArgoCD Get status, history and manifests and more of apps, projects and clusters
AWS AWS RDS events, instances, slow query logs, and more (MCP)
Azure Azure Azure resources and diagnostics (MCP)
Azure SQL Azure SQL Database health, performance, connections, and slow queries
Confluence Confluence Private runbooks and documentation
Coralogix Coralogix Retrieve logs for any resource
Datadog Datadog Query logs, metrics, and traces
Docker Docker Get images, logs, events, history and more
Elasticsearch Elasticsearch / OpenSearch Query logs, cluster health, shard and index diagnostics
GCP GCP Google Cloud Platform resources (MCP)
GitHub GitHub Repositories, issues, and pull requests (MCP)
Grafana Grafana Query and analyze dashboard configurations and panels
Helm Helm Release status, chart metadata, and values
Internet Internet Public runbooks, community docs etc
Kafka Kafka Fetch metadata, list consumers and topics or find lagging consumer groups
Kubernetes Kubernetes Pod logs, K8s events, and resource status (kubectl describe)
Loki Loki Query logs for Kubernetes resources or any query
MariaDB MariaDB MariaDB database queries and diagnostics (MCP)
MongoDB Atlas MongoDB Atlas Cluster health, slow queries, and performance diagnostics
NewRelic NewRelic Investigate alerts, query tracing data
OpenShift OpenShift Projects, routes, builds, security context constraints, and deployment configs
Prometheus Prometheus Investigate alerts, query metrics and generate PromQL queries
RabbitMQ RabbitMQ Partitions, memory/disk alerts, troubleshoot split-brain scenarios and more
Robusta Robusta Multi-cluster monitoring, historical change data, runbooks, PromQL graphs and more
ServiceNow ServiceNow Query tables and incident records
Sentry Error tracking, issues, and performance monitoring (MCP)
Slab Slab Team knowledge base and runbooks on demand
Splunk Log search and analysis (MCP)
SQL Databases SQL Databases PostgreSQL, MySQL, ClickHouse, MariaDB, SQL Server, SQLite
Tempo Tempo Fetch trace info, debug issues like high latency in application

See the full list of built-in toolsets for additional integrations including Cilium, KubeVela, Notion, Prefect, and more.

🚀 End-to-End Automation

HolmesGPT can fetch alerts/tickets to investigate from external systems, then write the analysis back to the source or Slack.

Integration Status Notes
Slack Demo. Available via Robusta.dev (commercial platform)
Microsoft Teams Available via Robusta.dev (commercial platform)
Prometheus/AlertManager Robusta SaaS or HolmesGPT CLI
PagerDuty HolmesGPT CLI only
OpsGenie HolmesGPT CLI only
Jira HolmesGPT CLI only
GitHub HolmesGPT CLI only

Installation

All Installation Methods

Read the installation documentation to learn how to install HolmesGPT.

Supported LLM Providers

All Integration Providers

Read the LLM Providers documentation to learn how to set up your LLM API key.

Using HolmesGPT

See the walkthrough documentation for usage guides, including:

🔐 Data Privacy

By design, HolmesGPT has read-only access and respects RBAC permissions. It is safe to run in production environments.

License

Distributed under the Apache 2.0 License. See LICENSE for more information.

Community

Join our community to discuss the HolmesGPT roadmap and share feedback:

Support

If you have any questions, feel free to message us on HolmesGPT Slack Channel

How to Contribute

Please read our CONTRIBUTING.md for guidelines and instructions.

For help, contact us on Slack or ask DeepWiki AI your questions.

Please make sure to follow the CNCF code of conduct - details here. Ask DeepWiki

OpenSSF Best Practices OpenSSF Scorecard