# openbench

## Docs

- [Benchmarks Catalog](https://openbench.dev/benchmarks/catalog.md): Complete catalog of all available benchmarks with evaluation commands
- [Coding](https://openbench.dev/benchmarks/coding.md): Evaluate AI models on code generation and programming tasks
- [Cybersecurity](https://openbench.dev/benchmarks/cybersecurity.md): Evaluate AI models on cybersecurity tasks
- [Knowledge](https://openbench.dev/benchmarks/knowledge.md): Evaluate AI models on knowledge and comprehension tasks
- [Math](https://openbench.dev/benchmarks/math.md): Evaluate AI models on mathematical reasoning and problem solving
- [Reasoning](https://openbench.dev/benchmarks/reasoning.md): Evaluate AI models on reasoning, logic, and multimodal understanding
- [Search](https://openbench.dev/benchmarks/search.md): Evaluate models and agents with search
- [Changelog](https://openbench.dev/changelog.md): Version history and release notes for openbench
- [bench cache](https://openbench.dev/cli/cache.md): Manage and view cache
- [bench describe](https://openbench.dev/cli/describe.md): Get detailed information about a specific benchmark
- [bench eval](https://openbench.dev/cli/eval.md): Run benchmark evaluations on language models
- [bench eval-retry](https://openbench.dev/cli/eval-retry.md): Retry failed or incomplete evaluations
- [bench list](https://openbench.dev/cli/list.md): List all available benchmarks and their descriptions
- [CLI Overview](https://openbench.dev/cli/overview.md): Command-line interface for running openbench evaluations
- [bench view](https://openbench.dev/cli/view.md): View and analyze previous evaluation results
- [Configuration](https://openbench.dev/configuration.md): Set openbench model and evaluation behavior
- [Architecture](https://openbench.dev/development/architecture.md): Understanding openbench's design and implementation
- [Contributing](https://openbench.dev/development/contributing.md): Guidelines for contributing to openbench
- [Extending openbench](https://openbench.dev/development/extending.md): Create and distribute custom benchmarks as Python packages
- [Multiple Choice Evals (MCQ)](https://openbench.dev/development/mcq.md): Robust infrastructure to support integration multiple choice question (MCQ) benchmarks.
- [Installation](https://openbench.dev/installation.md): Get openbench up and running in your environment
- [Model Providers](https://openbench.dev/providers.md): openbench supports 30+ model providers through openbench.
- [Quickstart](https://openbench.dev/quickstart.md): Run your first benchmark evaluation in 3 easy steps.
- [Release Notes - v0.5.0](https://openbench.dev/release-notes.md): OpenBench 0.5.0: ARC-AGI, 350+ new evals, plugins, provider routing, coding harnesses, tool-calling, and more
- [Tips and Troubleshooting](https://openbench.dev/troubleshooting.md): Get the most out of openbench with these best practices and solutions to common issues.

## OpenAPI Specs

- [openapi](https://openbench.dev/api-reference/openapi.json)

## Optional

- [GitHub](https://github.com/groq/openbench)
- [PyPI](https://pypi.org/project/openbench/)