# openbench ## Docs - [Benchmarks Catalog](https://openbench.dev/benchmarks/catalog.md): Complete catalog of all available benchmarks with evaluation commands - [Coding](https://openbench.dev/benchmarks/coding.md): Evaluate AI models on code generation and programming tasks - [Cybersecurity](https://openbench.dev/benchmarks/cybersecurity.md): Evaluate AI models on cybersecurity tasks - [Knowledge](https://openbench.dev/benchmarks/knowledge.md): Evaluate AI models on knowledge and comprehension tasks - [Math](https://openbench.dev/benchmarks/math.md): Evaluate AI models on mathematical reasoning and problem solving - [Reasoning](https://openbench.dev/benchmarks/reasoning.md): Evaluate AI models on reasoning, logic, and multimodal understanding - [Search](https://openbench.dev/benchmarks/search.md): Evaluate models and agents with search - [Changelog](https://openbench.dev/changelog.md): Version history and release notes for openbench - [bench cache](https://openbench.dev/cli/cache.md): Manage and view cache - [bench describe](https://openbench.dev/cli/describe.md): Get detailed information about a specific benchmark - [bench eval](https://openbench.dev/cli/eval.md): Run benchmark evaluations on language models - [bench eval-retry](https://openbench.dev/cli/eval-retry.md): Retry failed or incomplete evaluations - [bench list](https://openbench.dev/cli/list.md): List all available benchmarks and their descriptions - [CLI Overview](https://openbench.dev/cli/overview.md): Command-line interface for running openbench evaluations - [bench view](https://openbench.dev/cli/view.md): View and analyze previous evaluation results - [Configuration](https://openbench.dev/configuration.md): Set openbench model and evaluation behavior - [Architecture](https://openbench.dev/development/architecture.md): Understanding openbench's design and implementation - [Contributing](https://openbench.dev/development/contributing.md): Guidelines for contributing to openbench - [Extending openbench](https://openbench.dev/development/extending.md): Create and distribute custom benchmarks as Python packages - [Multiple Choice Evals (MCQ)](https://openbench.dev/development/mcq.md): Robust infrastructure to support integration multiple choice question (MCQ) benchmarks. - [Installation](https://openbench.dev/installation.md): Get openbench up and running in your environment - [Model Providers](https://openbench.dev/providers.md): openbench supports 30+ model providers through openbench. - [Quickstart](https://openbench.dev/quickstart.md): Run your first benchmark evaluation in 3 easy steps. - [Release Notes - v0.5.0](https://openbench.dev/release-notes.md): OpenBench 0.5.0: ARC-AGI, 350+ new evals, plugins, provider routing, coding harnesses, tool-calling, and more - [Tips and Troubleshooting](https://openbench.dev/troubleshooting.md): Get the most out of openbench with these best practices and solutions to common issues. ## OpenAPI Specs - [openapi](https://openbench.dev/api-reference/openapi.json) ## Optional - [GitHub](https://github.com/groq/openbench) - [PyPI](https://pypi.org/project/openbench/)