Our Programs and Services

Our Evaluation Overview

Of four established types of AI evaluations, Humane Intelligence provides three as programs and services: red teaming, bias bounties, and contextual (bespoke) evaluations. Red teaming is a great first point of entry to AI evaluations for people and organizations with deep subject matter expertise in a given domain or subject matter area. Data outputs from red teaming events can then be used to create bias bounty challenges. Contextual evals are the best fit for organizations wishing to do a comprehensive analysis of its AI model or system. Note that AI evaluations are often iterative and non-linear. The chart below is intentionally reductive to demonstrate relationships and refers to how Humane Intelligence approaches these evaluation types. Other organizations may define these differently.
Humane Intelligence’s AI evaluation service offerings

Our Types of Evaluations

Image

Bias Bounty

Bias bounties are collaboratively designed sets of challenges that bring together researchers, impacted communities, and domain experts to rigorously examine and improve AI / ML systems, models, and datasets. Humane Intelligence is launching new challenges in September and October 2025.

Image

Red Teaming

Red teaming is a semi-structured testing approach to assess and improve the safety and effectiveness of AI models and systems by identifying vulnerabilities, limitations, and potential areas for improvement. Humane Intelligence offers red teaming events as a paid service.

Participants seated and listening to Rumman present at the IMDA Singapore event

Contextual Evaluations

AI Contextual Evaluations are rigorous, mixed-method, bespoke evaluations designed to give a comprehensive analysis of an AI model or system’s performance for a specified problem space. Humane Intelligence designs and runs contextual evals as a paid service.

In Terms of Cookies…

Bias Bounties vs Red Teaming

We are often asked about examples of red teaming and bias bounties. Imagine if we did a red teaming exercise for cookies – yes, cookies. A red teaming cookie event could involve participants of any background or skill level identifying the basics of the situation: does the cookie taste good or bad? Is the cookie too hard? Was it baked correctly or does it fall apart when it’s picked up? A cookie bias bounty goes deeper and would involve people with more skill or knowledge to identify what went wrong. In this case, someone who works in the bakery might realize that salt was used instead of sugar because of a mislabelled sugar container. Likewise, red teaming is the best starting point to identify potential issues with AI / ML models or systems. Bias bounties then go further to explain the problem and can result in mitigations or solutions.
What red teaming or a bias bounty could tell us about our cookies
Let’s work together

Want to hire us?

We have worked with education companies, international civil society, industry, and governments to design and run red teaming events, bias bounties, and bespoke contextual evaluations.

Sign up for our newsletter