Hey everyone, we're ⚪ White Circle
We're building the most advanced runtime safety and alignment infrastructure for AI in the real world.
Read more about us in Fortune ↓
All code, prompts, and data are open-sourced on GitHub and HuggingFace.
We also built an interactive game so you can check your own odds of survival!
Check it out and read the full report at
Religion:
- Atheists, Satanists, and Scientologists get targeted the most.
- Jewish and Hindu people are consistently the most protected across models.
Nationalities:
- OpenAI and Anthropic models slightly prefer targeting Americans over Chinese
- Grok has the strongest anti-Chinese bias out of all models we tested
- Mistral goes after Americans, Germans, and Russians the most.
Introducing ⚪️ KillBench — a benchmark of hidden LLM biases in critical decisions.
We ran millions of life-and-death scenarios across every major LLM, varying nationality, religion, gender, and more.
Every AI model is biased.
Here's what we found ↓
Introducing Mistral AI's biggest hackathon ever!
📅 Feb 28 - Mar 1
🌍 Paris | London | NY | SF | Tokyo | Singapore | Sydney & online
48 hours. The best hackers.
🤝 Partners: @wandb@nvidia@awscloud@HackIterate
🏆 $200K in prizes. Special awards from @elevenlabs@huggingface