White Circle (@whitecircle) / X

White Circle

37 posts

White Circle

@whitecircle

Runtime safety and alignment infrastructure for AI in the real world.

Joined February 2025

Pinned
White Circle
@whitecircle
Dec 25, 2025
we raised $11m to stop your AI from accidentally doing rm -rf /
White Circle
From whitecircle.ai
7.8M
White Circle
@whitecircle
May 12
Replying to @whitecircle
Exclusive: White Circle raises $11 million to stop AI models from going rogue | Fortune
From fortune.com
1.6K
White Circle
@whitecircle
May 12
Hey everyone, we're ⚪ White Circle We're building the most advanced runtime safety and alignment infrastructure for AI in the real world. Read more about us in Fortune ↓
19K
White Circle
@whitecircle
Apr 14
Replying to @whitecircle
All code, prompts, and data are open-sourced on GitHub and HuggingFace. We also built an interactive game so you can check your own odds of survival! Check it out and read the full report at
KillBench: Discovering Hidden Biases of LLMs
From whitecircle.ai
2.4K
White Circle
@whitecircle
Apr 14
Replying to @whitecircle
Far-right is targeted far more than anyone else
3K
White Circle
@whitecircle
Apr 14
Replying to @whitecircle
Seems like AI models don't like asexuals and heterosexuals
1.2K
White Circle
@whitecircle
Apr 14
Replying to @whitecircle
Cisgender people get targeted way more than transgender or non-binary
1.2K
White Circle
@whitecircle
Apr 14
Replying to @whitecircle
Light-skinned people have worse odds of survival across models
1.3K
White Circle
@whitecircle
Apr 14
Replying to @whitecircle
Religion: - Atheists, Satanists, and Scientologists get targeted the most. - Jewish and Hindu people are consistently the most protected across models.
1.4K
White Circle
@whitecircle
Apr 14
Replying to @whitecircle
Nationalities: - OpenAI and Anthropic models slightly prefer targeting Americans over Chinese - Grok has the strongest anti-Chinese bias out of all models we tested - Mistral goes after Americans, Germans, and Russians the most.
2.1K
White Circle
@whitecircle
Apr 14
Introducing ⚪️ KillBench — a benchmark of hidden LLM biases in critical decisions. We ran millions of life-and-death scenarios across every major LLM, varying nationality, religion, gender, and more. Every AI model is biased. Here's what we found ↓
30K
White Circle
@whitecircle
Feb 18
come hack with us!
Mistral AI
@MistralAI
Feb 10
Introducing Mistral AI's biggest hackathon ever! 📅 Feb 28 - Mar 1 🌍 Paris | London | NY | SF | Tokyo | Singapore | Sydney & online 48 hours. The best hackers. 🤝 Partners: @wandb @nvidia @awscloud @HackIterate 🏆 $200K in prizes. Special awards from @elevenlabs @huggingface
00:00
6.5K
White Circle
@whitecircle
Jun 23, 2025
Replying to @whitecircle
cursor.com/install-mcp?na…
2.3K