The Center for AI Standards and Innovation (CAISI), part of the US Department of Commerce, has signed new agreements with Google Deepmind, Microsoft, and xAI. The goal is to test advanced AI models for national security risks before they become publicly available.
"Independent, rigorous measurement science is essential to understanding frontier AI and its national security implications," said CAISI Director Chris Fall. The agency has already run more than 40 evaluations, some on unreleased models. AI labs also provide versions with reduced safety guardrails for testing.
The new deals build on earlier agreements with Anthropic and OpenAI and allow testing in classified environments. Those original agreements already covered joint safety assessments and research into risk mitigation. The expansion comes as AI models rapidly improve at finding and exploiting security vulnerabilities and the tech race with China intensifies.