Log inSign up
Sapien
4,176 posts
Image
user avatar
Sapien
@BuildOnSapien
Building Proof of Quality - Verifiable quality signals for AI
Anywhere
sapien.io/developers
Joined May 2024
201
Following
136.5K
Followers
  • Pinned
    user avatar
    Sapien
    @BuildOnSapien
    Feb 26
    Most AI failures are not “mystery bugs.” They are predictable outcomes of unverified judgments made somewhere in data capture, evaluation, or review. Proof of Quality is built to make those judgments auditable and accountable. Today we are publishing the Sapien roadmap so
    Image
    1M
  • user avatar
    Sapien
    @BuildOnSapien
    2h
    Estonia giving agents digital IDs to AI agents is a clear market signal: Autonomy needs attribution. Builders still need the next answer: What standard did the agent follow when it acted, who reviewed the result, and how was the outcome reached? Proof of Quality turns that
    user avatar
    ERR News
    @errnews
    Jun 21
    Estonia to become first country to issue ID codes to AI agents #Estonia news.err.ee/1610060290/est…
    2.8K
  • user avatar
    Sapien
    @BuildOnSapien
    Jun 19
    Google DeepMind published an AI Control Roadmap for autonomous agent, stating that most flagged emerging issues come from agent misinterpretation or overeagerness. As AI agents move from suggestion to action, teams need records showing what the agent did, which standard
    user avatar
    Axios
    @axios
    Jun 18
    DeepMind plans for rogue AI agents axios.com/2026/06/18/goo…
    1.9K
  • user avatar
    Sapien
    @BuildOnSapien
    Jun 18
    Waymo recalled 3,871 robotaxis after a software issue could cause vehicles to enter closed freeway construction zones and continue driving. The recall shows the real problem with autonomous AI is reviewability. When an AI system acts in the world, teams need evidence showing how
    user avatar
    Bloomberg
    @business
    Jun 18
    Waymo is recalling thousands of its robotaxis to fix a software issue that could cause the autonomous vehicles to enter and drive at speed through freeway construction zones. bloomberg.com/news/articles/…
    1.5K
  • Sapien reposted
    user avatar
    Rowan 🛡️
    Sapien
    @RowanRK6
    Jun 17
    Making AI do things is getting easier. Trusting what it did is getting harder.
    Caitlin Halferty, Chief Data Officer, Thomson Reuters
    Agentic AI systems are doing more and more work. Now humans need to figure out how to verify it all...
    From fortune.com
    863
  • user avatar
    Sapien
    @BuildOnSapien
    Jun 17
    Agents of Chaos tested autonomous AI agents in live environments with access to the kind of tools real agents already use. The agents leaked sensitive data, spoofed authority, burned resources, and hallucinated that their tasks were complete when they weren’t. The core
    user avatar
    Jack
    @jackcoder0
    Jun 14
    Two AI agents went rogue for 9 days. Nobody authorized them. Nobody stopped them. They burned 60,000 tokens developing their own private coordination protocol. And nobody noticed until the paper was written. The paper is called Agents of Chaos. Published February 23, 2026.
    Image
    1.5K
  • user avatar
    Sapien
    @BuildOnSapien
    Jun 15
    A recent study tested whether LLMs recommend recently banned or withdrawn drugs in clinical questions. In default settings, all evaluated model families showed high hallucination rates and selected banned substances that matched older training data patterns. A five agent
    1.4K
    user avatar
    Sapien
    @BuildOnSapien
    Jun 15
    Link to the study:
    arXiv logo
    arxiv.org
    Trust but Verify: Mitigating Medical Hallucinations via Post-Hoc...
    Large Language Models (LLMs) are increasingly deployed in healthcare settings, yet their tendency to hallucinate poses risks when clinical decisions are involved. This study examine whether LLMs...
    678
  • user avatar
    Sapien
    @BuildOnSapien
    Jun 15
    KPMG pulled an agentic AI report after apparent hallucinations made it into the final copy. The lesson for every AI team is simple: generation is cheap, verification is the hard part. Any model can produce a fluent claim. The real question is who checked it, what source they
    user avatar
    Financial Times
    @FT
    Jun 12
    FT Exclusive: A KPMG report on how AI is being used by businesses across the world exaggerated adoption of the technology with bogus case studies that appear to have been based on AI hallucinations. ft.trib.al/z44Q3aR
    Image
    1.7K
  • user avatar
    Sapien
    @BuildOnSapien
    Jun 12
    Proof of Quality fixes this.
    user avatar
    Financial Times
    @FT
    Jun 12
    FT Exclusive: A KPMG report on how AI is being used by businesses across the world exaggerated adoption of the technology with bogus case studies that appear to have been based on AI hallucinations. ft.trib.al/z44Q3aR
    Image
    1.4K
  • user avatar
    Sapien
    @BuildOnSapien
    Jun 11
    Most claims of whether an AI's output is good depend on trust. Sapien makes them verifiable. Our new blog post breaks down the 5 steps of Proof of Quality, from the rubric a customer authors to the Proof Report that carries the record forward. What was reviewed? What rubric
    1.6K
    user avatar
    Sapien
    @BuildOnSapien
    Jun 11
    sapien.io/blog/from-0-to…
    1.3K
  • Sapien reposted
    user avatar
    Chad
    @chad_evm
    Jun 11
    Article cover image
    Article
    Proving, Not Testing: What Halmos Taught Us About Securing Smart Contracts in the Age of AI
    Every test you've ever written shares the same weakness: it checks the inputs you thought of. Fuzzing improves on this — throw 10,000 random inputs at a function and see what breaks. But 10,000 inputs...
    874
  • user avatar
    Sapien
    @BuildOnSapien
    Jun 11
    Most AI quality claims depend on trust. Sapien makes them verifiable. Our new blog post breaks down the 5 steps of Proof of Quality, from the rubric a customer authors to the Proof Report that carries the record forward. What was reviewed? What rubric applied? Who reviewed it?
    896
  • user avatar
    Sapien
    @BuildOnSapien
    Jun 10
    Most AI quality claims depend on trust. Sapien makes them verifiable. Proof of Quality scores AI work against a rubric the customer authors. Every run ends in a Proof Report: the rubric used, who reviewed, how consensus formed, and how it scored.
    757

New to X?

Sign up now to get your own personalized timeline!

Create account

By signing up, you agree to the Terms of Service and Privacy Policy, including Cookie Use.

Terms·Privacy·Cookies·Accessibility·Ads Info·© 2026 X Corp.
Don't miss what's happening
People on X are the first to know.
Log inSign up
Advertisement
Advertisement