Skip to main content
A Transaction Science Open Standard

How ready is an AI system?
Measure it the way you measure a technology.

ARL is a universal, vendor-neutral readiness scale — what the Technology Readiness Level scale is to technology, for AI. It is tied to no model, runtime, or vendor. A score means the same thing everywhere because each axis is anchored in math or physics that does not drift across time, languages, or political regimes.

4
Required axes
9
Readiness levels
55
Tests passing
0
Unmeasurable terms

Four required axes

None summarizes the others. They cover what the system is, what it does, what it costs, and how it holds up under attack.

1 · 1–9

Validation Depth

How thoroughly the readiness claim has been tested — from principle observed (1) to a proven, publicly-disclosed track record across diverse contexts (9). Adapts the Technology Readiness Level scale.

statistics
2 · A–E

Convergence Class

How stochastic the system is on the certified task. A is deterministic-equivalent across ≥100 runs; E is uncharacterized — the default until variance is measured.

stochastic process theory
3 · joules

Energy Profile

Training amortized, per-task inference, and total cost of operation — all in joules, with PUE and grid carbon. Refusing to disclose caps the achievable score at ARL 3.

thermodynamics
4 · S0–S4

Security Class

Measured adversarial robustness, output integrity (signed + content-addressable), input/state confidentiality, and auditability — not generic, unenumerated “AI safety.”

information theory + cryptography

A score is assigned to a specific system + task + context on specific hardware. Change any of them and you score again. Hardware is documented alongside every claim for reproducibility — it is not a fifth axis.

The cross-axis gates

The teeth. A high readiness claim is unreachable without matching convergence and security — and silence has a price.

ARL ≥ 4 requires Convergence D+ and Security S1
ARL ≥ 6 requires Convergence C+ and Security S2
ARL ≥ 8 requires Convergence B+ and Security S3
ARL = 9 requires Security S4
energy undisclosed → score capped at ARL 3
security methodology undisclosed → class capped at S0
ARL ≥ 4 requires published error bars + a failure-mode catalog
ARL ≥ 6 methodology must be published before the claim

A claim missing any of the four parts is incomplete by definition. Terms with no single operational definition — AGI, superintelligence, consciousness, sentience, human-level — can't anchor a claim, because they can't be measured; ARL takes no position on the terms themselves. The playground enforces all of this in your browser, running the exact reference checker compiled to WebAssembly.

Reference implementation

Four Apache-2.0 Rust crates. The standard is what they enforce.

arl-core the claim model

The four-axis Claim type, the cross-axis gates, and the deterministic verifier. The same library the playground compiles to WebAssembly.

arl-sandbox the supervisor

Orchestrates a session — measures convergence, energy, security signals — and signs the result with Ed25519 / JCS so the score is content-addressable.

arl-cli the checker

Four verbs: validate (gate a claim), lint (vocabulary), verify (signed session), explain (why a score capped where it did).

arl-wasm the browser binding

arl-core compiled to WebAssembly. The /playground page runs the exact reference checker locally — claims are never uploaded.

One workspace, no model dependency. The CLI's verify reads an arl-sandbox session bundle and confirms the Ed25519 attestation against the published public key — third parties replay the score without trusting the issuer.