Atla’s cover photo
Atla

Atla

Software Development

The eval and improvement platform for AI agents

About us

Atla is the eval and improvement platform for AI agents. We help teams find and fix agent failures—fast. As agents grow more complex, debugging and improving them has become a significant challenge. Atla brings clarity by tracing every step, surfacing error patterns across runs, and delivering specific suggestions to improve agent performance. With real-time monitoring, automated error detection, and tools for prompt experimentation, Atla gives teams the visibility and control needed to confidently ship agentic systems that work. We’re a team of researchers, engineers, entrepreneurs and operational leaders. Our expertise in evals was honed through training our own purpose-built LLM Judges, Selene and Selene Mini, which are available open-source and have been downloaded 40,000+ times. Atla is backed by Y Combinator, Creandum, and the founders of Reddit, Cruise, Rappi, Instacart and more. Blog: https://atlaai.substack.com/

Website
www.atla-ai.com
Industry
Software Development
Company size
2-10 employees
Headquarters
London
Type
Privately Held

Locations

Employees at Atla

Updates

  • Atla reposted this

    Trees, not logs? Most approaches to automated trace evaluation for agents use traces that have been flattened into a single block of text. It’s simple, but it discards the actual tree structure created by spans, parent child relationships, and branching tool calls. In our latest blog, we dig into why it's useful to keep the tree structure of traces intact to evaluate agents (something we do at Atla). Treating traces as trees allows us to 1) represent the trace much more compactly, as a collection of conversation prefix trees, 2) traverse the tree, pausing at every LLM span and leaving “notes”, and 3) traverse both forwards and backwards—where a backward pass prunes our judgements using the benefit of hindsight. Our view: preserving the structure of traces leads to more precise and more faithful evaluations. As agents become more complex, evaluation methods need to account for that complexity. Trees, not logs 🌲 ➡️ Check out the blog for more details: https://lnkd.in/ea4CkvHj

    • No alternative text description for this image
  • Atla reposted this

    Most teams debug AI agents like it’s 1999—staring at logs, guessing what went wrong. Today we shipped something different: our platform pinpoints the exact step that caused an agent to fail. The first real step toward solving the credit assignment problem for AI agents. If you can’t pinpoint why an agent failed, you can’t make it better. Now, you can.

    • No alternative text description for this image
  • Atla reposted this

    There's a problem with voice agents that no one's talking about. We've extended our applied research to voice agent evaluation 📊. We noted: 1. A surge in demand for voice agents 📈, 2. A lack of informative resources and tooling for evaluating them 👎, 3. Nothing that evaluates end-to-end, including how paralinguistic cues, audio understanding, and text-to-speech contribute to (un)successful interactions. So, there's a big blind spot in what we can scalably oversee (overhear?) today. But we're on our way to tuning into "the sound of alignment" by extending our error analysis framework with open-coding of audio-native failure modes. In the meantime, I've made a quick explainer on voice agents and Atla's high-level thoughts on evaluating them. Stand by for the tooling ⏯️.

  • Atla reposted this

    We're fans of error analysis at Atla, so we took Shreya Shankar and Hamel Husain's advice seriously, and experimented with automating various parts of it. We've written up our findings of what works (and what doesn't) in a new blogpost! TL;DR: we can do a lot better than "stuff everything into an LLM" with a bit of intentional context engineering & machine learning. Read more here: https://lnkd.in/dVKaTKTZ

    • No alternative text description for this image
  • Atla reposted this

    Had a blast 💥 sharing our work recently at both AI demo nights (thanks Andy Tyler and Anoushka Patel @ MMC Ventures) and at the very first London AI Builders (thanks Arthur Poot). I presented Atla's vision, and a little bit about what's going on under the bonnet of https://app.atla-ai.com/. The big alpha is not in telling you *that* something has gone wrong - like the vast majority of standard evals - but **where** it has gone wrong and, importantly, **why**. Backpropagating this on-policy feedback into the agent's codebase has given boosts of 10% points on tasks within a couple hours 🚀! I've had some super interesting chats with founders, devs, and AI Tinkerers about evaluation, alignment, RL, etc. at these events, and would highly recommend joining. Looking forward to the CodeWords Talk Night later @ incident.io - Roman Engeler will reveal more about what we're doing, come say hi 👋!

    • No alternative text description for this image
    • No alternative text description for this image
  • Atla reposted this

    Today we’re launching Atla — the improvement engine for AI agents. Engineering teams spend 5+ hours every week digging through traces, chasing failures one by one, and still miss the critical errors that matter most. As agents grow more autonomous and complex, this problem only gets worse. So we built Atla: a platform that finds and fixes your agent’s critical failures in minutes, not days. We surface dynamic patterns specific to your use case, suggest precise fixes, and measure whether changes are really improving your agents. Atla makes closing the loop on agent improvements fast and effective. Developers use it to focus on what matters, cut debugging time by 5X, and double shipping speed. Start using Atla for free: https://lnkd.in/exmPXVwY

Similar pages

Browse jobs

Funding

Atla 1 total round

Last Round

Seed

US$ 5.0M

See more info on crunchbase