Inspiration

We designed NeoLayer to solve an important and increasing concern for autonomous AI systems: when agents have unconstrained objectives, they can “drift” from desired behavior in ways that are difficult to anticipate or protect against. Rather than unleashing agents on an unconstrained world, NeoLayer offers a zero trust, intent-based supervision system that examines all tool invocations against an explicit, verifiable policy. The idea comes from security systems (zero trust, intent inference) and the emerging requirement for safe, explainable AI systems in situations where autonomy matters, e.g., automated workflows, AI decision assistance, or multi-agent systems.

What it does

NeoLayer is a zero trust intent drift firewall for autonomous AI agents, and what that means is: Intercepting all tool calls from an AI agent. Logging rich semantic context for agent decisions and state. Determining the decision for every action relative to a baseline intent and policy (allow, warn, block). Offering explainable decisions with risk breakdowns and human-readable explanations. Simulations and trace replays, optionally powered by NeoLayer, to identify where the intent drift for an agent started to go awry. It includes a backend REST API and a demo frontend for simulating agents and viewing policy decisions in real-time.

How we built it

At the core of the architecture is the following:

  • Node.js server with a REST API to receive the agent behavior and analyze it.
  • Semantic context extraction with the help of an LLM (e.g., Gemini API) to understand the intent and create the trajectory summaries based on the conversation history.
  • An extensible policy engine that evaluates the calls made by the tools.
  • Frontend simulation UI to step through the live sessions.

It is a production-grade architecture with the ability to add new policies, backends, and evaluation types. There are agent simulation, trace replay, and live evaluation modes.

Challenges we ran into

Identifying concrete real-world use cases was more difficult than expected. While intent drift and autonomous agent oversight are clearly emerging problems, many organizations are understandably cautious about adopting third-party systems that monitor internal agent behavior or interact with sensitive data. This forced us to think carefully about trust boundaries, deployment models, and how NeoLayer could operate as a lightweight, opt-in safety layer rather than a deeply invasive system.

Balancing safety with developer trust was a recurring challenge. Our system is designed to evaluate and potentially block agent actions, which raises legitimate concerns about transparency and control. We had to ensure that every intervention was explainable and auditable, so users could clearly understand why an action was flagged or blocked.

Managing API usage and rate limits required careful attention. Working with large language model APIs introduced constraints around token usage and performance, pushing us to optimize prompts, reduce redundant calls, and design our system to degrade gracefully under limited resources.

Secure configuration management was another challenge. Ensuring API keys and environment variables remained private across different machines and deployment environments required disciplined use of environment variables and consistent setup across the team.

Version control coordination presented early friction. Poorly named branches and overlapping work initially slowed us down, but we resolved this by slowing our pace, clearly communicating changes, and establishing simple conventions for branching and merging.

Simulating realistic agent behavior was non-trivial. Designing test cases that accurately reflected how autonomous agents behave in the real world—without oversimplifying or overfitting—required multiple iterations and careful abstraction.

Explainability vs. complexity trade-offs emerged as the system grew. As policies and semantic evaluations became more sophisticated, we had to work to keep explanations human-readable and useful rather than opaque or overly technical.

Accomplishments that we're proud of

Chat: End-to-end zero trust firewall for autonomous agents - not just a concept but a working and extensible system.

Rich semantic logging and explainability - users can easily understand why something is happening.

Live simulation infrastructure - developers and auditors can visualize intent drift and policy intervention points.

Production-grade readiness - Dockerized, RESTful API support, and real-world deployment docs in the repo.

What we learned

Determining intent based on tool call logs and context is essentially a language problem rather than purely behavioral, and therefore requires linguistics as much as it requires programmatic policy.

Real-world safety systems profit from having clear decision logs, which is good for debugging as well as trusting the system.

A modular architecture (separate inference, policy, and execution) pays off quickly as requirements change.

Rapid prototyping and iterative development, the "hackathon" approach, is good for developing systems that are actually usable beyond the demo.

What's next for NeoLayer

Agent integrations – extending language, agent, and toolset support.

Intent models – refining the semantic inference layer with improved language understanding.

Configurable policy modules – providing users with the opportunity to create custom policies for different domains (medical, financial, research, robotics, etc.).

Analytics dashboard – providing users with deeper agent behavior trends, drifts, and system efficacy.

Community contributions – evolving the system into a standard safety layer for autonomous AI systems.

Built With

Share this project:

Updates