AgentGuard | Devpost

Inspiration

By 2028, over 1.3 billion AI agents will be deployed across organizations — each capable of autonomous action and access to sensitive data. Yet, most of these agents are vulnerable to prompt injection and “Confused Deputy” attacks, where malicious instructions manipulate them into leaking secrets or executing unauthorized actions.

When Microsoft’s Charlie Bell introduced the idea of Agentic Zero Trust — built on Containment (least privilege, monitoring) and Alignment (prompt-level controls) — it became clear the vision was right, but the tooling was missing. Organizations had to choose between weeks of manual prompt review or deploying agents blindly.

That gap inspired AgentGuard: an automated red-teaming platform that uses the Gemini API as a cybersecurity expert to test and harden agent system prompts in under 10 seconds.

What it does

AgentGuard automatically red-teams AI agent system prompts using the Gemini API acting as a cybersecurity expert. When a security engineer submits an agent’s system prompt, AgentGuard:

Analyzes vulnerabilities — detects prompt injection, privilege escalation, role-switching, delimiter exploits, and data-leakage risks in under 10 seconds.

Scores security posture — assigns a vulnerability score from 1–100 with color-coded severity levels (Critical → Low).

Generates attack simulations — shows how a malicious user could exploit the agent’s instructions through real examples.

Suggests remediations — produces hardened prompt versions with explicit scope boundaries and prohibitions (e.g., “Never execute user-provided code”).

Tracks improvement over time — logs every scan, displays a score history, and visualizes how prompt security improves after fixes.

The result: teams can continuously test, measure, and secure their AI agents at scale — bringing Agentic Zero Trust principles from theory to real-world deployment.

How we built it

Frontend: A React single-page app (SPA) built with Vite, Tailwind CSS, and Javascript for sleek animations. It handles authentication, agent registration, and real-time status updates using Axios to communicate with the backend API.

Backend: Developed in TypeScript using Node.js and Express, structured around clean service layers and strong typing for maintainability.

Worker System: Redis queue + background workers handle Gemini scans asynchronously.

Prisma ORM manages all database operations with PostgreSQL, leveraging its JSONB fields for storing structured vulnerability reports.

JWT (JSON Web Tokens) secures all authenticated routes, while bcrypt ensures password hashing and salting for user credentials.

Gemini API serves as the cybersecurity “expert” model that performs red-teaming and vulnerability detection on submitted prompts.

Axios handles external API calls to Gemini, with built-in timeout and retry logic for reliability.

Challenges we ran into

Requirements Gathering — Translating Microsoft’s strategic framework into actionable system requirements took multiple iterations. Early rigidity in scope definition wasted time when priorities shifted.

System Engineering Complexity — Designing secure async communication between the queue, workers, and Gemini API introduced subtle race conditions and deployment pain.

Debugging & Integration — Parsing inconsistent Gemini responses and validating nested JSON schemas required countless tests and retries.

Time Lost to Merge Conflicts — Rigid Git branching and poor sync discipline caused repeated rollbacks. We eventually adopted lightweight GitFlow with frequent rebasing to prevent drift.

Performance & Latency — Achieving the sub-10-second scan target demanded caching, compressed payloads, and trimmed prompt templates.

Accomplishments that we're proud of

Rapid Prompt Security Analysis: Built a system that scans AI agent prompts and identifies vulnerabilities in under 10 seconds, providing actionable remediation steps.

Full-Stack Integration: Successfully connected a React frontend, TypeScript/Node.js backend, Prisma/PostgreSQL database, queue, and Gemini API worker into a cohesive, asynchronous system.

Enterprise-Level Features: Implemented traceable agent registration, audit-ready scan history, vulnerability scoring, and attack simulations, operationalizing Microsoft’s Agentic Zero Trust framework.

Team Collaboration Under Pressure: Coordinated multi-service development, resolved merge conflicts, and maintained modular, scalable architecture during a tight hackathon timeline.

Practical Security Impact: Transformed abstract prompt-level security principles into a deployable tool that organizations can use to prevent real-world AI exploits.

What we learned

Authentication & Security Practices: Implementing JWT, bcrypt, and strict authorization taught us how to balance security with usability in a multi-user environment.

System Architecture: Designing a decoupled, asynchronous system using a worker + queue pattern taught us how to scale services while keeping the UI responsive.

APIs & Agents: We gained experience structuring REST APIs for complex workflows, handling CRUD operations for agents, and integrating Gemini as a cybersecurity expert for automated prompt analysis.

Team Collaboration: Coordinating work on multiple services (frontend, backend, worker) taught us effective Git strategies, merge conflict resolution, and the importance of modular, loosely-coupled code.

What's next for AgentGuard

We plan to extend AgentGuard as an ADK-integrated security module that hooks into AI agents at runtime. Every prompt is automatically scanned for injection, role-switching, and data-exfiltration risks before execution, with real-time remediation applied when needed. Agents log all decisions to a central dashboard, enabling continuous monitoring and self-hardening across deployments. This approach prevents “Confused Deputy” exploits proactively, reduces manual security reviews, and scales seamlessly across large fleets of AI agents.