Inspiration

I was inspired by the hacker and builder community—many developers focus on creating features and shipping fast, but often overlook security. I wanted to make security approachable and easy to integrate into the development process.

Another motivation was the hackathon challenge itself, especially the prize for best use of Auth0, which gave me the push to explore authentication and security in a creative way. With this project, I aimed to build something that helps developers identify vulnerabilities and take action before attackers do.

What it does

The platform takes a GitHub repository and automatically performs a security-focused code review. Using your provided strategic goals, it simulates a team of AI “cybersecurity engineers” that identify vulnerabilities, refactor code, and implement improvements to harden your project. Once the code has been updated, the system generates a link that allows you to pull the secure, modified version directly back into your GitHub repository.

How we built it

We built a lightweight HTML/JavaScript frontend that collects three inputs: a GitHub repository link, user-defined strategic goals, and a GitHub Personal Access Token. These inputs are securely forwarded through an ngrok tunnel to a local virtualized Python environment, which orchestrates the backend pipeline.

The backend is a multi-agent system implemented with Google AI APIs, structured to mimic a cybersecurity software team with distinct roles:

Software Engineering Manager (Manager Agent) – Interprets the strategic goals, translates them into actionable tasks, and delegates responsibilities to the rest of the team.

Senior Penetration Tester (Engineer Agent 1) – Clones the repository, performs static and dynamic testing, and identifies critical vulnerabilities or potential exploits.

Software Engineer (Engineer Agent 2) – Receives vulnerability reports and implements targeted fixes, refactoring code to improve resilience.

Security Engineer (Engineer Agent 3) – Advises on secure design principles, validates fixes, and ensures changes align with industry best practices.

Communication flows as a task allocation loop: the Manager assigns objectives, the Penetration Tester reports vulnerabilities, the Software Engineer patches the code, and the Security Engineer verifies security posture. This iterative workflow simulates a real-world security review engagement, but entirely powered by AI agents.

Challenges we ran into

One of the biggest challenges was troubleshooting the ngrok tunnel. We initially ran into persistent port mismatches between the local Python backend and the public-facing ngrok endpoint, which required careful debugging of environment configurations before stable communication was established.

Another major challenge was integrating Auth0 into the frontend. Setting up secure authentication within a lightweight HTML/JavaScript page required custom handling of redirects, tokens, and callback flows. Ensuring that the authentication layer worked seamlessly with our input pipeline added both complexity and time to development, but ultimately strengthened the security posture of the overall system.

Accomplishments that we're proud of

We built the entire project exclusively using AI-assisted development—without manually coding from scratch. At most, we copy-pasted AI-generated code, but every component of the system was designed and implemented with the help of AI tools. This gave us a unique opportunity to push the limits of what modern AI coding assistants can do when orchestrated together.

We also stress-tested the boundaries of multiple platforms: our Gemini Pro subscription was quickly maxed out and rate-limited for nearly 10 hours, forcing us to adapt by switching between Perplexity Pro in Comet and Copilot Pro in VS Code to keep development moving. Despite these hurdles, we successfully integrated all the moving parts of the project end-to-end using only AI-generated code.

What we learned

One of the biggest takeaways was how critical prompt engineering is when working with large language models. We quickly realized that learning to “speak their language” is not just a convenience—it’s a core skill. The quality, precision, and structure of our prompts directly determined whether the AI produced useful, actionable code or something completely off-track. Developing this skill became just as important as traditional coding, especially since our entire project relied on AI-generated output.

What's next for Alfred

We plan to expand the system by automating additional specialized “AI teams” beyond cybersecurity—such as performance optimization teams, documentation teams, and UI/UX review teams. Each team would act as a set of role-based agents that collaborate to improve different aspects of a codebase, creating not just secure code, but truly production-ready, high-quality “vibe code.”

Built With

Share this project:

Updates