CodeForge Labs

🧠 About the Project

*Autonomous UX Experimentation

🌱 Inspiration

The project was inspired by a recurring pattern in modern development: UX problems are usually discovered late, diagnosed manually, and fixed reactively. Teams rely on guesswork, scattered feedback, and slow iteration cycles.

I wanted to explore a world where UX improvement could be proactive and autonomous — where agents could experience an interface the way a human would, understand what feels confusing or inefficient, and then repair the issues directly in code.

Instead of “testing variants,” the system acts more like an AI-powered UX engineer that continuously explores a product, identifies friction points, and generates improved versions in isolated environments.

🚀 What It Does

The platform introduces an entirely new workflow:

Browser-use agents interact with your application as if they were real users, discovering friction and surfacing usability gaps.
Anthropic Claude Code autonomously implements UX improvements inside isolated workspaces.
Daytona sandboxes spin up clean, ephemeral environments for every improvement idea.
Gemini evaluates behaviors and analyzes agent logs to produce structured insight.
Inngest orchestrates the multi-step pipeline reliably and in parallel.

The end result is a system that transforms a GitHub repository and a UX problem into multiple improved, auto-implemented versions of your application, each fully deployed and tested.

🧩 How It Was Built

Building this required combining several complex technologies into one pipeline:

Daytona SDK for on-demand cloud developer environments
Claude Code Agent SDK for autonomous codebase modifications
Browser-use SDK for realistic, task-based agent sessions
Google Gemini for log analysis and insight extraction
Inngest for orchestrating asynchronous jobs and multi-agent workflows
Next.js + Bun + Elysia + PostgreSQL for the control plane, API, and frontend

The biggest challenge was ensuring that all three agents — the browser explorer, the code engineer, and the orchestrator — communicated cleanly and could operate in parallel without interfering with each other.

🔍 What I Learned

Building an autonomous UX engineering system taught me:

How to design multi-agent workflows where each agent contributes a different layer of reasoning or capability
How to use Daytona effectively for large-scale, parallel sandbox creation
How to guide Claude Code to make reliable, traceable, and reversible code changes
How to extract actionable insight from unstructured browser logs using Gemini
How to handle state, orchestration, and error recovery using Inngest
How to produce a workflow that feels like a new type of intelligence layer on top of a codebase