kumQAt | Devpost

Landing Page
Dashboard Summary
Timing Stats
Sample Run Log
Sample Run Logs (2)

About Kumqat

Software teams move fast, but QA often becomes the bottleneck. At hackathons and in real product teams, we kept seeing the same pattern: features ship quickly, manual testing lags behind, and regressions are discovered too late.

We built Kumqat to change that workflow. Instead of writing tests from scratch, you provide:

a target URL
a plain-English requirement

Kumqat then turns that into an executable QA run and gives back structured, actionable results.

What Kumqat does

Kumqat is an AI-powered QA copilot for web apps that:

generates structured test cases from natural-language requirements
executes automated checks against the target site
streams run progress live to the UI
classifies outcomes as pass, fail, blocked, or flaky
provides triage-ready result cards (expected vs actual, severity, confidence, repro context)
stores history, supports reruns, and enables recurring scheduled runs
adds an interactive chat/discuss mode to ask follow-up questions about a run

How we built it

We built Kumqat as a full-stack system with:

Frontend: Next.js + TypeScript + Tailwind
Backend: FastAPI + SQLModel + SQLite
AI + Agent layer: Gemini for planning/validation/Q&A, Browser Use Cloud for autonomous browser actions
Realtime architecture: Server-Sent Events (SSE) for live run updates
Orchestration: async run pipeline with per-case lifecycle events and structured result persistence

Architecturally, we focused on an end-to-end QA loop:

Requirement in
Test plan generation
Autonomous execution
Structured validation
Human-readable triage output

Challenges we faced

Building a trustworthy AI QA system required solving practical reliability issues:

External websites can block automation with login walls, CAPTCHA, or WAF protections.
Live progress needed to be stable and understandable even with partial/in-flight results.
We had to distinguish real product failures from environmental constraints.
We needed graceful fallbacks when keys/services are unavailable, instead of hard-failing the product.

A big design decision was making “blocked” a first-class status, so teams can clearly separate “the feature is broken” from “automation couldn’t access the flow.”

What we learned

This project taught us that useful AI products are not just about model calls:

Structure beats raw output: developers need consistent, actionable result formats.
Transparency builds trust: explicit statuses and evidence reduce ambiguity.
Fallbacks are product features: robust degradation matters as much as best-case intelligence.
UX is critical for AI tools: real-time clarity and explainability drive adoption.