Inspiration

Startups often ship before they test, leading to broken UX, security vulnerabilities, and confused users.

We noticed a lot of products are vibe coded, which are pretty but also dangerously untested.

Manual testing is slow, QA is expensive, and users are unpredictable.

So we thought: what if we could automate this testing process?

We also love cats, so we thought: what if each “agent” was a quirky cat with its own testing behavior?

Inspired by tools like Playwright MCP, LangChain, and AI persona systems.

What it does

Simulates real users and attackers using LLM-powered agents.

Agents take on specific personas (e.g. hacker, grandpa, mobile-first) to explore your site or codebase.

Automatically:

Finds broken flows, UI bugs, or bad UX

Detects common security issues (e.g., XSS, injection)

Reviews your codebase for dev alignment issues

Outputs a JSON/Markdown report with issues + suggested fixes.

Scores your app on readiness (UX, security, polish).

How we built it

Frontend: SvelteKit interface to submit URLs/repos + configure test agents.

Backend Controller: ElysiaJS server that dispatches test tasks to agent containers.

Agents:

Gemini (LLM instructions)

Playwright (site interaction)

Custom LLM <--> Playwright workflow

Security tools (LLM-based + manual checks)

Accessibility Tools: Aria snapshots

Challenges we ran into

Designing agent personas that behaved realistically with minimal prompt tuning.

Making LLMs simulate broken behavior or confused users (e.g., senior UX testing).

Ensuring consistent site behavior across agent runs (timing, session handling).

Managing latency between LLM calls and frontend responsiveness.

Accomplishments that we're proud of

Deployed a functional distributed swarm of agents!

Built a fun, memorable UX with cat-themed characters and branding.

Agents actually found bugs in a test site — including mobile layout issues and unhandled form errors.

Created a working loop: simulate → test → report → fix.

Made a boring concept (automated testing) feel fun and demoable.

What we learned

How to use Playwright + LLMs in custom workflows. (Based off of MCP servers)

Real-world challenges of scaling LLM agents.

SvelteKit and Playwright integration techniques for smart testing flows.

What's next for Pawditor

GitHub App integration: run on every push with PR feedback.

Custom agent designer: let users create/testing personas via prompt templates.

Startup Dashboard: track readiness score across time & teammates.

More agent types (e.g., Accessibility Cat, Performance Cat, Rage Clicker Cat).

Polished hosted version with team-based workspaces.

Built With

Share this project:

Updates