Inspiration

DocuFlow AI was created to simplify one of the most time-consuming parts of any process: writing documentation. When building workflows or tutorials, documenting each step manually can take longer than doing the task itself. With Chrome’s new built-in Gemini APIs, it became possible to capture user actions and generate high-quality documentation locally, without relying on servers or external AI services. The goal was to make documentation fast, accurate, and privacy-first — directly from the browser.

What it does

DocuFlow AI is a Chrome extension that automatically captures on-screen actions and converts them into step-by-step documentation enhanced with AI. It records clicks, inputs, and navigation events, takes screenshots automatically, and highlights visual elements. Then, using Chrome’s on-device Gemini Nano, it generates professional text descriptions, titles, and summaries for each step.

The extension can also validate documentation quality, detecting duplicate or missing steps, and supports multi-language translation with automatic language detection. Users can review and refine the generated guide in a live markdown editor, preview the final result, and export it as PDF or markdown for sharing or version control.

Everything happens locally inside Chrome — no servers, no API keys, full privacy — delivering instant AI-powered documentation generation right where users work.

How we built it

DocuFlow AI was built entirely with Chrome Extensions Manifest V3 and the new Chrome Built-in AI APIs. It uses:

  • Prompt API for generating natural-language step descriptions and article summaries.
  • Gemini Nano for on-device understanding of user actions.
  • Translator API for multilingual documentation.
  • Chrome Storage API for local data handling and session management.
  • Chrome Scripting API and Offscreen Documents for background AI processing.

All logic is implemented in vanilla JavaScript, using Chrome’s built-in capabilities. Each recorded interaction triggers a capture via content scripts, and the offscreen document communicates with the Gemini model to produce contextual, high-quality text. This architecture allows the extension to work offline, privately, and with no dependencies beyond Chrome itself.

Challenges we ran into

One major challenge was synchronizing screen capture, AI generation, and UI updates in real time without affecting browser performance. Balancing responsiveness while processing screenshots and AI text generation locally required fine-tuning. We also faced design challenges to make the interface simple enough for general users but still powerful for documentation professionals.

Accomplishments that we're proud of

We built a fully functional Chrome extension that turns any workflow into clean, professional documentation. DocuFlow AI successfully integrates multiple built-in AI APIs, supports multilingual output, and maintains complete offline privacy. It demonstrates that Chrome’s on-device AI can automate a complex creative process — documentation — without external infrastructure.

What we learned

We learned how capable Chrome’s built-in AI really is for real-world automation. Understanding the Prompt API and Gemini Nano integration was key to building an extension that feels fast and local. We also gained experience optimizing memory and storage to handle image data efficiently inside the browser environment.

What's next for DocuFlow AI

The vision is to make DocuFlow AI the fastest way to document and share any process — entirely powered by Chrome’s local AI.

Built With

  • extension
Share this project:

Updates