Overview

What it does

OneShot is a multi-agent software factory generator. It takes a high-level user prompt describing a desired software project and autonomously generates complete, working code, associated tests, and documentation. The system orchestrates various specialized AI agents (e.g., Architect, Prompt Analyzer, Dynamic Worker) to understand requirements, plan implementation, research, generate code, and validate outputs. This includes:

  • Analyzing user prompts for project requirements.
  • Creating detailed implementation plans.
  • Generating code files, ensuring adherence to quality and best practices.
  • Producing comprehensive project documentation.
  • Handling validation logic, as suggested by snippets like nullValidator(t){return ab()} and i=>i!==null.
  • Managing configuration dynamically, like handleUpdateConfig({ allow_for_replans: checked }).

How it works

The application itself is an agentic pipeline that utilizes the following strategy:

  • Firstly, an architect agent will use a user's prompt to plan a task, decide how many and what agents to use, and define a success criteria / tests.
  • Then the planned agents are built and executed. Each agent builds its own part of an application with its own files assigned. Agents communicate with each other, sharing the functions, classes, methods, etc from their respective files.
  • After an agent finished, their code is being reviewed and if something is wrong - it retries.
  • At the end, the application specific unit tests are executed. We haven't implemented retries if the unit tests fail, but it will inform us what works as expected and what needs improvements.

How to run it

Go to our repository (main branch), open the backend/ folder, install dependencies (you can use pyproject.toml or pipreqs to do so), then run python test_cli.py --help for possible ways of execution. The frontend is now only integrated with an older version of our agentic pipeline (which is more "static") in dev branch, so better use cli.

Current state of the project and future steps

  • The project is completely functional and usable.
  • The frontend integration is only done for an older backend version in dev.
  • The codebase needs refactoring (removing old ver. parts of the code, deleting redundant files, structure improvements).

Inspiration

Inspired by the need for more robust, error-free software development, particularly in preventing common issues like null references and ensuring comprehensive validation. The vision was to automate and enhance the software creation process through intelligent, cooperative agents, ensuring high-quality, 'null-checked' code from the ground up.

How we built it

The system is built around a sophisticated multi-agent architecture using Python and LangChain. At its core, it leverages the deepagents library for creating and managing autonomous agents. Each agent (e.g., PromptAnalyzerAgent, ResearchPlannerAgent, ArchitectAgent, DynamicWorkerAgent) is instantiated with a specific role and system prompt, enabling it to perform specialized tasks within the software development lifecycle. The pipeline orchestrates these agents sequentially, passing contextual information and generated artifacts between them. Robust error handling, including JSON serialization and deserialization, was critical for inter-agent communication.

Challenges we ran into

The primary challenges involved:

  1. Integrating deepagents: Ensuring that all agents correctly instantiated and invoked the deep_agent pattern, especially during dynamic execution.
  2. Complex Data Flows: Managing the various data types (strings, dictionaries, lists, LangChain message objects) being passed between agents and ensuring proper JSON serialization and deserialization at each step. This led to errors like 'dict' object has no attribute 'strip' and 'list' object has no attribute 'strip'.
  3. LLM Output Consistency: Guiding the LLMs to consistently produce valid JSON and complete, working code without placeholders (e.g., # TODO comments or pass statements) in generated files.
  4. Error Handling and Fallbacks: Implementing robust error recovery mechanisms and fallback plans when LLM responses were malformed or unexpected.

Accomplishments that we're proud of

We successfully built and debugged a functional multi-agent pipeline capable of transforming a text prompt into a software project. We are particularly proud of:

  • The seamless integration of the deepagents framework across multiple specialized agents.
  • Overcoming complex JSON parsing and serialization issues across diverse data structures.
  • Establishing a reliable process for AI agents to generate structured, production-ready code.

What we learned

We gained significant insights into:

  • The intricacies of designing and implementing multi-agent systems for complex tasks.
  • Best practices for integrating and orchestrating Large Language Models in a structured workflow.
  • Advanced strategies for robust error handling, especially concerning data serialization and unexpected LLM outputs.
  • The importance of meticulous prompt engineering to guide AI agents toward precise and complete code generation.

What's next for null.checked

See you next year!

Built With

  • fastapi
  • langchain
  • langgraph
  • react
Share this project:

Updates