Inspiration

First, we had foundation models. Then we had AI agents built on top of these. Now we have agentic workflows composed of several agents with different specialized abilities working to solve a complex problem. Existing tools for creating agentic workflows, despite being low/no-code and having modular drag-and-drop designs, are hard to use. The more complex the problem, the more specialized components we need to manually add and connect. These tools have a high learning curve and you need to bend your task to fit the tool set available in these applications.

We want to change this paradigm of agentic workflow creation.

What it does

Bob, the agent builder, takes a single text prompt as input and then intelligently breaks it down into the sub-tasks that need to be solved to complete the whole solution. The interface is familiar with blocks and connecting edges but now the entire graph is automatically generated and the blocks and edges can be any agentic workflow, as simple as code generation or LLM reasoning or as complex as computer use.

A global planning agent chooses the best tools for each subtask by factoring in the demands of efficiency, nature of the subtask and input and output requirements. The entire agentic workflow is generated hierarchically with checking at each stage to ensure consistency and logical soundness. You just tell it what to do and it designs the best way to do it.

How we built it

We designed generalized JSON schemas to represent tasks and links for inter-task communication. A planning agent transforms the user prompt into a task object according to this schema. Then we iteratively pick the best tool for a task and if required break it down into simpler tasks. At each stage we keep track of the inputs and outputs connecting between these tasks. The entire graph of agentic blocks is tracked with LangGraph. This way we can substitute the planning model for anything of our choice: Perplexity, OpenAI, Gemini, or even custom fine-tuned models for your domain.

The agents currently support tools like computer use with Scrapybara, LLM based search and planning, and code generation and deterministic execution. The tasks at leaf nodes immediately start running and wait on the outputs from other subtasks to be ready before they take these as inputs for their own processing maximizing parallelism of tasks.

Challenges we ran into

  1. Architecting the design of an agentic workflow : Making a consistent schema for tasks, tools, and links was challenging. We wanted to build something that could decompose any task, being universal and compact at the same time. We iterated through several designs before landing on something that made sense and could accurately and comprehensively capture our problem structure.
  2. The problem of splitting tasks into subtasks : We employed different heuristics, prompting engineering and chain of thought techniques to ensure that the task decomposition was just right. Initially we ran into situations where the subtasks were either too many and too simple or on the other extreme, too similar to the parent task.
  3. Managing inter-task communication : We used asyncio to ensure that we could spawn tasks as soon as we reached the leaf nodes of a subtask while also waiting for its dependencies to be ready. Moreover, we designed the links so that the whole agentic workflow can be reused by just resetting the link values.
  4. Handling error checking for smaller tasks to prevent single points of failure: We run error checking for JSON schema validation and other kinds of structured output along with repeated LLM prompting until a valid output is generated. This way the model doesn't fail when a single subtask fails.
  5. Building a flexible front-end: We wanted to build a familiar block interface while also giving users the power of an entire agent builder inside each block. We can modify the contents of a task block to regenerate the agentic workflow with this new component while the parents in the tree are unaffected. The users can also manually connect blocks that they have constructed and blocks can be made to solve any problem, simple or complex, with the desired I/O format.

Accomplishments that we're proud of

We are blown away by the things this model is capable of. We exceeded our expectations by putting together several new technologies including Computer Use Agents (Scrapybara), Langchain, Structured output and reasoning with LLMs, and building and managing interprocess communication in an agentic workflow. We strongly believe that automating agentic workflow creation is the future.

What's next for Bob: The Agent Builder

There are several features we want to add to our agentic workflow:

  1. Eigenlayer verification for agent computations on Web3
  2. Streaming user input during task computation and dynamically creating user input fields during evaluation
  3. Integrating existing vertical AI SaaS tools to strengthen the capabilities of our agentic system to pick specialized components reducing the number of subtask decompositions needed
  4. Integrating this with user edits from the front-end for dynamically regenerating the computation graph

Built With

Share this project:

Updates