Inspiration

This is not another AI Agent that can help you manage your Google calendar, summarize your Zoom meeting notes, or some simple ChatGPT wrapper that you’ve already seen countless times.

It has come to our attention that 99% of the AI agents can't do anything more than managing people's Google Calendar. Yes, there's a huge gap between strong reasoning capabilities provided by the rapidly evolving LLMs and APIs/Tools that actually finish the last miles. ToolsGuru's mission is to bridge that gap by unlocking unlimited use cases with API generation on the fly as your agent is reasoning.

What it does

All the AI Agents out there right now contains a list of pre-defined plugins you can use and actions you can execute with them, and operates within the realms of that.

What if you have an agent that has unbounded possibilities? What if it can do literally do anything you ask it to, without having to manually configure anything?

We have build ToolsGuru that follows the First Principle. You can ask it to do anything, and it resolves every action it needs to perform by you guessed it... Generating it!.

Let's say you want to order food from UberEats. You're too lazy to go through all the restaurants on your phone, so you tell the ToolsGuru to do it. UberEats search API does not exist since Uber hasn't released it to public yet. ToolsGuru will simply use selenium to log in to UberEats and use reverse engineering to generate an API! UberEats search API delivered within 5 minutes. It's just as easy as that.

You can repeat this process for any action or workflow you want to perform on the internet, and it will be learned and saved by ToolsGuru for subsequent usages. So the next time you want to order food on UberEats, it gets done agentically with natural language only.

How we built it

We built it using Integuru and PICA. For the API generation part, we tract the network traffic and cookies and everything as user is navigating in selenium to the final destination (click a button or even ending at a desired state). Then we use reverse engineering to figure out the dependencies needed for making that request by building a DAG. Then we use LLM to trace it down to the very beginning where the auth token/cookies is the only thing we need to call the API.

For integration with PICA, we simply encapsulate the whole process using FastAPI. We create two native tools within PICA, APIGenerationTool which creates the prompt needed to create the API needed and DeployTool which will deploy the generated tool/api back to PICA so it can be used within the same chat session!

Challenges we ran into

A lot of websites still are not capable of being reverse engineered (or we just need to get better lol). It is very hard to get authenticated on the very first try, sometimes it takes 5 trials to get it working. But basically we just need 1 success and we can happily add it to our tool libraries.

Also it's hard for us to integrate the whole process with PICA at first, because the tool is generated so it requires a lot of prompt tuning for us to actually make the DeployTool work at one shot. But in the end we were able to do it by using LLM to directly rewrite the FastAPI backend as well as the tools router in PICA to make the deployment process as smooth as possible.

Accomplishments that we're proud of

We were able to unlock a lot of interesting use cases.

  • Generated Uber/Lyft comparison API to enable us compare price differences to get to the same destination
  • Generated UberEats Search API so you don't have to spend 15 minutes a day struggling with what to eat tonight
  • Generated Steam Games Lookup API so you will never feel bored for the long weekend.
  • CRAZY potential that we have unlocked for the agent space. Previously the whole agent design is a top-down design, where each agent is created for a predetermined purpose, for example, tax agent, calendar management agent etc. But with ToolsGuru's API generation on the fly abilities, we can forget about this top-down design and focus on what LLM reasoning models are good at. If we don't have some APIs, just generate them! Therefore really unleashing the potential that agents can do for us. i.e agents with ToolsGuru is really following the First Principle and self evolves faster.

What we learned

Reverse engineering is fun. Reasoning model is powerful. First Principle always wins.

What's next for ToolsGuru

Make the whole process more robust. Scale it so Tools generation will be more reliable. Reduce error rate for generating new APIs on the fly. Better auth process.

Built With

Share this project:

Updates