Inspiration

At my workplace, we frequently run into issues with large log files and we don't have time and resources to analyze them. I was motivated to find a solution to this problem and ChatGPT came to the rescue!

What it does

QueryPlugin lets you specify any public URL and run natural language queries against the data in it!

How we built it

We leveraged open source projects such as the ChatGPT retrieval plugin, the AI Sidekick project to automate data connection with external data sources to simplify the implementation of our project. At its heart, it uses vector embeddings and similarity search to find answers to user queries.

Challenges we ran into

  1. Vector database unavailability
  2. Pinecone index creation delays,
  3. Poor Weaviate documentation
  4. Unpredictable ChatGPT query generation behavior(had to reinvent the wheel)
  5. Spent a lot of time setting up data connection(should have used Sidekick right from the start!)

Accomplishments that we're proud of

We have been able to connect ChatGPT to any public data source and unleash its power on the whole wide web! The possibilities are boundless! In particular, we are targeting log analysis.

What we learned

  1. Vector Database Data connection process sucks big time!! Had to use a lot of workarounds to get things going
  2. Leverage open source projects as much as possible! Didn't have to waste 6-7 hours setting up data connectors.
  3. Learned how to effectively make ChatGPT send queries to the plugin. This is not present in documentation and so we found a way to make it reliable

What's next for QueryPlugin

  1. Integrating more data sources such as local files
  2. Modifying the vector search logic to make queries more efficient
  3. Scale up the system to make it access larger log files
  4. More flexibility in writing natural language queries to the plugin

Built With

  • fastapi
  • openai
  • pinecone
  • python
  • replit
  • sidekick
  • uvicorn
  • weaviate
  • zilliz
Share this project:

Updates