Inspiration

All of us work with data everyday, and we understand how much of a pain it is to find quick insights

_ think of those who don't know how to query their own company's data _

Through Datazz, even the most non-technical person can query a database to find insights from simple natural language

As well, more than just wanting to make a GPT wrapper, we implemented a variety of prompt and result engineering techniques to ensure high quality and accuracy

Overall, our passion for data, the power of GPT/copilot, and the creativity of MidJourney inspired us to develop an analytical tool directly in your messaging platform, complementary to the use of traditional BI platforms.

What it does

Datazz converts natural language to SQL queries, enabling users to produce analytics insights with greater ease

We preprocess the user's prompt to to identify relevant parts of the schema, and format the schema for the prompt

As well we use example queries for the given dialect to improve accuracy

We implemented Datazz on a discord bot to make it accessible to anyone

How we built it

  • Datazz is a Go-based application that supports connecting to any SQL database and retrieving and saving a database’s schema
  • When a user enters a natural language prompt, Datazz synthesizes the natural language query with knowledge of the database to produce high-quality prompts by identifying which parts of the schema are relevant to include in the formation of the query requested by the user
  • Datazz then executes the engineered prompts by leveraging GPT-3’s text-davinici-3 model, returning a SQL query and a table output of the user’s original prompt request.

Sometimes Datazz needs a few tries before getting the query how you wanted it, but with a slight nudge in the right direction, it will answer your queries with ease!

Challenges we ran into

Parsing and saving the data, prompt engineering, integrating into Discord, and ensuring quality query performance.

Accomplishments that we're proud of

A performant Text to SQL Discord bot with moderately accurate query results

What we learned

Large language models are powerful and require fine-tuning for optimal performance

What's next for Datazz

  • Expand the query capabilities
  • Support NoSQL (mongo, cassandra)
  • Implement a Slack bot for Datazz-core

Built With

Share this project:

Updates