RAGatha Chatbox

Inspiration

We were inspired by our previous experience of using OpenAI's ChatGPT, which led to our decision to incorporate OpenAI into our project.

What it does

The Retrieval Augmented Generation (RAG) Chatbot is an OpenAI-powered program that answers user-inputted questions based on the ICDS ROAR user guide. When the user types in a question, the chatbot runs the query on the loaded information from the user guide, retrieves relevant text, and generates an answer using OpenAI.

How we built it

First, we loaded the text from the ICDS ROAR user guide into a .txt file. Since the user guide included multiple images/diagrams, we used ChatGPT to generate detailed text descriptions of each image and added them into the .txt file in place of the images themselves. Then, we created a GUI-based chatbot application using Python's Tkinter module and utilized the LangChain library to work with OpenAI for document retrieval and language processing.

Challenges we ran into

The first challenge was processing the diagrams on the ROAR user guide. We considered loading a .rtf or .pdf file so that the original images could be loaded into the program, but we determined that it would be more efficient to convert the images to text descriptions so that the RAG would only have to search through textual information and not have to process each image for every new query. Another challenge was

Accomplishments that we're proud of

We are proud of creating a RAG without having any prior knowledge before this hackathon. In addition, we are proud of our perseveration despite various challenges and a great amount of debugging.

What we learned

We learned about RAGs, which utilize large language models to complete smaller tasks based on specific knowledge bases, particularly useful for enterprises and institutions who want to perform tasks based on company-specific or private information. In our case, our specific knowledge base was the ICDS Roar User Guide. It was fascinating to learn about the process of creating a RAG---from loading the text files, to storing the information in databases, to transforming them into vectors and the retriever types. While the project only utilized one type of retriever, embeddings, etc., the knowledge of various forms of such tools will be useful for future projects and hackathons. Additionally, we learned to use Tkinter to create an interactive GUI. Later on, in an attempt to sophisticate our chatbox, we looked into implementing BM25 to make its responses quicker and more helpful. We found that while it would have helped our responses come quicker by slight fractions of a second, it inadvertently introduced a separate problem entirely. Each unit of text is quantified into an amount of tokens, and while using the OpenAI model, we had a max of 4,000 tokens for five dollars. The ability to climb past 4,000 tokens would come with a higher cost that we felt wasn't worth the improvements on such a small scale.

What's next for RAG Chatbot HackPSU

We plan to redevelop the RAG Chatbot using Streamlit, a Python framework for web app development.

Built With

chroma
langchain
openai-api
python
tkinter

Submitted to

HackPSU Fall 2024

Created by

I worked on the research side of the project, providing extra insight in our endeavors in how-to's and how-nots.

literallylexx
I worked on the back end and helped with the front end as well. I did a lot of debugging and optimization for the code. These contributions help to ensure ease of use for the users.

Aryan Xavier
Researching about the process of RAG and the technologies behind it: Langchain, OpenAI API. I was not responsible with the coding, but learning what was actually happening behind the hood will be helpful to future projects

Tina Z
I contributed to formatting the handbook into a .txt file, including converting images into verbal form using generative AI.

Leona Chen
I contributed to the front end aspect of this project by creating the GUI for users to interact with the chatbot. It was one of my first experiences coding front end, specifically with tkinter, so I am very proud of myself for getting as far as I did with this project.

Chaoping Li

Updates

Leona Chen started this project — Oct 13, 2024 12:00 PM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.