Ctrl-Phi | Devpost

Home Page
API Playground
Declaration of Independence Example
HooHacks Example
NYT Example

Inspiration

Ctrl + F is one of the most prominent keybinds for efficient learning. Whether it's a Wikipedia page or a historical document, the ability to search through texts in a matter of seconds is a huge time-saver for anyone trying to learn.

However, Ctrl + F necessitates specific text input in order to search. What if you don't remember the exact content of a section? Or what if you have a question about the text you want answered? We wanted a better way to search for content so we set out to create it.

What it does

With Ctrl-Phi, we integrate AI with the Ctrl + F you know and love to make searching through web pages conversational. Simply press Ctrl + Shift + ., type in any query, like you would with ChatGPT, and Ctrl-Phi will scroll you to the part of the page you're looking for.

How we built it

The frontend was built with Nextjs and TailwindCSS, using select MaterialUI components. The Playground tab allows users to experiment with Ctrl-Phi using sets of example text, or any text of their own. It uses dynamic highlighting and scrolling within the text boxes and displays components using the FastAPI response.

In order to analyze text, we engineered a custom Search Agent. This agent uses an LLM alongside agentic AI logic and prompting patterns (based on the principles of "ReAct: Synergizing Reasoning and Acting in Language Models") in order to find direct matching text based on a user's query. The Search Agent can be used with any LLM (due to custom LLM and LLMConfig abstractions) and features token-based chunking to overcome context-window limitations. The SearchAgent also has robust error-handling, utilizes response validation, and relies on reprompting strategies to reduce hallucination.

Challenges we ran into

Minimizing hallucinations was a major challenge for us. Because such hallucinations are an inherent limitation of LLMs, we built tooling to validate output and minimize errors. We also relied on prompting techniques to address this issue.

Another major problem we encountered was the context-window limitation that applies to all LLMs. As we have to supply the reference text to the llm for evaluation, larger texts may exceed token limits for many meaning that no response is returned. To overcome this issue, we used tokenizers to implement chunking of the reference text. Moreover, we created LLM abstractions for modularity so that different llms and context-windows could be used.

Accomplishments that we're proud of

Around the first 8 hours of the hackathon, we worked largely in parallel, on either the chrome extension, the website, the large language models, or our own API. Despite having very different focuses, we managed to all stay on track, discuss big picture ideas as a team, and use pair programming to help one another get unstuck. We're very proud of how our diverse set of tooling came together to form a cohesive project with in such a short timeframe.

What we learned

In addition to learning a lot about each of our respective specializations, we also learned a lot about how to delegate responsibilities and integrate each other's work together. We felt like the scope of our project was relatively good, although we should have considered that certain technologies, like LLMs, are far harder to debug than others.

What's next for Ctrl-Phi

The first advancement would be the ability to scan and sift through PDFs, as many textbooks and educational resources are in the form of PDFs. Second would be the ability to scan and find images on a webpage. And finally, querying videos and jumping to a desired timestamp would be arguably the most distinct and time-saving use case.

Built With

caddy
chrome
css
fastapi
html
javascript
nextjs
pidantic
poetry
python
tailwind

Submitted to

HooHacks 2024

Created by

I designed the entire website, using NextJS and TailwindCSS. Designing the playground page was especially challenging, as I had to build the logic to highlight and scroll within an MUI textbox from scratch! I think the result is a frontend that is streamlined, pretty, and easy to use. I also helped out my teammates whenever I could, whether it be with the Chrome extension, or LLM debugging. Overall really fun project to work on, I believe it is genuinely useful.

Kieran Khan
I engineered and implemented the Search Agent, which allowed us to intelligently analyze texts based on sentiment from a query using an LLM. I also worked extensively on the backend API that when reached, would run the text evaluation.

Dillan Khurana
I worked on developing the chrome extension with JavaScript, HTML, and CSS. I also assisted with scraping web pages, handling that data, and many quality of life changes.

Eric Wolpert
Helped develop initial prototype for FastAPI portion. Setup hosting/DNS and helped add features and polish to the Chrome Extension.

Taylor South

Updates

Eric Wolpert started this project — Mar 24, 2024 07:13 AM EDT

Leave feedback in the comments!

Log in or sign up for Devpost to join the conversation.