Inspiration
As academic writers that've written both for passion and also professionally in journals like The Dartmouth's Journal of Sciences, we've experienced first-hand how long it takes to find real answers to complicated research by sifting through countless papers. When we saw InterSystem's magical tech at their workshop yesterday, we realized that this would be the perfect application of it.
What it does
Newton is a search engine built for research papers. It sifts through tens of the most relevant papers when you ask it a question, ranks them based on their authority & relevance, and gives you a detailed, cited answer based on all of their content with a consensus score so you know where academic consensus lies.
Newton can also serve as an AI tutor in it's Live Talk mode, where it interacts with you to answer questions about papers and visualize panes of information for you.
How we built it
- For the RAG answering and citation system, we couldn't have built this without InterSystem's IRIS technology that helps us vectorize all article and media content into vectors that we can then query with natural language, and even use to build lists of in-article references for citations.
- For the research paper extraction, we use Arxiv's research API (2.4M+ technical papers) with the option to use DFSEO's Google-Scholar API for a more vast set of sources.
- To rank the articles based on their authority, we fine tuned an LLM based on criteria such as authorship, relevance, and journal credibility.
- To synthesize an answer, we use OpenAI's GPT4's JSON mode to create a main output along with key facts and figures and a rule-based consensus score, so the user always knows where consensus lies.
- For the realtime Live Talk, we use the new brand new realtime API by OpenAI and also a combination of Whisper and TTS to make it work when manual mode is enabled.
- To get the most relevant diagrams & media, we use DFSEO's Google SERP API.
- The frontend is vanilla React with a corpus of libraries for fonts, animations, and formatting.
Challenges we ran into
- Since we're relatively new to working with SQL databases and using RAG, setting up the IRIS database was initially challenging and unashamedly, required us to seek help from the wonderful InterSystem team on call and on campus consistently.
- There were a myriad of challenges in learning websockets to correctly integrate the realtime voice API and in creating a new custom flow for the manual mode which required a completely different chain.
Accomplishments that we're proud of
- Newton is genuinely helpful! If nothing else, we made this for ourselves and might as well use it ourselves for our most pressing queries.
- We persevered through all our challenges, no matter how bleak it looked at times, and managed to make Newton a fully-functional web app.
What we learned
- Using RAG and vectorization generally through InterSystem's platform
- Bringing a full-stack web app from concept to reality in a quick sprint
- Learning and using new technologies in rapid succession like websockets, SQL databases, and multi-layer integration
What's next for NewtonAI
- We're looking to publish this online as an open-source project and share it with knowledge-hungry communities online to get their feedback on the platform, and improve it significantly with their help
Log in or sign up for Devpost to join the conversation.