Inspiration
Navigating the world of ML research is a daunting task, with the sheer volume of new discoveries accelerating at a rapid pace. We wanted to see if LLMs could help us organize scientific research into a more legible interface.
What it does
Codex uses LLMs to parse and index the open body of ML research into a rich knowledge graph. Given a topic such as "3D segmantation" or "speculative decoding", Codex queries this knowledge graph to construct an on-the-fly summary of the topic with links to any relevant datasets, benchmarks, and models.
How we built it
We finetuned a Mistral-7b variant that lets us take in any Arxiv paper and extract a structured JSON containing key findings, techniques, and topics. We ran this model on 50k CS papers to build a rich knowledge graph of concepts that our agent uses to generate Wikipedia pages for any concept you want.
Challenges we ran into
Resolving mentions of a topic across different papers was much more difficult than expected.
Accomplishments that we're proud of
Our finetuned model is extremely cost-effective—it enabled us to process 50,000 papers totaling ~250m tokens for roughly $20. If we were to use GPT-4 or Mistral-large, parsing all these papers would cost more than 100x as much.
What we learned
We learned a lot about getting good structured outputs and finetuning models.
What's next for Codex
We want to index all of Arxiv and improve our resolution pipeline.
Built With
- huggingface
- mistral
- mistral-7b
- python
- vite


Log in or sign up for Devpost to join the conversation.