Problem

There are more than 7,000 rare diseases and the average time for accurate diagnosis of these diseases is about 4-5 years.

One of the main challenges is the huge biomedical corpus :libros: that needs to be analyzed for each one of these patients.

This project will try to address this challenge with the use of LLMs. We will try to develop a LLM-based tool that can assist the clinician in the diagnosis of these patients.

Solution

We want to create the "Google" for genetic diseases.

To narrow down the list of diseases, we have developed a 3-step process:

  1. Semantic Similarity: Implementing an embedding search within a vast biomedical corpus that includes diseases and symptoms.
  2. Ranking System: Utilizing the cohere:rerank() function to rank the top 100 diseases and then selecting the top 10.
  3. Prompt Engineering: Developing a chatbot that will inquire about symptoms to refine the search down to one disease.

Tech Stack Used

  • OpenAI's text embeddings API
  • Model: OpenAI - gpt3.5_turbo
  • Cohere's rerank product
  • HTML, CSS, JavaScript, Python, Flask

Try it out

Built With

Share this project:

Updates