Inspiration

I individually tried out Neo4j knowledge graphs and Pinecone vector database with LangChain and wanted to combine the two approaches together.

What it does

Loads in documents (like wikipedia or notes from a directory) and stores them in a relationship knowledge graph and vectorstore index.

How we built it

With Python and llama-index.

Challenges we ran into

  • Running into many rate limited errors for LLM and embedding models
  • Graph databases consume more memory and performance than vector databases

Accomplishments that we're proud of

Attempted to combine vector database and knowledge graphs to support both structured and unstructured data. Then use Retrieval Augmented Generation (RAG) to ground an LLM and reduce hallucinations.

What we learned

Learned that vector databases and knowledge graphs are advantageous in their own ways. Vector databases is effective on unstructured data with the help of embeddings and vector similarity. Knowledge graphs is strong at modeling relationships between structured data entities and domain-specific knowledge.

Depending on the task and query, one can be more beneficial than the other. For example, asking "list down the books written by X" is more for knowledge graphs.

What's next for Graph Query

  • Use local LLM (ollama)
  • Reduce API calls per min to prevent getting rate limited

Built With

Share this project:

Updates