CodeAncestry

AI-powered tool that transforms undocumented code into multi-layered explanations, preserving developer knowledge across generations.

Problem Statement

Developers inherit undocumented legacy codebases and spend 60%+ of their time trying to understand "why" decisions were made. Comments are outdated or missing, original developers have left, and critical tribal knowledge is lost forever. This knowledge gap costs companies millions in productivity and creates barriers for junior developers trying to contribute to mature projects.

Tech Stack

  • Frontend: Lovable (React)
  • Authorization: OAuth GitHub
  • Backend: Python FastAPI
  • AI: Gemini API (analysis)
  • Data: Snowflake Database
  • Vector Embedding: Snowflake Cortex
  • Security: 1Password

Features

1) GitHub Authentication

  • GitHub OAuth login
  • Secure user authentication (1Password)
  • Company-specific repo access (only repos you own / can access)

2) Repository Selection + Analysis

  • User selects a GitHub repo to analyze
  • Repo metadata stored (owner, full name, commit date, etc.)

3) Commit Ingestion

  • Fetch commits from GitHub dynamically for selected repo
  • Store commit history in database (Snowflake)

4) Commit “AI Summaries”

  • Automatically generates AI summary of each commit
  • Uses diff/code context (not just commit message)
  • Stores AI summary back in database

5) Embeddings + Semantic Indexing (Snowflake Cortex)

  • Generates semantic embeddings using Snowflake Cortex -Creates vector embeddings using Cortex-generated commit analysis as the embedding text source.

6) RAG Question Answering (Grounded AI)

  • Builds context from retrieved commits
  • Sends context + question into a Cortex LLM (COMPLETE)
  • Returns:
    • final answer
    • relevant commits used as sources

7) Query Classifier (Gemini)

  • Used Gemini to classify the user entered query into semantic, temporal or hybrid query
  • Based on this decision we employed the appropriate workflow for the query

Built With

Share this project:

Updates