Transform any code repository into easy-to-understand interactive documentation through Git history and AI analysis.
🌐 Project Website | ⚡ flash-linear-attention Demo | 📖 verl Demo | 🔥 Megatron-LM Demo | 🦙 LLaMA-Factory Demo | ✏️ EasyEdit Demo | 🚀 nano-vllm Demo | 🎯 mini-sglang Demo
English | 简体中文
A toolchain for deeply understanding code repositories. It analyzes Git history, uses AI to interpret code, generates hierarchical documentation, and creates an interactive website that helps you easily understand any complex codebase.
- Visual Analysis: Generate repository structure heatmaps showing file modification frequency
- Smart Statistics: Analyze code size, modification distribution, and token counts
- AI Interpretation: Use Gemini 3 Pro Preview to generate easy-to-understand code explanations
- Hierarchical Docs: Recursively generate README files for each directory (bottom-up)
- Interactive Website: Read the Docs style static website with file tree navigation
understand-everything/
├── scripts/ # Core scripts (named by execution order)
│ ├── s0_find_snapshots.py # Find curriculum learning snapshots
│ ├── s1_curriculum_pipeline.py # Curriculum learning pipeline
│ ├── s2_explain_files.py # AI interprets code files
│ ├── s3_generate_readme.py # Generate hierarchical READMEs
│ └── s4_website.py # Generate interactive website
├── utils/ # Utility scripts
│ ├── s0_add_timestamps.py # Add timestamps
│ ├── s1_repo_heatmap_tree.py # Generate repo structure heatmap
│ ├── s2_analyze_stats.py # Analyze statistics
│ └── utils.py # Common utility functions
├── repo/ # Repositories to analyze (.gitignore ignored)
├── output/ # All generated output (.gitignore ignored)
│ └── <repo_name>/
│ ├── explain/ # AI interpretation markdown
│ └── website/ # Static website
└── pyproject.toml # Project configuration
# Create virtual environment
uv venv --seed .venv --python 3.12
source .venv/bin/activate
uv pip install -e .Set environment variables (for Gemini API):
export OPENAI_API_KEY="your-api-key"
export OPENAI_BASE_URL="your-openai-base-url"Assuming you want to analyze repo/your-project:
# Step 1: AI interprets files (generates explanations)
python scripts/s2_explain_files.py repo/your-project --workers 8 --percent 100
# Step 2: Generate hierarchical READMEs (bottom-up summarization)
python scripts/s3_generate_readme.py repo/your-project
# Step 3: Generate interactive website (final output)
python scripts/s4_website.py repo/your-projectOptional utility scripts:
# Generate repo heatmap (visualize modification frequency)
python utils/s1_repo_heatmap_tree.py repo/your-project
# Analyze statistics (understand code scale)
python utils/s2_analyze_stats.py repo/your-projectStart a local server to view the website:
cd output/your-project/website-<date>
python -m http.server 8000
# Open http://localhost:8000 in browserFunction: Use Gemini 3 Pro Preview to generate easy-to-understand explanations for each file
Features:
- Async concurrent processing, supports
--workers Nto set concurrency (default 16) - Supports
--top Nor--percent Nto select files to interpret - Automatically skips already interpreted files (use
--forceto regenerate) - Uses
tqdmto show real-time progress bar - Prompt optimized for "step-by-step explanation" style
Usage:
python scripts/s2_explain_files.py <repo_path> [options]
# Interpret all files with 8 workers
python scripts/s2_explain_files.py repo/your-project --workers 8 --percent 100
# Interpret top 50% of files
python scripts/s2_explain_files.py repo/your-project --percent 50
# Force regenerate
python scripts/s2_explain_files.py repo/your-project --percent 100 --forceOutput: output/<repo_name>/explain-<date>/*.md
Function: Recursively generate summary READMEs for each folder (bottom-up)
Features:
- Starts from deepest folders, summarizes layer by layer upward
- Subfolders represented by their READMEs, files by their interpretations
- If content exceeds 200K tokens, proportionally truncated
- Uses easy-to-understand prompts for summarization
Usage:
python scripts/s3_generate_readme.py <repo_path> [options]
# Example
python scripts/s3_generate_readme.py repo/your-project
# Force regenerate
python scripts/s3_generate_readme.py repo/your-project --forceOutput: Generates README.md in each folder of the interpretation directory
Function: Generate Read the Docs style static website
Features:
- Collapsible file tree navigation on the left, fixed indentation alignment
- Click folder to show README summary
- Click file to show AI interpretation + source code (with syntax highlighting)
- Supports all file types (.py, .cu, .cpp, .h, .md, etc.)
- Shows hidden files (except .git directory)
- Uses Prism.js for code highlighting
- Responsive design, mobile friendly
Usage:
python scripts/s4_website.py <repo_path> [options]
# Example
python scripts/s4_website.py repo/your-projectOutput:
output/<repo_name>/website/index.htmloutput/<repo_name>/website/styles.cssoutput/<repo_name>/website/app.jsoutput/<repo_name>/website/sources/- Source codeoutput/<repo_name>/website/explanations/- Interpretations (HTML)
Successfully analyzed open source projects:
- flash-linear-attention (468 files) - Triton implementation of efficient linear attention mechanisms
- verl (1100+ files) - ByteDance's large model reinforcement learning framework
- Megatron-LM (1330+ files) - NVIDIA's large-scale Transformer training framework
- LLaMA-Factory (405 files) - One-stop LLM fine-tuning framework supporting 100+ models
- EasyEdit (834 files) - Knowledge editing framework with 28+ editing methods
- nano-vllm (26 files) - Lightweight vLLM implementation with PagedAttention & Continuous Batching
- mini-sglang (103 files) - Lightweight LLM serving framework with Tensor Parallelism & HiCache
- Python 3.12+
- GitPython - Git repository operations
- Matplotlib - Heatmap visualization
- NumPy - Numerical computation
- Tiktoken - Token counting
- OpenAI SDK - Gemini API calls
- Markdown - Markdown → HTML conversion
- Prism.js - Code syntax highlighting
- TQDM - Progress bar display
- Minimalism: Each script focuses on one thing, code is clean and clear
- Clear Order: s1 → s2 → s3, named by execution order
- Interruptible: Each step runs independently, supports incremental updates
- Concurrent & Efficient: Async processing, supports multiple workers
MIT License
- Gemini 3 Pro Preview - Powerful code understanding capabilities
- Claude Code - Excellent programming assistant
