Startup proposal — RAG judging layer

Python tooling that evaluates startup pitch PDFs against investor-style criteria using RAG (LangChain + OpenAI embeddings + FAISS), with an optional Streamlit UI for interactive review, scoring, and Gemini-powered rewrites. Analytics records can be written to Snowflake when configured.

Repository layout

Path	Role
`RAG-layer-of-startup-judging/`	Main application: Streamlit UI, RAG pipelines, criteria/scoring logic
`RAG-layer-of-startup-judging/frontend.py`	Streamlit entrypoint
`RAG-layer-of-startup-judging/extract_proposal.py`	CLI: analyze a proposal PDF vs criteria PDFs
`RAG-layer-of-startup-judging/scoring.py`	CLI: score analysis JSON and suggest refinements
`RAG-layer-of-startup-judging/startup_rag.py`	Startup-doc RAG + Snowflake insert helpers
`RAG-layer-of-startup-judging/criteria_rag.py`	Criteria PDF RAG helpers
`test_snowflake_connection.py`	Local Snowflake connectivity smoke test (adjust paths/credentials for your environment)
`requirements.txt`	Python dependencies (install from repository root)

Prerequisites

Python 3.10+ recommended
OpenAI API key for embeddings and baseline chat (gpt-3.5-turbo in RAG chains)
For the UI rewrite flows: Gemini API key (GEMINI_API_KEY)
Optional: ElevenLabs API key for text-to-speech in the UI
Optional: Snowflake account + key-pair auth for startup_rag logging

Setup

Clone or copy this repo and go to the repository root (the directory that contains requirements.txt).

Create a virtual environment and install dependencies:

python -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt

Copy environment variables into .env in RAG-layer-of-startup-judging/ (never commit real secrets; this repo’s .gitignore excludes .env). Example variables:

Variable	Purpose
`OPENAI_API_KEY`	Required for embeddings/RAG and proposal extraction
`GEMINI_API_KEY`	Required for Gemini features in `frontend.py`
`GEMINI_MODEL_REWRITE`, `GEMINI_MODEL_PROJECTION`, `GEMINI_MODEL_*_FALLBACKS`	Optional model overrides
`ELEVENLAB_API_KEY`, `VOICE_ID`	Optional TTS
`FAST_REWRITE`	Optional; set to `1` / `true` / `yes` for faster rewrite behavior
`SNOWFLAKE_USER`, `SNOWFLAKE_ACCOUNT`, `SNOWFLAKE_PRIVATE_KEY_FILE`, `SNOWFLAKE_WAREHOUSE`, `SNOWFLAKE_DATABASE`, `SNOWFLAKE_SCHEMA`, `SNOWFLAKE_ROLE`	Optional Snowflake logging

Criteria PDFs — The UI and extract_proposal.py expect criteria documents such as marketing, angel, technical, and VC judging PDFs in RAG-layer-of-startup-judging/ (names referenced in frontend.py / extract_proposal.py). Add any missing files so default paths resolve.
Snowflake key — Point SNOWFLAKE_PRIVATE_KEY_FILE at your .p8 private key file on disk.

When running the app or CLI scripts, use the app directory:

cd RAG-layer-of-startup-judging

Run the frontend (Streamlit)

The web UI is launched with Streamlit. Activate your virtual environment (if you use one), go to the app folder, then start the app:

cd RAG-layer-of-startup-judging
streamlit run frontend.py --server.port 8502 --server.address 127.0.0.1

From the repository root in one line:

cd RAG-layer-of-startup-judging && streamlit run frontend.py --server.port 8502 --server.address 127.0.0.1

For local runs, open http://localhost:8502 in your browser.

For remote hosting via SSH port-forwarding, keep this tunnel running on your local machine:

ssh -i ~/.ssh/do_startupcheck -o IdentitiesOnly=yes -N -L 8502:127.0.0.1:8502 deploy@138.197.143.191

Then open http://localhost:8502 locally. http://127.0.0.1:8502/ works as well (same destination). Ensure OPENAI_API_KEY and GEMINI_API_KEY are set (for example via RAG-layer-of-startup-judging/.env) for full functionality.

Rewrite difference `%` (original vs rewritten text)

In the frontend rewrite panel, the app shows:

Total Content Change (%) (overall_percent_change)
Sentence-level change (%) (sentence_changes[].percent_change)

This value is generated by Gemini as a lexical/phrase overlap estimate (not semantic quality), with:

0% -> wording is almost identical
100% -> wording is fully rewritten
rounded to whole numbers

It is a rewrite magnitude indicator, not a direct quality score.

How the metrics are computed in practice:

Sentence-level change (sentence_changes[].percent_change)
- For each sentence pair (original_sentence vs rewritten_sentence), the rewrite-difference step compares wording overlap (tokens/phrases), not semantic intent.
- More shared wording -> lower %; less shared wording -> higher %.
- The result is rounded to a whole number in [0, 100].
- change_label is then assigned from that percentage as a bucket (low, medium, high).
Total content change (overall_percent_change)
- Computed from the same sentence-card set used above.
- Represents aggregate rewrite magnitude across the full rewritten section (not just one sentence).
- Displayed in UI as Total Content Change.
- If the app runs in fast mode or summary generation is skipped/fails, this value can be -1 internally and is shown as 0% in UI fallback.

How to read common variates (ranges):

Low change (0-25%): light edits, phrasing tweaks, minor clarifications
Medium change (26-60%): noticeable restructuring and added specificity
High change (61-100%): major rewrite, often new structure and stronger claims

Examples:

Original: "We will grow quickly."
Rewrite: "We project 12% MoM growth for 9 months, driven by a 28% paid-conversion funnel."
Likely range: high change (large lexical replacement + added concrete detail)
Original: "Our market is big."
Rewrite: "Our TAM is $4.2B, SAM is $620M, and we target a $48M SOM in 3 years."
Likely range: medium-high change
Original: "We have traction."
Rewrite: "We have traction with 1,240 MAU and 18% week-4 retention."
Likely range: medium change

CLI: analyze and score a proposal

Still from RAG-layer-of-startup-judging/:

# Produce structured analysis JSON
python extract_proposal.py --proposal-pdf /path/to/pitch.pdf --save-json analysis.json

# Score using saved analysis (avoids path quirks when chaining modules)
python scoring.py --analysis-json analysis.json --save-json score_output.json

For advanced options (multiple criteria PDFs, custom startup-failure PDF), see:

python extract_proposal.py --help
python scoring.py --help

Scoring computation

scoring.py computes proposal quality on a 0-10 scale in two parts:

Criteria completeness score (0-10)
- filled contributes full credit
- partial contributes half credit
- missing contributes no credit
- Formula:
  - completeness_ratio = (filled + 0.5 * partial) / total_criteria
  - completeness_score = 10.0 * completeness_ratio
Failure-risk penalty (0-3)
- From failure_mistakes_check:
  - yes counts as 1.0
  - unclear counts as 0.5
  - no counts as 0.0
- Formula:
  - penalty_ratio = (present_yes + 0.5 * present_unclear) / total_failures
  - failure_penalty = min(3.0, 3.0 * penalty_ratio)

Final score

raw_score = completeness_score - failure_penalty
score_out_of_10 = clamp(raw_score, 0.0, 10.0) (rounded to 2 decimals)
Investor-ready threshold: score_out_of_10 >= 7.0

The same scoring logic is also applied per investor criteria source in scores_by_investor.

Worked example

Suppose one proposal has:

Criteria evaluation: filled=6, partial=2, missing=2 (total 10)
Failure checks: yes=2, unclear=2, no=4 (total 8)

Then:

completeness_ratio = (6 + 0.5 * 2) / 10 = 0.70
completeness_score = 10 * 0.70 = 7.00
penalty_ratio = (2 + 0.5 * 2) / 8 = 0.375
failure_penalty = 3 * 0.375 = 1.125
raw_score = 7.00 - 1.125 = 5.875
score_out_of_10 = 5.88 (rounded to 2 decimals)

Verdict: below the 7.0 threshold, so refinement actions are recommended.

Score difference `%` for `scoring.py` (before vs after rewrite)

scoring.py outputs absolute scores (score_out_of_10). To quantify improvement between an original and rewritten version, compare two runs:

Run scoring on original proposal -> score_before
Run scoring on rewritten proposal -> score_after

Use:

Absolute delta: delta = score_after - score_before
Relative uplift %: uplift_pct = (delta / max(score_before, 0.01)) * 100

Interpretation:

Positive % -> improved investor-readiness score
0% -> no measurable score change
Negative % -> rewrite likely weakened criterion coverage or risk profile

Examples:

before=5.88, after=7.35
delta=+1.47, uplift≈+25.0% -> substantial improvement, now above threshold (7.0)
before=7.20, after=7.56
delta=+0.36, uplift≈+5.0% -> moderate improvement from already-strong baseline
before=6.40, after=6.40
delta=0.00, uplift=0.0% -> rewrite changed wording but not measurable scoring factors

Other utilities

upload.py — Upload a PDF to an HTTP endpoint (python upload.py file.pdf https://example.com/upload).
commentaire.py — Gemini-backed commentary helpers (requires GEMINI_API_KEY and google-generativeai).
test_elevenlabs.py — Quick ElevenLabs API check.

Security notes

Do not commit .env, private keys (.p8), or API keys. Use .env.example (without secrets) if you want to document variable names for collaborators.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
RAG-layer-of-startup-judging		RAG-layer-of-startup-judging
StartupCheck		StartupCheck
.gitignore		.gitignore
README.md		README.md
app.py		app.py
package-lock.json		package-lock.json
requirements.txt		requirements.txt
test_snowflake_connection.py		test_snowflake_connection.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Startup proposal — RAG judging layer

Repository layout

Prerequisites

Setup

Run the frontend (Streamlit)

Rewrite difference `%` (original vs rewritten text)

CLI: analyze and score a proposal

Scoring computation

Worked example

Score difference `%` for `scoring.py` (before vs after rewrite)

Other utilities

Security notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Startup proposal — RAG judging layer

Repository layout

Prerequisites

Setup

Run the frontend (Streamlit)

Rewrite difference % (original vs rewritten text)

CLI: analyze and score a proposal

Scoring computation

Worked example

Score difference % for scoring.py (before vs after rewrite)

Other utilities

Security notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Rewrite difference `%` (original vs rewritten text)

Score difference `%` for `scoring.py` (before vs after rewrite)

Packages