Skip to content

Chuckyecheesy/StartupCheck

Repository files navigation

Startup proposal — RAG judging layer

Python tooling that evaluates startup pitch PDFs against investor-style criteria using RAG (LangChain + OpenAI embeddings + FAISS), with an optional Streamlit UI for interactive review, scoring, and Gemini-powered rewrites. Analytics records can be written to Snowflake when configured.

Repository layout

Path Role
RAG-layer-of-startup-judging/ Main application: Streamlit UI, RAG pipelines, criteria/scoring logic
RAG-layer-of-startup-judging/frontend.py Streamlit entrypoint
RAG-layer-of-startup-judging/extract_proposal.py CLI: analyze a proposal PDF vs criteria PDFs
RAG-layer-of-startup-judging/scoring.py CLI: score analysis JSON and suggest refinements
RAG-layer-of-startup-judging/startup_rag.py Startup-doc RAG + Snowflake insert helpers
RAG-layer-of-startup-judging/criteria_rag.py Criteria PDF RAG helpers
test_snowflake_connection.py Local Snowflake connectivity smoke test (adjust paths/credentials for your environment)
requirements.txt Python dependencies (install from repository root)

Prerequisites

  • Python 3.10+ recommended
  • OpenAI API key for embeddings and baseline chat (gpt-3.5-turbo in RAG chains)
  • For the UI rewrite flows: Gemini API key (GEMINI_API_KEY)
  • Optional: ElevenLabs API key for text-to-speech in the UI
  • Optional: Snowflake account + key-pair auth for startup_rag logging

Setup

  1. Clone or copy this repo and go to the repository root (the directory that contains requirements.txt).

  2. Create a virtual environment and install dependencies:

    python -m venv .venv
    source .venv/bin/activate   # Windows: .venv\Scripts\activate
    pip install -r requirements.txt
  3. Copy environment variables into .env in RAG-layer-of-startup-judging/ (never commit real secrets; this repo’s .gitignore excludes .env). Example variables:

    Variable Purpose
    OPENAI_API_KEY Required for embeddings/RAG and proposal extraction
    GEMINI_API_KEY Required for Gemini features in frontend.py
    GEMINI_MODEL_REWRITE, GEMINI_MODEL_PROJECTION, GEMINI_MODEL_*_FALLBACKS Optional model overrides
    ELEVENLAB_API_KEY, VOICE_ID Optional TTS
    FAST_REWRITE Optional; set to 1 / true / yes for faster rewrite behavior
    SNOWFLAKE_USER, SNOWFLAKE_ACCOUNT, SNOWFLAKE_PRIVATE_KEY_FILE, SNOWFLAKE_WAREHOUSE, SNOWFLAKE_DATABASE, SNOWFLAKE_SCHEMA, SNOWFLAKE_ROLE Optional Snowflake logging
  4. Criteria PDFs — The UI and extract_proposal.py expect criteria documents such as marketing, angel, technical, and VC judging PDFs in RAG-layer-of-startup-judging/ (names referenced in frontend.py / extract_proposal.py). Add any missing files so default paths resolve.

  5. Snowflake key — Point SNOWFLAKE_PRIVATE_KEY_FILE at your .p8 private key file on disk.

When running the app or CLI scripts, use the app directory:

cd RAG-layer-of-startup-judging

Run the frontend (Streamlit)

The web UI is launched with Streamlit. Activate your virtual environment (if you use one), go to the app folder, then start the app:

cd RAG-layer-of-startup-judging
streamlit run frontend.py --server.port 8502 --server.address 127.0.0.1

From the repository root in one line:

cd RAG-layer-of-startup-judging && streamlit run frontend.py --server.port 8502 --server.address 127.0.0.1

For local runs, open http://localhost:8502 in your browser.

For remote hosting via SSH port-forwarding, keep this tunnel running on your local machine:

ssh -i ~/.ssh/do_startupcheck -o IdentitiesOnly=yes -N -L 8502:127.0.0.1:8502 deploy@138.197.143.191

Then open http://localhost:8502 locally. http://127.0.0.1:8502/ works as well (same destination). Ensure OPENAI_API_KEY and GEMINI_API_KEY are set (for example via RAG-layer-of-startup-judging/.env) for full functionality.

Rewrite difference % (original vs rewritten text)

In the frontend rewrite panel, the app shows:

  • Total Content Change (%) (overall_percent_change)
  • Sentence-level change (%) (sentence_changes[].percent_change)

This value is generated by Gemini as a lexical/phrase overlap estimate (not semantic quality), with:

  • 0% -> wording is almost identical
  • 100% -> wording is fully rewritten
  • rounded to whole numbers

It is a rewrite magnitude indicator, not a direct quality score.

How the metrics are computed in practice:

  • Sentence-level change (sentence_changes[].percent_change)

    • For each sentence pair (original_sentence vs rewritten_sentence), the rewrite-difference step compares wording overlap (tokens/phrases), not semantic intent.
    • More shared wording -> lower %; less shared wording -> higher %.
    • The result is rounded to a whole number in [0, 100].
    • change_label is then assigned from that percentage as a bucket (low, medium, high).
  • Total content change (overall_percent_change)

    • Computed from the same sentence-card set used above.
    • Represents aggregate rewrite magnitude across the full rewritten section (not just one sentence).
    • Displayed in UI as Total Content Change.
    • If the app runs in fast mode or summary generation is skipped/fails, this value can be -1 internally and is shown as 0% in UI fallback.

How to read common variates (ranges):

  • Low change (0-25%): light edits, phrasing tweaks, minor clarifications
  • Medium change (26-60%): noticeable restructuring and added specificity
  • High change (61-100%): major rewrite, often new structure and stronger claims

Examples:

  • Original: "We will grow quickly."
    Rewrite: "We project 12% MoM growth for 9 months, driven by a 28% paid-conversion funnel."
    Likely range: high change (large lexical replacement + added concrete detail)

  • Original: "Our market is big."
    Rewrite: "Our TAM is $4.2B, SAM is $620M, and we target a $48M SOM in 3 years."
    Likely range: medium-high change

  • Original: "We have traction."
    Rewrite: "We have traction with 1,240 MAU and 18% week-4 retention."
    Likely range: medium change

CLI: analyze and score a proposal

Still from RAG-layer-of-startup-judging/:

# Produce structured analysis JSON
python extract_proposal.py --proposal-pdf /path/to/pitch.pdf --save-json analysis.json

# Score using saved analysis (avoids path quirks when chaining modules)
python scoring.py --analysis-json analysis.json --save-json score_output.json

For advanced options (multiple criteria PDFs, custom startup-failure PDF), see:

python extract_proposal.py --help
python scoring.py --help

Scoring computation

scoring.py computes proposal quality on a 0-10 scale in two parts:

  1. Criteria completeness score (0-10)

    • filled contributes full credit
    • partial contributes half credit
    • missing contributes no credit
    • Formula:
      • completeness_ratio = (filled + 0.5 * partial) / total_criteria
      • completeness_score = 10.0 * completeness_ratio
  2. Failure-risk penalty (0-3)

    • From failure_mistakes_check:
      • yes counts as 1.0
      • unclear counts as 0.5
      • no counts as 0.0
    • Formula:
      • penalty_ratio = (present_yes + 0.5 * present_unclear) / total_failures
      • failure_penalty = min(3.0, 3.0 * penalty_ratio)

Final score

  • raw_score = completeness_score - failure_penalty
  • score_out_of_10 = clamp(raw_score, 0.0, 10.0) (rounded to 2 decimals)
  • Investor-ready threshold: score_out_of_10 >= 7.0

The same scoring logic is also applied per investor criteria source in scores_by_investor.

Worked example

Suppose one proposal has:

  • Criteria evaluation: filled=6, partial=2, missing=2 (total 10)
  • Failure checks: yes=2, unclear=2, no=4 (total 8)

Then:

  • completeness_ratio = (6 + 0.5 * 2) / 10 = 0.70
  • completeness_score = 10 * 0.70 = 7.00
  • penalty_ratio = (2 + 0.5 * 2) / 8 = 0.375
  • failure_penalty = 3 * 0.375 = 1.125
  • raw_score = 7.00 - 1.125 = 5.875
  • score_out_of_10 = 5.88 (rounded to 2 decimals)

Verdict: below the 7.0 threshold, so refinement actions are recommended.

Score difference % for scoring.py (before vs after rewrite)

scoring.py outputs absolute scores (score_out_of_10). To quantify improvement between an original and rewritten version, compare two runs:

  1. Run scoring on original proposal -> score_before
  2. Run scoring on rewritten proposal -> score_after

Use:

  • Absolute delta: delta = score_after - score_before
  • Relative uplift %: uplift_pct = (delta / max(score_before, 0.01)) * 100

Interpretation:

  • Positive % -> improved investor-readiness score
  • 0% -> no measurable score change
  • Negative % -> rewrite likely weakened criterion coverage or risk profile

Examples:

  • before=5.88, after=7.35
    delta=+1.47, uplift≈+25.0% -> substantial improvement, now above threshold (7.0)

  • before=7.20, after=7.56
    delta=+0.36, uplift≈+5.0% -> moderate improvement from already-strong baseline

  • before=6.40, after=6.40
    delta=0.00, uplift=0.0% -> rewrite changed wording but not measurable scoring factors

Other utilities

  • upload.py — Upload a PDF to an HTTP endpoint (python upload.py file.pdf https://example.com/upload).
  • commentaire.py — Gemini-backed commentary helpers (requires GEMINI_API_KEY and google-generativeai).
  • test_elevenlabs.py — Quick ElevenLabs API check.

Security notes

  • Do not commit .env, private keys (.p8), or API keys. Use .env.example (without secrets) if you want to document variable names for collaborators.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages