An NBA player performance anomaly detector and front-office scouting platform.
Identifying statistical deviations in NBA game logs to flag sudden performance shifts (breakouts, slumps, usage rate changes) that indicate true momentum versus statistical noise. Built to simulate an analytics-driven general manager evaluating trade prospects and free agency signings.
A decoupled architecture using a Python/FastAPI backend and a React/Next.js frontend. The backend fetches historical game logs from the external nba_api, caches the results in a local PostgreSQL database, and engineers core features (usage rate, true shooting, game score). An IsolationForest machine learning model runs on the recent temporal window to identify statistically significant outliers. Features are translated into Z-scores, calculating regression-to-the-mean probabilities. This mathematical context is injected as a prompt to the Google Gemini LLM, which behaves as a front-office analyst generating a natural language scouting report.
- Backend: Python 3.11, FastAPI, SQLAlchemy
- Data / ML: Pandas, scikit-learn (IsolationForest), numpy
- LLM: Google
gemini-2.5-flashAPI viagoogle-genaiSDK - Database: PostgreSQL (Dockerized)
- Frontend: Next.js (App Router), TypeScript, TailwindCSS, Framer Motion, Recharts
- End-to-end Machine Learning pipeline utilizing
IsolationForestto flag true statistical game anomalies and filtering out typical variance. - Automated contextual LLM generation, grounding
gemini-2.5-flashin purely mathematical deviations (Z-scores) to limit hallucination and output concrete trade analysis. - Z-Score caching and DB intelligence mapping to prevent duplicate API generation and limit cloud costs.
- Aggressive data ingestion bypass utilizing custom browser headers and exponential backoff to circumvent
stats.nba.comrate limits.
# Clone the repository
git clone git@github.com:Techdude01/NBAnomaly.git
cd NBAnomaly
# Setup Database
make up
# Set up Python virtual environment and dependencies
python3 -m venv venv
source venv/bin/activate
pip install -r backend/requirements.txt
# Configure Environment Variables
# Add a .env file linking POSTGRES_URL and your GEMINI_API_KEY
cp backend/.env.example backend/.env
# Setup Frontend dependencies
cd frontend
npm installIf you have Docker installed, skip the manual environment setup and just run:
make docker-upThis builds and pulls all services (Postgres, FastAPI, Next.js) and maps them to their respective ports. Use make down to kill the stack safely.
# From root
source venv/bin/activate
make dev# New terminal
cd frontend
npm run devNavigate to http://localhost:3000 to interact with the dashboard.
The native nba_api package relies heavily on endpoints tied to stats.nba.com, which enforces sudden, strict rate-limiting and bot-protection timeouts during bulk data fetching. Initially, fetching full career game logs for a player would sequentially crash the container with TimeoutError. To circumvent this without paying for a commercial sports data provider, I abandoned the core nba_api/PlayerGameLog abstraction entirely. Instead, I reverse-engineered the raw API requests and built a custom requests wrapper applying specific Sec-Ch-Ua mimicking headers and an exponential backoff algorithm. This safely processes the ingestion sequence into Postgres while masking the bot signature.