Inspiration
We wanted a way to quantify “who’s most likely to podium” before a race starts and to explain why. Racing fans, teams and commentators benefit from probability-based reasoning and interactive “what-if” scenario simulation (e.g., change start position, change expected pace). PodiumCast combines racing telemetry/history with ensemble ML to produce actionable probabilities plus factor importance so predictions are explainable.
What it does
- Loads historical race CSVs (Sebring, Road America) and produces: win probability, podium (top-3) probability, top-5/top-10 probabilities, expected finishing position, and a DNF risk estimate.
- Provides confidence intervals and a factor analysis dashboard showing which features (starting pos, past track history, pace consistency, etc.) moved the probability most.
- Offers an interactive scenario simulator — change inputs (e.g., starting position) and re-run predictions to see new odds.
How I built it
Data ingestion from structured race CSVs → feature engineering (50+ features) → models (multiple predictors: podium predictor, win probability model, DNF risk model, position predictor) → ensemble aggregator → FastAPI backend → React + Vite frontend dashboard with a 3D podium visualization and scenario controls. Architecture diagram and workflow are in the README.
A compact ensemble representation (used conceptually) is: P_ensemble=\sum_{i}w_i*P_i where: P_i are model probabilities and w_i are ensemble weights (normalized). This lets us combine specialist models (win, DNF, finishing-position) into final odds.
Challenges we ran into
Feature design. Extracting robust, interpretable features from messy CSVs (different races/tracks) required many edge-case rules.
Small dataset size. Motorsport datasets are often small — that forced careful validation and conservative regularization to avoid overfitting.
Calibration. Matching model confidence intervals to reality (probability calibration) took extra validation and isotonic/Platt-style calibration checks.
Accomplishments that I'm proud of
- End-to-end pipeline from raw CSV to live interactive predictions.
- Factor analysis view that helps users understand why odds change when a parameter is modified.
- Scenario simulator that recalculates probabilities on the fly in the UI.
What I learned
- Thoughtful feature engineering and simple ensembles often beat complex black-boxes on small racing datasets.
- Exposing model uncertainty (confidence intervals) dramatically increases user trust.
- Building UI controls to let users test “what-if” scenarios is extremely effective at surfacing model strengths & limits.
What's next for PodiumCast
- Add more tracks and seasons to improve model robustness.
- Experiment with time-series models (per-lap modelling) and stronger calibration.
- Add live data integrations (race weekend telemetry) and a lightweight API key service for team integrations.
Built With
- fastapi
- numpy
- pandas
- python
- react
- scikit-learn
- vite
Log in or sign up for Devpost to join the conversation.