⛽ FuelCast: The AI Energy Price Oracle

Project Name: FuelCast: AI Energy Price Oracle

Summary: FuelCast is a full-stack, predictive analytics dashboard that forecasts U.S. retail gasoline prices using a non-linear machine learning model, designed to mitigate the economic shock of energy price volatility. Our model achieved a 45% reduction in prediction error compared to the industry standard, providing a viable tool for consumer and business budgeting.

🌍 UN Sustainable Development Goal Alignment

Primary Focus: SDG 7: Affordable and Clean Energy

Target Addressed: Target 7.1 (Ensure universal access to affordable, reliable, and modern energy services).

Impact: Volatile fuel prices are a major threat to economic stability and affordability for small businesses and low-income households. Our high-accuracy model (MAE of 8.5 cents) enables proactive budgeting and hedging against price spikes, making necessary transportation energy more economically reliable and accessible.

💡 Inspiration

Traditional time-series forecasting (SARIMAX) fails when applied to volatile commodity markets because it assumes linearity. Our inspiration was to solve this "Linear Trap". We knew that retail gas price is a downstream effect of the global crude oil price, but with a complex time lag and non-linear market reaction. We set out to build a system that could identify and exploit this hidden causal relationship.

🛠️ What it does

FuelCast is a real-time predictive dashboard that delivers the most accurate retail fuel price forecast possible using historical data.

Prediction: Forecasts U.S. Regular Gasoline prices for the near-term future.

Visualization: Displays predictions from two models (SARIMAX and XGBoost) against the actual price curve on a Next.js dashboard using Recharts.

Interpretation: Provides clear Feature Importance scores from the XGBoost model, explaining why the price is moving (e.g., confirming the influence of crude oil).

🔨 How we built it

Data Ingestion: We sourced weekly U.S. Retail Gas Prices (Target) and Daily WTI Crude Oil Futures (Exogenous Feature) from high-authority datasets (EIA-sourced data).

Feature Engineering: We transformed the time series by creating powerful, lagged features, most critically the 4-Week Lag for the Crude Oil price and features for Seasonality (month, day-of-year).

Modeling Pipeline: We trained two models for comparison:

Baseline: SARIMAX, a linear statistical model.

Challenger: XGBoost Regressor, a non-linear ensemble model, trained on our engineered features.

Full-Stack Deployment: We saved the trained models and designed the front-end (Next.js/TypeScript/Tailwind) to fetch data from a planned FastAPI Python service.

🚧 Challenges we ran into

The Negative $R^2$: Our primary challenge was proving the "Linear Trap." The SARIMAX baseline model's R² score was -1.4241, which meant it was performing worse than a random guess (the mean). This required an immediate and absolute pivot to the machine learning approach.

Data Alignment: Successfully merging the daily WTI data with the weekly retail gas price data required careful resampling and ensuring the time indices aligned correctly for prediction.

SARIMAX Instability: The auto_arima function struggled with model convergence, highlighting the fragility of statistical methods when facing noisy, volatile data.

✅ Accomplishments that we're proud of

Error Reduction: Achieving a 45% reduction in prediction error by reducing the RMSE from $0.2264$ (SARIMAX) to $0.1238$ (XGBoost).

MAE Score: Delivering an average absolute prediction error (MAE) of just 8.5 cents per gallon. This is a business-ready metric for hedging and planning.

Causal Validation: Using the XGBoost Feature Importance to empirically prove that the 4-Week Crude Oil Lag is the dominant causal factor in retail price movement.

🧠 What we learned

We learned that in finance and commodity prediction, the modeling choice is secondary to Feature Engineering. By converting the time-series problem into a supervised learning problem and feeding the XGBoost model specific, lagged causal features, we generated superior performance compared to relying on the model's inherent temporal capabilities alone.

🚀 What's next for FuelCast

EIA Integration: Replace historical Kaggle data loading with direct, live integration to the EIA Public API for weekly automated updates.

Cloud Deployment: Deploy the trained XGBoost model via the FastAPI service to a scalable cloud platform (e.g., Google Cloud Run).

Monetization & SDG Expansion: Expand the dashboard into an API service for regional logistics companies to optimize their supply chain costs based on the forecast, directly enhancing affordability for the transport sector.

Built With

Share this project:

Updates