Inspiration
The night we were thinking about a project we would like to do, Artem was booking tickets, and remembered a story about how he missed his 1-hour connection once. Kate remembered how, because of her separate airline tickets, she had to wait for most of her connection for her luggage to arrive and barely made it in time. We all had the same thought, that if only we could predict such difficulties, we would book different tickets and never “gamble” like this anymore. Now, with “Sky Gamble,” you can.
What it does
Sky Gamble is a web application that predicts the risk of flight delays and suggests whether you need to book another flight. Our application lets you either upload all your tickets and autofill your itinerary using the OpenAI API or enter your details manually. Our model utilizes the Random Forest learning method, trained using scikit-learn and imblearn, on a dataset of 20 million flights from the Bureau of Transportation Statistics, to predict delays on any of the flights. The model achieves 81% accuracy, classifying the predicted delay into 1 out of 5 classes and calculating the expected value of the delay. Afterwards, it calculates the time remaining on each of your connections and decides if it is possible to make each one of them, along with the probability of your success.
How we built it
Half of our team worked on the model, while the other half focused on website UI/UX, front-end, and back-end development using Next.js and Django. However, we were all helping each other as well. First, we simultaneously developed the design, front-end, and back-end of the home page while working on the prediction models, which proved to be the most challenging and educational part. Once the model was finalized, we began working on the front-end and back-end of the result page, as well as calculating the total delays and percentages.
Challenges we ran into
The most challenging part was selecting the optimal model type for the task and polishing it. We tried to train our model using four different learning methods, including MLP (Multilayer Perceptron), GBDT (Gradient Boost Decision Tree), Random Forest, and Vector Embeddings with K-nearest neighbors search, to achieve the best accuracy possible. Each of the models we tried underwent an enormous number of adjustments before we finally selected the model trained using Random Forest for our project.
Accomplishments that we're proud of
We are proud of all the models we have created, as they have collectively contributed to the development of our most accurate model. Notably, the Vector Embeddings were the most sophisticated to develop and build, as they required quite efficient indexing (we used the Hierarchical Navigable Small Worlds graph algorithm for this task). We are also proud of the design we made for our website, as it stands out among many other applications in terms of user experience. But most of all, we are proud of how we managed to act as a team at all stages of the development.
What we learned
This experience was amazing and unforgettable for all four of us. The model learning team learned various model types and how they work, as well as which type is the most optimal for specific scenarios. While researching the past flight delay prediction models, we encountered several research papers that we used to learn from. Remarkably, our model, in the end, was able to outperform all the models with a greater variety of airports. While our web-development team learned about OpenAI API and prompt engineering.
What's next for Sky Gamble
For now, Sky Gamble only works for flights within the U.S., so in the future, the dataset used for our model can be expanded to include international flights. Some other improvements we are looking for are calculating the probability of flight cancellation, including weather in our calculation, and just overall improving our model to output more accurate results.
Built With
- django
- machine-learning
- next.js
- numpy
- openai
- pandas
- prompt-enginerring
- scikit-learn

Log in or sign up for Devpost to join the conversation.