Inspiration
Over 1.47 billion people (which is approximately 20% of the entire world!) have been impacted in some manner by floods in the past decade throughout the world shocked us. Upon doing more research, we found approximately 60,000 people die due to floods yearly and moreover, countries have to pay over 200 billion dollars of damage from the aftermath. Seeing how people lose their lives to natural disasters no one control, we believe that these numbers wouldn't be as big if a pro-active system was developed to forecast floods BEFORE they happened.
What it does
We created machine learning models that make early forecasts for whether or not floods will occur. These models were deployed on the cloud and constantly update a firebased database that consists of issues flood warnings across the world. We also created an Android app and web app for users to receive flood warnings before the flood happens, and to visualize the flood warnings and land susceptibility to floods on a dashboard interface.
- The Android app takes your current GPS location, and if there is an upcoming flood in the next few days, the app notifies all users within that local area to seek shelter and stay safe a few days in advance. It also allows users to see flood warnings in cities across the world and view a global list of recent floods in real-time.
- The web application has a dashboard that shows flood warnings for the user's location and a list of floods in surrounding areas. There is an interactive map that shows floods in real-time, colored by their severity. You can click on certain areas of map and see the susceptibility map and flood warning info for that region.
How we built it
There aren't datasets of floods and weather features when the floods occurred (like precipitation, proximity to water sources, etc) so we had to create our own dataset. We used flood incidences (date and location) from the NOAA database which consisted of over 100,000 flood instances. We collected negative data by taking random lat/long/times where floods did not happened. For this list of flood incidences, we collected data for features indicative of floods.
- Precipitation
- Proximity to dams and reservoirs
- Forest loss
- Road/infrastructure presence
- Rock type
- Proximity to water bodies
Our sources included
- OpenStreetMap(road/infrastructure presence)
- Global Forest Change (Forest Loss)
- IMERG (precipitation)
- Global Dam Watch
- LP DAAC (Water Bodies)
We dealt with 150+ GB of data, taking hours to process for each feature.
We used the sci-kit learn library to create our KNN, SVC, & Random Forest models. Then we trained them on the dataset we created. Our Random Forest model obtained the highest test accuracy of 97% and a recall (detection rate) of 97.6%. This means, it correctly forecasted 97.6% of floods in the test set.
We used the Firebase Realtime Database to store previous floods within the local areas of users along with a global database for the user to explore recent, global floods. The user isn't only limited to their local area; they can check up on the flood status of other areas through a simple search.
The Android app was created in Android Studio and uses the user's current location to display the relevant information to them and monitor the Firebase Database for flood warnings in their city. The website was created with React and also uses Firebase to display flood data & warnings to users. The Google Maps API was used in combination with our Real-time Database to display an interactive, global map showing floods happening across the world in real-time.
Challenges we ran into
One of the largest challenges was collecting data for our models to train on. As mentioned, there aren't publicly accessible, flood datasets containing weather or terrain-related information at the time of the flood. So, we had to do research to determine which features were indicators of floods, then we had to determine which were feasible to collect with the time constraints that we had.
Another issue that arose was the actual data collection because we had to deal with large amounts of data - over 150 GB! Processing the data was difficult, and took up most of our time in this project. We had multiple scripts running at the same time for hours to collect the data for 10,000 incidences. Then we had to process this data, compile it together, and got our final datasets of 10,000 flood and non-flood incidences.
Accomplishments that we're proud of
Despite the fact that data collection was the hardest part of the project, we are proud to say that despite we had achieved approximately 97% testing accuracy in predicting floods. This is likely because our dataset is so extensive with so many features and flood incidents.
We accomplished multiple tasks in a limited time period (warranting a total lack of sleep):
- Dataset compilation
- Model creation
- Web App
- Mobile App
We were also able to utilize algorithms we learned from our Computer Science class such as binary search to determine the closest water body from a flood incident given a list of water bodies sorted by lat, long. For determining the distance of the closest water bodies from a location, we had to utilize math that we thought we would never need again such as the Pythagoras theorem to calculate the distance between two points given their longitudes and latitudes.
Typically when reading the Firebase Database, lots of frames are usually dropped on the Android application. But this time, we optimized our code when it came to reading the Firebase Database. We realized that instead of having to read every single firebase entry and searching for the specific entry that was equal to the city provided by the user, we instead just set a listener on a specific reference to a city which decreased the number of frames lost on the Android app from 90 to 15.
What we learned
- We learned about writing to Firebase from Python.
- We learned more about React and got more experience by working on the web app
- Some of us learned about using Firebase in Android applications.
- We got more experience in data processing. The data we compiled for our features were in all sorts of different formats (osm, XML, tif, hdf5, ...)
- Got experience with the QGIS desktop app for large-scale geographical data
- CSS is worse than Assembly
What's next for GFFS
At the moment, GFFS forecasts floods before they happen, but floods aren't the only natural disaster that takes lives. We want to minimize the impact of natural disasters around the world, and that can be accomplished if we look into more types of natural disasters and forecasting them.



Log in or sign up for Devpost to join the conversation.