I stumbled upon this idea by diving deeper into the third challenge of the Acorn Talents Hack which was called Smart Infrastructure.
My regression model predicts the number of homeless people in a given year based on the data I found. These predictions would then be given to some level of governance to make appropriate decisions regarding the quantity and location of the shelters for the people.
I first performed data mining and profiling on data resources found online via government institutions and other reliable sources. After receiving the data, I cleaned and organized it to more accurately depict trends that I could later confirm with the help of a regression model.
Finding the data for homeless people was a big struggle since it is not a) recorded in general and b) released to the public. Additionally, any form of data that I happened to encounter was in a pdf or displayed as graphs in a pdf which are both equally as hard to read.
Despite the lack of data, I still pushed forward and found some reliable data that I was able to depict trends from and place into a regression model. Due to there being almost no data on the population of homeless people, I had to create a script that would generate 'fake', but not random, data which would accurately align with numerous articles based on the populations of homeless people.
I gained a lot of insight in what data science actually is and I also learned a lot about how to face many of the challenges that data scientists or analysts may face on a day-to-day basis.
Upcoming steps include an elegant and robust full-stack website for users to view the data I found and predicted. Additionally, I would to post reports based on our findings to this website in order to inform the city of Toronto.