<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:cc="http://cyber.law.harvard.edu/rss/creativeCommonsRssModule.html">
    <channel>
        <title><![CDATA[Stories by Jessica Ayodele on Medium]]></title>
        <description><![CDATA[Stories by Jessica Ayodele on Medium]]></description>
        <link>https://medium.com/@jess-analytics?source=rss-77c6fba84503------2</link>
        <image>
            <url>https://cdn-images-1.medium.com/fit/c/150/150/1*LKu0c1v_S8vEUrLuHWh3lg.png</url>
            <title>Stories by Jessica Ayodele on Medium</title>
            <link>https://medium.com/@jess-analytics?source=rss-77c6fba84503------2</link>
        </image>
        <generator>Medium</generator>
        <lastBuildDate>Tue, 19 May 2026 01:10:40 GMT</lastBuildDate>
        <atom:link href="https://medium.com/@jess-analytics/feed" rel="self" type="application/rss+xml"/>
        <webMaster><![CDATA[yourfriends@medium.com]]></webMaster>
        <atom:link href="http://medium.superfeedr.com" rel="hub"/>
        <item>
            <title><![CDATA[Data Analyst Interview Questions Pt. 1]]></title>
            <description><![CDATA[<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://jess-analytics.medium.com/data-analyst-interview-questions-pt-1-fe7e1cf397d9?source=rss-77c6fba84503------2"><img src="https://cdn-images-1.medium.com/max/2600/0*ZyACuo4TQLnjBUvw" width="5760"></a></p><p class="medium-feed-snippet">40+ Questions I&#x2019;ve been asked in interviews since 2021</p><p class="medium-feed-link"><a href="https://jess-analytics.medium.com/data-analyst-interview-questions-pt-1-fe7e1cf397d9?source=rss-77c6fba84503------2">Continue reading on Medium »</a></p></div>]]></description>
            <link>https://jess-analytics.medium.com/data-analyst-interview-questions-pt-1-fe7e1cf397d9?source=rss-77c6fba84503------2</link>
            <guid isPermaLink="false">https://medium.com/p/fe7e1cf397d9</guid>
            <category><![CDATA[excel]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[sql]]></category>
            <category><![CDATA[data-analytics]]></category>
            <category><![CDATA[data-interviews]]></category>
            <dc:creator><![CDATA[Jessica Ayodele]]></dc:creator>
            <pubDate>Sun, 06 Aug 2023 14:25:28 GMT</pubDate>
            <atom:updated>2023-08-06T14:25:28.864Z</atom:updated>
        </item>
        <item>
            <title><![CDATA[15 Volunteer Organizations to gain Tech and Data skills]]></title>
            <description><![CDATA[<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://jess-analytics.medium.com/15-volunteer-organizations-to-gain-tech-and-data-skills-bf6a068514b1?source=rss-77c6fba84503------2"><img src="https://cdn-images-1.medium.com/max/2600/0*zPSMhG-gjDiYi7WF" width="6000"></a></p><p class="medium-feed-snippet">Are you looking to gain real world Tech experience while learning and before landing a job?</p><p class="medium-feed-link"><a href="https://jess-analytics.medium.com/15-volunteer-organizations-to-gain-tech-and-data-skills-bf6a068514b1?source=rss-77c6fba84503------2">Continue reading on Medium »</a></p></div>]]></description>
            <link>https://jess-analytics.medium.com/15-volunteer-organizations-to-gain-tech-and-data-skills-bf6a068514b1?source=rss-77c6fba84503------2</link>
            <guid isPermaLink="false">https://medium.com/p/bf6a068514b1</guid>
            <category><![CDATA[ui]]></category>
            <category><![CDATA[data-analytics]]></category>
            <category><![CDATA[software-development]]></category>
            <category><![CDATA[digital-marketing]]></category>
            <category><![CDATA[data-science]]></category>
            <dc:creator><![CDATA[Jessica Ayodele]]></dc:creator>
            <pubDate>Sun, 18 Sep 2022 19:46:21 GMT</pubDate>
            <atom:updated>2022-09-18T20:33:52.232Z</atom:updated>
        </item>
        <item>
            <title><![CDATA[Analysis of New York City Motor Vehicles Collisions]]></title>
            <link>https://medium.com/data-science/analysis-of-new-york-city-motor-vehicles-collisions-927da110dfc7?source=rss-77c6fba84503------2</link>
            <guid isPermaLink="false">https://medium.com/p/927da110dfc7</guid>
            <category><![CDATA[data-science]]></category>
            <category><![CDATA[editors-pick]]></category>
            <category><![CDATA[hands-on-tutorials]]></category>
            <category><![CDATA[tableau]]></category>
            <category><![CDATA[data-analysis]]></category>
            <dc:creator><![CDATA[Jessica Ayodele]]></dc:creator>
            <pubDate>Thu, 01 Jul 2021 05:26:05 GMT</pubDate>
            <atom:updated>2021-07-02T19:33:01.964Z</atom:updated>
            <content:encoded><![CDATA[<h4><a href="https://towardsdatascience.com/tagged/hands-on-tutorials">Hands-on Tutorials</a></h4><h4><em>A Data analyst interview case study using Google BigQuery and Tableau</em></h4><p>In my last article, I spoke about my transition into Data Analytics and how I recently landed a full-time Data Analyst position. Throughout the month of April ’21, I was breezing in and out of interviews with various North American companies. For some of these companies, I had to partake in Excel, SQL, or Python tests while a few others had me work on case studies. In this article, I will walk you through one of such case studies which I passed and my approach in tackling the problem.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*mC7mXxomdLjIOutV" /><figcaption>Photo by <a href="https://unsplash.com/@withluke?utm_source=medium&amp;utm_medium=referral">Luke Stackpoole</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><h3>The Task</h3><p>First, case studies are a way for companies to test core skills before considering you for advanced interview stages. For this case study, I was tasked with analysing the New York City Motor Vehicles Collision dataset in Google BigQuery from Jan 2014 to Dec 2017 and provide recommendations to reduce occurrence of accidents in Brooklyn,<em> </em>a borough in New York<em>. </em>The entire dataset currently has over 1.7 million records from 2012 to date and can be accessed <a href="https://console.cloud.google.com/bigquery?project=brooklyn-collisions-311918&amp;ws=!1m5!1m4!4m3!1sbigquery-public-data!2snew_york_mv_collisions!3snypd_mv_collisions&amp;d=new_york_mv_collisions&amp;p=bigquery-public-data&amp;t=nypd_mv_collisions&amp;page=table">here</a>.</p><p><em>Side note: Google BigQuery has several public datasets that are updated periodically and can be used to build projects for your portfolio.</em></p><h3>My Approach</h3><p>My first instinct was to search the web for articles related to the task because “there’s nothing new under the sun”. I found previous articles which I found useful in developing my approach. A summary of my approach is shown in the image below.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*gF3899DOUP-eIQZNJs-NCg.png" /><figcaption>My Case Study Approach (Image by Author)</figcaption></figure><h3>First Steps</h3><p>Here are a few tricks and steps you should use to approach future case studies.</p><ul><li><strong><em>Understanding the task:</em></strong> This is relevant for any case study to ensure that your analysis does not go off-point. It is important to follow the instructions first before going the extra mile. In this case study, I almost missed where I was asked to analyse only 2014–2017 data in the brief.</li><li><strong>Prepping the Data</strong>: Identifying the primary key and checking for duplicates and null values should be a no-brainer when exploring your dataset. Also look out for fields that might be relevant to your analysis, so you do not end up importing irrelevant fields into your Business Intelligence tool. This is where SQL came in handy.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/724/1*pkaATK0p-rAoIpaNXOIblQ.png" /><figcaption>Checking for Duplicates in Google BigQuery (P.S. Query returned no results is a good thing in this case.)</figcaption></figure><h3>Deep Dive</h3><p>To analyse the dataset, I made use of <em>Tableau Public</em> for two reasons: I wanted to create an interactive dashboard and Tableau was one of the skill sets mentioned in the job description. From exploring the dataset, I got ideas of key features to do an in-depth analysis on. Some are highlighted below while others can be explored in the final dashboard.</p><ul><li><em>Collision Analysis:</em> This was done to reveal top causes that led to collisions and fatalities. We can see here that most fatalities were caused by <em>Driver Inattention/Distraction.</em></li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*FqSbwlHuVP3WGLSNV6X28A.png" /><figcaption>Top 7 Collision Contributing factors by fatalities (Image by Author using Tableau)</figcaption></figure><ul><li><em>Time Series Analysis:</em> Reveal what time of day or day of week have most collisions. We can see from the chart below that most collisions occurred during rush hour <em>(4PM–5PM)</em>. We also see significant numbers at early hours of the day.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/826/1*t-NaP3KK9hWr6i_rdad3fQ.png" /><figcaption>Collisions Time Series Analysis (Image by Author using Tableau)</figcaption></figure><ul><li><em>Fatality Analysis:</em> This revealed that pedestrians were killed more often than other road users whenever collisions occurred.</li></ul><figure><img alt="" src="https://cdn-images-1.medium.com/max/698/1*zq8T7X-ZcT2SZKS8G78WCQ.png" /><figcaption>Total Annual Fatalities by road users (Image by Author using Tableau)</figcaption></figure><h3>Bringing it all together</h3><p>Using the insights gathered from my analysis, I prepared a slide deck to provide recommendations. An additional tip is to ensure any recommendation you provide is backed up by your analysis<em> — not prior knowledge.</em> Also, most companies would give a few hours to 5 business days to complete a case study. If you see you have more time, please try not to rush through it.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*e1AOA1gFeZa0rxlp0HhBug.png" /><figcaption>Recommendations provided based on my analysis (Image by Author)</figcaption></figure><p>The final submission for this case study was a <em>slide deck</em> and <em>dashboard</em>. The latter was an add-on because this was a major tech company and they loved it :). A preview of the interactive dashboard is shown below. I designed the background in Figma, and the rest of the magic happened in Tableau.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/1*zoNPwW36JxvPPE-3f3KYmg.png" /><figcaption>Final Tableau dashboard (Image by Author)</figcaption></figure><h3>Relevant Links</h3><ul><li><a href="https://tabsoft.co/3qGoC2B">Tableau Dashboard</a></li><li><a href="https://jess-analytics.com/resources">Final Slide deck</a></li><li><a href="https://www.linkedin.com/in/jessicauwoghiren">LinkedIn Profile</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=927da110dfc7" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/analysis-of-new-york-city-motor-vehicles-collisions-927da110dfc7">Analysis of New York City Motor Vehicles Collisions</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[BRIDGERTON: An analysis of Netflix’s most-streamed TV series]]></title>
            <description><![CDATA[<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://medium.com/data-science/bridgerton-an-analysis-of-netflixs-most-streamed-tv-series-c4c9e2926397?source=rss-77c6fba84503------2"><img src="https://cdn-images-1.medium.com/max/2600/0*tG8IIorra8Sq0IfO" width="5989"></a></p><p class="medium-feed-snippet">An analysis of over 300,000 tweets on the Bridgerton TV series using NLP techniques in Python &amp; Tableau</p><p class="medium-feed-link"><a href="https://medium.com/data-science/bridgerton-an-analysis-of-netflixs-most-streamed-tv-series-c4c9e2926397?source=rss-77c6fba84503------2">Continue reading on TDS Archive »</a></p></div>]]></description>
            <link>https://medium.com/data-science/bridgerton-an-analysis-of-netflixs-most-streamed-tv-series-c4c9e2926397?source=rss-77c6fba84503------2</link>
            <guid isPermaLink="false">https://medium.com/p/c4c9e2926397</guid>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[tableau]]></category>
            <category><![CDATA[nlp]]></category>
            <category><![CDATA[bridgerton]]></category>
            <category><![CDATA[python]]></category>
            <dc:creator><![CDATA[Jessica Ayodele]]></dc:creator>
            <pubDate>Fri, 29 Jan 2021 20:41:46 GMT</pubDate>
            <atom:updated>2021-04-08T04:43:11.338Z</atom:updated>
        </item>
        <item>
            <title><![CDATA[The Year 2020: Analyzing Twitter Users’ Reflections using NLP]]></title>
            <description><![CDATA[<div class="medium-feed-item"><p class="medium-feed-image"><a href="https://medium.com/data-science/the-year-2020-analyzing-twitter-users-reflections-using-nlp-3afdfdf2f68e?source=rss-77c6fba84503------2"><img src="https://cdn-images-1.medium.com/max/2600/0*LPztxZpQxpqxAYA8" width="7952"></a></p><p class="medium-feed-snippet">A Sentiment Analysis Project using Python and Tableau</p><p class="medium-feed-link"><a href="https://medium.com/data-science/the-year-2020-analyzing-twitter-users-reflections-using-nlp-3afdfdf2f68e?source=rss-77c6fba84503------2">Continue reading on TDS Archive »</a></p></div>]]></description>
            <link>https://medium.com/data-science/the-year-2020-analyzing-twitter-users-reflections-using-nlp-3afdfdf2f68e?source=rss-77c6fba84503------2</link>
            <guid isPermaLink="false">https://medium.com/p/3afdfdf2f68e</guid>
            <category><![CDATA[nlp]]></category>
            <category><![CDATA[python]]></category>
            <category><![CDATA[tableau]]></category>
            <category><![CDATA[sentiment-analysis]]></category>
            <category><![CDATA[machine-learning]]></category>
            <dc:creator><![CDATA[Jessica Ayodele]]></dc:creator>
            <pubDate>Wed, 30 Dec 2020 18:50:26 GMT</pubDate>
            <atom:updated>2021-04-08T04:42:38.015Z</atom:updated>
        </item>
        <item>
            <title><![CDATA[Analysis of Toronto Neighbourhoods using Machine Learning]]></title>
            <link>https://medium.com/data-science/analysis-of-toronto-neighbourhoods-using-machine-learning-291b942578f2?source=rss-77c6fba84503------2</link>
            <guid isPermaLink="false">https://medium.com/p/291b942578f2</guid>
            <category><![CDATA[data-analytics]]></category>
            <category><![CDATA[data-visualization]]></category>
            <category><![CDATA[clustering]]></category>
            <category><![CDATA[machine-learning]]></category>
            <category><![CDATA[data-science]]></category>
            <dc:creator><![CDATA[Jessica Ayodele]]></dc:creator>
            <pubDate>Fri, 27 Nov 2020 01:19:46 GMT</pubDate>
            <atom:updated>2021-01-12T08:02:45.357Z</atom:updated>
            <content:encoded><![CDATA[<h4>A New Immigrant’s Guide to settling in the City of Toronto</h4><h3>Introduction</h3><p>When I began this project, I came across a news article which read <em>“Canada to welcome 1.2 million immigrants by 2023” </em><a href="https://www.aljazeera.com/news/2020/10/30/canada-aims-to-bring-in-over-1-2-immigrants-over-next-3-years">[1]</a>. This made me excited for the millions of people looking for a pathway to Canada since I recently relocated here. A 2020 US news ranking showed Canada as the 2nd best country in the world, so it is not a surprise that every year, thousands of people choose to migrate here <a href="https://www.usnews.com/news/best-countries/best-immigrants">[2]</a>. Asides from having a stable economy and many growth opportunities, Canada has offered many immigrants a new home. In 2019, Canada opened its borders to <strong>341,000</strong> people with 35% of them settling in the City of Toronto <a href="https://www.cicnews.com/2020/02/canada-broke-another-record-by-welcoming-341000-immigrants-in-2019-0213697.html?_gl=1*7xjcrn*_ga*YW1wLTVFcnozdmhPVFpfTmtraWpTWnRaSG5tRmdGQjQ4RS1jQWxOS2Frc1ZTVVVGVDlqdGg5MWZneVE1Tzk0eW9OWUY.#gs.lqzc9s">[3]</a>. Hence, it is safe to say that the City of Toronto is a top destination for most new immigrants.</p><h3><strong>Problem Statement</strong></h3><p>The City of Toronto has 140 neighbourhoods spanning 6 districts. As a new immigrant, a vital question to answer is <em>“What neighbourhood do I settle in?”. </em>The aim of this project is to group Toronto neighbourhoods in order of desirability using Machine Learning and Data Visualization techniques.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/1024/0*3MG2Lgak4WMM9lcS" /><figcaption>Photo by <a href="https://unsplash.com/@matthewlai?utm_source=medium&amp;utm_medium=referral">Matthew Lai</a> on <a href="https://unsplash.com?utm_source=medium&amp;utm_medium=referral">Unsplash</a></figcaption></figure><h3><strong>Basis</strong></h3><p>There are several factors to consider when settling down in any location. For this project, I performed my analysis using the following criteria:</p><ul><li>Total number of Essential Venues in each neighbourhood</li><li>Primary and Secondary Benchmarks: Primary benchmarks considered were Unemployment rate, Crime rate and COVID-19 rates for each neighbourhood while the Secondary benchmark was housing price for a one-bedroom apartment in each neighbourhood.</li></ul><h3>Data Description</h3><p>Most of the datasets were obtained from the City of Toronto Open Data Portal. Other datasets were scraped from the web. They include:</p><ul><li><a href="http://open.toronto.ca/dataset/neighbourhoods/"><em>Neighbourhood Boundaries Map (GeoJSON)</em></a>: This file contains standard geospatial data and was critical for map visualizations</li><li><a href="https://www.toronto.ca/home/covid-19/covid-19-latest-city-of-toronto-news/covid-19-status-of-cases-in-toronto/"><em>COVID-19 dataset for Toronto</em></a><em>: Total cases </em>as of October 22nd, 2020</li><li><a href="https://open.toronto.ca/dataset/neighbourhood-crime-rates/"><em>Crime rates dataset for Toronto Neighbourhoods</em></a><em>: </em>for the Years 2014 to 2019</li><li><a href="https://open.toronto.ca/dataset/neighbourhood-profiles/"><em>Neighbourhood Profiles/Census dataset</em></a><em>:</em> Based on data collected by Statistics Canada in the last Census campaign held in 2016</li><li><a href="https://www.zumper.com/rent-research/toronto-on"><em>Housing rental prices</em></a><em>:</em> Contains median rental prices per neighbourhood</li></ul><h3>Methodology</h3><p>The Python libraries used on this project were Numpy, Pandas, Geopandas, Plotly, Scikit learn, Requests and Geopy. All visualizations were done using Plotly library because the visualizations are very interactive and can be achieved with fewer lines of code.</p><p>The GitHub repo for this project can be found <a href="https://github.com/jess-data/Coursera_Capstone">here</a> while the Jupyter notebook can be viewed <a href="https://nbviewer.jupyter.org/github/jess-data/Coursera_Capstone/blob/main/Analysis%20of%20Toronto%20Neighbourhoods%20using%20ML.ipynb">here</a>.</p><p>The main steps for this project are summarized in the flowchart below:</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/679/1*cKSHGISwjLU0-3TS08ZT_Q.png" /><figcaption>Project Flowchart</figcaption></figure><h3><strong>Data Exploration</strong></h3><blockquote>The Interactive Charts and Maps in the rest of this article are best viewed using a <strong>Computer or tablet</strong></blockquote><h4><strong>Exploring Venues in City of Toronto</strong></h4><p>Firstly, I obtained top 100 venues in each neighbourhood by sending a request via the Foursquare API. A total of 2118 venues and 291 unique venue categories were returned.</p><p>Using <em>One-hot encoding</em>, I converted the venue categories to numerical values for each neighbourhood to carry out further analysis. The total number of essential venues such as restaurants, schools, train stations, malls etc. were computed for each neighbourhood. From the Sunburst chart below, we can see all 6 Toronto districts and their respective neighbourhoods. The neighbourhoods are displayed based on proportion of the total number of essential venues present in them. <strong>Click/Tap</strong> on chart to explore further.</p><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fplotly.com%2F%7Ejess-data%2F43.embed%3Fautosize%3Dtrue&amp;display_name=Plotly&amp;url=https%3A%2F%2Fchart-studio.plotly.com%2F%7Ejess-data%2F43%2F&amp;image=https%3A%2F%2Fchart-studio.plotly.com%2Fstatic%2Fwebapp%2Fimages%2Fplotly-logo.8d56a320dbb8.png&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=plotly" width="600" height="400" frameborder="0" scrolling="no"><a href="https://medium.com/media/5296cbaf173a147962b6e8c731bf8886/href">https://medium.com/media/5296cbaf173a147962b6e8c731bf8886/href</a></iframe><p><strong><em>Quick Facts Check: </em></strong><em>There are more coffee shops and restaurants in Toronto than there are neighbourhoods with over 900 restaurants spanning across the city</em></p><h4><strong>Exploring Toronto Neighbourhoods using Primary benchmarks</strong></h4><p>After a clean-up of the individual datasets for the <em>primary benchmarks</em>, I merged them into one Pandas dataframe as shown below.</p><figure><img alt="" src="https://cdn-images-1.medium.com/max/565/1*YwpTk_OrsHRQTa1lRuiKuA.png" /></figure><p>The dataframe was converted to an interactive bubble chart below. Crime rates represented by the <em>bubble size</em>. <strong>Click/Tap </strong>on the legend on the Bubble chart to isolate a district and explore further.</p><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fplotly.com%2F%7Ejess-data%2F55.embed%3Fautosize%3Dtrue&amp;display_name=Plotly&amp;url=https%3A%2F%2Fchart-studio.plotly.com%2F%7Ejess-data%2F55%2F&amp;image=https%3A%2F%2Fchart-studio.plotly.com%2Fstatic%2Fwebapp%2Fimages%2Fplotly-logo.8d56a320dbb8.png&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=plotly" width="600" height="400" frameborder="0" scrolling="no"><a href="https://medium.com/media/b7e0e9ac209ebb0c9d95c79cfe3ed319/href">https://medium.com/media/b7e0e9ac209ebb0c9d95c79cfe3ed319/href</a></iframe><p><strong><em>Quick Stats Check:</em></strong><em> Average Unemployment rate is 8.3%. Average number of crimes committed per 100,000 people is 1378 and 1 in 100 persons had contracted COVID-19 as at October 2020.</em></p><h3><strong>Machine Learning</strong></h3><h4><strong>Clustering Toronto Neighbourhoods</strong></h4><p>A clustering algorithm,<em> “k-means”</em>, was used to group the neighbourhoods in order of desirability for new immigrants. <em>k-means</em> is an Unsupervised Machine Learning algorithm that groups the data points such that all neighbourhoods with similar data points are in the same cluster.</p><h4><strong>Steps for Clustering Toronto Neighbourhoods</strong></h4><p>The steps below were used to segment the neighbourhoods:</p><ol><li>Determine optimum number of clusters using the “Elbow” method</li><li>Group neighbourhoods using <em>total number of essential venues</em>. These <em>essential venues</em> included places such as Schools, Train stations, Restaurants, Banks, Shopping Malls, Bus Stations etc. This resulted in 3 distinct neighbourhood clusters and the outcome was represented in the final Choropleth map as “Venue Density”</li><li>Group neighbourhoods using the <em>primary benchmarks — </em>Unemployment, Crime and COVID-19 rates. The result of this clustering attempt is shown below</li><li>Group the neighbourhoods in the “Low” cluster from Step 3 using the <em>secondary benchmark i.e. Housing prices</em></li></ol><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fplotly.com%2F%7Ejess-data%2F53.embed%3Fautosize%3Dtrue&amp;display_name=Plotly&amp;url=https%3A%2F%2Fchart-studio.plotly.com%2F%7Ejess-data%2F53%2F&amp;image=https%3A%2F%2Fchart-studio.plotly.com%2Fstatic%2Fwebapp%2Fimages%2Fplotly-logo.8d56a320dbb8.png&amp;key=a19fcc184b9711e1b4764040d3dc5c07&amp;type=text%2Fhtml&amp;schema=plotly" width="600" height="400" frameborder="0" scrolling="no"><a href="https://medium.com/media/acb88ea31bc065f81e868dad53a76ffa/href">https://medium.com/media/acb88ea31bc065f81e868dad53a76ffa/href</a></iframe><h3>Results</h3><p>The outcome of the clustering steps above was used to rank the neighbourhoods into four categories. Neighbourhoods that belonged to the <em>Mid &amp; High</em> clusters in Step 3 were named as the <strong><em>Least desirable</em></strong> while those with <em>Low, Mid and High</em> housing prices in Step 4 were named as <strong><em>Most Desirable, Desirable and Semi-Desirable </em></strong>respectively. The final neighbourhood desirability index was made into a choropleth map below using Plotly library.</p><iframe src="https://cdn.embedly.com/widgets/media.html?src=https%3A%2F%2Fdatapane.com%2Fu%2Fosas%2Freports%2Fmy-plot%2Fembed%2F&amp;display_name=Datapane&amp;url=https%3A%2F%2Fdatapane.com%2Fu%2Fosas%2Freports%2Fmy-plot%2F&amp;image=https%3A%2F%2Fstorage.googleapis.com%2Fdatapane-files-prod%2Fpublic%2F10254e35-47bb-474c-b0ad-e3d86e9ec7f2.png%3FExpires%3D1606801219%26GoogleAccessId%3Dgcs-files%2540datapane-env-prod.iam.gserviceaccount.com%26Signature%3DlzVWn952rlKFo3lojg1CI06PNUazhz0RorhFNsKo1wcHQN7BCD2QcZq%252BkNNtdVZprpZQBUnDOHewrhzYzHRDqQzB%252FRWrg%252BVqh7SFQRuvQmx5PWGUuGXWcNpe4HBXoYv9O6njekcGaxb68Liqj8ohbHZx9CZ4GFWBHMDknOzpH8%252Bfw1%252FQe55rE21RkRfiBdcRiVW34Je05AglYlZAXEqkxRbZdcEf8HCrdumu93JE6tO1fBigVusExxUhBPehXKp1IB1dFD35vGICpLC8qYIGIKIC8fOD9wfOmCf04siQZSVVehjn9f%252BmFwUV5w5nmC%252FPFw%252BP7Or%252F7tMjDTP8ZSQRSg%253D%253D&amp;key=d04bfffea46d4aeda930ec88cc64b87c&amp;type=text%2Fhtml&amp;schema=datapane" width="800" height="625" frameborder="0" scrolling="no"><a href="https://medium.com/media/7e0631382ca5a6caf804470b808a1a88/href">https://medium.com/media/7e0631382ca5a6caf804470b808a1a88/href</a></iframe><h3><strong>Conclusion</strong></h3><p>From the results, we can make the following deductions:</p><ul><li>Only <strong>10%</strong> of Toronto neighbourhoods have high venue density with <em>Mount Pleasant West, Church-Yonge Corridor, Yonge-St. Clair and Bay Street Corridor</em> taking the lead</li><li><strong>Most Desirable Neighbourhoods:</strong> Consider neighbourhoods in <em>Scarborough area </em>if searching for less pricey apartments. Other neighbourhoods to consider are <em>Banbury Don-Mills </em>and <em>Annex </em>in North York and York districts respectively</li><li><strong>Looking for Entertainment:</strong> Look no further than <em>Downtown Toronto</em> which is also known as the Entertainment District. This area was classified <strong><em>Semi-desirable</em></strong> owing to the higher housing prices. However, if you’re looking for fun and have the $$$, it is a great place to settle in</li><li><strong>Presence of Essential Venues:</strong> If you are keen on proximity to essential venues, the neighbourhoods to consider which are also in the <strong><em>Desirable</em></strong> category are <em>Mount Pleasant West, Yonge-St, Clair and Greenwood-Coxwell</em></li><li><strong>Avoid if you Can:</strong> Most neighbourhoods in the North-Western region of Toronto i.e. <em>Etobicoke</em> district were classified as the <strong><em>Least desirable</em></strong> due to the high crime and COVID-19 rates in those neighbourhoods. It is also interesting that this region is home to Jane and Finch which is a <em>“red” </em>neighbourhood.</li></ul><h3><strong>References</strong></h3><p>All references used for this project have been hyperlinked within the write-up. For the complete Python code written on Jupyter Notebook, GitHub repo with the dataset and my social media pages, please use the links below:</p><ul><li><a href="https://nbviewer.jupyter.org/github/jess-data/Coursera_Capstone/blob/main/Analysis%20of%20Toronto%20Neighbourhoods%20using%20ML.ipynb">Jupyter Notebook</a></li><li><a href="https://github.com/jess-data/Coursera_Capstone/blob/main/Report%20for%20Analysis%20of%20Toronto%20neighbourhoods.pdf">Full Report</a></li><li><a href="https://github.com/jess-data/Analysis-of-Toronto-Neighbourhoods">GitHub Repository</a></li><li><a href="https://jess-analytics.com/">Personal Website</a></li><li><a href="https://www.linkedin.com/in/jessicauwoghiren/">LinkedIn</a></li><li><a href="https://twitter.com/jessica_xls">Twitter</a></li><li><a href="https://linktr.ee/DataTechSpace">DataTech Space Community</a></li></ul><img src="https://medium.com/_/stat?event=post.clientViewed&referrerSource=full_rss&postId=291b942578f2" width="1" height="1" alt=""><hr><p><a href="https://medium.com/data-science/analysis-of-toronto-neighbourhoods-using-machine-learning-291b942578f2">Analysis of Toronto Neighbourhoods using Machine Learning</a> was originally published in <a href="https://medium.com/data-science">TDS Archive</a> on Medium, where people are continuing the conversation by highlighting and responding to this story.</p>]]></content:encoded>
        </item>
    </channel>
</rss>