{"id":36261,"date":"2022-10-31T08:52:54","date_gmt":"2022-10-31T08:52:54","guid":{"rendered":"https:\/\/www.askpython.com\/?p=36261"},"modified":"2022-10-31T08:52:56","modified_gmt":"2022-10-31T08:52:56","slug":"predictive-model-in-python","status":"publish","type":"post","link":"https:\/\/www.askpython.com\/python\/examples\/predictive-model-in-python","title":{"rendered":"Building a Predictive Model in Python"},"content":{"rendered":"\n<p>Today we are going to learn a fascinating topic which is How to create a predictive model in python. It is an essential concept in Machine Learning and Data Science. Before getting deep into it, We need to understand what is predictive analysis. Let us look at the table of contents.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What is Predictive Analysis?<\/h2>\n\n\n\n<p>Predictive analysis is a field of Data Science, which involves making predictions of future events. We can create predictions about new data for fire or in upcoming days and make the machine supportable for the same. We use various statistical techniques to analyze the present data or observations and predict for future.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Why should we use Predictive Analysis?<\/h2>\n\n\n\n<p>If done correctly, Predictive analysis can provide several benefits. Some key features that are highly responsible for choosing the predictive analysis are as follows.<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Immediate Feedback System<ul><li>It provides a better marketing strategy as well. <\/li><li>In the case of taking marketing services or any business, We can get an idea about how people are liking it, How much people are liking it, and above all what extra features they really want to be added.<\/li><li>It figures out the current trend.<\/li><\/ul><\/li><li>Optimization<ul><li>We can optimize our prediction as well as the upcoming strategy using predictive analysis.<\/li><li>It is similar to modification.<\/li><li>It involves a comparison between present, past and upcoming strategies.<\/li><\/ul><\/li><li>Better strategy<ul><li>We end up with a better strategy using this Immediate feedback system and optimization process.<\/li><li>It also provides multiple strategies as well.<\/li><li>It leads the better decisions.<\/li><\/ul><\/li><li>Risk Reduction<ul><li>When we do not know about optimization not aware of a feedback system, We just can do Rist reduction as well. <\/li><li>It is best suitable for newcomers.<\/li><\/ul><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Predictive Analysis use cases<\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Churn Prevention<ul><li>It allows us to predict whether a person is going to be in our strategy or not. Whether he\/she is satisfied or not.<\/li><li>we get analysis based pon customer uses. Using that we can prevail offers and we can get to know what they really want.<\/li><\/ul><\/li><li>Quality Assurance<ul><li>We can understand how customers feel by using our service by providing forms, interviews, etc.<\/li><li>What actually the people want and about different people and different thoughts.<\/li><li>What about the new features needed to be installed and about their circumstances?<\/li><\/ul><\/li><li>Risk Modelling<ul><li>It allows us to know about the extent of risks going to be involved. so that we can invest in it as well.<\/li><li>Analyzing current strategies and predicting future strategies.<\/li><\/ul><\/li><li>Sales Forecasting<ul><li>How it is going in the present strategies and what it s going to be in the upcoming days.<\/li><li>Analyzing the same and creating organized data.<\/li><\/ul><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Steps involved in Predictive Analysis<\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>Problem Definition<ul><li>It aims to determine what our problem is. We need to resolve the same. The main problem for which we need to predict.<\/li><li>Every field of predictive analysis needs to be based on This problem definition as well.<\/li><\/ul><\/li><li>Data Gathering<ul><li>We collect data from multi-sources and gather it to analyze and create our role model.<\/li><\/ul><\/li><li>Data cleaning<ul><li>We can take a look at the missing value and which are not important. They need to be removed.  <\/li><li>We need to improve the quality of this model by optimizing it in this way.<\/li><li>We need to remove the values beyond the boundary level.<\/li><\/ul><\/li><li>Data Analysis<ul><li>It involves managing gathered data.<\/li><li>Managing the data refers to checking whether the data is well organized or not.<\/li><\/ul><\/li><li>Modeling<ul><li>This step involves saving the finalized or organized data craving our machine by installing the same by using the prerequisite algorithm<\/li><\/ul><\/li><li>Model Testing<ul><li>We need to test the machine whether is working up to mark or not.<\/li><li>We need to check or compare the output result\/values with the predictive values.<\/li><li>Analyzing the compared data within a range that is o to 1 where 0 refers to 0% and 1 refers to 100 %.<\/li><\/ul><\/li><li>Deployment<ul><li>Once our model is created or it is performing well up or it&#8217;s getting the success accuracy score then we need to deploy it for market use.<\/li><\/ul><\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Applications of Predictive Analysis<\/h2>\n\n\n\n<p>Predictive can build future projections that will help in many businesses as follows:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Pricing<\/li><li>Demand Planning<\/li><li>Campaign Management<\/li><li>Customer Acquisition<\/li><li>Budgeting and Forecasting<\/li><li>Fraud detection<\/li><li>Promotions<\/li><\/ul>\n\n\n\n<h2 class=\"wp-block-heading\">Creating our Predictive model (Example)<\/h2>\n\n\n\n<p>Let us try a demo of predictive analysis using google collab by taking a dataset collected from a banking campaign for a specific offer. Analyzing the data and getting to know whether they are going to avail of the offer or not by taking some sample interviews. <\/p>\n\n\n\n<figure class=\"wp-block-table\"><table><thead><tr><th>index<\/th><th>sl<\/th><th>age<\/th><th>job<\/th><th>marital_status<\/th><th>education<\/th><th>default<\/th><th>balance<\/th><th>housing<\/th><th>loan<\/th><th>contact<\/th><th>day<\/th><th>month<\/th><th>duration<\/th><th>campaign<\/th><th>pdays<\/th><th>previous<\/th><th>poutcome<\/th><th>y<\/th><\/tr><\/thead><tbody><tr><td>0<\/td><td>0<\/td><td>30<\/td><td>unemployed<\/td><td>married<\/td><td>primary<\/td><td>no<\/td><td>1787<\/td><td>no<\/td><td>no<\/td><td>cellular<\/td><td>19<\/td><td>oct<\/td><td>79<\/td><td>1<\/td><td>A<\/td><td>0<\/td><td>unknown<\/td><td>no<\/td><\/tr><tr><td>1<\/td><td>1<\/td><td>3<\/td><td>services<\/td><td>married<\/td><td>secondary<\/td><td>no<\/td><td>4789<\/td><td>yes<\/td><td>yes<\/td><td>cellular<\/td><td>11<\/td><td>may<\/td><td>220<\/td><td>1<\/td><td>339<\/td><td>4<\/td><td>failure<\/td><td>no<\/td><\/tr><tr><td>2<\/td><td>2<\/td><td>35<\/td><td>management<\/td><td>single<\/td><td>tertiary<\/td><td>no<\/td><td>1350<\/td><td>yes<\/td><td>no<\/td><td>cellular<\/td><td>16<\/td><td>apr<\/td><td>185<\/td><td>1<\/td><td>330<\/td><td>1<\/td><td>failure<\/td><td>no<\/td><\/tr><tr><td>3<\/td><td>3<\/td><td>30<\/td><td>management<\/td><td>married<\/td><td>tertiary<\/td><td>no<\/td><td>1476<\/td><td>yes<\/td><td>yes<\/td><td>unknown<\/td><td>3<\/td><td>jun<\/td><td>199<\/td><td>4<\/td><td>4<\/td><td>0<\/td><td>unknown<\/td><td>no<\/td><\/tr><tr><td>4<\/td><td>4<\/td><td>59<\/td><td>blue_collar<\/td><td>married<\/td><td>secondary<\/td><td>no<\/td><td>0<\/td><td>yes<\/td><td>no<\/td><td>unknown<\/td><td>5<\/td><td>may<\/td><td>226<\/td><td>1<\/td><td>A<\/td><td>0<\/td><td>unknown<\/td><td>no<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n#importing modules\nimport pandas as pd\nimport numpy as np\nfrom sklearn.model_selection import train_test_split\nfrom sklearn.metrics import accuracy_score\nfrom sklearn.linear_model import LogisticRegression\n\nimport seaborn as sns\nimport matplotlib.pyplot as plt\n%matplotlib inline\n\ndata = pd.read_csv(&quot;\/dataset.csv&quot;, delimiter = &quot;,&quot;, header = &quot;infer&quot;)\ndata.head()\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"173\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-1-1024x173.png\" alt=\"\" class=\"wp-image-36273\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-1-1024x173.png 1024w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-1-300x51.png 300w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-1-768x130.png 768w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-1-1536x260.png 1536w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-1.png 1594w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nsns.pairplot(data)\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"789\" height=\"822\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-2.png\" alt=\"Capture 2\" class=\"wp-image-36275\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-2.png 789w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-2-288x300.png 288w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-2-768x800.png 768w\" sizes=\"auto, (max-width: 789px) 100vw, 789px\" \/><figcaption>Capture 2<\/figcaption><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndata.corr()\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"880\" height=\"344\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-3.png\" alt=\"Capture 3\" class=\"wp-image-36276\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-3.png 880w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-3-300x117.png 300w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-3-768x300.png 768w\" sizes=\"auto, (max-width: 880px) 100vw, 880px\" \/><figcaption>Capture 3<\/figcaption><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nsns.heatmap(data.corr(), annot = True)\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"676\" height=\"414\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-4.png\" alt=\"Capture 4\" class=\"wp-image-36277\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-4.png 676w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-4-300x184.png 300w\" sizes=\"auto, (max-width: 676px) 100vw, 676px\" \/><figcaption>Capture 4<\/figcaption><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndata.dtypes\n\nsl                 int64\nage                int64\njob               object\nmarital_status    object\neducation         object\ndefault           object\nbalance            int64\nhousing           object\nloan              object\ncontact           object\nday                int64\nmonth             object\nduration           int64\ncampaign           int64\npdays             object\nprevious           int64\npoutcome          object\ny                 object\ndtype: object\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndata_new = pd.get_dummies(data, columns=&#x5B;&#039;marital_status&#039;,\t&#039;education&#039;,\t&#039;default&#039;,\t&#039;housing&#039;,\t&#039;loan&#039;,\t&#039;contact&#039;,\t&#039;month&#039;, &#039;poutcome&#039;\t])\ndata_new.y.replace((&#039;yes&#039;,&#039;no&#039;), (1,0), inplace = True)\ndata_new.dtypes\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"422\" height=\"653\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-5.png\" alt=\"Capture 5\" class=\"wp-image-36280\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-5.png 422w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-5-194x300.png 194w\" sizes=\"auto, (max-width: 422px) 100vw, 422px\" \/><figcaption>Capture 5<\/figcaption><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nprint(data.shape)\n  (5, 18)\ndata.education.unique()\n  array(&#x5B;&#039;primary&#039;, &#039;secondary&#039;, &#039;tertiary&#039;], dtype=object)\npd.crosstab(index = data&#x5B;&quot;education&quot;], columns = data&#x5B;&quot;y&quot;])\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"293\" height=\"215\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-6.png\" alt=\"Capture 6\" class=\"wp-image-36281\"\/><figcaption>Capture 6<\/figcaption><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndata.education.value_counts().plot(kind = &quot;barh&quot;)\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"674\" height=\"352\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-7.png\" alt=\"Capture 7\" class=\"wp-image-36282\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-7.png 674w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-7-300x157.png 300w\" sizes=\"auto, (max-width: 674px) 100vw, 674px\" \/><figcaption>Capture 7<\/figcaption><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ndata_y = pd.DataFrame(data_new&#x5B;&#039;y&#039;])\ndata_x = data_new.drop(&#x5B;&#039;y&#039;], axis = 1)\nprint(data_y.columns)\nprint(data_x.columns)\n<\/pre><\/div>\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img loading=\"lazy\" decoding=\"async\" width=\"1024\" height=\"154\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-8-1024x154.png\" alt=\"Capture 8\" class=\"wp-image-36283\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-8-1024x154.png 1024w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-8-300x45.png 300w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-8-768x115.png 768w, https:\/\/www.askpython.com\/wp-content\/uploads\/2022\/10\/Capture-8.png 1292w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><figcaption>Capture 8<\/figcaption><\/figure>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nx_train, x_test, y_train, y_test = train_test_split(data_x, data_y, test_size = 0.3, random_state = 2, stratify = data_y)\nprint(x_train.shape)\nprint(x_test.shape)\nprint(y_train.shape)\nprint(y_test.shape)\n<\/pre><\/div>\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n#OUTPUT FOR THE ABOVE CODE\n(3, 27)\n(2, 27)\n(3, 1)\n(2, 1)\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\">Summary<\/h2>\n\n\n\n<p>Today we covered predictive analysis and tried a demo using a sample dataset. Hope you must have tried along with our code snippet. You can try taking more datasets as well. We must visit again with some more exciting topics.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Today we are going to learn a fascinating topic which is How to create a predictive model in python. It is an essential concept in Machine Learning and Data Science. Before getting deep into it, We need to understand what is predictive analysis. Let us look at the table of contents. What is Predictive Analysis? [&hellip;]<\/p>\n","protected":false},"author":47,"featured_media":36287,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[],"class_list":["post-36261","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-examples"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/36261","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/users\/47"}],"replies":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/comments?post=36261"}],"version-history":[{"count":0,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/36261\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media\/36287"}],"wp:attachment":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media?parent=36261"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/categories?post=36261"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/tags?post=36261"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}