{"id":12157,"date":"2021-01-30T05:59:00","date_gmt":"2021-01-30T05:59:00","guid":{"rendered":"https:\/\/www.askpython.com\/?p=12157"},"modified":"2021-02-08T08:46:10","modified_gmt":"2021-02-08T08:46:10","slug":"ridge-regression","status":"publish","type":"post","link":"https:\/\/www.askpython.com\/python\/examples\/ridge-regression","title":{"rendered":"Ridge Regression in Python"},"content":{"rendered":"\n<p>Hello, readers! Today, we would be focusing on an important aspect in the concept of Regression &#8212; <strong>Ridge Regression in Python<\/strong>, in detail.<\/p>\n\n\n\n<p>So, let us get started!!<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Understanding Ridge Regression<\/h2>\n\n\n\n<p>We all are aware that, <a aria-label=\"Linear Regression (opens in a new tab)\" href=\"https:\/\/www.askpython.com\/python\/examples\/linear-regression-in-python\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"rank-math-link\">Linear Regression<\/a> estimates the best fit line and predicts the value of the target numeric variable. That is, it predicts a relationship between the independent and dependent variables of the dataset. <\/p>\n\n\n\n<p>It finds the coefficients of the model via the defined technique during the prediction.<\/p>\n\n\n\n<p>The issue with Linear Regression is that the calculated coefficients of the model variables can turn out to become a large value which in turns makes the model sensitive to inputs. Thus, this makes the model very unstable.<\/p>\n\n\n\n<p><strong>This is when Ridge Regression comes into picture!<\/strong><\/p>\n\n\n\n<p>Ridge regression also known as, <code>L2 Regression<\/code> adds a penalty to the existing model. It adds penalty to the loss function which in turn makes the model have a smaller value of coefficients. That is, it shrinks the coefficients of the variables of the model that do not contribute much to the model itself.<\/p>\n\n\n\n<p>It penalizes the model based on the <strong>Sum of Square Error(SSE)<\/strong>. Though it penalizes the model, it prevents it from being excluded from the model by letting them have towards zero as a value of coefficients.<\/p>\n\n\n\n<p>To add, a hyper-parameter called lambda is included into the L2 penalty to have a check at the weighting of the penalty values.<\/p>\n\n\n\n<p>In a Nutshell, ridge regression can be framed as follows:<\/p>\n\n\n\n<p><strong>Ridge = loss + (lambda * l2_penalty)<\/strong><\/p>\n\n\n\n<p>Let us now focus on the implementation of the same!<\/p>\n\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Ridge Regression in Python &#8211; A Practical Approach<\/h2>\n\n\n\n<p>In this example, we will be working on the Bike Rental Count dataset. You can find the dataset <a href=\"https:\/\/github.com\/Safa1615\/BIKE-RENTAL-COUNT\/blob\/master\/day.csv\" class=\"rank-math-link\" target=\"_blank\" rel=\"noopener\">here<\/a>!<\/p>\n\n\n\n<p>At first, we load the dataset into the Python environment using <code>read_csv()<\/code> function. Further, we split the data using <a aria-label=\"train_test_split()  (opens in a new tab)\" href=\"https:\/\/www.askpython.com\/python\/examples\/split-data-training-and-testing-set\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"rank-math-link\">train_test_split() <\/a>function.<\/p>\n\n\n\n<p>Then we define the error metrics for the model Here we have made use of <a href=\"https:\/\/www.askpython.com\/python\/examples\/mape-mean-absolute-percentage-error\" target=\"_blank\" aria-label=\"MAPE  (opens in a new tab)\" rel=\"noreferrer noopener\" class=\"rank-math-link\">MAPE <\/a>as an error metric.<\/p>\n\n\n\n<p>At last, we apply Ridge regression to the model using <code>Ridge()<\/code> function.<\/p>\n\n\n\n<p><strong>Example:<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport os\nimport pandas\n\n#Changing the current working directory\nos.chdir(&quot;D:\/Ediwsor_Project - Bike_Rental_Count&quot;)\nBIKE = pandas.read_csv(&quot;day.csv&quot;)\n\nbike = BIKE.copy()\ncategorical_col_updated = &#x5B;&#039;season&#039;,&#039;yr&#039;,&#039;mnth&#039;,&#039;weathersit&#039;,&#039;holiday&#039;]\nbike = pandas.get_dummies(bike, columns = categorical_col_updated)\n#Separating the depenedent and independent data variables into two dataframes.\nfrom sklearn.model_selection import train_test_split \nX = bike.drop(&#x5B;&#039;cnt&#039;],axis=1) \nY = bike&#x5B;&#039;cnt&#039;]\nX_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size=.20, random_state=0)\n\nimport numpy as np\ndef MAPE(Y_actual,Y_Predicted):\n    mape = np.mean(np.abs((Y_actual - Y_Predicted)\/Y_actual))*100\n    return mape\n\nfrom sklearn.linear_model import Ridge\nridge_model = Ridge(alpha=1.0)\nridge=ridge_model.fit(X_train , Y_train)\nridge_predict = ridge.predict(X_test)\nRidge_MAPE = MAPE(Y_test,ridge_predict)\nprint(&quot;MAPE value: &quot;,Ridge_MAPE)\nAccuracy = 100 - Ridge_MAPE\nprint(&#039;Accuracy of Ridge Regression: {:0.2f}%.&#039;.format(Accuracy))\n<\/pre><\/div>\n\n\n<p><strong>Output:<\/strong><\/p>\n\n\n\n<p>Using Ridge (L2) penalty, we have received an accuracy of 83.3%<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nMAPE value:  16.62171367018922\nAccuracy of Ridge Regression: 83.38%.\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>By this, we have come to the end of this topic. Feel free to comment below, in case you come across any question.<\/p>\n\n\n\n<p>For more such posts related to Python, Stay tuned with us.<\/p>\n\n\n\n<p>Till then, Happy Learning!! \ud83d\ude42<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hello, readers! Today, we would be focusing on an important aspect in the concept of Regression &#8212; Ridge Regression in Python, in detail. So, let us get started!! Understanding Ridge Regression We all are aware that, Linear Regression estimates the best fit line and predicts the value of the target numeric variable. That is, it [&hellip;]<\/p>\n","protected":false},"author":4,"featured_media":12189,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[],"class_list":["post-12157","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-examples"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/12157","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/comments?post=12157"}],"version-history":[{"count":0,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/12157\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media\/12189"}],"wp:attachment":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media?parent=12157"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/categories?post=12157"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/tags?post=12157"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}