{"id":57363,"date":"2023-12-22T13:17:32","date_gmt":"2023-12-22T13:17:32","guid":{"rendered":"https:\/\/www.askpython.com\/?p=57363"},"modified":"2025-04-10T20:50:43","modified_gmt":"2025-04-10T20:50:43","slug":"keras-loss-functions","status":"publish","type":"post","link":"https:\/\/www.askpython.com\/python-modules\/keras-loss-functions","title":{"rendered":"A Complete Guide to Keras Loss Functions"},"content":{"rendered":"\n<p><em>Deep Learning is one of the most happening technologies with recent developments such as deepfake and autonomous vehicles. In this post, we will understand the crucial elements of building these deep learning models. <\/em><\/p>\n\n\n\n<p>You are all done building your model and want to test if the model is working as expected. Loss Functions are of great help in such scenarios where you would want to check how close the model&#8217;s results are to the expected outputs. <\/p>\n\n\n\n<p>In other words, a loss function(an objective function or a cost function) is used to measure the difference between the actual and predicted values of a model. Loss functions also help to assess the model&#8217;s performance; and how well the model adapts to the training data.<\/p>\n\n\n\n<p>Loss functions play an important role in <a href=\"https:\/\/www.askpython.com\/python\/examples\/backpropagation-in-python\" data-type=\"post\" data-id=\"27766\">backpropagation<\/a> where the gradient of the loss function is sent back to the model to improve.<\/p>\n\n\n\n<p>Through this article, we will understand loss functions thoroughly and focus on the types of loss functions available in the Keras library. <\/p>\n\n\n\n<p><a href=\"https:\/\/www.askpython.com\/python\/examples\/deep-learning-algorithms-2023\" data-type=\"post\" data-id=\"48070\">Learn about the popular deep-learning algorithms here!<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">What Is a Loss Function?<\/h2>\n\n\n\n<p>A loss function, just as the name suggests calculates the loss or the difference between the model&#8217;s predicted values and the actual target values. When we are training or building a model, our main objective should be to minimize this loss to obtain an optimized model.<\/p>\n\n\n\n<p>During training, the weights and biases of a deep learning model are often updated to minimize this loss. <\/p>\n\n\n\n<p>The general loss function or cost function can be considered as below. <\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"486\" height=\"120\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-7.png\" alt=\"Loss Function\" class=\"wp-image-57383\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-7.png 486w, https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-7-300x74.png 300w\" sizes=\"auto, (max-width: 486px) 100vw, 486px\" \/><figcaption class=\"wp-element-caption\">Loss Function <\/figcaption><\/figure>\n\n\n\n<p>J is the loss function, w<sup>T <\/sup>is the training weight and b is the bias applied to the network. y<sup>^<\/sup> is the predicted value and y is the actual value. Coming to the topic at hand, let us take a look at all the loss functions the Keras Library has to offer.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Keras Loss Functions<\/h2>\n\n\n\n<p>The Keras library provides a Pythonic interface for building deep learning models on smartphones and the web. It offers numerous services being an open-source library. It has an extensive set of loss functions to be used for different use cases.<\/p>\n\n\n\n<p>There are two types of losses- probabilistic and Regression, each providing a variety of losses. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Probabilistic Losses<\/h3>\n\n\n\n<p>Probabilistic losses can be used for both regression and classification tasks. These losses can be used for models which give out a probability for prediction.<\/p>\n\n\n\n<p>These are the available probabilistic losses. These losses can be used in both class and function forms.<\/p>\n\n\n\n<blockquote class=\"wp-block-quote is-layout-flow wp-block-quote-is-layout-flow\">\n<p>You might notice that the type of loss are repetitive. That is because the losses can be called in the form of a class and a function too. While they serve the same purpose, the class form and a function form differ by their names. <\/p>\n<\/blockquote>\n\n\n\n<ul class=\"wp-block-list\">\n<li>BinaryCrossentropy class<\/li>\n\n\n\n<li>CategoricalCrossentropy class<\/li>\n\n\n\n<li>SparseCategoricalCrossentropy class<\/li>\n\n\n\n<li>Poisson class<\/li>\n\n\n\n<li>binary_crossentropy function<\/li>\n\n\n\n<li>categorical_crossentropy function<\/li>\n\n\n\n<li>sparse_categorical_crossentropy function<\/li>\n\n\n\n<li>Poisson function<\/li>\n\n\n\n<li>KLDivergence class<\/li>\n\n\n\n<li>kl_divergence function<\/li>\n<\/ul>\n\n\n\n<p>Let us see the usage of each loss function.<\/p>\n\n\n\n<h4 class=\"wp-block-heading\">BinaryCrossentropy class<\/h4>\n\n\n\n<p>The binary<a href=\"https:\/\/www.askpython.com\/python-modules\/numpy\/cross-entropy-in-python\" data-type=\"post\" data-id=\"52277\"> cross entropy<\/a> loss computes the cross entropy between the true and predicted labels. It can be used for classification problems that have a binary prediction(0 or 1).<\/p>\n\n\n\n<p>Let us see an example of using this loss function.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ny_true = &#x5B;0, 1, 1, 0]\ny_pred = &#x5B;-18.6, 0.51, 2.94, -12.8]\nbce = tf.keras.losses.BinaryCrossentropy(from_logits=True)\nbce(y_true, y_pred).numpy()\n<\/pre><\/div>\n\n\n<p>There are two lists of actual(y_true) and predicted(y_pred) values. The binary cross-entropy loss class is accessed using the variable bce, which is used to calculate the loss between the predicted and actual values. <\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"895\" height=\"178\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-8.png\" alt=\"Binary Cross entropy loss class\" class=\"wp-image-57394\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-8.png 895w, https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-8-300x60.png 300w, https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-8-768x153.png 768w\" sizes=\"auto, (max-width: 895px) 100vw, 895px\" \/><figcaption class=\"wp-element-caption\">Binary Cross entropy loss class<\/figcaption><\/figure>\n\n\n\n<p>In the same way, the binary cross entropy function can be called by using the following syntax.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ntf_keras.losses.binary_crossentropy(\n    y_true, y_pred, from_logits=False, label_smoothing=0.0, axis=-1)\n<\/pre><\/div>\n\n\n<h4 class=\"wp-block-heading\">CategoricalCrossentropy class<\/h4>\n\n\n\n<p>The categorical cross-entropy loss is used when there are multiple class labels. The class labels must be provided in a one-hot encoded form, which means the classes should be either 0 or 1.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ny_true = &#x5B;&#x5B;0, 1, 0], &#x5B;0, 0, 1]]\ny_pred = &#x5B;&#x5B;0.05, 0.95, 0], &#x5B;0.1, 0.8, 0.95]]\ncce = tf.keras.losses.CategoricalCrossentropy()\ncce(y_true, y_pred).numpy()\n<\/pre><\/div>\n\n\n<p>There are two instances and three classes, where the first instance belongs to the second label, and the second instance belongs to the third label. The y_pred array gives the probability of the instance belonging to a particular class. <\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"811\" height=\"176\" src=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-9.png\" alt=\"Categorical Crossentropy class\" class=\"wp-image-57397\" srcset=\"https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-9.png 811w, https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-9-300x65.png 300w, https:\/\/www.askpython.com\/wp-content\/uploads\/2023\/12\/image-9-768x167.png 768w\" sizes=\"auto, (max-width: 811px) 100vw, 811px\" \/><figcaption class=\"wp-element-caption\">Categorical Crossentropy class<\/figcaption><\/figure>\n\n\n\n<p>The categorical cross entropy function can be called from the Keras framework as below.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ntf_keras.losses.categorical_crossentropy(\n    y_true, y_pred, from_logits=False, label_smoothing=0.0, axis=-1\n)\n<\/pre><\/div>\n\n\n<h4 class=\"wp-block-heading\">SparseCategoricalCrossentropy class<\/h4>\n\n\n\n<p>This class is used when the labels are integers and not encoded(example &#8211; 1,2,3). In this case, only the y_true variable changes from the categorical cross entropy class.<\/p>\n\n\n\n<p>The function can be similarly called from keras.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ntf_keras.losses.sparse_categorical_crossentropy(\n    y_true, y_pred, from_logits=False, axis=-1, ignore_class=None\n)\n<\/pre><\/div>\n\n\n<h4 class=\"wp-block-heading\">Poisson Class<\/h4>\n\n\n\n<p>The Poisson loss is particularly used when predicting count data. It is used for regression tasks and use cases like the number of customers purchasing a product.<\/p>\n\n\n\n<p>The <strong>poisson class<\/strong> and function can be called using the syntax:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nPoisson class\ntf_keras.losses.Poisson(reduction=&quot;auto&quot;, name=&quot;poisson&quot;)\nPoisson function\ntf_keras.losses.poisson(y_true, y_pred)\n<\/pre><\/div>\n\n\n<h4 class=\"wp-block-heading\">KL Divergence Loss<\/h4>\n\n\n\n<p>In general, the Kullback-Leibler divergence measures how a probability distribution is different from another. The KL Divergence loss class and functions compute the KL loss between the predicted and actual values. <\/p>\n\n\n\n<p>The KL loss is calculated as follows:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nloss = y_true * log(y_true \/ y_pred)\n<\/pre><\/div>\n\n\n<p>The KL Divergence class and function can be called similar to the other losses. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nKL Divergence Class\ntf_keras.losses.KLDivergence(reduction=&quot;auto&quot;, name=&quot;kl_divergence&quot;)\nKL Divergence Function \ntf_keras.losses.kl_divergence(y_true, y_pred)\n<\/pre><\/div>\n\n\n<h3 class=\"wp-block-heading\">Regression Losses<\/h3>\n\n\n\n<p>The regression losses are used when dealing with regression problems which typically predict a numerical value. <\/p>\n\n\n\n<p>Similar to the probabilistic losses, the regression losses can also be used in both class and function representations. <\/p>\n\n\n\n<p>These are the loss functions Keras provides for regression tasks.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>MeanSquaredError class or mean_squared_error function <\/li>\n\n\n\n<li>MeanAbsoluteError class or mean_absolute_error function <\/li>\n\n\n\n<li>MeanAbsolutePercentageError class or mean_absolute_percentage_error function<\/li>\n\n\n\n<li>MeanSquaredLogarithmicError class or mean_squared_logarithmic_error function<\/li>\n\n\n\n<li>CosineSimilarity class or cosine_similarity function <\/li>\n<\/ul>\n\n\n\n<p>These functions can be used with a similar syntax as the probabilistic losses.<\/p>\n\n\n\n<p><strong>For example,<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ntf.keras.losses.mean_squared_error()\n<\/pre><\/div>\n\n\n<p><a href=\"https:\/\/www.askpython.com\/resources\/regression-error-metrics\" data-type=\"post\" data-id=\"56303\">The popular regression loss functions or error metrics are explained here<\/a><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>To recapitulate, we have discussed what are loss functions and understood the types of loss functions available in the Keras library in detail. The choice of the right loss function purely depends on the use case and the predicting variable. <\/p>\n\n\n\n<h2 class=\"wp-block-heading\">References<\/h2>\n\n\n\n<p><a href=\"https:\/\/keras.io\/api\/losses\/\" target=\"_blank\" rel=\"noopener\">Keras Documentation<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Deep Learning is one of the most happening technologies with recent developments such as deepfake and autonomous vehicles. In this post, we will understand the crucial elements of building these deep learning models. You are all done building your model and want to test if the model is working as expected. Loss Functions are of [&hellip;]<\/p>\n","protected":false},"author":55,"featured_media":64142,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-57363","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python-modules"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/57363","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/users\/55"}],"replies":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/comments?post=57363"}],"version-history":[{"count":0,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/57363\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media\/64142"}],"wp:attachment":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media?parent=57363"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/categories?post=57363"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/tags?post=57363"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}