{"id":2289,"date":"2020-01-06T15:10:08","date_gmt":"2020-01-06T15:10:08","guid":{"rendered":"https:\/\/www.askpython.com\/?p=2289"},"modified":"2023-02-16T19:57:20","modified_gmt":"2023-02-16T19:57:20","slug":"python-encode-and-decode-functions","status":"publish","type":"post","link":"https:\/\/www.askpython.com\/python\/string\/python-encode-and-decode-functions","title":{"rendered":"Python encode() and decode() Functions"},"content":{"rendered":"\n<p>Python&#8217;s <code>encode<\/code> and <code>decode<\/code> methods are used to encode and decode the input string, using a given encoding. Let us look at these two functions in detail in this article.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-text-color has-background has-vivid-green-cyan-background-color has-vivid-green-cyan-color\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Encode a given String<\/h2>\n\n\n\n<p>We use the <code>encode()<\/code> method on the input string, which every string object has.<\/p>\n\n\n\n<p><strong>Format<\/strong>:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ninput_string.encode(encoding, errors)\n<\/pre><\/div>\n\n\n<p>This encodes <code>input_string<\/code> using <code>encoding<\/code>, where <code>errors<\/code> decides the behavior to be followed if, by any chance, the encoding fails on the string.<\/p>\n\n\n\n<p><code>encode()<\/code> will result in a sequence of <code>bytes<\/code>.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ninp_string = &#039;Hello&#039;\nbytes_encoded = inp_string.encode()\nprint(type(bytes_encoded))\n<\/pre><\/div>\n\n\n<p>This results in an object of <code>&lt;class 'bytes'&gt;<\/code>, as expected:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n&lt;class &#039;bytes&#039;&gt;\n<\/pre><\/div>\n\n\n<p>The type of encoding to be followed is shown by the<code> encoding<\/code> parameter. There are various types of character encoding schemes, out of which the scheme <strong>UTF-8<\/strong> is used in Python by default.<\/p>\n\n\n\n<p>Let us look at the <code>encoding<\/code> parameter using an example.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\na = &#039;This is a simple sentence.&#039;\n\nprint(&#039;Original string:&#039;, a)\n\n# Decodes to utf-8 by default\na_utf = a.encode()\n\nprint(&#039;Encoded string:&#039;, a_utf)\n<\/pre><\/div>\n\n\n<p><strong>Output<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nOriginal string: This is a simple sentence.\nEncoded string: b&#039;This is a simple sentence.&#039;\n<\/pre><\/div>\n\n\n<p><strong>NOTE<\/strong>: As you can observe, we have encoded the input string in the UTF-8 format. Although there is not much of a difference, you can observe that the string is prefixed with a <code>b<\/code>. This means that the string is converted to a stream of bytes, which is how it is stored on any computer. As bytes! <\/p>\n\n\n\n<p>This is actually not human-readable and is only represented as the original string for readability, prefixed with a <code>b<\/code>, to denote that it is not a string, but a sequence of bytes.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-text-color has-background has-vivid-green-cyan-background-color has-vivid-green-cyan-color\"\/>\n\n\n\n<h3 class=\"wp-block-heading\">Handling errors<\/h3>\n\n\n\n<p>There are various types of <code>errors<\/code>, some of which are mentioned below:<\/p>\n\n\n\n<figure class=\"wp-block-table aligncenter\"><table class=\"\"><tbody><tr><td><strong>Type of Error<\/strong><\/td><td><strong>Behavior<\/strong><\/td><\/tr><tr><td><code>strict<\/code><\/td><td><strong>Default<\/strong> behavior which raises <code>UnicodeDecodeError<\/code> on failure.<\/td><\/tr><tr><td><code>ignore<\/code><\/td><td><strong>Ignores<\/strong> the un-encodable Unicode from the result.<\/td><\/tr><tr><td><code>replace<\/code><\/td><td><strong>Replaces<\/strong> <em>all<\/em> un-encodable Unicode characters with a question mark (<code>?<\/code>)<\/td><\/tr><tr><td><code>backslashreplace<\/code><\/td><td><strong>Inserts<\/strong> a backslash escape sequence (<code>\\uNNNN<\/code>) instead of un-encodable Unicode characters.<\/td><\/tr><\/tbody><\/table><\/figure>\n\n\n\n<p>Let us look at the above concepts using a simple example. We will consider an input string where not all characters are encodable (such as <code>\u00f6<\/code>),<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\na = &#039;This is a bit m\u00f6re c\u00f6mplex sentence.&#039;\n\nprint(&#039;Original string:&#039;, a)\n\nprint(&#039;Encoding with errors=ignore:&#039;, a.encode(encoding=&#039;ascii&#039;, errors=&#039;ignore&#039;))\nprint(&#039;Encoding with errors=replace:&#039;, a.encode(encoding=&#039;ascii&#039;, errors=&#039;replace&#039;))\n<\/pre><\/div>\n\n\n<p><strong>Output<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nOriginal string: This is a m\u00f6re c\u00f6mplex sentence.\nEncoding with errors=ignore: b&#039;This is a bit mre cmplex sentence.&#039;\nEncoding with errors=replace: b&#039;This is a bit m?re c?mplex sentence.&#039;\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator has-text-color has-background has-vivid-green-cyan-background-color has-vivid-green-cyan-color\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Decoding a Stream of Bytes<\/h2>\n\n\n\n<p>Similar to encoding a string, we can decode a stream of bytes to a string object, using the <code>decode()<\/code> function.<\/p>\n\n\n\n<p>Format:<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nencoded = input_string.encode()\n# Using decode()\ndecoded = encoded.decode(decoding, errors)\n<\/pre><\/div>\n\n\n<p>Since <code>encode()<\/code> converts a string to bytes, <code>decode()<\/code> simply does the reverse.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nbyte_seq = b&#039;Hello&#039;\ndecoded_string = byte_seq.decode()\nprint(type(decoded_string))\nprint(decoded_string)\n<\/pre><\/div>\n\n\n<p><strong>Output<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\n&lt;class &#039;str&#039;&gt;\nHello\n<\/pre><\/div>\n\n\n<p>This shows that <code>decode()<\/code> converts bytes to a Python string.<\/p>\n\n\n\n<p>Similar to those of <code>encode()<\/code>, the <code>decoding<\/code> parameter decides the type of encoding from which the byte sequence is decoded. The <code>errors<\/code> parameter denotes the behavior if the decoding fails, which has the same values as that of <code>encode()<\/code>.<\/p>\n\n\n\n<hr class=\"wp-block-separator has-text-color has-background has-vivid-green-cyan-background-color has-vivid-green-cyan-color\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Importance of encoding<\/h2>\n\n\n\n<p>Since encoding and decoding an input string depends on the format, we must be careful when encoding\/decoding. If we use the wrong format, it will result in the wrong output and can give rise to errors.<\/p>\n\n\n\n<p>The below snippet shows the importance of encoding and decoding. <\/p>\n\n\n\n<p>The first decoding is incorrect, as it tries to decode an input string which is encoded in the UTF-8 format. The second one is correct since the encoding and decoding formats are the same.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\na = &#039;This is a bit m\u00f6re c\u00f6mplex sentence.&#039;\n\nprint(&#039;Original string:&#039;, a)\n\n# Encoding in UTF-8\nencoded_bytes = a.encode(&#039;utf-8&#039;, &#039;replace&#039;)\n\n# Trying to decode via ASCII, which is incorrect\ndecoded_incorrect = encoded_bytes.decode(&#039;ascii&#039;, &#039;replace&#039;)\ndecoded_correct = encoded_bytes.decode(&#039;utf-8&#039;, &#039;replace&#039;)\n\nprint(&#039;Incorrectly Decoded string:&#039;, decoded_incorrect)\nprint(&#039;Correctly Decoded string:&#039;, decoded_correct)\n<\/pre><\/div>\n\n\n<p><strong>Output<\/strong><\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nOriginal string: This is a bit m\u00f6re c\u00f6mplex sentence.\nIncorrectly Decoded string: This is a bit m\ufffd\ufffdre c\ufffd\ufffdmplex sentence.\nCorrectly Decoded string: This is a bit m\u00f6re c\u00f6mplex sentence.\n<\/pre><\/div>\n\n\n<hr class=\"wp-block-separator has-text-color has-background has-vivid-green-cyan-background-color has-vivid-green-cyan-color\"\/>\n\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>In this article, we learned how to use the <code>encode()<\/code> and <code>decode()<\/code> methods to encode an input string and decode an encoded byte sequence. <\/p>\n\n\n\n<p>We also learned about how it handles errors in encoding\/decoding via the <code>errors<\/code> parameter. This can be useful for encryption and decryption purposes, such as locally caching an encrypted password and decoding them for later use.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">References<\/h2>\n\n\n\n<ul class=\"wp-block-list\"><li>JournalDev article on encode-decode<\/li><\/ul>\n\n\n\n<hr class=\"wp-block-separator has-text-color has-background has-vivid-green-cyan-background-color has-vivid-green-cyan-color\"\/>\n","protected":false},"excerpt":{"rendered":"<p>Python&#8217;s encode and decode methods are used to encode and decode the input string, using a given encoding. Let us look at these two functions in detail in this article. Encode a given String We use the encode() method on the input string, which every string object has. Format: This encodes input_string using encoding, where [&hellip;]<\/p>\n","protected":false},"author":7,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[3],"tags":[],"class_list":["post-2289","post","type-post","status-publish","format-standard","hentry","category-string"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/2289","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/comments?post=2289"}],"version-history":[{"count":0,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/2289\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media?parent=2289"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/categories?post=2289"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/tags?post=2289"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}