{"id":16323,"date":"2021-05-19T17:22:40","date_gmt":"2021-05-19T17:22:40","guid":{"rendered":"https:\/\/www.askpython.com\/?p=16323"},"modified":"2021-05-19T17:22:52","modified_gmt":"2021-05-19T17:22:52","slug":"speech-recognition","status":"publish","type":"post","link":"https:\/\/www.askpython.com\/python-modules\/speech-recognition","title":{"rendered":"Python Speech Recognition Module &#8211; A Complete Introduction"},"content":{"rendered":"\n<p>Hey there! Today let&#8217;s learn about converting speech to text using the <code>speech recognition<\/code> library in Python programming language. So let&#8217;s begin!<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Introduction to Speech Recognition<\/h2>\n\n\n\n<p>Speech recognition is defined as the automatic recognition of human speech and is recognized as one of the most important tasks when it comes to making applications like Alexa or Siri.<\/p>\n\n\n\n<p>Python comes with several libraries which support speech recognition feature. We will be using the <code>speech recognition<\/code> library because it is the simplest and easiest to learn.<br><\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Importing Speech Recognition Module<\/h2>\n\n\n\n<p>The first step, as always, is to import the required libraries. In this case, we only need to import the <code>speech_recognition<\/code> library. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport speech_recognition as SR\n<\/pre><\/div>\n\n\n<p>If the statement gives an error, you might need to install the library using the <code>pip<\/code> command.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\">Implementing Speech Recognition in Python<\/h2>\n\n\n\n<p>To convert speech from our audio  to text, we need the <code>Recognizer<\/code> class from the <code>speech_recognition<\/code> module to create an object which contains all the necessary functions for further processing.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">1. Loading Audio<\/h3>\n\n\n\n<p>Before we continue, we&#8217;ll need to download an audio file. The one I used to get started is a speech from Emma Watson which can be found <a aria-label=\" (opens in a new tab)\" href=\"https:\/\/youtu.be\/UjI_bspcUHA\" target=\"_blank\" rel=\"noreferrer noopener\" class=\"rank-math-link\">here<\/a>. <\/p>\n\n\n\n<p>We download the audio file and converted it into <code>wav<\/code> format because it works best to recognize speech. But make sure you save it to the same folder as your Python file.<\/p>\n\n\n\n<p>To load audio we will be using the <code>AudioFile<\/code> function. The function opens the file, reads its contents and store all the information in an AudioFile instance called <code>source.<\/code><\/p>\n\n\n\n<p>We will traverse through the source and do the following things:<\/p>\n\n\n\n<ol class=\"wp-block-list\"><li>Every audio has some <code>noise<\/code> involved which can be removed using the <code>adjust_for_ambient_noise<\/code> function. <\/li><li>Making use of the <code>record<\/code> method which reads the audio file and stores certain information into a variable to be read later on.<\/li><\/ol>\n\n\n\n<p>The complete code to load the audio is mentioned below.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; gutter: true; title: ; notranslate\" title=\"\">\nimport speech_recognition as SR\nSR_obj = SR.Recognizer()\n\ninfo = SR.AudioFile(&#039;speech.wav&#039;)\nwith info as source:\n    SR_obj.adjust_for_ambient_noise(source)\n    audio_data = SR_obj.record(source,duration=100)\n<\/pre><\/div>\n\n\n<p>Here we have also mentioned a parameter known as <code>duration<\/code> because it will take a lot more time to recognize speech for a longer audio. So will will only be taking first 100 seconds of the audio.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">2. Reading data from audio<\/h3>\n\n\n\n<p>Now that we have successfully loaded the audio, we can now invoke <code>recognize_google()<\/code> method and recognize any speech in the audio. <\/p>\n\n\n\n<p>The method can take several seconds depending on your internet connection speed. After processing the method returns the best possible speech that the program was able to recognize from the first 100 seconds.<\/p>\n\n\n\n<p>The code for the same is shown below.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; gutter: true; title: ; notranslate\" title=\"\">\nimport speech_recognition as SR\nSR_obj = SR.Recognizer()\n\ninfo = SR.AudioFile(&#039;speech.wav&#039;)\nwith info as source:\n    SR_obj.adjust_for_ambient_noise(source)\n    audio_data = SR_obj.record(source,duration=100)\nSR_obj.recognize_google(audio_data)\n<\/pre><\/div>\n\n\n<p>The output comes out to be a bunch of sentences from the audio which turn out to be pretty good. The accuracy can be increased by the use of more functions but for now it does the basic functionalities. <\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: plain; title: ; notranslate\" title=\"\">\n&quot;I was appointed 6 months and I have realised for women&#039;s rights to often become synonymous with man heating if there is one thing I know it is that this has to stop someone is by definition is the belief that men and women should have equal rights and opportunities is the salary of the economic and social policy of the success of a long time ago when I was 8 I was confused sinkhole but I wanted to write the play Aise the width on preparing for the 14 isostasy sacralized elements of the media 15 my girlfriend Statue of Liberty sports team because they don&#039;t want to pay monthly 18 18 Mai Mela friends were unable to express their feelings I decided that I am business analyst at the seams and complicated to me some recent research has shown me feminism has become&quot;\n<\/pre><\/div>\n\n\n<h2 class=\"wp-block-heading\">Conclusion<\/h2>\n\n\n\n<p>Congratulalations! Today in this tutorial you learned about recognizing speech from audio and displaying the same on your screen. <\/p>\n\n\n\n<p>I would also like to mention that speech recognition is a  very deep and vast concept, and what we have learned here barely scratches the surface of the whole subject. <\/p>\n\n\n\n<p>Thank you for reading!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hey there! Today let&#8217;s learn about converting speech to text using the speech recognition library in Python programming language. So let&#8217;s begin! Introduction to Speech Recognition Speech recognition is defined as the automatic recognition of human speech and is recognized as one of the most important tasks when it comes to making applications like Alexa [&hellip;]<\/p>\n","protected":false},"author":28,"featured_media":16345,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[2],"tags":[],"class_list":["post-16323","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-python-modules"],"blocksy_meta":[],"_links":{"self":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/16323","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/users\/28"}],"replies":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/comments?post=16323"}],"version-history":[{"count":0,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/posts\/16323\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media\/16345"}],"wp:attachment":[{"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/media?parent=16323"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/categories?post=16323"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.askpython.com\/wp-json\/wp\/v2\/tags?post=16323"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}