Buy new:
-13%
$35.00$35.00
FREE delivery Monday, March 16
Advertisement
Ships from: Amazon.com Sold by: Amazon.com
Save with Used - Very Good
$15.99$15.99
$4.99 delivery March 20 - 25
Advertisement
Ships from: HPB Inc. Sold by: HPB Inc.
Sorry, there was a problem.
There was an error retrieving your Wish Lists. Please try again.Sorry, there was a problem.
List unavailable.
Download the free Kindle app and start reading Kindle books instantly on your smartphone, tablet, or computer - no Kindle device required.
Read instantly on your browser with Kindle for Web.
Using your mobile phone camera - scan the code below and download the Kindle app.
Follow the author
OK
The Voice in the Machine: Building Computers That Understand Speech (Mit Press) Paperback – March 23, 2012
Purchase options and add-ons
Stanley Kubrick's 1968 film 2001: A Space Odyssey famously featured HAL, a computer with the ability to hold lengthy conversations with his fellow space travelers. More than forty years later, we have advanced computer technology that Kubrick never imagined, but we do not have computers that talk and understand speech as HAL did. Is it a failure of our technology that we have not gotten much further than an automated voice that tells us to “say or press 1”? Or is there something fundamental in human language and speech that we do not yet understand deeply enough to be able to replicate in a computer? In The Voice in the Machine, Roberto Pieraccini examines six decades of work in science and technology to develop computers that can interact with humans using speech and the industry that has arisen around the quest for these technologies. He shows that although the computers today that understand speech may not have HAL's capacity for conversation, they have capabilities that make them usable in many applications today and are on a fast track of improvement and innovation.
Pieraccini describes the evolution of speech recognition and speech understanding processes from waveform methods to artificial intelligence approaches to statistical learning and modeling of human speech based on a rigorous mathematical model—specifically, Hidden Markov Models (HMM). He details the development of dialog systems, the ability to produce speech, and the process of bringing talking machines to the market. Finally, he asks a question that only the future can answer: will we end up with HAL-like computers or something completely unexpected?
- Print length354 pages
- LanguageEnglish
- PublisherThe MIT Press
- Publication dateMarch 23, 2012
- Dimensions7 x 0.8 x 9 inches
- ISBN-100262533294
- ISBN-13978-0262533294
Book recommendations, author interviews, editors' picks, and more. Read it now
Frequently purchased items with fast delivery
Communicative AI: A Critical Introduction to Large Language ModelsPaperbackFREE Shipping on orders over $35 shipped by AmazonGet it as soon as Monday, Mar 16
Artificial Unintelligence: How Computers Misunderstand the World (Mit Press)PaperbackFREE Shipping on orders over $35 shipped by AmazonGet it as soon as Monday, Mar 16
Last Words: Large Language Models and the AI ApocalypsePaul KockelmanPaperbackFREE Shipping on orders over $35 shipped by AmazonGet it as soon as Monday, Mar 16Only 15 left in stock (more on the way).
Designing Agentive Technology: AI That Works for PeoplePaperbackFREE Shipping on orders over $35 shipped by AmazonGet it as soon as Monday, Mar 16
Mind Children: The Future of Robot and Human IntelligencePaperbackFREE Shipping on orders over $35 shipped by AmazonGet it as soon as Monday, Mar 16
These Strange New Minds: How AI Learned to Talk and What It MeansHardcoverFREE Shipping on orders over $35 shipped by AmazonGet it as soon as Monday, Mar 16Only 1 left in stock - order soon.
Editorial Reviews
Review
About the Author
Product details
- Publisher : The MIT Press
- Publication date : March 23, 2012
- Language : English
- Print length : 354 pages
- ISBN-10 : 0262533294
- ISBN-13 : 978-0262533294
- Item Weight : 1.3 pounds
- Dimensions : 7 x 0.8 x 9 inches
- Best Sellers Rank: #5,159,662 in Books (See Top 100 in Books)
- #257 in Speech & Audio Processing
- #1,368 in Speech
- Customer Reviews:
About the author

Since March 2018 I am a director of engineering at Google in Zurich, Switzerland.
I have been in the speech technology research and business for more than 30 years. Prior to joining Google, I led a team that build the conversational capability of Jibo, a startup aiming at the commercialization of the first consumer social robot. In 2012 I was the director of the International Computer Science Institute (ICSI) in Berkeley, CA, an independent research institution affiliated with the University of California at Berkeley. Before that I was the Chief Technology Officer of SpeechCycle, a company specialized in advanced spoken human-machine interaction systems for enterprise customer care (yes, those annoying "please tell me the reason you are calling about" computers that prevent you to talk to human operators when you need them). Trying to make those annoying computers better, I led an effort to develop new technology that tried to make those computers learn from their own mistakes and improve the quality of the interactions with customers.
Before SpeechCycle, between 2003 and 2005, I managed a speech research team at IBM T.J. Watson Research, in Yorktown Heights, NY, and prior to that, between 1999 and 2003, I was at SpeechWorks International, which is now known as Nuance, today's largest worldwide computer speech company.
The turning point in my computer speech research career was when, in 1988, I joined AT&T Bell Laboratories (later known as AT&T Laboratories). There I worked with some of the most influential scientists in computer speech, such as Larry Rabiner and Bishnu Atal. I arrived at Bell Laboratories from Italy, where in the 1980s I was a researcher at CSELT, the laboratories of the national Italian telephone company.
During all this time I wrote, as an author or co-author, about 150 scientific papers and articles in the fields of speech recognition, spoken language understanding and dialog, multimodal interaction, and machine learning. I am best known for my original contributions to statistical methods for spoken language understanding and reinforcement learning for spoken dialog systems.
My first book, "The Voice in the Machine", published by MIT Press in 2012, narrates the story of 60 years of computer speech technology evolution in a way that is accessible to general scientific readers. My second book, "AI Assistants", published by MIT Press in 2021, still for a general audience of readers, looks at the recent development of human-machine voice interaction after Siri, Alexa, and Google Assistant were introduced and new technologies, such as Deep learning, dramatically changed the way computers recognize and understand human speech.
Products related to this item
Customer reviews
- 5 star4 star3 star2 star1 star5 star77%23%0%0%0%77%
- 5 star4 star3 star2 star1 star4 star77%23%0%0%0%23%
- 5 star4 star3 star2 star1 star3 star77%23%0%0%0%0%
- 5 star4 star3 star2 star1 star2 star77%23%0%0%0%0%
- 5 star4 star3 star2 star1 star1 star77%23%0%0%0%0%
Customer Reviews, including Product Star Ratings help customers to learn more about the product and decide whether it is the right product for them.
To calculate the overall star rating and percentage breakdown by star, we don’t use a simple average. Instead, our system considers things like how recent a review is and if the reviewer bought the item on Amazon. It also analyzed reviews to verify trustworthiness.
Learn more how customers reviews work on AmazonTop reviews from the United States
There was a problem filtering reviews. Please reload the page.
- Reviewed in the United States on August 1, 2012Format: HardcoverI enjoyed reading this book! It is a comprehensive description of the evolution of the speech technologies focused on the major results of research and the changes of directions that the technology had in the last decades. The last chapter is about the advent of Siri and what will happen in the next future. Reading the book you will encounters many and many protagonists with their anecdotes, ideas and achievements.
I see two main categories of people that might gain great advantage by reading this book. The first are those not involved in the evolution of speech technologies, the second are the insiders, who were involved either in research or at any level, even non technical, in the speech industry. For the former the book explains how a complex technology evolves in reality with all the roadblocks, turns, and steep paths while the author puts all his effort in explaining very complex engineering problems without formulas or technicalities, but using simple and enlightening analogies and examples. The book will help them to understand what is behind Siri, Google Voice, or every other speaking machine. For the latter, the professionals of the voice science and industry, it is very interesting to see how the author assembles a map of the past and current technology, the motivations and the forces behind it, and shows how all the pieces fit together in a technological landscape of the area in which they are currently engaged. For them it is like stepping out for a minute to gain a vantage point perspective and different points of view.
I belong to the second category because I spent 20 years in R&D in the research lab in Italy where Roberto Pieraccini moves his first steps and then I was deep involved in the newborn speech industry.
A last little advice is for the readers who would like to move from the author's examples to more technical readings. I found the Notes section very interesting, like a book inside the book. You might read it from the top to the bottom and you will find there some formulas, pointers to literature and complementary thoughts.
Now, I'll eagerly wait a continuation from Roberto Pieraccini to look forward instead of backward, but I strongly suggest to read this marvelous book now.
- Reviewed in the United States on May 24, 2015A very good high-level overview. The earlier parts of the book provide more implementation details than the later parts, which tend to gloss over the details in favor of recounting history.
- Reviewed in the United States on October 4, 2013Format: HardcoverRoberto Pieraccini's The Voice in the Machine is a phenomenal read. I found myself enjoying every single page. The writing is clear, precise, personal, folksy, with entertaining anecdotes. As I got closer to the end of the book, I became sadder and sadder, realizing that the time when I would be entertained and educated by Roberto was drawing to a close.
Mostly about developments in the speech recognition field (for completeness, Pieraccini has one chapter on Text-to-Speech), it's a very well-written, comprehensive survey of the history and current developments in speech technology.
It covers everything from the earliest attempts, through all the government-sponsored ARPA speech recognition challenges, to recent commercial deployments. The book would well serve as a reference for a college course or just for leisure reading: it's the best example I've ever seen of a book that explains concepts behind complex math, intuitively, without using a single equation. Roberto's writing style could almost be called poetic. It definitely conveys the passion behind the science. You must get this!
- Reviewed in the United States on November 19, 2012Format: HardcoverThis book deserves 6, 7, 8 stars. It takes a technical subject and does a really good job of showing the essence of the issues involved.
Be aware that the target audience is NEITHER
- people who already understand computer speech technology (unless perhaps they want to learn some history) OR
- the intellectually lazy. This is a difficult subject, and to get the most out of it, you will occasionally have to close the book and think about what you have just read.
But assuming you are in this target audience (you're an engineer in another field, a physicist, an astronomer, basically someone curious about the world around you) and want to learn the basic history, ideas, successes, and failures of computer speech understanding, I have never come across a book close to as good as this.
I only wish there were a comparable book in similar fields like computer vision, or computer translation.
- Reviewed in the United States on June 15, 2012Format: HardcoverAs someone who has been in the speech industry for quite some time, I can tell you this book is a terrific starting point for business people and students alike. Pierracini's great anecdotes are what makes this so enjoyable. Whether it's HAL 9000 or Victor Hugo he is employing to convey his point, the author makes learning enjoyable.
- Reviewed in the United States on June 9, 2012Format: HardcoverHaving just completed a course in NLP, I was looking for an introduction to speech processing in order to prep for more advanced reading on the subject. Pieraccini's book was just what I needed.
The author starts out by describing in convincing detail why human speech is so complex and difficult to understand, and to recreate in a lab or a commercial setting. He then goes on to describe early attempts inspired by AI, eventually arriving at statistical approaches that are the basis of most modern speech processing systems.
I like the book in its broad coverage, and while I do realize that the book is not aimed at techies, I'd have appreciated a little more coverage of HMMs and EM.
At a handful of places, there are some editing oversights that are simply disappointing for a book from a writer of this caliber (Ch. 5: "...De Mori, who pursued a brilliant carrier first at McGill..." -- career, not carrier).
Nonetheless, the book is a good read for someone interested in this technology.
Top reviews from other countries
RODIReviewed in the United Kingdom on February 16, 20135.0 out of 5 stars Excellent reading , gets inside the abilities of machine recognition and the basics behind the technology
Format: HardcoverVerified PurchaseSome basic knowledge of speech recognition principles to get the maximum from the book.
However, even a novice will learn much from the excellent and structured presentation of such a complex technology.



![Computer Networking Bible: [3 in 1] The Complete Crash Course to Effectively Design, Implement and Manage Networks. Including Sections on Security, Performance and Scalability](https://m.media-amazon.com/images/I/41H4YJnxKgL._AC_SR100,100_QL65_.jpg)


