Speaking with Computer Generated Voices
Speaking with Computer Generated Voices
While creating Yet Another Choose Your Own Adventure I took a detour into the world of computer-generated voices. As Chris Maury noted in his post, the quality of algorithmically generated voices is improving and I wanted to learn more about the state of the industry.
Pop quiz! Think of a computer generated voice you heard recently. Apple’s Siri? Portal’s GLaDOS? The Mac OS say
command?
While GLaDOS was created by an actual human, say
and Siri both have their roots in a company called Nuance which merged with Kurzweil’s ScanSoft in 2005. Apple acquired Nuance in 2012 and renamed one of the voices, Samantha, to Siri. You can hear more sample voices from Nuance and the general quality is quite good. I didn’t want to pick Siri as the Choose Your Own Adventure voice because she has become a bit too cliché. Commercial applications were either prohibitively expensive or lacked the proper API for a weekend hack so I dug into some of the research communities around text-to-speech (TTS) to find open source solutions.