| 
   
Article Details
Common Sense Advisory Blogs
Speech: The Next Frontier for Improving the Customer Experience
Posted by Donald A. DePalma on August 24, 2007  in the following blogs: Translation and Localization, Technology, Interpreting
Pages | |


Focusing on "Speech Technology at the Tipping Point," this week's 17th annual SpeechTEK conference in New York drew a total of about 2,000 delegates to the sessions and visitors to the exhibition hall (with a better buyer-supplier ratio than the usual language industry conference). Contending that speech is ready for prime time in call centers, search, and other consumer- and employee-facing applications, the organizers called on Malcolm Gladwell to underscore "The Tipping Point" in a luncheon keynote.

What's driving the move to automated speech interactions are mobile devices, developing markets where language competence might be more oral than written, and, of course, the ever-rising expense of employing enough human call center operators. For most mobile uses and many customer service applications, voice interactions would be more efficient than the eight fingers and one thumb that typed this posting.

How close are we to a voice-guided future? Closer than a few years ago, but we're not there yet, especially once the utterance isn't in English. We took some briefings with various providers and listened in on several presentations that blended speech recognition, call centers, and sometimes machine translation. In most cases, our interlocutors were aware of the importance of providing a customer experience in the language of their prospects, even if current technology does not allow them to do so today. That's good news for anyone trying to improving the experience of their customers around the globe.
  • Mike Cohen, one of the founders of the company that became speech recognition technology provider Nuance, kicked off the conference describing his work at Google. He said that the search company’s mission is to organize all the world’s information and make it universally accessible and useful. He emphasized that Google means ALL information, so that requires making speech indexable and searchable from any device. His first product is GOOG411 (+1.800466.4411) which lets you search for and connect to a business in the States (and get an SMS with details and a map link if you're on a mobile phone). Cohen told us that North American English is all that it can handle today, but that the technology could process other languages.

  • Fil Alleva, general manager of Microsoft's Speech Components group, outlined Vista's support for speech. He focused on how the technology increased accessibility for those with disabilities and improved productivity for enthusiasts. Looking beyond the AmerEnglish market, Alleva told us that his speech module supports both U.S. and U.K. English, German, French, Spanish, Japanese, and both simplified and traditional Chinese.
How does all this stuff fit together? Industry specifications help.
  • Nuance's Daniel Burnett outlined the W3C's work on SSML (Speech Synthesis Markup Language) to provide a standards-based platform for voice browsing. The 1.1 update will broaden support for Arabic, Cantonese, Hindi, Japanese, Korean, Mandarin, and Russian (the WQ3C lacks expertise in the italicized languages, so if you have any free time...). The 1.1 team would also like to improve how SSML deals with word boundaries, phonetic alphabets other than the IPA (not India Pale Ale), tones, parts of speech, and text containing multiple languages (think Spanglish and Hinglish).

  • IBM's Jan Kleindienst outlined a just-completed research project that integrated disparate academic speech recognition and and machine translation engines across Europe with UIMA (Unstructured Information Management Architecture). Then Hannah Grap described Language Weaver's module for integrating speech with its MT engine, focusing on military applications and doctor/patient interactions. Elsewhere on the speech-plus-MT front, IBM showcased its UIMA-enabled TALES (Translingual Automatic Language Exploitation System) for video capture, speech-to-text conversion, MT into English, and information extraction for foreign-language broadcasts and websites.
The bottom line: The basic technology for speech is steadily evolving and integration with other applications is happening through both proprietary models like Google's and industry-wide standards from the W3C and elsewhere, but is far from perfect. However, the multilingual component of speech needs even more work. As domestic non-Anglophone populations increase, the importance of automatically and intelligently directing a Spanish-speaking caller in the U.S. or Hindi speaker in India to the right service representative will be a growing issue for call centers trying to improve the overall customer experience.

 

Post a Comment

Name
Email address :(Your Email Address Will Not Be Displayed)
URL

Your Comments
Enter Code given below :    

Link To This Page

Bookmark this page using the following link:http://www.commonsenseadvisory.com/Default.aspx?Contenttype=ArticleDetAD&tabID=63&Aid=439&moduleId=390

Do you have a website? You can place a link to this page by copying and pasting the code below.
Back
Keywords: Interpreting technologies, Machine interpretation, Machine translation, Translation, Translation technologies

  
Refine Your Search
  Date
Skip Navigation Links.
Skip Navigation Links.




 
 
Terms of Use | Privacy Statement | Contact Us
Copyright © 2017 Common Sense Advisory, Inc. All Rights Reserved.