| 
   
Article Details
Global Watchtower
Common Sense Advisory Blogs
Microsoft’s Machine Translation Finds its Voice – and its Face
Posted by Nataly Kelly on March 13, 2012  in the following blogs: Technology
Pages | |


Microsoft is in the headlines for translation again. This time, it’s developed the Monolingual TTS, a new prototype for spoken language translation that actually personalizes the voice of the user. Are we entering the Star Trek era? The short answer is no. While these developments may seem mystical and magical to the average person, they’re exactly what researchers (and even sci-fi fans) have long been anticipating.

Debuted by Microsoft at the 2012 TechFest, the new system can “listen” to a voice, process words in Spanish or Mandarin, and vocalize the results in the target language. How does it work? The system requires about an hour’s worth of training data to customize the voice and make it sound like your own. It breaks up the voice into 5-millisecond portions and then remixes the sounds in order to train the program. One hour might sound like a lot, but it really isn’t much, when one considers how long it can take to train commonplace dictation software tools.

Voice customization is something we’ve written about before, and Microsoft’s new development certainly offers great potential. Not only can voice customization be used to recreate your own voice, but as we’ve pointed out, it can be used to restore voices of others – including loved ones who have passed on. Spooky? Maybe not. Consider the potential for revitalizing education. For example, how much more compelling might a history lesson about the 1860s be if it were told in the actual voice of someone from 1860? Whenever an hour’s worth of recorded voice data is available, it should be possible.

A personalized voice isn’t the only snazzy feature of Microsoft’s new translation technology. The system can give the illusion that you’re fluently speaking a foreign language that you haven’t mastered. How? By using an avatar with your own face, in the form of an animated 3-D image. The lips and facial movements are synchronized in order to give the appearance that the user is speaking another language. Now, all we need is a screen that can attach to our faces, add a few motion sensors, and voilà, we all suddenly look like hyperpolyglots.

Well, not exactly. It will take time before any system can manage spoken language translation for dozens of languages with any true level of quality. Right now, even the underlying text-to-text translation is far from perfect for many language combinations. Quality is improving, because statistical machine translation models rely on large amounts of data in order to get better. More and more content is being created, so as time goes on, quality will keep getting better.

However, in the realm of translation, quality is difficult to define. Machine translation is suitable for several purposes – most notably translating large volumes of information for gisting, translating support documentation created using controlled authoring tools, and so on.  However, does Microsoft’s announcement – or Google’s ongoing work in machine translation for that matter – represent a game change for the translation market?  Hardly.  Machine translation represents just a tiny fraction of the US$31 billion language services market.  So far, what we’ve seen is not that machine translation replaces human work, but that it actually generates more demand for human translation. Ray Kurzweil agrees that human translation will not simply disappear due to technological advances.

That said, pay close attention to what Google and Microsoft are doing. These companies have enormous research budgets and some of the world’s greatest human brain power on their payrolls. Google has had a “voice hunter” scouring the globe collecting voice samples of people speaking all kinds of languages – and with various different accents. Google’s Translate app now offers handwriting recognition. These developments may be happening somewhat quietly, but major progress is being made. Communication has always been about more than just words.  It’s about inflection, tone of voice, context, culture, and even personal communication style.

With the latest efforts of companies like Microsoft and Google, when it comes to translation, things are definitely starting to move beyond just words. In fact, they’re getting personal.

 

Post a Comment

Name
Email address :(Your Email Address Will Not Be Displayed)
URL

Your Comments
Enter Code given below :    

Related Research
Meddling with Medical Machine Translation
MT Attracts More Eyeballs than Money
Eliminating Roadblocks to Translation Quality
What to Do about Google
Microsoft’s Role in Language Preservation
The Language Services Market: 2011
Machine Translation for Indian Languages
How to Kill the MT Quality Argument
Trends in Machine Translation
The Market for MT Post-Editing in 2011
Why Every LSP Needs to Consider Post-Edited MT
Link To This Page

Bookmark this page using the following link:http://www.commonsenseadvisory.com/Default.aspx?Contenttype=ArticleDetAD&tabID=63&Aid=2841&moduleId=390

Do you have a website? You can place a link to this page by copying and pasting the code below.
Back
Keywords: Machine translation, Speech technology, Translation, Translation technologies

  
Refine Your Search
Date
Skip Navigation Links.
Skip Navigation Links.


 
 
Terms of Use | Privacy Statement | Contact Us
Copyright © 2013 Common Sense Advisory, Inc. All Rights Reserved.