|
|
Article Details
|
|
|
|
|
Common Sense Advisory Blogs
|
|
|
|
Microsoft’s Machine Translation Finds its Voice – and its Face
|
|
|
|
Microsoft is in the headlines for translation again. This time, it’s developed the Monolingual TTS, a new prototype for spoken language translation that actually personalizes the voice of the user. Are we entering the Star Trek era? The short answer is no. While these developments may seem mystical and magical to the average person, they’re exactly what researchers (and even sci-fi fans) have long been anticipating.
Debuted by Microsoft at the 2012 TechFest, the new system can “listen” to a voice, process words in Spanish or Mandarin, and vocalize the results in the target language. How does it work? The system requires about an hour’s worth of training data to customize the voice and make it sound like your own. It breaks up the voice into 5-millisecond portions and then remixes the sounds in order to train the program. One hour might sound like a lot, but it really isn’t much, when one considers how long it can take to train commonplace dictation software tools.
Voice customization is something we’ve written about before, and Microsoft’s new development certainly offers great potential. Not only can voice customization be used to recreate your own voice, but as we’ve pointed out, it can be used to restore voices of others – including loved ones who have passed on. Spooky? Maybe not. Consider the potential for revitalizing education. For example, how much more compelling might a history lesson about the 1860s be if it were told in the actual voice of someone from 1860? Whenever an hour’s worth of recorded voice data is available, it should be possible.
A personalized voice isn’t the only snazzy feature of Microsoft’s new translation technology. The system can give the illusion that you’re fluently speaking a foreign language that you haven’t mastered. How? By using an avatar with your own face, in the form of an animated 3-D image. The lips and facial movements are synchronized in order to give the appearance that the user is speaking another language. Now, all we need is a screen that can attach to our faces, add a few motion sensors, and voilà, we all suddenly look like hyperpolyglots.
Well, not exactly. It will take time before any system can manage spoken language translation for dozens of languages with any true level of quality. Right now, even the underlying text-to-text translation is far from perfect for many language combinations. Quality is improving, because statistical machine translation models rely on large amounts of data in order to get better. More and more content is being created, so as time goes on, quality will keep getting better.
However, in the realm of translation, quality is difficult to define. Machine translation is suitable for several purposes – most notably translating large volumes of information for gisting, translating support documentation created using controlled authoring tools, and so on. However, does Microsoft’s announcement – or Google’s ongoing work in machine translation for that matter – represent a game change for the translation market? Hardly. Machine translation represents just a tiny fraction of the US$31 billion language services market. So far, what we’ve seen is not that machine translation replaces human work, but that it actually generates more demand for human translation. Ray Kurzweil agrees that human translation will not simply disappear due to technological advances.
That said, pay close attention to what Google and Microsoft are doing. These companies have enormous research budgets and some of the world’s greatest human brain power on their payrolls. Google has had a “voice hunter” scouring the globe collecting voice samples of people speaking all kinds of languages – and with various different accents. Google’s Translate app now offers handwriting recognition. These developments may be happening somewhat quietly, but major progress is being made. Communication has always been about more than just words. It’s about inflection, tone of voice, context, culture, and even personal communication style.
With the latest efforts of companies like Microsoft and Google, when it comes to translation, things are definitely starting to move beyond just words. In fact, they’re getting personal.
|
|
|
|
|
|
Link To This Page
|
|
Bookmark this page using the following link:http://www.commonsenseadvisory.com/Default.aspx?Contenttype=ArticleDetAD&tabID=63&Aid=2841&moduleId=390
Do you have a website? You can place a link to this page by copying and pasting the code below.
|
|
|
Back
|
|
|
|
Keywords: Machine translation, Speech technology, Translation, Translation technologies |
|
|
|
|
|
|
|
|
|
|
 |
|
|
 |
|