Long overshadowed by Google Translate, Microsoft has begun to make the machine translation (MT) capabilities of Bing more visible. In April it announced the Translator app for Windows Phone, supporting text, voice, and video input for 39 languages when connected to the web. Today Microsoft officially released Microsoft Translator Hub, the latest step in the evolution of its MT software, which we first profiled in 2006. This product allows organizations to train the Bing statistical MT engine to their domain and lexicon through three components:
- The Translator API (application programming interface) exposes functions such as translation, language detection, sentence breaking, and text-to-speech to calling programs. It supports collaborative weighting that lets users review alternate translations based on their role and defined rules. Microsoft Researcher Chris Wendt told us that the API first became available in September 2011 as a partner-ready product with support and documentation. The company says that it has more than 10,000 commercial users of the API. It is free for use up to two million characters per month, but incurs fees for higher volumes and commercial users.
- The Widget application interfaces between the API and any application or website where it’s embedded. The Widget collects corrections and votes on alternative translations from users, administers users, manages bulk editing, and highlights pending edits.
- The Translator Hub itself offers a generic engine trained on Microsoft’s data, but allows users to add a layer of customization for their organization’s style and domain. This “layered tenancy” manages multiple translation models and weights them against data from both Microsoft and the customer. The result is an engine that gives individual organizations a lot of control through formal inputs like terminology and the less formal approach of crowdsourcing. By contrast, Google Translate does not let users train the engine. Finally, the Hub supports a continuous loop of incremental updates to the MT engine, supplemented by periodic complete rebuilds for quality and performance.
The Translator Hub product grew out of a rush project to support relief efforts in Haiti following the January 2011 earthquake. The Bing team faced the problem of building a statistical machine translation (SMT) engine overnight, something that at the time was not easily done. It took them five days to ship Haitian Creole, and that was good enough for responders. Microsoft decided to formalize the process they used, thus resulting in the API and the trainable engine.
The company recently added support for Hmong, an effort that Microsoft Group Program Manager Chris Wendt said was driven almost completely by speakers of that language. He told us Microsoft named it the “hub” because it allows public and private groups to involve their extended communities in training the engine. Efforts that support less commonly used or endangered languages provide a socially responsible way for technology companies to showcase their wares. Commercial buyers, of course, will engage their communities of translators, translation agencies, employees, partners, customers, and prospects in their extended demand and supply chains to improve the quality of their translation.
Wendt said that a variety of commercial companies already use the Hub. For example, Amazon, Facebook, and Twitter have integrated it into their websites. Technology users including Microsoft itself, Autodesk, Intel, and PLYMedia built it into their workflows and use the Hub to customize their translations. Globalization software vendors such as Atril (Déjà Vu), Kilgray (MemoQ), SDL (Trados), Welocalize (GlobalSight), and XTM International (XMT) have connected the Hub to their products, while Clay Tablet employs the engine to demonstrate cascading workflow in its translation broker that integrates content and translation management systems.
Interestingly, language service provider (LSP) Lionbridge replaced IBM’s real-time translation engine with Microsoft’s Bing Translator in its year-old GeoFluent software-as-a-solution offering. In describing the change, Lionbridge highlighted Bing’s training set of millions of already translated sentences for better baseline quality and the Hub’s ability to integrate the client’s terminology. The former reason underscores the benefit of MT solutions that leverage vast amounts of content managed by developers such as Google or Microsoft, while the latter emphasizes the benefit of customization based on the ultimate user’s training materials (see “Trends in Machine Translation,” Oct11).
Microsoft’s MT engine has been under development for a decade, but undercover for most of that time. With the release of the Hub, it challenges Google Translate with: 1) an engine that buyers can customize to their domain or company needs versus a Google-fits-all model; 2) the layered tenancy approach instead of the multi-tenant model that discourages commercial users; and 3) its status as a fully supported product. These features, combined with competitive concerns that keep companies such as Facebook at arm’s length from Google, translate into strong advantages for Microsoft against its competitor in the cloud. Its traditional reach into enterprises plus its cloud-based platform broaden the Hub’s reach, thus improving Microsoft's position in the MT arena. The company still faces the big marketing challenge of getting the “crowd” in various language communities involved in training their baseline engines.