By now, most language professionals have seen that claims that neural machine translation (NMT)
is delivering results as good – or almost as good – as human translation. If these claims – which have been repeated in the mainstream tech press without much examination – are accurate, it is only a matter of time before human translators will be out of work.
Based on our ongoing examination of the role of machine translation in facilitating multilingual communications
, we find that these claims are overblown. In many respects, NMT represents a significant improvement over state-of-the-art statistical machine translation, but it has not closed the gap with human translation. It will make MT more acceptable and help address the huge mismatch between language needs and supply, but we predict that it will simultaneously help increase the value of human translators for the high-value tasks where they excel.
Google is among the first large-scale implementer of NMT in a production environment. Prior to 2016, artificial intelligence (AI) researchers saw NMT as something that would come, but largely treated it as a bit of “pie in the sky.” All this changed in 2016. Microsoft actually beat Google to launch
with NMT by several months, but its deployment received minimal attention, whereas Google’s press release
immediately garnered the sort of love-struck press attention
that language technology almost never receives.
The discrepancy in attention is due, in part, to Google’s savvy media strategy that tied in with increasing attention to AI
and delivered seemingly easy-to-digest claims that – on their surface – promised world-changing results. By contrast, more realistic and measured press releases from Microsoft
– which announced a shift to NMT at the same time that Google was making its announcements – did not appeal to the imagination in the same way.
What particularly caught on with the press was Google’s internal finding that NMT-generated target text was almost indistinguishable from human translation
when graded by human reviewers and that it had reduced translation errors by 55–85%. The latter claim relied on unclear math that the company’s own charts – which showed substantial but incremental improvement – did not seem to reflect: