| 
   
Article Details
Global Watchtower
Common Sense Advisory Blogs
French Language Technology Company and Systran Develop Web Terminology Builder for Machine Translation
Posted by Donald A. DePalma on May 31, 2005  in the following blogs: Translation and Localization
Pages | |


When polled, everyone involved in multilingual content management agrees that terminology management is essential -- for effective communication, for protecting certain types of intellectual property, and for optimized searching both inside and outside the organization. Yet the various processes associated with multilingual terminology management (searching, extracting, validating, formatting and delivering) are still time-consuming, expensive (Lingway's Bernard Normier reckons a "term" can cost between two and five dollars to create), and therefore obvious targets for a technology fix. There are numerous efforts underway to make existing terminology banks or bases interoperable, using a standard such as TBX. But these mainstream terminology bases have usually been handcrafted (so they tend to be systematically out of date) and are piecemeal (they have neither adequate multilingual coverage, nor adequate subject matter timeliness).

Lingway's Terminology Builder solution is based on the idea that terminology can be quickly mined from multilingual corpora using any computing methods that align terms with their cross-language equivalents. First, establish an extract-likely term list from a technical domain corpus containing documents in various languages. Use any combination of statistical and linguistic analysis to pull up plausible noun phrases, and verb and object groups from the source corpus. Then use these candidate terms as search queries in the "other" language texts in the corpus, providing a simple word for word translation to see what's there. The other language terms that occur most frequently in the results of that search are then considered the best candidates for the target language terminology. Then format the multilingual lists to populate Systran's dictionaries.

This solution takes a similar approach to the recently-announced Google project to provide "translation" (that is, dig out existing multilingual equivalents from the vast mine of texts on the web by seeking various formal and statistical parallels), and adds some linguistic processing to dig down to the structures beneath the strings, and produce not translations of texts but of terms. We can expect to see more and more use of knowledge/text/content mining for various multilingual applications. But the technology won't solve endemic questions in terminology management, especially where terms are considered part of an organization's semantic DNA.

First, high quality translation automation term bases built this way will always depend on the right mix of available subject matter data in relevant languages. This means that building the right corpus may prove more important than the speed and cost benefits resulting from the terminology itself. Second, organizations will still need a validation process that secures buy-in from many departments and individuals.

We're still more than a few years away from putting a squishy fish in our ears -- or an omniscient linguistic server on our networks -- for fully automated and rhetorically compelling translation. But each and every step like this one from Lingway -- and similar ones from MultiCorpora and KCSL -- will get us closer.

 

Post a Comment

Name
Email address :(Your Email Address Will Not Be Displayed)
URL

Your Comments
Enter Code given below :    

Link To This Page

Bookmark this page using the following link:http://www.commonsenseadvisory.com/Default.aspx?Contenttype=ArticleDetAD&tabID=63&Aid=153&moduleId=391

Do you have a website? You can place a link to this page by copying and pasting the code below.
Back
Keywords: Localization, Translation

  
Refine Your Search
  Date
Skip Navigation Links.
Skip Navigation Links.




 
 
Terms of Use | Privacy Statement | Contact Us | Report a Technical Issue
Copyright © 2011 Common Sense Advisory, Inc. All Rights Reserved.