Back in 2008, we wrote about Language Death and Why It Matters, lamenting the loss of “repositories of information and understanding that took thousands of years to gather.” In 2009, Google began a project to collect and preserve audio, video, and other digital artifacts of at-risk languages before they disappear altogether. Famously, the company has a stated mission “to organize the world’s information and make it universally accessible and useful.”
Today, Google released the results of its most ambitious language preservation project to date. The Endangered Languages Project houses information about more than 3,000 of the 7,000 still extant languages. Data for the project was provided by the Catalogue of Endangered Languages (ELCat), produced by the University of Hawai'i at Manoa and The Institute for Language Information and Technology (The Linguist List) at Eastern Michigan University. The site also lists the many supporting organizations, now formalized as the Alliance for Linguistic Diversity. The Endangered Languages Project provides an online resource for language activists around the world and a reference point for sharing the latest updates on lesser-used and minority languages.
Thank goodness. The world is now losing two languages per month. As endangered language expert K. David Harrison explains, "No culture has a monopoly on human genius" (see "A Digital Love Letter to a Dying Language.")
Google manager Siobhán Ní Chonchúir told the Irish Times that the project is “an important step to preserve what the elders know and to give the next generation a chance to learn from their heritage.” We could not agree more. As we have written in the past, “Of the 6,912 known living human languages, only 2,261 have writing systems.” Therefore, audio and video are tremendously important in studying and conserving these languages. In the coming week, Common Sense Advisory will release updated figures for the Availability Quotient and e-GDP of online languages. This year’s numbers show a new power being exerted by a “long tail effect” of languages.
What does this news mean for Google? The company already provides core web applications in over 120 world languages. Wikipedia supports active content creation in roughly 280 languages and dialects. By collecting information on over 3,000 human tongues, http://www.endangeredlanguages.com will represent a living treasure-trove of data and human culture for students, anthropologists, cultural curators, for the speakers themselves – and last but not least, for nearby peoples who may want to better understand their neighbors. Many of Google’s own initiatives stand to benefit from this information, too.
In recent research and posts about linguistically diverse areas – such as The Need for Translation in Africa and India Contemplates a Billion Web Users – we’ve demonstrated both the humanitarian and the economic benefits of translation. South America remains another language hotspot; for instance, Bolivia alone hosts greater language diversity today than the entire European continent. The many benefits of translation extend not only to the native speakers of those languages, but to the government and non-governmental organizations serving those populations, to investors and for-profit entities seeking to build new markets, and to language service providers looking to extend into new geographies.
In addition to these economic and social benefits, this announcement from Google reminds us of the cultural benefit to all humankind of cultural diversity and survival. Throughout history, language minority groups – and their people – have often been lost in the shuffle. Dominant world languages have long bullied smaller ones spoken by communities with less power, fewer resources, or less proclivity for destruction. When someone asks, “Are endangered languages worth saving?” what they are asking is, “Do these people really matter?” Today’s announcement reveals Google’s simple answer: “Yes.” To all of us.