In science fiction, the Encyclopedia Galactica is a compendium of a galaxy’s worth of knowledge.
Wikipedia isn’t quite there yet—for one, we’ve barely left Earth. However, that doesn’t mean we aren’t trying to put together a planet’s worth of knowledge and ensure that all of its inhabitants can learn from it in their own languages.
That’s where our content translation tool comes in. The tool simplifies translating Wikipedia articles into different languages by automating many of the tedious steps inherent in manually translating Wikipedia’s articles. As of last month, more than half a million articles have been created with the help of the content translation tool since its introduction.
Why did we build this tool? It’s simple: translating Wikipedia’s knowledge into new languages can help reduce our knowledge gap. For example, English-speaking users can access more than five million articles. Bengali speakers, a language of 260 million people, have access to a mere 75,000.
The content translation tool has not been easy to construct, maintain, and grow. We’ve spent four years constructing and fine-tuning it to best fit Wikipedia’s decentralized model, ensuring that it fits into all of Wikimedia’s many individual wikis which have grown and developed their own custom infrastructure to suit their local contexts.
More recently, the Wikimedia Foundation has worked to modernize the content translation tool with a rich-text interface and improve its output by utilizing artificial intelligence. The tool now provides users with an initial machine translation for them to improve prior to publishing, and incorporates safeguards which help ensure that all untouched machine translations are reviewed. We’ve partnered with Google to ensure that these automatically translated items are as high-quality as possible, and we’ve ensured that none of our users’ data is being passed to Google in the process.
We’re liking what we’re seeing from the newly improved content translation tool. In the last year, it has been used to translate nearly 150,000 articles, an over twenty percent increase in the year-over-year number. Moreover, our data shows that translated articles are less likely to be deleted than articles created from scratch.
What’s next for the tool? As part of the Wikimedia Foundation’s medium-term plan, released earlier this year, the Foundation’s language team will be focusing on two key areas: ensuring that the full suite of translation tools can be used by as many different language Wikimedia wikis as possible, and in expanding the kinds of contributions that the content translation tool supports. They’ll take on the former before tackling the latter, starting with the Malayalam, Bengali, Tagalog, Javanese, and Mongolian languages.
Pau Giner, Lead UX Designer, Product Design
For more information about our content translation tool’s history and future, please see an expanded blog post on Wikimedia Space, the new site built for news, questions, and conversations within the Wikimedia movement.