Rosetta Stone

Many science fiction stories provide for a ‘universal translator’, often using it as a convenient plot device to quickly allow individuals from two or more species to communicate from their first words.

Unfortunately, we here on Earth-prime haven’t yet developed such a thing, and that’s why translation across hundreds of language Wikipedias is so important: it lowers the cost of spreading knowledge across the world, as it allows multilingual editors to reuse efforts made by other volunteer editors to cover a topic.

To facilitate this process, we here at the Wikimedia Foundation developed a content translation tool that helps Wikipedia editors to easily translate articles. Content translation simplifies translating Wikipedia articles into different languages by automating many of the boring steps of the traditional translation process.[1]

In early April, content translation reached a new milestone: more than 300,000 articles where created since the tool was released three years ago, making this a good time to reflect on the impact of the tool and discuss future plans.

Content translation video made for the 100,00th translation. Video by Victor Grigas/Wikimedia Foundation, CC BY-SA 3.0. Due to browser limitations, the video will not play on Microsoft Edge, Internet Explorer, or Safari. Please try Mozilla Firefox instead, or watch it directly on Wikimedia Commons.

Thanks to the editors working with content translation, many topics are now available in new languages, making knowledge easy to access to more people in the world. As the tool has been adopted, the article creation rate has increased to the point where more than 400 new articles were being translated per day last month—or one new article every 3.5 minutes.

During these three years, we have received quite a bit of feedback about how the tool has helped Wikipedia editors in different communities. Many editors appreciate that the tool automates most of the manual steps they had to do before, and lets them focus on creating quality content. Expert translators from the Medical Translation Task Force, for example, estimated their productivity increased 17% with content translation, helping them to expand Wikipedia’s coverage of vaccine information faster than before.

In addition to reducing the content gap across languages, another goal of content translation was to generate quality content. Content translation allows to reuse the efforts that editors from other communities made on the source article, finding images, adding references and reviewing the content; which often lead to a better initial version of the article compared to starting from scratch. Our measurements indicate that the deletion rate of articles created with content translation is lower compared to new articles started from scratch in most languages. For example: Spanish had the highest number of translations in 2017. In that same year, less than 10% of the articles started with content translation were deleted—lower than the 52% deletion ratio for new articles that were not created with the tool in the same period.

We want to share more details on how the tool has been adopted by different communities and the plans for the future which include a new version of the tool.

Part of the daily editor toolset on many wikis

Content translation has become part of the daily routine on many Wikipedia communities, accounting for a significant portion of the articles created in those wikis. For example, on the Tamil Wikipedia, 13% of the articles created since content translation was released have been created with this tool.

Tamil is a language spoken by 70 million people, but the Wikipedia in their language has less than 150,000 articles. Tamil speakers can now read in their language about Mexican rag dolls, coronation ceremonies, long hair or more than seven thousand other topics created with content translation.

Tamil Wikipedia articles of different lengths, all created with content translation. Screenshots via the indicated articles on the Tamil Wikipedia, CC BY-SA 3.0. The enclosed images may be under different licenses.

On the Catalan Wikipedia, one of the tool’s early adopters, content translation has been used to start 19% of the articles created since the tool was first available. Other communities like the French have also used content translation to create a significant number of articles (in their case, more than 24,000), but the larger size of that Wikipedia means that those only represent a small percentage of their total article production. However, for many other Wikipedias, large and small, content translation has not been established as a common tool to create new articles yet. You can check the stats page on any Wikipedia for an overview or query our APIs for a deeper analysis on the data about published content.

The adoption of the tool is very different from language to language. Several factors may have an effect on this, including the availability of quality machine translation, the number of languages spoken by their editors, and the quantity of the content available in such languages.

Given the diversity of editors, languages and kinds of content, getting feedback about the use of the tool is essential. Please feel free to provide feedback about the use of content translation for your specific context on the project talk page.

The next version

Currently the Language team is working on a new version of content translation. Version 2 will be a major refactoring and architectural update of the tool. The goal is to make content translation a solid and reliable translation tool that is aligned with the Wikimedia standards in technology and design, and provides a great way to contribute for newcomers.

The new version will include a more powerful editing surface based on Visual Editor, that will allow to solve many of the most often requested features. Reliable support for undo/redo and copy & paste will provide translators with more freedom to manipulate their content. In addition, tools to insert and edit templates, tables, multimedia, categories, and more advanced kinds of content will allow editors to improve their translations further with new content before publishing them.

Screenshot by Pau Giner/Wikimedia Foundation. Text from the English Wikipedia’s article on nasothek, CC BY-SA 3.0.

This is a large effort that will require to rewire many of the tools in order for them to work with the new editing surface. The plan is to gradually replace version 1 with version 2 in several stages. Backwards compatibility will make sure that content created by users during the transition period won’t be affected.

From now on, the focus for developing new features will be on version 2 in order to provide access to the improved version of the tool as soon as possible. The current version of content translation (version 1) will still get maintenance support to make sure that the tool is available for the users that rely on it on a daily basis.

A better experience for newcomers

We’ll take this opportunity to better align the designs in the new version to the Wikimedia design guidelines, and provide better support for new editors. We want to deliver a better experience for newcomers based on learnings from existing and new research on the experience of new editors.

Screenshot by Pau Giner/Wikimedia Foundation. Text from the English Wikipedia’s article on nasothek, CC BY-SA 3.0.

Currently new editors often struggled with some of the error messages they get from the tool. For example, their translation may trigger a “spam” error because a link that already existed in the source article points to a website that is blocked in the target wiki. In such cases, it is not obvious for new users what is going on, where the problem is and how to solve it. Guiding users through the process of reviewing their translation will help them to improve the initial automatic translations further and provide a higher quality initial article as a result.

We believe that with these improvements we can make translation a great way to start contributing to Wikipedia. You can check the project page for the new version to learn more about it, and use the discussion page to provide any feedback.

Pau Giner, Senior User Experience Designer, Audiences Design
Wikimedia Foundation

Notes

  1. Traditionally, translating Wikipedia articles required much manual effort, including moving back and forth across tabs, copying and pasting from external language tools (e.g. dictionaries), reformatting content, and rewriting links to point to the right place.

Related

Read further in the pursuit of knowledge

Community From the archives Offline access Wikipedia

Offline-Pedia converts old televisions into Wikipedia readers

There are villages in the Ecuadorian Andes that are so small you cannot find them on a map. Cajas Juridica is one such place, located just 13km north of the equator. But two engineering students, Joshua Salazar and Jorge Vega, and the staff of Yachay Tech University have figured out a way to give discarded….

Community From the archives Interview Profiles Wikipedia

Meet the scientist working to increase the number of underrepresented scientists and engineers on Wikipedia

By day, Dr. Jess Wade is a physicist best known for her work on “polymer-based, circularly polarising, light-emitting diodes.” But in the evenings (and on the weekends, and as other time permits) Dr. Wade is a strong advocate for increasing diversity and inclusion in STEM subjects, speaking at conferences and starting a campaign on Wikipedia to promote more early-career women….

Community Foundation From the archives Wikipedia

New interaction timeline improves investigation of harassment cases

The new interaction timeline tool is a way to look at two contributors’ editing history—where they have interacted, when, and how often. This can help add clarity when reviewing reports of harassment and abuse, and takes some of the burden off both the people reviewing problems, and the people reporting them.

Help us unlock the world’s knowledge.

As a nonprofit, Wikipedia and our related free knowledge projects are powered primarily through donations.

Donate

Connect —

Stay up-to-date about the Wikimedia Foundation

Get email updates

Subscribe to news about ongoing projects and initiatives.

Contact a human

Questions about the Wikimedia Foundation or our projects? Get in touch with our team.

Photo credits

Rosetta Stone

© Hans Hillewaert

CC BY-SA 4.0

Offline-Pedia-screenshot

University Yachay Tech

CC BY-SA 4.0

17_350-icl-jwade-024

Jess Wade

CC BY-SA 4.0

matthew-henry-86779-unsplash