News

Where did that paragraph go? This software change helps volunteers hold up Wikipedia’s high quality

High quality and reliability of content are at the heart of Wikipedia, but this requires a massive collaborative effort. Recent technical improvements have made the process easier.

One key to Wikipedia’s high quality is a system of mutual checks, based on the fact that every version of every page is stored and accessible. Thousands of community members review the latest edits of others in order to find errors or inconsistencies, comparing each new version of a page to older page versions and checking if the new content complies with guidelines for citation, style, orthography and more. Edits are also scrutinized for subjectivity, copyright violations or vandalism. If a problem is found, it usually gets corrected within minutes.

Wikipedia editors know which articles may need monitoring through a variety of ways. Many logged-in contributors save the pages they’re interested in to their personal watchlists and routinely check them for quick overviews of changes made on those pages. Furthermore, each language version of Wikipedia has a recent changes page, a ticker that shows the latest edits to all of its pages. Users who closely monitor this page can use filters to view the types of edits they’re interested in, such as changes by unregistered users. There are also ways to check changes by looking at all edits a particular user has made, or by directly examining the version history of a specific Wikipedia page.

A widely used tool for comparing versions of a Wikipedia page is the wikitext diff, a two-column view that shows an older version of a page on one side and the newer version on the other side. The tool displays the two versions in the wikitext markup and highlights differences between them with a color code.

Screenshot of a Wikipedia "diff."
A simple wikitext diff: In the newer version, some text was removed (highlighted in yellow) and some other text added (highlighted in blue).

However, in the past, it was often hard and time-consuming to compare page versions. Due to a technical limitation, whenever a part of a text was simply moved to another position on the page, it was displayed as if it had been removed and some other text had been added. Even worse, there was no easy way to see if someone had changed the text that had been moved. In consequence, Wikipedia editors had to spend time checking whether a text had been moved or removed and then more time identifying changes between the different versions.

Screenshot of the Wikipedia "diff" resulting from moving an entire paragraph.
A chunk of text was moved. Can you spot the changes inside?
We wanted to create a wikitext diff view that would show both moved text chunks and the changes inside them. But what might sound like a simple change was actually a very delicate task for two reasons: first, changes to the diff code can affect the speed of MediaWiki software, and second, detecting moved pieces of text isn’t trivial: How much can a paragraph be changed to still qualify as the same, moved piece of text?

The Technical Wishes team from Wikimedia Germany (Deutschland), the German Wikimedia chapter in Berlin, took on this task, supported by software teams from the Wikimedia Foundation. Our project aims to improve the software behind Wikipedia, so our developers dove deep into the wikidiff code, and put a lot of effort into improving, fine tuning and testing it.[1]

After lots of programming, testing and even more testing, the wikitext diff now clearly indicates moved text chunks with the help of little arrows, and highlights changes that were made within them. This change has been active on most Wikipedias for a few weeks now.

Screenshot of the Wikipedia "diff" resulting from moving an entire paragraph. Individual words changed in that paragraph are now highlighted for further inspection.
Now it’s clearly indicated that two paragraphs were moved and which text was changed within them.

The most recent news from the world of diffs is on your phone: As of this week, moved text chunks are shown correctly on mobile devices as well. In order for this to happen, the Wikimedia Foundation’s Reading Web team took our recent changes in the diff code and developed styles for it in the mobile view.

See caption.
This is what the diff view now looks like on mobile devices. In this example, the paragraph “The smallest dog […]” was moved down on the page, and the word “merely” was replaced by “only”.

And last but not least, a similar technical improvement was released in early 2018 by the Wikimedia Foundation: The Visual Diff, a tool for users who prefer a visual view over wikitext, also shows changes in moved text chunks. The code behind it, however, is completely independent from the code of the wikitext diff.

We’re hoping that all these improvements are making the life of many contributors easier and will support them in the vital work they do in quality assurance.

Johanna Strodt, Project Manager Communications
Wikimedia Germany (Deutschland)

Footnote

1. If you’re interested in our challenges and learnings, this post is for you.

Related

Read further in the pursuit of knowledge

Creating our own future, step by step: Looking back on the 2019 Wikimedia Summit

What might the future of the Wikimedia movement look like? At the Wikimedia Summit, held in Berlin, Germany, from 29–31 March, around 210 participants from across the globe gathered to find answers to this question. Over three energetic days, representatives from Wikimedia affiliates, the Wikimedia Foundation, and three Wikimedia committees came together with members of….

Read more

At the 2019 Wikimedia Summit, we’re putting the Wikimedia movement under the microSCOPE

With a new name and a strategy-focused aim, the Wikimedia Summit (formerly known as the Wikimedia Conference) kicks off next month. Around 200 participants from Wikimedia affiliates, the Wikimedia Foundation, and various committees will head to Berlin from 29–31 March to discuss the future of the Wikimedia movement. The finalized schedule will be published in….

Read more

Eureka! A new visual interface for specialized searches

With over five million articles, finding the exact Wikipedia article you want can sometimes feel like you’re searching for the proverbial needle in the haystack. That’s why if you go and search the world’s largest encyclopedia, you will see a new interface that provides several common search terms. No longer will people looking for their….

Read more

Help us unlock the world’s knowledge.

As a nonprofit, Wikipedia and our related free knowledge projects are powered primarily through donations.

Donate now

Contact us

Questions about the Wikimedia Foundation or our projects? Get in touch with our team.
Contact

Photo credits