The anatomy of search: The root of the problem

A galloping overview As we have done before, let’s get a bird’s-eye view of the parts of the search process: text comes in and gets processed and stored in a database (called an index); a user submits a query; documents that match the query are retrieved from the index, ranked based on how well they….

Read more

Wikimedia Foundation collaborates with two initiatives: Mozilla’s OSSN and TeachingOpenSource’s POSSE

We’re always looking for ways to strengthen the open source ecosystem. Over the past two months, the Developer Advocacy team at the Wikimedia Foundation collaborated with two open source initiatives: Mozilla’s Open Source Student Network (OSSN) and TeachingOpenSource.org’s Professors’ Open Source Software Experience  (TOS and POSSE, respectively). OSSN is designed to bring more students into open….

Read more

You can trial content translation (version two!) right now

On International Translation Day, we are opening up early access to version two of the content translation tool, which simplifies the process of translating Wikipedia articles for Wikimedia’s volunteer translators. First released in January 2015, more than 350,000 Wikipedia articles have been created using the tool. Content translation’s second version, previewed last April, is a….

Read more

The anatomy of search: Variation under nature

A galloping overview Let’s first get a bird’s-eye view of the parts of the search process: text comes in and gets processed and stored in a database (called an index); a user submits a query; documents that match the query are retrieved from the index, ranked based on how well they match the query, and….

Read more

Introducing global preferences across Wikimedia wikis

There are a lot of Wikimedia wikis. Wikipedia is the best-known of them all, but there’s also Commons, Wikiquote, Wikisource, Wiktionary, and more. Also, each of these sites is available in multiple languages. Wikipedia, for example, has nearly 300 language editions. The newest launched just this week. Until this month, each wiki and language version….

Read more
Collage of Wikimedia GSoC and Outreachy 2018 students

See the completed Google Summer of Code and Outreachy projects from this year

Fifteen students from India, Israel, and Cameroon contributed over fifty thousand lines of code to Wikimedia projects this past summer* under the mentorship and guidance of twenty-four mentors. Thirteen of those projects were conducted under the Google Summer of Code program, and one was managed under the Outreachy program.  In addition, one student completed a….

Read more

Where did that paragraph go? This software change helps volunteers hold up Wikipedia’s high quality

One key to Wikipedia’s high quality is a system of mutual checks, based on the fact that every version of every page is stored and accessible. Thousands of community members review the latest edits of others in order to find errors or inconsistencies, comparing each new version of a page to older page versions and….

Read more

EventStreams updates: You can now find new events, composite streams, and historical timestamp subscription

Last year, we released the EventStreams service. This service allows anyone to subscribe to recent changes to Wikimedia data. At the time, we only had one stream of data available: RecentChanges. RecentChanges is a stream of Wikimedia change events (e.g. recent edits to pages in Japanese Wikipedia). External developers can consume this stream to create tools or….

Read more

The anatomy of search: A token of my affection

A galloping overview To start, let’s get a bird’s-eye view of the parts of the search process: text comes in and gets processed and stored in a database (called an index); a user submits a query; documents that match the query are retrieved from the index, ranked based on how well they match the query,….

Read more

Evolving the MediaWiki platform: Why we replaced Tidy with a HTML5 parser

Three years ago, the Wikimedia Foundation's Parsing Team decided to replace Tidy, a tool to fix HTML errors, with a HTML5-based tool. Here's what we did in that time period, and what kind of complexities we faced in changing pieces of the technical infrastructure powering Wikimedia wikis.

Read more