A new platform to explore statistics about Wikimedia projects

Photo by SpaceX, CC0.

Wikistats 2 builds on the success of Wikistats, the project started more than 15 years ago by Erik Zachte. Wikistats has been the canonical source of statistics about the reach and impact of the Wikimedia movement for many years. It offered a quantitative mirror to the Wikimedia communities to reflect on their growth, gaps and strategic opportunities. It also provided one of the earliest public data sources for the study of large-scale peer production communities, and as such has been cited nearly a thousand times in the literature.

As detailed in Wikistats 2’s documentation, there are several noticeable changes in the new site’s design, but the biggest changes come on the backend. In this post, we’ll detail what changes you’ll see, and explain how to access the data programmatically.

What’s new? Pretty much everything … but the data!

The data-processing pipeline for the new Wikistats has been rebuilt from scratch. It uses distributed-computing open source technology such as Hadoop, Spark, Sqoop, and Hive to ingest and enhance projects data, and loads a prepared version of the whole history of every projects into Druid, a fast-computing analytics server. Druid then serves sliced and diced subsets of data through the Analytics Query Service, the MediaWiki external API for analytics data.

A brand new front-end has also been designed and built on top of the new API. The dashboard concentrates many information, providing an easy way to overlook any project at a glance. More details can be found in the three sections of the dashboard which are labeled Contributing, Reading and Content. The Contributing section is about edits and editors, the Reading one about visited articles and unique-devices, and the Content contains article-level statistics.

You may notice that the data that exists in Wikistats 2.0 is the same data that existed in Wikistats. For this alpha release, we decided to replicate the existing metrics. In doing so we had two goals in mind: We wanted to test this new dashboard against a time-proofed one, and we also wanted to provide existing Wikistats users with statistics that closely matched those they are familiar with. We succeeded relatively well at replicating the existing statistics.

How to access the data programmatically

You can access the same data that powers the new user-interface by querying a RESTful API. The full documentation is available on this page, but we’ll walk you through some examples.

Let’s get the number of edits made every day in October 2017 for Wikipedia in Spanish:

There are two parameters in the above URL telling us about editor-types and page-types. The editor-types parameter allows to filter by anonymous users (anonymous), registered users declared as robots (group-bot), registered users not declared as bots but that we suspect are nonetheless (name-bot), and registered users the we think are legitimate humans (user). The page-types parameter is  about content versus non-content pages. Content pages are located in the main namespace, while non-content pages refer to talk pages, and others special namespaces.

A second example: We want to find number of human editors who have made more than 100 edits over the course of a month, each month between January and July 2015 on the Commons project:

This request introduces a new parameter, named activity-level. It is defined for requests on editors and edited-pages and allows to filter for specific levels of activity (1..4-edits, 5..24-edits, 25-99-edits, 100..-edits, or all-activity-levels for no filtering).

And a last one, just for fun! Let’s say we want to find the number of  pages visited by regular users (not bots) between december 2016 and January 2017 on the English-language edition of Wikipedia. You can see how to add dates below:

That’s it! Please let us know what you like or dislike about the new dashboard, and particularly don’t hesitate to file bugs. This will help us graduate that alpha version to the beta stage.

Joseph Allemandou, Senior Software Engineer, Analytics
Wikimedia Foundation


Read further in the pursuit of knowledge

Arrows on road

Setting the record straight—WT:Social is not affiliated with Wikipedia or the Wikimedia Foundation

As of late, we have received several questions about the Wikimedia Foundation and Wikipedia’s affiliation with WT:Social. The recently launched WT:Social is related to WikiTribune, a venture independently initiated by Wikipedia founder Jimmy Wales. Wikipedia and the Wikimedia Foundation are separate and independent from WT:Social. We have no connection to the social networking site. The….

Read more
A group of men celebrate with the World Cup trophy amidst a shower of confetti

Wikipedia’s most-popular articles of 2018 show that pop culture rules over us all

People visited Wikipedia over 190 billion times in 2018 alone, many motivated by the encyclopedia’s wealth of in-depth articles about topics you didn’t know enough about. But in looking at the English Wikipedia’s most-popular articles of 2018, it’s clear that one motivation reigned supreme. People wanted to keep up with the popular culture moments happening….

Read more

Five ways academics can contribute to Wikipedia

In recent weeks, the world learned about Dr. Donna Strickland, only the third woman to be awarded the Nobel Prize in Physics. It also learned that Wikipedia lacked an article on Strickland amongst its over five million articles. Wikipedia subsequently received justifiable criticism for its low percentage of female editors, its editing culture, and its….

Read more

Help us unlock the world’s knowledge.

As a nonprofit, Wikipedia and our related free knowledge projects are powered primarily through donations.

Donate now

Connect Связь

Stay up-to-date on our work.

Get email updates

Subscribe to news about ongoing projects and initiatives.

This mailing list is powered by MailChimp. The Wikimedia Foundation will handle your personal information in accordance with this site’s privacy policy.

Contact a human

Questions about the Wikimedia Foundation or our projects? Get in touch with our team.

Photo credits



NYU library2 crop

Detroit Publishing Company/restored by Durova

Public domain