Can machine learning uncover Wikipedia’s missing “citation needed” tags?

One of the key mechanisms that allows Wikipedia to maintain its high quality is the use of inline citations. Through citations, readers and editors make sure that information in an article accurately reflects its source. As Wikipedia’s verifiability policy mandates, “material challenged or likely to be challenged, and all quotations, must be attributed to a….

Read more
A lock and chain hold a gate shut.

How many Wikipedia references are available to read? We measured the proportion of open access sources across languages and topics.

Let’s say you’re planning a trip to a subtropical region and you want to learn about available vaccines for yellow fever. You look up the English Wikipedia article. You’re lucky to find a well-sourced section, with a wealth of references, many of them pointing to information from public health agencies and reputable news articles. Great!….

Read more

What are the ten most cited sources on Wikipedia? Let’s ask the data.

A new dataset of fifteen million records documents source usage in Wikipedia by identifier and across languages.

Read more

How we’re using machine learning to visually enrich Wikidata

Only 2.5 million of 45 million Wikidata items have an image attached. A new algorithm helps people find relevant and high-quality images to add to Wikidata items.

Read more