Legal:Data retention guidelines

From Wikimedia Foundation Governance Wiki
Revision as of 00:47, 10 January 2014 by JVargas (WMF) (talk | contribs) (Changed Definitions section, and added definitions disclaimer.)

Data_retention_guidelines/Introduction

Introduction

Data is important. It is how we can learn and grow as an organization and a movement, and how we can help make the projects better for those who use them to create, learn, and share. At the same time, we are committed to keeping your private data "for the shortest possible time that is consistent with maintenance, understanding, and improving the Wikimedia Sites, and our obligations under applicable U.S. law" (quote from the Wikimedia Foundation Privacy Policy).

This document helps explain how we fulfill this commitment, by describing our guidelines for data retention, system design, and ongoing auditing and maintenance. These guidelines are meant to be a living document — they will be updated over time to reflect current retention practices.

To what data do these guidelines apply?

These guidelines apply to all non-public data we collect from Wikimedia Sites covered by the Privacy Policy.

How long do we retain non-public data?

Unless otherwise indicated, we retain the following types of data for no more than the following periods of time:

Data type Examples Maximum Retention Period
Personal information Collected from users
  • IP addresses of site visitors (operational data)
  • IP addresses of A/B test subjects (analytical data)
After no more than 90 days, it will be deleted, aggregated, or anonymized
Optionally provided by a user
  • Email address in account settings
Indefinitely
Non-personal information associated with a user account* Collected from user Indefinitely
Optionally provided by a user
  • Logs of terms entered into the site's search box
After no more than 90 days, association with personal information will be deleted, aggregated, or anonymized
Non-personal information not associated with a user account* Collected from user Indefinitely

*For the purposes of this table, “user account” means username, user ID, or IP address.

Definitions

For the purposes of these guidelines, "personal information" means information you provide us or information we collect from you that could be used to personally identify you. To be clear, while we do not necessarily collect all of the following types of information, we consider the following to be "personal information" if it is otherwise nonpublic:

(a) your real name, address, phone number, email address, password, identification number on government-issued ID, IP address, credit card number;
(b) when associated with one of the items in subsection (a), any sensitive data such as date of birth, gender, sexual orientation, racial or ethnic origins, medical conditions or disabilities, political affiliation, and religion; and
(c) any of the items in subsections (a) or (b) when associated with your user account.

Terms that are not defined in this document have the same meaning given to them in the Privacy Policy.

Exceptions to these guidelines

If we make exceptions to these guidelines, we will notify the community by describing the exception on this page.

Data may be retained in system backups for longer periods of time.

Audits for existing systems

These guidelines are based on practices that the Foundation has generally followed for many years, particularly the 90-day rule for IP addresses and similar personal information in our server logs. However, our older systems may not always comply with these new guidelines, particularly for personal information other than IP addresses. As a result, once these guidelines are adopted, WMF’s technology teams plan to audit our existing systems and bring them into compliance. Because of the size and scope of these systems, this audit will necessarily occur in a gradual fashion.

Design of new systems

In order to support these data retention periods and our overall privacy policy, new tools and systems implemented by the Foundation will be designed with privacy in mind. This will include:

  • inclusion of these data retention guidelines as requirements during the design process
  • legal consultation during the design and development process
  • inclusion of privacy considerations in the code review process

Ongoing handling of new information

Despite our best efforts in designing and deploying new systems, we may occasionally record personal information in a way that does not comply with these guidelines. When we discover such an oversight, we will promptly comply with the guidelines by deleting, aggregating, or anonymizing the information as appropriate.