Legal talk:Data retention guidelines: Difference between revisions

From Wikimedia Foundation Governance Wiki
Content deleted Content added
Verdy p (talk | contribs)
Line 124: Line 124:
::: And this should be investigated to make sure that there are not "black hats" expliting them to track users up to their source even if these black hats don't know exactly the route followed by this trafic).
::: And this should be investigated to make sure that there are not "black hats" expliting them to track users up to their source even if these black hats don't know exactly the route followed by this trafic).
::: For this I would advocate the development or support of very secure browsers which could hide the user's trafic directly from its source (TOR has this in its specific version of the Mozilla browser; but users are at risk when using any mobile device from famous brands, except possibly the rare mobile devices built on top of Linux OSes, such as Ubuntu Mobile) [[User:Verdy p|verdy_p]] ([[User talk:Verdy p|talk]]) 14:08, 19 February 2014 (UTC)
::: For this I would advocate the development or support of very secure browsers which could hide the user's trafic directly from its source (TOR has this in its specific version of the Mozilla browser; but users are at risk when using any mobile device from famous brands, except possibly the rare mobile devices built on top of Linux OSes, such as Ubuntu Mobile) [[User:Verdy p|verdy_p]] ([[User talk:Verdy p|talk]]) 14:08, 19 February 2014 (UTC)
:::: Thanks for continuing the discussion, Verdy. Let me respond briefly:
:::: '''Server-request metadata''': I agree that there will be a lot of changes in the future. That's why I like the change we made in response to your earlier comments - instead of using a precise, defined list, we gave ourselves some flexibility so that we can do the right thing when new technologies arise. Thank you again for raising that - it is probably one of the most important changes we made in response to community feedback.
:::: '''Public information''': We do publish a lot of information, like precise timestamps. As we've discussed extensively in the main privacy policy discussion, changing these would be a huge change that would break many third-party tools that are quite important to the functioning of the site. So these changes could be made, but the discussion must be had at a technical and social level, with participation from WMF engineering, bot authors, and checkusers (among many other people). The privacy policy has had a lot of positive impact (ops, for example, is starting to improve logging already) but this is only a starting point - more extensive changes of that sort really have to happen separately.
:::: '''Protecting IP users''': I agree that this is important, but it will require deep technological changes to the site. It has to be discussed as a technical issue first, with a good strategy to address it, before we can write it into the privacy policy.
:::: '''Browsers''': I agree that it would be good if browsers and other related tools took privacy more seriously, but that's well outside the scope of what the Foundation can do at this time - we need to focus on what we can control.
:::: Hope that helps explain the situation - thanks again for your serious comments on these important issues. -[[User:LuisV (WMF)|LuisV (WMF)]] ([[User talk:LuisV (WMF)|talk]]) 19:14, 21 February 2014 (UTC)


== Advertising on projects ==
== Advertising on projects ==

Revision as of 19:14, 21 February 2014

Template:Archive box non-auto

Comments from //Shell

The following discussion is closed: Archiving whole section now as they all appear resolved/set, please reopen if not. Will archive soon if still closed. Jalexander--WMF 00:26, 13 February 2014 (UTC) [reply]

Introduction

How long do we retain non-public data?

    • "After no more than 90 days..." I had to think twice about what it means. Would it be possible to say "After at most 90 days..."?
      Good suggestion. I've made the change. Mpaulson (WMF) (talk) 01:03, 10 January 2014 (UTC)[reply]
      Nice. //Shell 09:09, 10 January 2014 (UTC)[reply]
    • "Anonymized" What does this mean? Does it mean that it becomes very difficult to associate the data to a specific user, or that it's completely impossible? (Clarification: Especially for small projects, say 5 editors a normal day)
Hi //Shell, we have added some additional definitions and examples to the definitions section of the guidelines. Thanks for the comment! RPatel (WMF) (talk) 22:23, 3 February 2014 (UTC)[reply]
    • "Email address in account settings: Indefinitely" Does this mean that if I remove or change my email address, the old address will still be kept? Is that the meaning? Is it desirable? Not sure how to rephrase it to only be about the current email address.
I've explained how this happened, and proposed a solution, below. —LVilla (WMF) (talk) 00:48, 4 February 2014 (UTC)[reply]
    • "Non-personal information associated with a user account: Collected from user: Indefinitely" While the given examples seem okay, this category seems broad and that's particularly bad since the data is kept indefinitely. The given examples seem okay, since they're almost already public data (first edit, when a user has verified email, and whether the user edits through mobile are public data). E.g. the list of read articles is not public, but could be covered by this category.
      That's a fair point. It's hard to draw a line and nail down what almost public data means, but the goal of this section and the list of examples we provided is to try and characterize this category of data as much as possible, without providing an exhaustive list (which we can't do, as Michelle notes below). The bottom line is that we want to commit to retaining indefinitely the same kind of data about individual users that we would be comfortable sharing publicly. What makes this data subject to different terms than metadata collected and published when saving an edit is that it's passively collected and not explicitly released under Wikimedia's terms of use. So, in short: while user X registered an account on a mobile device or user X edited a page via Visual Editor or user X was thanked by user Y for an edit s/he made could all be considered examples of almost public data, as they don't disclose anything that falls within the definition of PII, "list of articles read by user X" definitely does: we can't and we won't retain or release this data, unless the user intentionally decides to do so). Maybe the best way to frame this distinction is to say that deciding whether almost public data could be publicly released is not a question settled on legal grounds but on whether it's appropriate and desirable (if needed, a decision could be made via a community consultation or an RFC). Michelle, is that an appropriate distinction? Hope this helps clarify what we're trying to do here, any suggestion to improve the language and terminology is welcome. DarTar (talk) 01:58, 31 January 2014 (UTC)[reply]
      DarTar: A list of read articles is not explicitly listed as "personal information", nor is it explicit in the "How long do we retain public data?" table. I realize that the reason for this might be that it's simply not saved and thus not relevant, but I'd like to see it mentioned somewhere what WMF considers a list of read articles to be. If you wish you could add a note like "currently not kept at all", but things may change, and this seems like a basic piece of information. //Shell 06:54, 11 February 2014 (UTC)[reply]
      Hi Shell! We talked about it a bit after Dario responded to you and decided to add it specifically to the table, so people would be clear about how long we retain that type of information. Hope that helps! Mpaulson (WMF) (talk) 23:59, 11 February 2014 (UTC)[reply]
      Great! //Shell 06:30, 12 February 2014 (UTC)[reply]
    • "Non-personal information associated with a user account: Optionally provided by a user: Logs of terms entered into the site's search box" I realize that "optional" here means that not every WM site visitor must search, but since it's a key part of any wiki it doesn't feel like I "optionally provided" it - I must do it to see the article I'm interested in (ignoring other search engines). No biggie, but feels a bit weird.
      I see your point here, Shell. We weren't sure how to best phrase the differentiation between information collected from the user and information provided by the user. We're open to suggestions though if you or anyone else has one. Mpaulson (WMF) (talk) 01:17, 10 January 2014 (UTC)[reply]
      Would it be possible to remove "optionally" and just say "Provided by a user"? //Shell 09:09, 10 January 2014 (UTC)[reply]
      I would be fine with doing that. I think we originally added "optionally" to more clearly distinguish that kind of data from data that is collected either automatically or actively by us. But obviously, if it makes it less clear rather than helping, we can remove it. Mpaulson (WMF) (talk) 14:32, 10 January 2014 (UTC)[reply]
      I was confused about search terms being optional, since they feel necessary to use the site, while the email address is usually mandatory, but in the Wikimedia case it's optional. So, I wouldn't mind adding back "optional" to the "personal information" one, but it's more consistent not to. //Shell 19:04, 10 January 2014 (UTC)[reply]
    • Do you intend to have most common data in this table, in the form of examples? It would be nice to see a complete list somewhere (though that might be asking too much).
      The table is meant to address broad categories of data so that we address the treatment of as much data as we can in these guidelines. That said, we are going to try to improve the table (and the exceptions section) with more examples over time as we refine our practices. Mpaulson (WMF) (talk) 01:21, 10 January 2014 (UTC)[reply]
      It would be nice to have as many examples as possible, so I could imagine that there was a long list in this table, but collapsed by default. //Shell 09:09, 10 January 2014 (UTC)[reply]
      I agree. The hope is that we will gradually expand the guidelines with more examples over time. I will talk to people internally and see what additional examples (if any) we can add now though. I imagine if the table gets unwieldy, we'll experiment with formatting so that it's as easy-to-read as we can make it. Mpaulson (WMF) (talk) 14:20, 10 January 2014 (UTC)[reply]
      Great. Since there are already examples that feel representative, it's not a big deal, but it'd be nice to eventually have an almost complete list. //Shell 19:04, 10 January 2014 (UTC)[reply]

Definition of personal information (good job!)

    • I can think of a couple more items to put in (b), though I'm not sure if it's necessary: (current) city (clarification: which is different/broader than address), marital status, family ties
      I added "marital and familial status" to the definition. I'm checking internally whether it makes sense to add current city. Mpaulson (WMF) (talk) 18:41, 10 January 2014 (UTC)[reply]
      I was thinking about city, since that something you can "easily" get from an IP address, but street address is not.
      Of course there's lots of other private information, but maybe it's unnecessary to add that, since I don't see how Wikimedia would get the info: income level/economic situation, level of education, profession, current job situation, hobbies/interests (though interests could be gleaned from what pages a user visits).
      There's also the user-agent info: OS/browser version, browser language(s), screen size etc. which websites almost never make public, but which could potentially uniquely identify a user over multiple websites[1]. //Shell 19:04, 10 January 2014 (UTC)[reply]
      We have added user-agent string to the definition of personal information, so that should be covered now. As for the other "private information" you mentioned earlier, I don't think that level of detail is necessary as the categories in (b) are meant to be illustrative examples of what we consider to be "sensitive information". Mpaulson (WMF) (talk) 22:51, 14 January 2014 (UTC)[reply]


Exceptions to these guidelines

    • "Data may be retained in system backups for longer periods of time." Is there any restriction on how long those backups can exist? Would it be possible, for instance, to delete, aggregate, or anonymize them after at most 5 years?
      Hi Shell. We have talked internally about your proposal in a significant fashion and agree that it's a good idea. I will be adding corresponding language to the guidelines. Thank you so much for your suggestion. Mpaulson (WMF) (talk) 23:51, 11 February 2014 (UTC)[reply]
      Good. //Shell 06:31, 12 February 2014 (UTC)[reply]

Design of new systems

    • "inclusion of privacy considerations in the code review process". Would this be added to some checklist, or is it just a general guideline?

Great to see this stuff be explicit. //Shell 00:15, 10 January 2014 (UTC)[reply]

Hi, //Shell: The idea was "both", I think - we'd like all Mediawiki developers to be generally careful/sensitive about this, but Engineering would also like to add it to the formal code review guidelines the Mediawiki community has. You can see some discussion about what text to add to the guidelines over on mediawiki.org. —LVilla (WMF) (talk) 18:31, 30 January 2014 (UTC)[reply]
Ok. I'm just interested on an overview level regarding this issue and won't be following what exact protocols you decide on. Anyway, good! //Shell 06:42, 11 February 2014 (UTC)[reply]

General comment/response to Shell

Hi Shell! Thank you for taking the time to comment and help us improve these guidelines. Your suggestions are always helpful and greatly appreciated. We will respond in-line to your comments as we work through them. Mpaulson (WMF) (talk) 00:51, 10 January 2014 (UTC)[reply]
I've responded to your comments. (and clarified a couple of things) //Shell 09:09, 10 January 2014 (UTC)a[reply]

Indefinite retention of emails

The following discussion is closed: closing as it looks resolved/set, please reopen if not. Will archive soon if still closed. Jalexander--WMF 00:26, 13 February 2014 (UTC)[reply]

Why would emails be retained indefinitely? I would have expected that if an account gets "officially" closed, the user identifies under a new account and declares the old one as discontinued, or exercises their Right to Vanish, these are all scenarios where an email would not be kept on record forever. -- (talk) 08:03, 11 January 2014 (UTC)[reply]

That was because of a misunderstanding on my part; it does work as you'd expect and I had it wrong while drafting. I'm working on figuring out how to make the table more accurate (probably a new row for things that will be deleted when users delete them, like email) and will post above in Shell's thread when we've figured that out. —LVilla (WMF) (talk) 01:34, 23 January 2014 (UTC)[reply]
After trying to find the specific sub-section in Shell's list above, I realized it was just easier to post here and reopen this one :)
This category was always intended primarily for account settings. Since we aren't currently aware of other examples that would fit in this category as it was designed, we propose removing it and adding this row instead:
Data type Origin Examples Maximum Retention Period
Personal information Account settings * Email address Until user deletes/changes the account setting.
Does that work/make sense? —LVilla (WMF) (talk) 00:47, 4 February 2014 (UTC)[reply]
Sounds good to me. :-) //Shell 06:32, 12 February 2014 (UTC)[reply]

Non-personal information associated with a user account (server logs)

This should include the contents of some HTTP headers, which may have privacy concerns, including:

  • Referer: the previous page visited, which may be on any other site (in my opinion if this is from another site, it is strictly private and can only be used as analytic data, only in aggregated forms by origin domain). Almost all browsers send this information by default (unless the user has installed a filtering plugin).
  • Accept-Language: the default language of the browser used, or the list of prefered languages defined in browser preferences; some combinations of prefered languages may be very user-specific, and notably if this/these languages are very uncommon in the country or region associated to the géolocalized IP (e.g. Icelandic or Wolof selected by a user currently in locations like Monaco, Addis Abheba or Harbin, China).
  • User-Agent: and Accept: which identify precisely the type and version of the browser, and of its supported or installed plugins. These indormations are used by CheckUser admins teying to identify a user given its past navigation with the same browser installation when IP only is not enough to assert that this is the same user. The exact configation of these combinations of software versions may be very unique to a user; notably when the user has installed some uncommon plugin (this includes media player extensions, or localized versions of security tools) or uses an uncommon browser for a specific platform.
  • X-Chrome-*: and similar custom HTTP headers defined by browsers or plugins (including antivirus tools), some of these headers contain user id's (associated to registration of the plugin or browser; this is very common for media players, or custom browsers embedded within game softwares, or within game consoles, or in some smart TV sets or set top boxes, or in some brands of mobile devices).
  • Via: and similar HTTP headers defined by proxies relaying the user navigation. Some of these headers identify the origin user behind a non-anomizing proxy. Frequently, they contain personal information such as an authorized user name registered on the proxy, or the IP address of the connected user, or some hardware identifier of a mobile device using a public hotspot, or some user id associated internally by the proxy or hotspot (for example in a McDonald restaurant or in a train station), or session identifiers generated on those proxies or hotspots locally associated to an identified user whose account there may persist there for long, and will be sent again each time the same user returns to the same location to use the hotspot with the same device or same local user account). Generally these identifiers (and the full set of HTTP headers) may be requested by admins of these proxies or hotspot, when they receive an alert that one of its users is using their service to abuse external sites such as Wikimedia.

There are also:

  • Cookies: but they are defined by the visited site itself and should be subject to the policy about permanent or session cookies defined by the visited wikimedia sites (this includes cookies generated once the user logs on any Wikimedia site with SUL).
  • Data collected by javascript (or scripted plugins such as Flash and media codecs), which can collect other capabilities of the device (such as as the display resolution), or its settings, and data sent to servers by dynamic HTTP requests generated by these scripts. Some of these scripts may also send regular "ping" events to show that the user is still connected to the same page. It could even track what the user is reading specifically in the page (for example when the user interacts with it to inhide a "rolled" box, or when he clicks on visible tabs to see other tabs. Some browser-side scripts may also respond to servers, in response to an incoming event from the server. This allows a site to know that the user is active for long on one specific page; however these data perform separate HTTP requests, in the background, which are not always on the same site as the visited site, and that are logged separately on the queried server).
  • Data collected by media players for tracking the quality of connections for the delivery of streams. In some cases the media players will switch to use another stream.
  • Some medias such as video and audio include timecodes that also allows the site to track which part of the media has been played, and how many times by the user. When the user pauses the media, rollbacks to repeat it, or skips some parts, the media server may know it.
  • DNS resolution requests and similar "site info" requests, including for getting TXT records checked by security tools, of "finger" and "whois" info: not all of them are coming from an ISP but may be performed directly to Wikimedia DNS servers from a plugin in the browser or from the browser itself (trying to assess the site). Some of these requests may be very user-specific if they test some aliased subdomain names within Wikmedia domains, or if they perform queries that are typically only performed by ISPs. Users may perform direct DNS requests to Wikimedia domains. In some cases the ISP may reveal information about the user for which it forwards the DNS resolution request, as part of the DNS query itself in timely reproducible patterns of events. These requests are not reaching a webserver but an infrastructure server managed by Wikimedia (but possibly hosted by a third party domain hosting provider, operating with their own data retention and privacy policies).

More generally, this data includes everything that is stored by the webserver in the server logs, and it is much more than just the IP or the URL visited with its query parameters (some webserver logs may add query parameters not present in the URL but added in POST data (and that may be converted by one of the front proxies used by Wikimedia sites into GET parameters present in the URL submitted to the backend server).

Note that there are logs stored in front proxies (including instances the various Squid instances connected to the public IP address) and logs stored by backend webservers. There may be filters in front proxies, and front proxes may anonymize part of these requests (notably requests whose cacheable results will be delivered to multiple users).

Server logs are concerned by US laws, when they require that the sites in US retain these logs for some period of time. All these logs are also used by CheckUser admins. verdy_p (talk) 00:53, 15 January 2014 (UTC)[reply]

Hi, Verdy:
Thanks for your detailed thinking on this. There are many different parts to this; let me try to respond in pieces:
User Agent information: We agree that UAs should be treated as personal information, and covered by this policy; that is why it is in the definition of PI :) We're already working on this, for example by filtering UAs in Labs and by working to sanitize them in Event Logging.
Other HTTP headers: I see your point about putting this in PI. We’re talking with analytics and ops about how best to handle them.
Cookies: These aren’t data stored on our servers so they aren't appropriate for this policy. They are instead addressed in the general privacy policy.
Other examples from site users: You listed a lot of other examples, such as data collected through javascript methods, and from hypothetical future media servers. Some of these we implicitly mention (EventLogging is a javascript-based tool); others are not. We’ll try to expand the list of examples over time, but ultimately, the examples are examples - they can’t be, and aren’t intended to be, a complete list. Instead, we’ll apply the general principles described here as situations come up.
Examples from outside the sites: DNS logs would be covered by this policy, since they are “services” as defined in the Privacy Policy, and don’t currently have a separate privacy policy. You’re correct to point out that in some circumstances those logs could be identifying.
US law and log retention: There may be some unusual circumstances where we're required to stop deleting logs (i.e., if we're sued and the logs have some data relevant to that) but as a general matter there are no US laws (federal or state) that require log retention.
Hope that helps clarify. —LVilla (WMF) (talk) 02:16, 4 February 2014 (UTC)[reply]
Further followup on DNS: the DNS tool we use doesn't log requests at all, only aggregate counts. Hope that helps. —LVilla (WMF) (talk) 20:08, 4 February 2014 (UTC)[reply]
@Verdy p: Final point on other HTTP headers: we talked about this after your question, and we realized that our approach was not quite right. As a oresult, we've made changes to the data retention policy and the privacy policy. I've discussed the changes in a lot more detail on the main privacy policy talk page. Thanks for raising the issue! —LVilla (WMF) (talk) 18:49, 13 February 2014 (UTC)[reply]
Thanks a lot for taking note about these issues and revisiting a few missing/unclear items.
However this subject of meta-data in server requests; as well as the integration of active components (like multimedia plugins) is not closed. As techologies will continue to evolve; and browsers as well (or security suites) performing some hidden background requests to many other third parties, we'll needto track it for a long time. The issue is more sever with components that are mandatory parts of the Internet architecture itself (notably DNS, IP routing data exchanges, finger, the PKI architecture and secure authentication key exchanges) and other technologies supposed to mitigate this risk (such as DNT protocols). DNS is now the most attacked protocol (in terms of global network neutrality) by ISPs themselves (and all their thrd-party service providers).
I'm not even sure that the use of HTTPS now on Wikimedia will really improve the privacy, or if it will not just help those that want to identify and track users... Even users of The Onion Network may also find problems in terms of being tracked (even if the exchanged contents are encrypted! It will still be easy to track recent changes occuring in MediaWiki projects to correlate them with traffic initiated from one "anonymous site" whose authentication key may be indexed at its source and correlated to trafics reaching the public sites).
May be we publish too many things on Wikimedia public logs (we could mitigate this risk by reducing the precision of timestamps to only 5 minutes, and shuffling entries from multiple users so that they won't have a deduced order of occurences; also we should probably hide part of IP addresses for non-logged in users, to only about 20 bits; we could also assign better "anonymous user names" for these IPs, for example by hashing these addresses with the time of creation of the user name and some secret data used at that time for a limited period and changed regularly: the server would issue new randomized data for each new period of time, for example once every week; by encrypting the start time of that period, with an encryption key owned only by the WMF, and then using that time-key as additional data to the IP address for generating a string hash used as the "public user name"). We should better protect the privacy of IP users (notably because they may be not logged in by accident (by expiration of their current login session); ans so we should not reveal these IP publicly (let's leave that possibility only to CheckUsers using server logs.)
Note that the public username assigned for IP-only users (connected with IPv4 or IPv6), the encrypted user id generated as above (a unique but temporary id not lasting more than one week; so that admins can still block most abusers easily for one week), may take the form of a 128-bit IPv6 address allocated in a private IPv6 address block: It will not be routable on the Internet (except possibly via Wikimedia servers offering some routing to these users, using the privately stored secure mappings). This form would work with existing tools that expect to parse IP users as those using a username looking like an IP address.
And this should be investigated to make sure that there are not "black hats" expliting them to track users up to their source even if these black hats don't know exactly the route followed by this trafic).
For this I would advocate the development or support of very secure browsers which could hide the user's trafic directly from its source (TOR has this in its specific version of the Mozilla browser; but users are at risk when using any mobile device from famous brands, except possibly the rare mobile devices built on top of Linux OSes, such as Ubuntu Mobile) verdy_p (talk) 14:08, 19 February 2014 (UTC)[reply]
Thanks for continuing the discussion, Verdy. Let me respond briefly:
Server-request metadata: I agree that there will be a lot of changes in the future. That's why I like the change we made in response to your earlier comments - instead of using a precise, defined list, we gave ourselves some flexibility so that we can do the right thing when new technologies arise. Thank you again for raising that - it is probably one of the most important changes we made in response to community feedback.
Public information: We do publish a lot of information, like precise timestamps. As we've discussed extensively in the main privacy policy discussion, changing these would be a huge change that would break many third-party tools that are quite important to the functioning of the site. So these changes could be made, but the discussion must be had at a technical and social level, with participation from WMF engineering, bot authors, and checkusers (among many other people). The privacy policy has had a lot of positive impact (ops, for example, is starting to improve logging already) but this is only a starting point - more extensive changes of that sort really have to happen separately.
Protecting IP users: I agree that this is important, but it will require deep technological changes to the site. It has to be discussed as a technical issue first, with a good strategy to address it, before we can write it into the privacy policy.
Browsers: I agree that it would be good if browsers and other related tools took privacy more seriously, but that's well outside the scope of what the Foundation can do at this time - we need to focus on what we can control.
Hope that helps explain the situation - thanks again for your serious comments on these important issues. -LuisV (WMF) (talk) 19:14, 21 February 2014 (UTC)[reply]

Advertising on projects

This discussion is open since the 10th of January, and due to close on 4 days. However, it seems that no advertising of its existence as been made (since today) on the french wikipedia (correct me if I'm wrong). I see that as a problem, since those guidelines will affect all users of the projects of the Wikimedia Foundation...

Pleclown (talk) 10:46, 10 February 2014 (UTC)[reply]

Hi Pleclown. You are correct that the only notifications that went out were to wikimedia-l/WikimediaAnnounce and on the talk pages for the privacy policy and access policy. However, the Data Retention Guidelines are just that-- guidelines. Unlike a Policy, guidelines do not require a vote from the Board and can be amended at any time. We intend for it to be a living document and we welcome discussion about it even after the consultation period ends. Thanks for the input! RPatel (WMF) (talk) 23:08, 10 February 2014 (UTC)[reply]

Location information

This is part of the personal information definition, and it needs to be more specific. First, please revise to "...location information (if you have not posted it publicly)". In other words, personal information voluntarily provided on a WMF project by an individual can't really be treated in the same way as personal information that has not been publicly provided.

With respect again to location, when wearing my checkuser hat, I think we might need to be a bit more clear as to what would or would not fall into the "location" issue. Is naming the country giving away location? This comes up regularly when addressing sockpuppetry issues. Risker (talk) 16:44, 10 February 2014 (UTC)[reply]

Hi Risker! That's a good call. I've added your suggested phrasing accordingly. With regards to location, the privacy policy permits public disclosure of location information as long as it's properly aggregated or anonymized. Most of the time, a country-level identification would be sufficient to protect a user's identity. However, there may be rare cases where there is such a small community in a particular country that we would not feel comfortable releasing even country-level information. Does that make sense? Mpaulson (WMF) (talk) 19:07, 11 February 2014 (UTC)[reply]

Closing of the Consultation Period for the Data Retention Guidelines

The community consultation for the Data Retention Guidelines has closed as of 14 February 2014. We thank the community members who have participated in this discussion since the opening of the consultation on 09 January 2014 and have helped make the Guidelines better as a result. Although we are closing the community consultation, we welcome community members to continue the discussion. The Guidelines are intended to evolve and expand over time. You can read more about the consultation on the Wikimedia blog. Mpaulson (WMF) (talk) 00:02, 15 February 2014 (UTC)[reply]