Policy talk:Privacy policy: Difference between revisions

From Wikimedia Foundation Governance Wiki
Content deleted Content added
Line 629: Line 629:
::::{{ping|LVilla (WMF)}} are normal admins (sysops) exempt from this policy, or does that wording only apply to CU/OS/Stewards, who have [[OS|more]] [[CU|specific]] [[Access to nonpublic information policy|policies]]? [[User:PiRSquared17|PiRSquared17]] ([[User talk:PiRSquared17|talk]]) 21:53, 16 May 2014 (UTC)
::::{{ping|LVilla (WMF)}} are normal admins (sysops) exempt from this policy, or does that wording only apply to CU/OS/Stewards, who have [[OS|more]] [[CU|specific]] [[Access to nonpublic information policy|policies]]? [[User:PiRSquared17|PiRSquared17]] ([[User talk:PiRSquared17|talk]]) 21:53, 16 May 2014 (UTC)
:Hi [[User:Reguyla|Reguyla]] & [[User:PiRSquared17|PiRSquared17]]. Thank you for your comments and questions. We wanted to clarify why [[m:Privacy_policy#What_This_Privacy_Policy_Does_.26_Doesn.27t_Cover|administrative volunteers are excluded from the privacy policy]]. The privacy policy is meant to be an agreement between the Foundation and its users on how the Foundation will handle user data. The Foundation can’t control the actions of community members such as administrative volunteers, so we don’t include them under the privacy policy. However, administrative volunteers, including CheckUsers and Stewards are subject to the [[m:Access_to_nonpublic_information_policy|access to nonpublic information policy]] (access policy). Under the access policy, these volunteers must sign a [[m:Confidentiality_agreement_for_nonpublic_information|confidentiality agreement]] which requires them to treat any personal information that they handle according to the same standards outlined in the privacy policy. So, even though administrative volunteers are not included in the privacy policy, the access policy and the confidentiality agreement require them to follow the same rules set forth in the privacy policy. I hope that clears up any confusion. [[User:RPatel (WMF)|RPatel (WMF)]] ([[User talk:RPatel (WMF)|talk]]) 20:48, 20 May 2014 (UTC)
:Hi [[User:Reguyla|Reguyla]] & [[User:PiRSquared17|PiRSquared17]]. Thank you for your comments and questions. We wanted to clarify why [[m:Privacy_policy#What_This_Privacy_Policy_Does_.26_Doesn.27t_Cover|administrative volunteers are excluded from the privacy policy]]. The privacy policy is meant to be an agreement between the Foundation and its users on how the Foundation will handle user data. The Foundation can’t control the actions of community members such as administrative volunteers, so we don’t include them under the privacy policy. However, administrative volunteers, including CheckUsers and Stewards are subject to the [[m:Access_to_nonpublic_information_policy|access to nonpublic information policy]] (access policy). Under the access policy, these volunteers must sign a [[m:Confidentiality_agreement_for_nonpublic_information|confidentiality agreement]] which requires them to treat any personal information that they handle according to the same standards outlined in the privacy policy. So, even though administrative volunteers are not included in the privacy policy, the access policy and the confidentiality agreement require them to follow the same rules set forth in the privacy policy. I hope that clears up any confusion. [[User:RPatel (WMF)|RPatel (WMF)]] ([[User talk:RPatel (WMF)|talk]]) 20:48, 20 May 2014 (UTC)
::The [[Access to nonpublic information policy]] does not apply to "normal" sysops who are not identified to the Wikimedia Foundation, but who may have access to some private data (deleted edits). [[User:PiRSquared17|PiRSquared17]] ([[User talk:PiRSquared17|talk]]) 23:07, 20 May 2014 (UTC)


== Typo ==
== Typo ==

Revision as of 23:07, 20 May 2014

Template:Autotranslate User:MiszaBot/config

Shortcut:
T:P


What is changing?

Several comments below ask about what’s new in this draft as compared to the current privacy policy. To help new folks just joining the conversation, we have outlined the main changes in this box. But feel free to join the discussion about these changes here.

As a general matter, because the current privacy policy was written in 2008, it did not anticipate many technologies that we are using today. Where the current policy is silent, the new draft spells out to users how their data is collected and used. Here are some specific examples:

  1. Cookies: The current policy mentions the use of temporary session cookies and broadly states some differences in the use of cookies between mere reading and logged-in reading or editing. The FAQ in the new draft lists specific cookies that we use and specifies what they are used for and when they expire. The draft policy further clarifies that we will never use third-party cookies without permission from users. It also outlines other technologies that we may consider using to collect data like tracking pixels or local storage.
  2. Location data: Whereas the current policy does not address collection and use of location data, the draft policy spells out how you may be communicating the location of your device through GPS and similar technologies, meta data from uploaded images, and IP addresses. It also explains how we may use that data.
  3. Information we receive automatically: The current policy does not clearly explain that we can receive certain data automatically. The new draft explains that when you make requests to our servers you submit certain information automatically. It also specifies how we use this information to administer the sites, provide greater security, fight vandalism, optimize mobile applications, and otherwise make it easier for you to use the sites.
  4. Limited data sharing: The current policy narrowly states that user passwords and cookies shouldn’t be disclosed except as required by law, but doesn’t specify how other data may be shared. The new draft expressly lists how all data may be shared, not just passwords and cookies. This includes discussing how we share some data with volunteer developers, whose work is essential for our open source projects. It also includes providing non-personal data to researchers who can share their findings with our community so that we can understand the projects and make them better.
  5. Never selling user data: The current policy doesn’t mention this. While long-term editors and community members understand that selling data is against our ethos, newcomers have no way of knowing how our projects are different from most other websites unless we expressly tell them. The new draft spells out that we would never sell or rent their data or use it to sell them anything.
  6. Notifications: We introduced notifications after the current policy was drafted. So, unsurprisingly, it doesn’t mention them. The new draft explains how notifications are used, that they can sometimes collect data through tracking pixels, and how you can opt out.
  7. Scope of the policy: The current policy states its scope in general terms, and we want to be clearer about when the policy applies. The new draft includes a section explaining what the policy does and doesn’t cover in more detail.
  8. Surveys and feedback: The current policy doesn’t specifically address surveys and feedback forms. The new draft explains when we may use surveys and how we will notify you what information we collect.
  9. Procedures for updating the policy: The new draft specifically indicates how we will notify you if the policy needs to be changed. This is consistent with our current practice, but we want to make our commitment clear: we will provide advance notice for substantial changes to the privacy policy, allow community comment, and provide those changes in multiple languages.

This is of course not a comprehensive list of changes. If you see other changes that you are curious about, feel free to raise them and we will clarify the intent.

The purpose of a privacy policy is to inform users about what information is collected, how it is used, and whom it is shared with. The current policy did this well back when it was written, but it is simply outdated. We hope that with your help the new policy will address all the relevant information about use of personal data on the projects. YWelinder (WMF) (talk) 01:07, 6 September 2013 (UTC)[reply]



Handling our user data - an appeal

Preface (Wikimedia Deutschland)

For several months, there have been regular discussions on data protection and the way Wikimedia deals with it, in the German-speaking community – one of the largest non-English-speaking communities in the Wikimedia movement. Of course, this particularly concerns people actively involved in Wikipedia, but also those active on other Wikimedia projects.

The German-speaking community has always been interested in data protection. However, this particular discussion was triggered when the Deep User Inspector tool on Tool Labs nullified a long-respected agreement in the Toolserver, that aggregated personalized data would only be available after an opt-in by the user.

As the Wikimedia Foundation is currently reviewing its privacy policy and has requested feedback and discussion her by 15 January, Wikimedia Deutschland has asked the community to draft a statement. The text presented below was largely written by User:NordNordWest and signed by almost 120 people involved in German Wikimedia projects. It highlights the many concerns and worries of the German-speaking community, so we believe it can enhance the discussion on these issues. We would like to thank everyone involved.

This text was published in German simultaneously in the Wikimedia Deutschland-blog and in the Kurier, an analogue to the English "Signpost". This translation has been additionally sent as a draft to the WMF movement-blog.

(preface Denis Barthel (WMDE) (talk), 20.12.)

Starting position

The revelations by Edward Snowden and the migration of programs from the Toolserver to ToolLabs prompted discussions among the community on the subject of user data and how to deal with it. On the one hand, a diverse range of security features are available to registered users:

  • Users can register under a pseudonym.
  • The IP address of registered users is not shown. Only users with CheckUser permission can see IP addresses.
  • Users have a right to anonymity. This includes all types of personal data: names, age, background, gender, family status, occupation, level of education, religion, political views, sexual orientation, etc.
  • As a direct reaction to Snowden’s revelations, the HTTPS protocol has been used as standard since summer 2013 (see m:HTTPS), so that, among other things, it should no longer be visible from outside which pages are called up by which users and what information is sent by a user.

On the other hand, however, all of a user’s contributions are recorded with exact timestamps. Access to this data is available to everyone and allows the creation of user profiles. While the tools were running on the Toolserver, user profiles could only be created from aggregated data with the consent of the user concerned (opt-in procedure). This was because the Toolserver was operated by Wikimedia Deutschland and therefore subject to German data protection law, one of the strictest in the world. However, evaluation tools that were independent of the Foundation and any of its chapters already existed.

One example is Wikichecker, which, however, only concerns English-language Wikipedia. The migration of programs to ToolLabs, which means that they no longer have to function in accordance with German data protection law, prompted a survey of whether a voluntary opt-in system should still be mandatory for X!’s Edit Counter or whether opt-in should be abandoned altogether. The survey resulted in a majority of 259 votes for keeping opt-in, with 26 users voting for replacing it with an opt-out solution and 195 in favor of removing it completely. As a direct reaction to these results, a new tool – Deep User Inspector – was programmed to provide aggregated user data across projects without giving users a chance to object. Alongside basic numbers of contributions, the tool also provides statistics on, for example, the times on weekdays when a user was active, lists of voting behavior, or a map showing the location of subjects on which the user has edited articles. This aggregation of data allows simple inferences to be made about each individual user. A cluster of edits on articles relating to a certain region, for example, makes it possible to deduce where the user most probably lives.

Problems

Every user knows that user data is recorded every time something is edited. However, there is a significant difference between a single data set and the aggregated presentation of this data. Aggregated data means that the user’s right to anonymity can be reduced, or, in the worst case, lost altogether. Here are some examples:

  • A list of the times that a user edits often allows a deduction to be made as to the time zone where he or she lives.
  • From the coordinates of articles that a user has edited, it is generally possible to determine the user’s location even more precisely. It would be rare for people to solely edit area X, when in fact they came from area Y.
  • The most precise deductions can be made by analyzing the coordinates of a photo location, as it stands to reason that the user must have been physically present to take the photo.
  • Places of origin and photo locations can reveal information on the user’s means of transport (e.g. whether someone owns a car), as well as on his or her routes and times of travel. This makes it possible to create movement profiles on users who upload a large number of photos.
  • Time analyses of certain days of the year allow inferences to be drawn about a user’s family status. It is probable, for example, that those who tend not to edit during the school holidays are students, parents or teachers.
  • Assumptions on religious orientation can also be made if a user tends not to edit on particular religious holidays.
  • Foreign photo locations either reveal information about a user’s holiday destination, and therefore perhaps disclose something about his or her financial situation, or suggest that the user is a photographer.
  • If users work in a country or a company where editing is prohibited during working hours, they are particularly vulnerable if the recorded time reveals that they have been editing during these hours. In the worst-case scenario, somebody who wishes to harm the user and knows extra information about his or her life (which is not unusual if someone has been an editor for several years) could pass this information on to the user’s employer. Disputes within Wikipedia would thus be carried over into real life.

Suggestions

Wikipedia is the fifth most visited website in the world. The way it treats its users therefore serves as an important example to others. It would be illogical and ridiculous to increase user protection on the one hand but, on the other hand, to allow users’ right to anonymity to be eroded. The most important asset that Wikipedia, Commons and other projects have is their users. They create the content that has ensured these projects’ success. But users are not content, and we should make sure that we protect them. The Wikimedia Foundation should commit to making the protection of its registered users a higher priority and should take the necessary steps to achieve this. Similarly to the regulations for the Toolserver, it should first require an opt-in for all the tools on its own servers that compile detailed aggregations of user data. Users could do this via their personal settings, for example. Since Wikipedia was founded in 2001, the project has grown without any urgent need for these kinds of tools, and at present there seems to be no reason why this should change in the future. By creating free content, the community enables Wikimedia to collect the donations needed to run WikiLabs. That this should lead to users loosing their right of anonymity, although the majority opposes this, is absurd. To ensure that user data are not evaluated on non-Wikimedia servers, the Foundation is asked to take the following steps:

  • Wikipedia dumps should no longer contain any detailed user information. The license only requires the name of the author and not the time or the day when they edited.
  • There should only be limited access to user data on the API.
  • It might be worth considering whether or not it is necessary or consistent with project targets to store and display the IP addresses of registered users (if they are stored), as well as precise timestamps that are accurate to the minute of all their actions. The time limit here could be how long it reasonably takes CheckUsers to make a query. After all, data that are not available cannot be misused for other purposes.

Original signatures

  1. Martina Disk. 21:28, 24. Nov. 2013 (CET)
  2. NNW 18:52, 26. Nov. 2013 (CET)
  3. ireas :disk: 19:23, 26. Nov. 2013 (CET)
  4. Henriette (Diskussion) 19:24, 26. Nov. 2013 (CET)
  5. Raymond Disk. 08:38, 27. Nov. 2013 (CET)
  6. Richard Zietz 22px|8)|link= 22:18, 27. Nov. 2013 (CET)
  7. Alchemist-hp (Diskussion) 23:47, 27. Nov. 2013 (CET)
  8. Lencer (Diskussion) 11:54, 28. Nov. 2013 (CET)
  9. Smial (Diskussion) 00:09, 29. Nov. 2013 (CET)
  10. Charlez k (Diskussion) 11:55, 29. Nov. 2013 (CET)
  11. elya (Diskussion) 19:07, 29. Nov. 2013 (CET)
  12. Krib (Diskussion) 20:26, 29. Nov. 2013 (CET)
  13. Jbergner (Diskussion) 09:36, 30. Nov. 2013 (CET)
  14. TMg 12:55, 30. Nov. 2013 (CET)
  15. AFBorchertD/B 21:22, 30. Nov. 2013 (CET)
  16. Sargoth 22:06, 2. Dez. 2013 (CET)
  17. Hilarmont 09:27, 3. Dez. 2013 (CET)
  18. --25px|verweis=Portal:Radsport Poldine - AHA 13:09, 3. Dez. 2013 (CET)
  19. XenonX3 – (RIP Lady Whistler) 13:11, 3. Dez. 2013 (CET)
  20. -- Ra'ike Disk. LKU WPMin 13:19, 3. Dez. 2013 (CET)
  21. --muns (Diskussion) 13:22, 3. Dez. 2013 (CET)
  22. --Hubertl (Diskussion) 13:24, 3. Dez. 2013 (CET)
  23. --Aschmidt (Diskussion) 13:28, 3. Dez. 2013 (CET)
  24. Anika (Diskussion) 13:32, 3. Dez. 2013 (CET)
  25. K@rl 13:34, 3. Dez. 2013 (CET)
  26. --DaB. (Diskussion) 13:55, 3. Dez. 2013 (CET) (Auch wenn ich das mit den Dumps etwas übertrieben finde.)
  27. --AndreasPraefcke (Diskussion) 14:05, 3. Dez. 2013 (CET) Gerade das mit den Dumps ist wichtig, und auch auf den Wikipedia-Websites sollte diese Info nicht angezeigt werden. So ungefähr (nicht genauer durchdacht, nur als ungefähre Idee): Edits von heute: wie gehabt sekundengenau angezeigt, Edits von dieser Woche: minutengenau, Edits der letzten sches Wochen: stundengenau, Edits der letzten 12 Monate: tagesgenau, Edits davor: monatsgenau – die Reihenfolge muss natürlich gewahrt werden; Edits und darauffolgende reine Reverts: ganz aus der Datenbank raus)
    Man sollte aber trotz berechtigter Interessen am Datenschutz nicht vergessen, dass diese Art der Datums-/Zeitbeschneidung ein zweischneidiges Schwert ist. Versionsgeschichtenimporte einerseits und URV-Prüfungen andererseits würden deutlich erschwert ;-) -- Ra'ike Disk. LKU WPMin 14:19, 3. Dez. 2013 (CET) (wobei für letzteres eine tagesgenaue Anzeige für den Vergleich mit Webarchiv reichen würde)
  28. --Mabschaaf 14:08, 3. Dez. 2013 (CET)
  29. --Itti 14:28, 3. Dez. 2013 (CET)
  30. ...Sicherlich Post 14:52, 3. Dez. 2013 (CET)
  31. --Odeesi talk to me rate me 16:29, 3. Dez. 2013 (CET)
  32. --gbeckmann Diskussion 17:23, 3. Dez. 2013 (CET)
  33. --Zinnmann d 17:24, 3. Dez. 2013 (CET)
  34. --Kolossos 17:41, 3. Dez. 2013 (CET)
  35. -- Andreas Werle (Diskussion) (heute mal "ohne" Zeitstempel...)
  36. --Gleiberg (Diskussion) 18:03, 3. Dez. 2013 (CET)
  37. --Jakob Gottfried (Diskussion) 18:30, 3. Dez. 2013 (CET)
  38. --Wiegels „…“ 18:55, 3. Dez. 2013 (CET)
  39. --Pyfisch (Diskussion) 20:29, 3. Dez. 2013 (CET)
  40. -- NacowY Disk 23:01, 3. Dez. 2013 (CET)
  41. -- RE rillke fragen? 23:17, 3. Dez. 2013 (CET) Ja. Natürlich nicht nur die API, sondern auch die "normalen Seiten" (index.php) sollten ein (sinnvolles) Limit haben. Eine Einschränkung von Endanwendungen durch Richtlinien lehne ich ab, genauso wie überstürztes Handeln. Man wird viel abwägen müssen und eventuell Ausnahmen für bestimmte Benutzergruppen schaffen müssen oder neue Wege, Daten darzustellen. Checkuser-Daten werden meines Wissens automatisch nach 3 Mon. gelöscht: S. User:Catfisheye/Fragen_zur_Checkusertätigkeit_auf_Commons#cite_ref-5
  42. --Christian1985 (Disk) 23:25, 3. Dez. 2013 (CET)
  43. --Jocian 04:45, 4. Dez. 2013 (CET)
  44. -- CC 04:50, 4. Dez. 2013 (CET)
  45. --Don-kun Diskussion 07:10, 4. Dez. 2013 (CET)
  46. --Zeitlupe (Diskussion) 09:09, 4. Dez. 2013 (CET)
  47. --Geitost 09:25, 4. Dez. 2013 (CET)
  48. Everywhere West (Diskussion) 09:29, 4. Dez. 2013 (CET)
  49. -jkb- 09:29, 4. Dez. 2013 (CET)
  50. -- Wurmkraut (Diskussion) 09:47, 4. Dez. 2013 (CET)
  51. Simplicius Hi… ho… Diderot! 09:53, 4. Dez. 2013 (CET)
  52. --Hosse Talk 12:49, 4. Dez. 2013 (CET)
  53. Port(u#o)s 12:57, 4. Dez. 2013 (CET)
  54. --Howwi (Diskussion) 14:26, 4. Dez. 2013 (CET)
  55.  — Felix Reimann 17:17, 4. Dez. 2013 (CET)
  56. --Bubo 18:30, 4. Dez. 2013 (CET)
  57. --Coffins (Diskussion) 19:22, 4. Dez. 2013 (CET)
  58. --Firefly05 (Diskussion) 20:09, 4. Dez. 2013 (CET)
  59. Es geht darum, den Grundsatz und das Regel-Ausnahme-Schema klarzustellen. --Björn 20:13, 4. Dez. 2013 (CET)
  60. --V ¿ 21:46, 4. Dez. 2013 (CET)
  61. --Merlissimo 21:59, 4. Dez. 2013 (CET)
  62. --Stefan »Στέφανος«  22:02, 4. Dez. 2013 (CET)
  63. -<)kmk(>- (Diskussion) 22:57, 4. Dez. 2013 (CET)
  64. --lutki (Diskussion) 23:06, 4. Dez. 2013 (CET)
  65. -- Ukko 23:22, 4. Dez. 2013 (CET)
  66. --Video2005 (Diskussion) 02:17, 5. Dez. 2013 (CET)
  67. --Baumfreund-FFM (Diskussion) 07:30, 5. Dez. 2013 (CET)
  68. --dealerofsalvation 07:35, 5. Dez. 2013 (CET)
  69. --Gripweed (Diskussion) 09:32, 5. Dez. 2013 (CET)
  70. --Sinuhe20 (Diskussion) 10:05, 5. Dez. 2013 (CET)
  71. --PerfektesChaos 10:22, 5. Dez. 2013 (CET)
  72. --Tkarcher (Diskussion) 13:51, 5. Dez. 2013 (CET)
  73. --BishkekRocks (Diskussion) 14:43, 5. Dez. 2013 (CET)
  74. --PG ein miesepetriger Badener 15:34, 5. Dez. 2013 (CET)
  75. --He3nry Disk. 16:32, 5. Dez. 2013 (CET)
  76. --Sjokolade (Diskussion) 18:15, 5. Dez. 2013 (CET)
  77. --Lienhard Schulz Post 18:43, 5. Dez. 2013 (CET)
  78. --Kein Einstein (Diskussion) 19:35, 5. Dez. 2013 (CET)
  79. --Stefan (Diskussion) 22:19, 5. Dez. 2013 (CET)
  80. --Rauenstein 22:58, 5. Dez. 2013 (CET)
  81. --Anka Wau! 23:45, 5. Dez. 2013 (CET)
  82. --es grüßt ein Fröhlicher DeutscherΛV¿? Diskussionsseite 06:42, 6. Dez. 2013 (CET)
  83. --Doc.Heintz 08:55, 6. Dez. 2013 (CET)
  84. --Shisha-Tom ohne Uhrzeit, 6. Dez. 2013
  85. --BesondereUmstaende (Diskussion) 14:57, 6. Dez. 2013 (CET)
  86. --Varina (Diskussion) 16:37, 6. Dez. 2013 (CET)
  87. --Studmult (Diskussion) 17:30, 6. Dez. 2013 (CET)
  88. --GT1976 (Diskussion) 20:51, 6. Dez. 2013 (CET)
  89. --Wikifreund (Diskussion) 22:04, 6. Dez. 2013 (CET)
  90. --Wnme 23:07, 6. Dez. 2013 (CET)
  91. -- ST 00:47, 7. Dez. 2013 (CET)
  92. --Flo Beck (Diskussion) 13:45, 7. Dez. 2013 (CET)
  93. IW 16:34, 7. Dez. 2013 (CET)
  94. --Blech (Diskussion) 17:48, 7. Dez. 2013 (CET)
  95. --Falkmart (Diskussion) 18:21, 8. Dez. 2013 (CET)
  96. --Partynia RM 22:53, 8. Dez. 2013 (CET)
  97. --ElRaki 01:09, 9. Dez. 2013 (CET) so viele Benutzerdaten wie möglich löschen/so wenig Benutzerdaten wie unbedingt nötig behalten
  98. --user:MoSchle--MoSchle (Diskussion) 03:57, 9. Dez. 2013 (CET)
  99. --Daniel749 Disk. (STWPST) 16:32, 9. Dez. 2013 (CET)
  100. --Knopfkind 21:19, 9. Dez. 2013 (CET)
  101. --Saibot2 (Diskussion) 23:14, 9. Dez. 2013 (CET)
  102. --Atlasowa (Diskussion) 15:03, 10. Dez. 2013 (CET) Der Aufruf richtet sich aber ebenso an WMDE, die ja die Abschaffung des Toolservers beschlossen hat und damit die Entwicklung zum DUI ermöglicht hat. Nur Briefträger zu WMF sein ist zu wenig. Wenn WMDE sich Gutachten zur Spendenkultur in Deutschland schreiben lassen kann, um beim WMF Lobbyismus für eine eigene Spendensammlung zu machen, dann kann WMDE ja wohl auch Gutachten zum dt./europ. Datenschutz in Auftrag geben.
  103. ----Fussballmann Kontakt 21:38, 10. Dez. 2013 (CET)
  104. --Steinsplitter (Disk) 23:40, 10. Dez. 2013 (CET)
  105. --Gps-for-five (Diskussion) 03:03, 11. Dez. 2013 (CET)
  106. --Kolja21 (Diskussion) 03:55, 11. Dez. 2013 (CET)
  107. --Laibwächter (Diskussion) 09:50, 11. Dez. 2013 (CET)
  108. -- Achim Raschka (Diskussion) 15:18, 11. Dez. 2013 (CET)
  109. --Alabasterstein (Diskussion) 20:32, 13. Dez. 2013 (CET)
  110. --Grueslayer Diskussion 10:51, 14. Dez. 2013 (CET)
  111. Daten nur erheben, wenn unbedingt für den Betrieb (bzw. rechtlich) notwendig. Alles andere sollte gar nicht erhoben werden. Die Rückschlüsse auf die Zeitzonen und das Wohngebiet (häufig auch von Benutzern selbst angegeben) sehe ich gar nicht als gravierend an. Vielmehr, dass im Wiki alles protokolliert wird. Die halte ich nicht für nötig. Wer muss schon wissen, wer vor 10 Jahren wo genau editiert hat. Nach einem Jahr sollte die Vorratsdatenspeicherung anonymisiert werden (also in der Artikelhistorie kanns dirn bleiben, da nötig, jedoch nicht in der Benutzer-Beitragsliste).--Alberto568 (Diskussion) 21:51, 14. Dez. 2013 (CET)
  112. --Horgner (Diskussion) 15:48, 16. Dez. 2013 (CET)
  113. --Oursana (Diskussion) 21:52, 16. Dez. 2013 (CET)
  114. --Meslier (Diskussion) 23:53, 16. Dez. 2013 (CET)
  115. -- Martin Bahmann (Diskussion) 09:20, 18. Dez. 2013 (CET)
  116. DerHexer (Disk.Bew.) 15:24, 19. Dez. 2013 (CET)
  117. Neotarf (Diskussion) 01:58, 20. Dez. 2013 (CET)
  118. --Lutheraner (Diskussion) 13:17, 20. Dez. 2013 (CET)
  119. --Lienhard Schulz (talk) 07:53, 21 December 2013 (UTC)[reply]
  120. --Brainswiffer (talk) 16:33, 1 January 2014 (UTC)[reply]
  121. Botulph (talk) 23:54, 31 January 2014 (UTC) Wie stets vorbehaltlich besserer Erkenntnis. Freundlicher Gruß. +verneig+[reply]

Comments

Can WMDE get an EU lawyer to assess whether such analysis of data is lawful under the current or draft EU directive and what it would take to respect it? I see that the draft contains some provisions on "analytics"; if the WMF adhered to EU standards (see also #Localisation des serveurs aux Etats-Unis et loi applicable bis) we might automatically solve such [IMHO minor] problems too. --Nemo 16:12, 20 December 2013 (UTC)[reply]

See also #Please_add_concerning_user_profiles (permalink, s) and #Generation_of_editor_profiles (permalink, s). PiRSquared17 (talk) 20:36, 20 December 2013 (UTC)[reply]

On a more personal note than the official response below, I shall repeat here advice I have regularly given to editors on the English Wikipedia in my capacity as Arbitrator: "Editing a public wiki is an inherently public activity, akin to participating in a meeting in a public place. While we place no requirement that you identify yourself or give any details about yourself to participate – and indeed do our best to allow you to remain pseudonymous – we cannot prevent bystanders from recognizing you by other methods. If the possibility of being recognized places you in danger or is not acceptable to you, then you should not involve yourself in public activities – including editing Wikipedia." MPelletier (WMF) (talk) 21:10, 20 December 2013 (UTC)[reply]

We can prevent creating user profiles by aggregating data. It has been done at the toolserver. It can be done at WikiLabs. NNW (talk) 21:29, 20 December 2013 (UTC)[reply]
No, you cannot. Those tools existed anyways, just elsewhere. You cannot prevent aggregation of public data without making that data not public anymore; including on the website itself (remove it from the API and people will just screen scrape for it) and in the dumps. Transparency isn't an accident, it's one of the basic principles of wikis in general and of the projects in particular. MPelletier (WMF) (talk) 18:12, 21 December 2013 (UTC)[reply]
Laws can prevent it though :) (looks like it may happen rather soon in EU). If everyone here takes extremist stances and collate everything as if there were no differences between publishing data and using it, or querying a database and making someone else query it, then it will be very hard to have any dialogue. To reiterate a point above, if a Wikimedia project includes Google Analytics and sends all private data to Google, our users don't care whether it was put by the WMF or a sysop, they just want it removed. --Nemo 18:23, 21 December 2013 (UTC)[reply]
No, actually, laws do not. The directive everyone refers to does not have anything to say about what people are allowed to do with publicly available information, but about private information which edit times most definitely are not.

Contrarywise, whether someone accesses a tool (or project page) is private information and this is why the rules already do forbid disclosing it; so your Google Analytics example is a good illustration of what we do already forbid. MPelletier (WMF) (talk) 20:55, 21 December 2013 (UTC)[reply]

I'm glad you have such legal certainties; I do not and I asked lawyers to comment, in the meanwhile I only said that law can forbid something if they wish (this seems rather obvious to me). As for Google Analytics, of course it's not the same thing, but it was just an example where it's easier to agree that it doesn't matter whether it's WMF or an user to place it on our servers (though the proposed draft explicitly does not cover the case of a sysop adding Google Analytics to a project). --Nemo 22:33, 21 December 2013 (UTC)[reply]
"your Google Analytics example is a good illustration of what we do already forbid." Oh, really? Just a short while ago a Software Engineer on the Wikimedia Foundation's Analytics team wrote about Analytics for tools hosted on labs?: "I don't think there are any technical reasons people can't use Google Analytics on a Labs instance. The only thing I can think of is that it'd be nice if people used something Open Source like PiWik. But I'll ask and report back in a bit." > later > "Google Analytics or any other analytics solution is strictly forbidden by Labs rules *unless* there's a landing page with a disclaimer that if the user continues, their behavior will be tracked." So that's the "good illustration of what we do already forbid": just put up a disclaimer. --Atlasowa (talk) 00:58, 22 December 2013 (UTC)[reply]
"Those tools existed anyways, just elsewhere.": This is told so often and it is still no good point. There are so many bridges and there are so many people crashing their cars into them. Does that mean we have to do it, too? A first step could be just to stop creating user profile on WMF servers. It was the end of the Toolserver limitations that started all the discussion. Of course there will be always someone who can and will do it somewhere but that is no reason to invite people to do it here on servers that are paid with donations for our work. I want to create an encyclopedia, not to collect money for spying on me. NNW (talk) 12:15, 22 December 2013 (UTC)[reply]

Additional signatures

  1. --Geolina163 (talk) 16:06, 20 December 2013 (UTC)[reply]
  2. --Density (talk) 16:35, 20 December 2013 (UTC)[reply]
  3. --Minihaa (talk) 16:57, 20 December 2013 (UTC) bitte um Datensparsamkeit.[reply]
  4. --Theaitetos (talk) 17:08, 20 December 2013 (UTC)[reply]
  5. -- Sir Gawain (talk) 17:17, 20 December 2013 (UTC)[reply]
  6. --1971markus (talk) 18:26, 20 December 2013 (UTC)[reply]
  7. --Goldzahn (talk) 19:22, 20 December 2013 (UTC)[reply]
  8. --Spischot (talk) 21:38, 20 December 2013 (UTC)[reply]
  9. --Bomzibar (talk) 22:43, 20 December 2013 (UTC)[reply]
    --Charlez k (talk) 22:51, 20 December 2013 (UTC) already signed, see above (Original signatures) --Krib (talk) 23:05, 20 December 2013 (UTC)[reply]
  10. --J. Patrick Fischer (talk) 09:14, 21 December 2013 (UTC)[reply]
  11. --Túrelio (talk) 15:07, 21 December 2013 (UTC)[reply]
  12. --Poupou l'quourouce (talk) 17:46, 21 December 2013 (UTC)[reply]
  13. --Nordlicht8 (talk) 21:54, 21 December 2013 (UTC)[reply]
  14. -- FelixReimann (talk) 11:16, 22 December 2013 (UTC)[reply]
  15. --Asio otus (talk) 11:54, 22 December 2013 (UTC)[reply]
  16. --Rosenzweig (talk) 12:26, 22 December 2013 (UTC)[reply]
  17. --Mellebga (talk) 13:47, 25 December 2013 (UTC)[reply]
  18. --Pasleim (talk) 15:24, 26 December 2013 (UTC)[reply]
  19. Elvaube ?! 13:32, 29 December 2013 (UTC)[reply]
  20. --Zipferlak (talk) 13:18, 2 January 2014 (UTC)[reply]
  21. --Gerbil (talk) 15:04, 5 January 2014 (UTC)[reply]
  22. --Sebastian.Dietrich (talk) 22:41, 9 January 2014 (UTC)[reply]
  23. --Stefan Bellini (talk) 18:57, 12 January 2014 (UTC)[reply]
  24. --SteKrueBe (talk) 23:48, 12 January 2014 (UTC)[reply]
  25. --Wilhelm-Conrad (talk) 23:02, 14 January 2014 (UTC)[reply]
  26. --Cubefox (talk) 20:37, 15 January 2014 (UTC)[reply]
  27. --Yellowcard (talk) 22:47, 16 January 2014 (UTC)[reply]
  28. --Ghilt (talk) 23:55, 19 January 2014 (UTC)[reply]

Response

Please note the response by Tfinc above in the Generation of editor profiles and my follow up to it. Obfuscating user contributions data or limiting our existing export will not happen. The Wikipedia projects are wikis, edits to it are by nature public activities that have always been, and always must be, available for scrutiny. MPelletier (WMF) (talk) 21:10, 20 December 2013 (UTC)[reply]

We don't need to keep around timestamps down to a fraction of a second forever. PiRSquared17 (talk) 21:13, 20 December 2013 (UTC)[reply]
Not sure about that. I wonder if de.wiki also has agreed to a decrease of its own right to fork, a right which they constantly use as a threat. Making dumps unusable would greatly reduce the contractual power of de.wiki, dunno if they really want it. --Nemo 21:43, 20 December 2013 (UTC)[reply]

While we believe this proposal is based on legitimate concerns, we want to highlight some of the practical considerations of such a proposal. Due to the holidays, we’ve addressed this only briefly, but we hope it serves to explain our perspective.

In summary, public access to metadata around page creation and editing is critical to the health and well-being of the site and is used in numerous places and for numerous use cases:

  • Protecting against vandalism, incorrect and inappropriate content: there are several bots that patrol Wikipedia’s articles that protect the site against these events. Without public access to metadata, the effectiveness of these bots will be much reduced, and it is impossible for humans to perform these tasks at scale.
  • Community workflows: Processes that contribute to the quality and governance of the project will also be affected: blocking users, assessing adminship nominations, determining eligible participants in article deletion discussions.
  • Powertools: certain bulk processes will be broken without public access to this metadata.
  • Research: researchers around the world use this public metadata for analysis that is useful for both to the site and the movement. It is essential that they continue to have access.
  • Forking: In order to have a full copy of our projects and their change histories all metadata needs to be exposed alongside content.

In summary, public and open-licensed revision metadata is vital to the technical and social functioning of Wikipedia, and any removal of this data would have serious impact on a number of processes and actions critical to the project. Tfinc (talk) 00:54, 21 December 2013 (UTC)[reply]

How was it possible for Wikipedia to grow 13 years without aggregating user data? What has changed since the start of WikiLabs that this is necessary? Why is it necessary for creating an encyclopedia to know the exact second of my edit 5 years ago? Where does the licenses say that it is necessary that the exact second of my edit has to be part of a fork? NNW (talk) 10:38, 21 December 2013 (UTC)[reply]
I understand the part on aggregation and analytics, but the point about seconds is quite silly: sure, seconds could not be necessary in some ideal version of MediaWiki where they don't matter; but they also don't matter at all for your privacy. To avoid inferences about timezone we should remove hours of the day, not seconds. --Nemo 18:12, 21 December 2013 (UTC)[reply]
If you read the appeal above you will see that I do know that talking about seconds is silly. But it is senseless to start with hours when some people don't understand the basic proplem with that data. Seconds just carry the topic to extremes so it may get understood that no one needs five year old timestamps for preventing vandalism or whatever. NNW (talk) 12:02, 22 December 2013 (UTC)[reply]
Actually, I read it but I don't see that. The text does not specify what level of precision in timestamps you want to achieve. --Nemo 10:19, 29 December 2013 (UTC)[reply]
I cannot offer a complete solution to this problem. The appeal in a nutshell is As much transparency as necessary, as much privacy as possible. I am not that much into technical questions. Perhaps some of the suggestions cannot be implemented for some technical reasons I don't know. Perhaps there are some better ways to keep users’ anonymity. All I did was centralizing a growing dissatisfaction about the way our data is handled and to start a discussion about it. NNW (talk) 11:56, 29 December 2013 (UTC)[reply]
Thanks. This is a frank and reasonable way to frame it. --Nemo 12:03, 29 December 2013 (UTC)[reply]
It's true that most actions of plain vandalism can be efficiently performed if we know the exact order of events, in order to revert edits correctly.
But the precision of timestamps is needed for things where there are battles related to the order of events in the history, for example battles of licences: we need to be able to prove the anteriority of a work. Precise timestamps are then needed, but we could hide this info by replacing these exact timestamps by digital signatures generated by the server, and making an API reserved to CheckUser admins, that would be able to assert which event occured before another one. IT could also be used for anonimizing contributions made by users that asked their account to be deleted and their past contributions to be fully anonymized (while maintaining the validity of their past work and provability and permanence/irrevocability of their agreed licences).
Other open projects have experienced this issue when it was difficult to assert the licencing terms (for example on OpenStreetMap before it changed its licence from CC-BY-SA to ODbL for new controbutions, and needed to check its data according to the time the user actually accepted the new Contributor Terms and actually accepted to relicence, or not, its past contributions, in order to cleanup the active online database then published exclusively using the new licence: this did not mean that the old database was illegal, but that it has been frozen at a precise timestamp, and all further edits made exclusively on the new licence that users had to accept beore continuing making new edits).
Precise timestamps are then needed for long terms, and this is not just ot fight active abuses and spams (with bots interested in a short period of time not exceeding one month; after that time, a bot alone cannot work reliably without human review to restrict its searches, if something must be reverted, or in case of doubt, with all user rights transferred to a special aggregated/anonymized user account detached from the original user).
Note that timestamps and goelocation data stored in media files are a problem, users chsould have a way to cleanup a media file from these data by reducng the precision (for example only the date, or just the year, and a weaker geolocation, or deletion of unnecessary metadata such as stored hardware ID's of digital cameras, version of the tool used to modify the photos, possibly online by using external services like Google Picasa), or other kind of information which may store such data using stealth technics such as steganography (using technics that will be discovered only years laters): Commons should have a tool to inspect these metadata, to allow the orogonal uploaded to cleanup these hidden details, to be dropped permanently by dropping also the stored old versions of these media files.
Fully anonimizing photos and videos is a really difficult challenge (it is far easier to do it on graphics with reduced color spaces or with vector graphics accepting some randomized alteration of any unnecessary geometric precision), as things initially invisible may be revealed later by new procesing algorithms (like those already used now by Google which can precisely identify places and people by looking at some small portions of photos or assembling multiple ones from the same "exposed public user account" and in the same timestamp period, or photos/videos participating to the same topic elsewhere)!
Note that these media analysis tools may also be used to "assert" the licencing terms and legitimate author of a work, that has been reused elsewhere without permission (and there are already examples where legitimate Wikimedia contents have been attacked later by abusers trying to take the authorship and building a fake anteriority). This has already done severe damages in Wikimedia projects (for example when several editions of WikiQuotes had to be fully erased and restarted from zero, a few years ago, when we could no longer prove the origin or anteriority of a work). verdy_p (talk) 13:33, 22 December 2013 (UTC)[reply]

Question of context

AxelBoldt, NNW, and everyone else...

I regret to admit that the context in which the members of the appeal came up with the feature request is unclear to me due to the language barrier. Please provide me with links of where the opt-out idea originated; even if they're in German, I will be grateful as I would not have to try to search for the discussion myself. Gryllida (talk) 07:20, 31 December 2013 (UTC)[reply]

As far as I know the opt-out idea was made by Cyberpower678 first when he started the RFC for X!'s Edit Counter [1]. Such tools at the toolserver always had an opt-in (also as far as I know). NNW (talk) 13:08, 31 December 2013 (UTC)[reply]
NNW, is there a place lack of opt-in feature was discussed, first time, for the DUI tool specifically? Gryllida (talk) 15:13, 31 December 2013 (UTC)[reply]
Gryllida, the DUI was the direct result of the RFC for X!'s Edit Counter. Any opt-in/opt-out/nothing-at-all discussions were held there. As Ricordisamoa refused to change anything (see link in the thread below) there was nothing left to discuss. Some reactions to his tool can be found at User talk:Ricordisamoa#Deep user inspector. NNW (talk) 15:40, 31 December 2013 (UTC)[reply]
NNW, «the DUI was the direct result of the RFC for X!'s Edit Counter» is a useful observation. ☺ Where can I see evidence for that, for reference, as it appears to be of relevance to this thread? Gryllida (talk) 15:56, 31 December 2013 (UTC)[reply]
[2]. NNW (talk) 16:05, 31 December 2013 (UTC)[reply]
NNW, you have linked me to the RFC text at the initial stage while its discussion section is empty. Community views could be of interest in this discussion though. ☺ For me to not go through the history manually, could you please locate the RFC in an archive and link me to that? Gryllida (talk) 15:56, 31 December 2013 (UTC)[reply]
Ah, the latest revision appears to contain the archive. Thanks! ☺ Gryllida (talk) 15:58, 31 December 2013 (UTC)[reply]
Even though a translated message about the RfC was spammed to all wikis (by me), most commenters seem to be from enwiki or dewiki. I'd say dewiki mainly wanted to keep opt-in, enwiki wanted to remove it or use opt-out, which is not surprising. PiRSquared17 (talk) 23:55, 1 January 2014 (UTC)[reply]

NNWThanks for the context. It appears that the tool functions as a proxy to already available information, and the WMF lack authority to eliminate it entirely, such as if it were hosted externally. Hence it appears useless for them to add actionable clauses about it into their privacy policy.

I only see work on an Externsion as a last resort, for the DUI tool to fail to function at the wikis that choose to request such extension with community consencus. If the community is willing to experiment, the WMF labs resources are available for collaborative community work on it. Gryllida (talk) 09:22, 3 January 2014 (UTC)[reply]

Response

Thank you to all the users who contributed to this discussion, and who signed on to this appeal. We take these concerns seriously, and understand why you are concerned, even when we disagree with some of your analysis (as we first discussed in our blog).

As I understand the appeal, there are really four main requests. I’d like to summarize and respond to each of these requests here.

Protecting users

At the highest level, the appeal asks that the Foundation "commit to making the protection of its registered users a higher priority and should take the necessary steps to achieve this". We believe strongly that we have already made protection of all of our users a high priority. This can be seen in our longstanding policies — like the relatively small amount of data that we require to participate and the steps we take to ensure that nonpublic information is not shared with third parties — and in our new policies, like the steps we've taken to add https and filter IP addresses at Labs. We will of course always have to balance many priorities while running the sites, but privacy has been and will remain one of the most important ones.

Reducing available information

More concretely, the appeal expresses concern that the publication of certain information about edits in the dumps, on the API, and on the sites, allows users to deduce information about editors. It therefore requests that we remove that information from dumps and the API.

This information has been public since the beginning of the projects almost 13 years ago. As Tfinc and others have discussed extensively above, the availability of this information has led to the creation of a broad set of tools and practices that are central to the functioning of the projects. We understand that this can lead to the creation of profiles that are in some cases uncomfortably suggestive of certain information about the respective editor. However, we do not think this possibility can justify making a radical change to how the projects have always operated, so we do not plan to act on this request.

Aggregation on Labs

The second major concern presented was that the Wikimedia Labs policy, unlike the Toolserver policy, does not explicitly prohibit volunteer-developed software that aggregates certain types of account information without opt-in consent. Because of this, the appeal requested a ban on such software on servers (like Labs) that are hosted by the Foundation.

To address this concern, I proposed a clarification to the Labs terms of use. Several users have expressed the opinion that this is insufficient, so the discussion is still ongoing about what approach (if any) should be taken on Labs. Anyone interested in this request is urged to contribute to the discussion in that section.

Collection of IP addresses

The final request in the appeal was to not "store and display the IP addresses of registered users". We currently store those addresses, but only for 90 days, as part of our work to fight abuse. This will continue under the new Data Retention Guidelines. We do not display the IP addresses of registered users, except to those volunteers who are involved in our abuse-fighting process, and then only under the terms described in this Privacy Policy and the Access to Nonpublic Information Policy. So we think we are reasonably compliant with this request.

Conclusion

As NNW put it in a comment above, the appeal seeks “as much transparency as necessary, as much privacy as possible.” The WMF strongly agrees with this goal, which is why we have always collected very little personal data, why we do not share that data except in very specific circumstances, and why we have written a very detailed, transparent privacy policy that explains in great detail what we do with the data we have. At the same time, we also recognize that providing information about edits has been part of how we have enabled innovation, flexibility, and growth. After weighing those factors, we have reached the conclusions described above. We hope that the users who signed the appeal will accept this conclusion, and continue to participate and support our shared mission. —LVilla (WMF) (talk) 00:37, 10 January 2014 (UTC)[reply]

Note on Labs Terms / Response to NNW

Hi, NNW: If you are asking here about the change from Toolserver to Labs about when “profiling tools” are allowed, we made the change because the edit information has always been transparently available, so the Toolserver policy was not effective in preventing “profiling” - tools like X edit counter could be (and were) built on other servers. As has been suggested above, since the policy was ineffective, we removed it.
However, this change was never intended to allow anarchy. The current Labs terms of use allows WMF to take down tools, including in response to a community process like the one that occurred for X edit counter. Would it resolve some of your concerns if the Labs terms made that more obvious? For example, we could change the last sentence of this section from:
If you violate this policy ... any projects you run on Labs, can be suspended or terminated. If necessary, the Wikimedia Foundation can also do this in its sole discretion.
to:
If you violate this policy ... any projects you run on Labs, can be suspended or terminated. The Wikimedia Foundation can also suspend or terminate a tool or account at its discretion, such as in response to a community discussion on meta.wikimedia.org.
I think this approach is better than a blanket ban. First, where there is a legitimate and widely-felt community concern that a particular tool is unacceptable, it allows that tool to be dealt with appropriately. Second, it encourages development to happen on Labs, which ultimately gives the community more leverage and control than when tools are built on third-party servers. (For example, tools built on Labs have default filtering of IP addresses to protect users - something that doesn’t automatically happen for tools built elsewhere. So we should encourage use of Labs.) Third, it encourages tool developers to be bold - which is important when encouraging experimentation and innovation. Finally, it allows us to discuss the advantages and disadvantages of specific, actual tools, and allows people to test the features before discussing them, which makes for a more constructive and efficient discussion.
Curious to hear what you (and others) think of this idea. Thanks.-LVilla (WMF) (talk) 00:02, 24 December 2013 (UTC)[reply]
Is there a need in distinguishing WMF's role in administering Labs tools? I would only stress the requirement of Labs Tools to obey this policy, here, and link to a Labs policy on smooth escalation (ask tool author; discuss in community; ask Labs admins; ask WMF). Gryllida (talk) 05:14, 24 December 2013 (UTC)[reply]
WMF is called out separately in the policy because WMF employees ultimately have control (root access, physical control) to the Labs servers, and so ultimately have more power than others. (I think Coren has been recruiting volunteer roots, which changes things a bit, but ultimately WMF still owns the machines, pays for the network services, etc.) I agree that the right order for conversation is probably tool author -> community -> admins, and that the right place for that is on in the terms of use but an informal policy/guideline on wikitech. -LVilla (WMF) (talk) 17:15, 24 December 2013 (UTC)[reply]
Yah, I just wanted to propose that the policy references both concepts (WMF's ultimate control, and the gradual escalation process) so the users don't assume that appealing to WMF is the only way. Gryllida (talk) 08:38, 25 December 2013 (UTC)[reply]
As I mentioned elsewhere on this page, the talk about "community consensus" raises questions such as "which community?" and "what happens when different communities disagree?" Anomie (talk) 14:30, 24 December 2013 (UTC)[reply]
Right, which is why I didn't propose anything specific about that for the ToU- meta is just an example. Ultimately it'll have to be a case-by-case judgment. -LVilla (WMF) (talk) 17:15, 24 December 2013 (UTC)[reply]
I would perhaps remove the "on Meta" bit then since it bears no useful meaning. «... such as in response to a community discussion.» looks complete to me. There doesn't even have to be a discussion in my view: a single user privately contacting WMF could be enough, granted his report of abuse is accurate. «... such as in response to community feedback.» could be more meaningful. Gryllida (talk) 08:38, 25 December 2013 (UTC)[reply]
This is meant as an example ("such as"), so I think leaving the reference to meta in is OK. Also, this is in addition to the normal reasons for suspension. For the normal reasons for suspension, a report by a single person would be fine, but I think in most cases this sort of discretion will be exercised only after community discussion and consultation, so I think the reference to discussion is a better example than saying "feedback".-LVilla (WMF) (talk) 22:28, 31 December 2013 (UTC)[reply]
I am referring to this argument from above: we made the change because the edit information has always been transparently available, so the Toolserver policy was not effective. The position that any analysis that can be performed by a third party should also be allowable on WMF servers with WMF resources is not convincing. It is clearly possible for a third party to perform comprehensive and intrusive user profiling by collating edit data without the user's prior consent. We could (and should!) still prohibit it on our servers and by our terms-of-use policy. (A different example: it's clearly possible for a third party running a screen scraper to construct a conveniently browsable database of all edits that have ever been oversighted; this doesn't mean WMF should allow it and finance it.) Now, why should this kind of user profiling be prohibited by WMF? Because WMF lives on the goodwill of its editors, and editor NNW above put it best: "I want to create an encyclopedia, not to collect money for spying on me." AxelBoldt (talk) 18:15, 24 December 2013 (UTC)[reply]
You're right, but I think removed (oversaught) edits are out of question here. Whatever else is available is available, and allowing to collect freely available information programmatically sounds reasonable to me. Gryllida (talk) 08:38, 25 December 2013 (UTC)[reply]
It's not reasonable if the editors don't want it and if it doesn't further any identifiable objective of the foundation. In fact it is not only unreasonable but it's a misuse of donor funds. AxelBoldt (talk) 22:28, 25 December 2013 (UTC)[reply]
You should be interested in contributing to the #Tool_settings section below. Gryllida (talk) 01:56, 28 December 2013 (UTC)[reply]
Hello LVilla (WMF)! Your suggestion means that any tool that will be programmed in future has to be checked and – if someone things that it is necessary – has to be discussed individually. My experiences until now: "the community should not have any say in the matter" and a quite short discussion "Technically feasible, legally okay... but want tools do we want?" started at lists.wikimedia.org. If we want it that way we will have to define who is "community". Is it the sum of all users of all WMF projects? Can single projects or single users declare to keep a tool (e.g. en:WP voted for no opt-out or opt-in for X!'s Edit Counter but that would mean that my edits there will be used in that tool although I deny it completely for my account)? Which way will we come to a decision: simple majority or best arguments (and who will decide then)? Does a community vote for tool X mean that there is no chance for a tool Y to try it a second time or do we have to discuss it again and again?
We have to be aware of our different cultures of handling private data or even defining what's private and what's not. Labs "doesn't like" (nice term!) "harmful activity" and "misuse of private information". US law obviously doesn't evaluate aggregating data as misuse, I do. We discuss about necessary "transparency" but do not have a definition for it. The time logs of my edits five years ago seem to be important but you don't want to know my name, my address, my sex, my age, my way how I earn my money… which would make my edits, my intentions and my possible vandalism much more transparent than any time log. Some say "the more transparency the better" but this is a discussion of the happy few – but dominating – who live in North America and Western Europe. I think we also should think of those users who live in the Global South and want to edit problematic topics (religion, sexuality…). For those aggregated user profiles may become a real problem and they will always be a minority in any discussion. NNW (talk) 17:56, 28 December 2013 (UTC)[reply]
Everyone involved is aware that privacy values vary a great deal from community to community; but it seems very ill-advised to give the most restrictive standards a veto over the discussion, in practice and in principle. A clear parallel with the discussion over images can be drawn: while it would have been possible to restrict our standards to the subset deemed acceptable by all possible visitors, to do so would have greatly impoverished us. The same goes for usage of public data: we should foster an encourage new creative uses; not attempt (and fail) to preemptively restrict new tools to the minuscule subset nobody could raise an objection to. This does not preclude acting to discourage or disable a tool the community at large objects to – and the Foundation will be responsive to such concerns – but it does mean that this is not something that can be done with blanket bans.

To answer your more explicit questions, the answer will generally be "it depends" (unsatisfying as this sounds). Ultimately yes, the final arbiter will be the Foundation; but whether or not we intervene is dependent entirely on context as a whole; who objects, why, and what could be done to address those concerns. MPelletier (WMF) (talk) 00:48, 1 January 2014 (UTC)[reply]

So for programmers sky's the limit, it's to the community to find out which tool might violate their rights and to discuss this again and again and again because every tool has to be dealt anew. The community has to accept that in the end a RFC like for X!’s Edit Counter is just a waste of time and that programmers – of course – are not interested in any discussion or compromise because it might cut their tools. WMF is in the comfort position that Meta is in the focus of only very few users and the privacy policy does not apply to Labs. It would be fair to admit that under these circumstances WP:ANON becomes absurd and in near future – with more powerful tools – a lie. I understood "The Wikimedia Foundation, Inc. is a nonprofit charitable organization dedicated to encouraging the growth, development and distribution of free, multilingual, educational content" as "free and multilingual and educational content" but a user profile generated with my editing behaviour isn't educational. NNW (talk) 13:50, 4 January 2014 (UTC)[reply]
Unfortunately - it is. Just think of the possibilities for scientific research... Alexpl (talk) 08:14, 29 January 2014 (UTC)[reply]
A body donation would be great for scientific research, too. NNW (talk) 08:50, 29 January 2014 (UTC)[reply]
I think that's already covered by «Depending on which technology we use, locally stored data can be anything [...] to generally improve our services». Please be sure not to bring your organs close to the servers. ;-) --Nemo 08:57, 29 January 2014 (UTC)[reply]
One could squeeze a few Doctor titels out of in-depth research on contributors identity in combination with their WP work. Compared to that, a body donation is somewhat trivial. So I do agree we have to identify and neutralise every attempt to collect user data as fast and effective as possible. Alexpl (talk) 09:47, 29 January 2014 (UTC)[reply]

Questions from Gryllida

Implementation as Extension

This requests to conceal time of an edit. Would any of the supporters of the appeal be willing to demonstrate a working wiki with the requested change implemented as an Extension which discards edit time where needed? If sufficiently safe and secure, it could be added to a local German wiki by request of the community, and considered by other wiki communities later on. Many thanks. Gryllida (talk) 04:43, 24 December 2013 (UTC)[reply]

Tool settings

Have you considered requesting the Tool author to add an opt-out (or opt-in, as desired) option at a suitable scope? Gryllida (talk) 04:45, 24 December 2013 (UTC)[reply]

Example: editor stats:
«Note, if you don't want your name on this list, please add your name to [[User:Bawolff/edit-stat-opt-out]]».
--Gryllida (talk) 02:14, 28 December 2013 (UTC)[reply]

FYI: The tool address is here. It is not mentioned in the appeal text. (I have notified the tool author, Ricordisamoa, of this discussion and potentially desired feature.) Gryllida (talk) 02:20, 28 December 2013 (UTC)[reply]

User:Ricordisamoa deliberately ignored the idea of an opt-in or opt-out and there is no chance to discuss anything: There's no private data collection, and only WMF could prevent such tools from being hosted on their servers: the community should not have any say in the matter. For complete discussion read Talk:Requests for comment/X!'s Edit Counter#Few questions. NNW (talk) 16:29, 28 December 2013 (UTC)[reply]
@Gryllida and NordNordWest: of course I accept community suggestions (e.g. for improvements to the tool) but the WMF only is competent about legal matters concerning Wikimedia Tool Labs. If there should be any actions, they will have to be taken by the WMF itself. See also [3]. --Ricordisamoa 03:04, 29 December 2013 (UTC)[reply]
Ricordisamoa, would you not be willing to add an opt-out? I would desire it be solved without legal actions or escalation, as it appears to be something within your power and ability, and many users want it. (It seems OK to decline OPT-IN feature request.) Gryllida (talk) 09:07, 29 December 2013 (UTC)[reply]
@Gryllida: No. --Ricordisamoa 16:44, 30 December 2013 (UTC)[reply]
Ricordisamoa, I understand your view. It might make sense to document that in FAQ, if not already, at leisure. I appreciate you being responsive. Gryllida (talk) 07:17, 31 December 2013 (UTC)[reply]
As long as WMF wants to encourage programmers to do anything as long as it is legally there is no reason for programmers to limit the capabilities of their tools. "Community" is just a word which can be ignored very easily when "community" wants to cut capabilities. Only "improvements" will be accepted and "improvements" mean "more, more, more". NNW (talk) 14:00, 4 January 2014 (UTC)[reply]

Discussion on same topic in other locations

Note that this issue has also been discussed in #Generation_of_editor_profiles and #Please_add_concerning_user_profiles. For a full history of this topic, please make sure to read those sections as well. —LVilla (WMF) (talk) 00:36, 8 January 2014 (UTC)[reply]

Opt-in

There is the possibility for a compulsary opt-in for generating user profiles at Labs. By this we would return to the Toolserver policy which worked fine for years. No information would be reduced, fighting vandalism would still be possible, programmers still could write new tools and of course there will be lots of users who are willing to opt-in (like in Toolserver times). On the other hands all other users who prefer more protection against aggregated user profiles can get it if they want to. I see no reason why this minimal solution of the appeal couldn't be realized. NNW (talk) 13:43, 13 January 2014 (UTC)[reply]

As has been stated elsewhere, this only gives a false sense of security. There are other websites that allow profiling anyway, and there's no way to stop them, so there's no clear reason to pretend that you have a choice. //Shell 20:56, 13 January 2014 (UTC)[reply]
As has been stated elsewhere something that is done somewhere doesn't mean we have to do it, too. NNW (talk) 21:32, 13 January 2014 (UTC)[reply]
Toolserver policy was only enforced upon user request. There's a lingering worry that some upset user slap a tool author with a take-down request; this is demoralizing to authors after spending many hours developing the software. This discouraging effect is why we don't see many community tracking tools, like the Monthly DAB Challenge. I've got cool and interesting ideas, but wont waste my time. Dispenser (talk) 19:04, 21 January 2014 (UTC)[reply]
With an opt-in there would be no reason for any complaint. Everybody can decide if her/his data gets used for whatever or not and there will be still lots of users who will like and use whatever you are programming. Please think of those authors who spent many hours to create an encyclopedia and find themselves as an object of spying tools afterwards. Believe me: that's demoralizing. NNW (talk) 23:19, 21 January 2014 (UTC)[reply]
Users never spying on each other? I read enough ArbCom to know that's Fucking Bullshit. This goes beyond edit counters and affect community management. English Wikipedians do not want to watch over 2,000 articles for a month to understand what's happening at a WikiProject. Now I cite the DAB challenge as w:User:JustAGal was completely unknown to us until we expanded the data analysis. We've subsequently redesigned tools to work better for her.
Postscript: Dabfix, an automatic disambiguation page creation and cleanup script, only has a single user and may never recouped the hundreds of hours spent programming and testing it. If a tool is never used then I've wasted time that I could've done something useful. Dispenser (talk) 02:56, 10 February 2014 (UTC)[reply]

Alternative Labs terms proposal: per-project opt-in

The discussion above has been pretty wide-ranging, with some voices in support of opt-in; others in support of opt-out. It is also clear that, for any global proposal, defining who should be consulted is a key challenge. With those two things in mind, Coren and I would like to propose a per-project opt-in; i.e., if a particular project (e.g., Deutsch Wikipedia) wants to require it, then extraction of data from that project will require a per-user opt-in. This gives control directly to specific communities who have been most concerned about the issue, while still preserving flexibility for everyone else. Thoughts/comments welcome. —LVilla (WMF) (talk) 01:13, 4 February 2014 (UTC)[reply]

So 797 communities will have to discuss if they want to have an opt-in. Quite a lot talk, I think, especially for those who are active on several projects and dislike the idea of aggregated user data at all. I have got 100 or more edits in 20 projects although I don't speak all those languages. How can I vote in such a complex matter when I am not able to understand these languages? Am I allowed to vote in every project in which I have edits or do I have to meet some criteria? Why should a community control an opt-in/no opt-in when it is much easier that everybody takes control over his/her own data? It will lead to much more discontent among users when it will be a decision of projects instead of single users. Not everyone at de:WP thinks data aggregation is a bad thing, not everyone at en:WP likes to see data aggregated. NNW (talk) 10:10, 4 February 2014 (UTC)[reply]
@LVilla (WMF): A question of clarification: Does your proposal mean that in case a project makes this decision and a single user does not opt-in, the user's data will be excluded from the data pool which is accessable for developers on labs and external developers? (Which would be much more than just a declaration of intention but a technical barrier to analyze that user's data at all.) And could you explain the necessity of the intermediate level of legitimation by the respective project? I'm not sure if I understand what it's good for when you at the same time acknowledge that the single user himself has to make the decision to opt in. Wouldn't that be a shift of responsibility that no longer matches the reality? Why not just skip that step? User activity does not in general only take place on one single wiki, in times where contributors use their home wiki + commons + (sometimes) meta or wikidata, it seems to ignore the interdependencies we've built over the years. Alice Wiegand (talk) 23:26, 9 February 2014 (UTC)[reply]
@Lyzzy: If I understand your question correctly, then yes: if the project (such as WM-DE) opts out, then tools whose purpose is to build individual user profiles could not access the data of a user of that project who does not opt-in. The idea of doing this on a per-project basis is primarily because the objection to these sorts of tools appears to be highly specific to one project. (Not to say that everyone else on every other project loves it, but it seems undeniable that the bulk of the objection appears to be from one project.) Secondarily, it is because this rule is primarily symbolic (as discussed elsewhere in the page), so making it a per-project thing allows projects who care to make the symbolic statement without overly complicating the situation for others. Finally, it is because people objected to making it per-tool, because it was unclear what level of community discussion would be sufficient to force an individual tool to become opt-in. By making it per-project, we make it quite clear what sort of community discussion is necessary. This does lead to some inefficiencies, particularly for people who participate on meta and other projects. But none of the proposed solutions are perfect - all of them require discussions in a variety of places and inconveniencing different sets of users. Hope that helps explain the situation and the proposal. —LVilla (WMF) (talk) 02:54, 11 February 2014 (UTC)[reply]
@LVilla (WMF): I'm not sure you understood the question correctly. Would the non-opted-in user's data be somehow hidden in the Tool Labs database replicas as a technical restriction (which seems like it could be a significant performance hit for those wikis and would damage other uses of that data), or would this just be a policy matter that tool authors would be required to code their "data aggregation" tools to decline to function for the non-opted-in user on those wikis? Anomie (talk) 14:37, 11 February 2014 (UTC)[reply]
@LVilla (WMF):, I still don't understand if your proposal includes a technical solution. In part of your statements it reads as if the data of a user who does not opt-in after a project decided to go the opt-in-line will not be accessible to any tool on labs. That's entirely different from anything we talked about earlier (labs specific self-commitment) and it's also different from "there's a tag on the record, so please tell your tool not to analyse it". And because there is some kind of ambiguity, clarity about what the proposal is about is essential. Alice Wiegand (talk) 22:48, 17 February 2014 (UTC)[reply]
@Lyzzy: Sorry about the lack of clarity. The proposal does not include any technical measures. There are two types of technical measures possible:
(1) Publish less information. As described previously, this is inconsistent with how we have always done things, and would break a variety of tools.
(2) Audit individual tools on Labs. Given that most tool developers on Labs are likely to respect the policy, this would introduce a very high cost for a very low benefit.
So, yes, this would be a self-commitment, but the operations team at Labs would be able to kick off specific tools that violate the policy if/when the violation is discovered. Hope that helps clarify. —LuisV (WMF) (talk) 23:19, 18 February 2014 (UTC)[reply]
It does, thanks! Alice Wiegand (talk) 14:21, 19 February 2014 (UTC)[reply]
Edit counters will and have existed with or without Labs adopting the Toolserver policy. What about just letting the DE Wikipedia Luddites block tool links they don't like? Dispenser (talk) 03:11, 10 February 2014 (UTC)[reply]
You might take a look at Requests for comment/X!'s Edit Counter and check where the opt-in supporters come from. It is a bit more complex than de:WP vs. the rest of the world. NNW (talk) 09:02, 10 February 2014 (UTC)[reply]
@Dispenser: I've pointed out repeatedly that I think this is a mostly symbolic policy. We're trying to strike a balance that allows some communities who particularly care to make their symbolic statement. Not ideal, I know, but none of the solutions will please everyone here.—LVilla (WMF) (talk) 02:54, 11 February 2014 (UTC)[reply]
I'm not sure I support opt-in in any case, but this compromise is obviously intended for dewiki IMO. The privacy (if you consider analysis of aggregate data to be private) of users who edit on most wikis would still be gone. PiRSquared17 (talk) 02:59, 11 February 2014 (UTC)[reply]
This supposed privacy never existed in the first place. All the necessary data is already public. All this debate is about forcing people who want to create these tools to do so on third-party servers rather than on Tool Labs. Anomie (talk) 14:40, 11 February 2014 (UTC)[reply]

This discussion fell into sleep a while ago, unfortunately. Right now there is a RFC at en:WP about an edit counter opt-in which will hurt EU law when a community decides if data of single users will be aggregated and shown. I still think that it is not the right of a community to decide this but only the concern of everyone for him-/herself. NNW (talk) 11:31, 10 April 2014 (UTC)[reply]

For anyone still interested in this, I've opened a discussion at en:Wikipedia talk:Requests for comment/User analysis tool. SlimVirgin (talk) 23:13, 9 May 2014 (UTC)[reply]

Non-English speakers beg some love

Just two weeks before the proposed end date for this consultation, the translation statistics are very depressing: the top language is French with 80 % translated, only 5 languages are more than 2/3 translated. This means that 4 months have been wasted not involving the global community in the discussion and not spotting translatability issues that will bite later.
One obvious reason here is that the draft is a +200 % length increase compared to the current policy: otherwise, we'd have roughly 14 languages fully translated instead of 0. If the WMF staff is serious about making a privacy policy that people can understand, well of course that's not easy and it probably entails rewriting it from scratch under new premises, to embed the initial feedback received so far and reduce the length by about 66 %. --Nemo 10:28, 30 December 2013 (UTC)[reply]

If the document is too long, have you tried editing it down? :-) --MZMcBride (talk) 10:30, 30 December 2013 (UTC)[reply]
I've made some specific edit proposals but Luis declared they were not "serious" (though in the end he did make some minor changes). I didn't bother making more. --Nemo 10:32, 30 December 2013 (UTC)[reply]
If you want to suggest edits that make the language clearer without changing the policy, by all means - many people have done that and gotten changes in, including you, and the policy is better (and shorter) for it. (The unserious suggestion, if I recall correctly, essentially amounted to removing an entire section of the document.) -LVilla (WMF) (talk)
I don't know what's aggregated in this group but the policy document is fully translated into several languages, including French and German. — Pajz (talk) 12:06, 30 December 2013 (UTC)[reply]
No it isn't, the "More On What This Privacy Policy Doesn’t Cover" section is not translated at all in German. The only group you should use is the complete one I linked above. --Nemo 20:14, 30 December 2013 (UTC)[reply]
And how is one supposed to do that (see #(Technical:) Cannot translate navigation box title above)? — Pajz (talk) 22:11, 30 December 2013 (UTC)[reply]
Like the rest of the policy, this was originally translated into formal German (as well as four other widely-used language) by paid translators. It is now untranslated because we rewrote the whole section, making it 1/6th shorter (at Nemo's request and in part based on his suggestions), and the translators haven't caught up. This is unfortunate, but also unavoidable when you make changes to a translated document.
We can't have the professional translators re-translate every time we make a change - besides the money, the overhead of entering the new translations in with every change would be huge. (And my understanding is that the volunteer translators often aren't happy with the quality of the professional translations anyway :) So we can either slow the editing of the policy, or accept that sometimes sections of it will not be translated while we're discussing it. I admit neither option is ideal, but I think we have made the right choice in leaning towards fast changes.
If there are things we're doing wrong that are hindering the volunteer translators, I'm happy to listen to suggestions on that front - I do think we've been fixing translation software mistakes as quickly as we can, but if not, let me know. -LVilla (WMF) (talk) 01:52, 31 December 2013 (UTC)[reply]
If it's really essential that we get this translated beyond 5 languages, why don't we just pay for translation in the top 10-20 languages? This is too important to wait on. Steven Walling (WMF) • talk 00:53, 31 December 2013 (UTC)[reply]
Hi Nemo, what's stopping you from sending out a call for volunteer translators via the usual two channels - the translation notifications system and/or the Translators-l mailing list? As far as I can see, neither has been done in this case so far, so it's likely that many interested translators do not yet know about this translation task. (Regarding the first channel, there is a little technical issue in that notifications can only be sent for translatable pages, not for aggregated groups - cf. bug 56187 - but that can be mitigated by sending at least a notification for the main page, and linking the aggregated group in the accompanying text message.)
Regards, Tbayer (WMF) (talk) 03:42, 31 December 2013 (UTC)[reply]
 Done I eventually sent out a translation notification myself (with an emphasis on main text of the privacy policy, to help translators focus their energy, but also inviting translation of the whole group). Many thanks to the volunteers who have since then already translated or updated around 500 translation units; hopefully more will be done over the coming days. So if it's really the case that there are indeed still serious translation problems remaining, we should have a good chance to uncover them before the deadline a week from now.
BTW, we are planning to do the same for the draft for the new data retention policy, which is going to be published soon.
Regards, Tbayer (WMF) (talk) 23:34, 8 January 2014 (UTC)[reply]

Dearchived, I don't see any solution here. My eleemosynary skills are clearly lacking: be careful to beg a coin, you may get a kick and a punch. --Nemo 09:07, 29 January 2014 (UTC)[reply]

Data Retention Guidelines posted

We're happy to announce that the first draft of the new data retention guidelines are now available for your review, feedback, and translation. This draft is the result of a collaboration between many teams within the Foundation, including Analytics, Operations, Platform, Product, and Legal.

As with the other privacy documents, this draft is just that: a draft. We want to hear from you about how we can make it better. As suggested in the discussion about timelines above, we plan to hold the community consultation period for this draft open until 14 February 2014.

Thanks - looking forward to the discussion. —LVilla (WMF) (talk) 21:30, 9 January 2014 (UTC)[reply]

Great to see. I've commented on Talk:Data retention guidelines. //Shell 00:30, 10 January 2014 (UTC)[reply]


SUL account creation dates

Template:Tracked If one looks at Special:CentralAuth/darkweasel94, one can see on which dates I first visited all wikis listed there while logged in. This information, which is information about a user simply having read something, not actually actively done anything, is publicly available about everyone with a global account.

This doesn't appear to be covered either in the current privacy policy or in this draft, although it clearly has privacy implications - it can be used to find out that somebody was online on a certain date/at a certain time even if that user didn't actually contribute anything or do a logged action (other than account "creation"). Users don't normally expect that simply looking at a page will show up in any logs, so I think the privacy policy should make it clear that this is done. darkweasel94 (talk) 08:50, 26 January 2014 (UTC)[reply]

You don't think it's comparable to the new user log? PiRSquared17 (talk) 16:41, 28 January 2014 (UTC)[reply]
As far as I can see it basically is the new user log. The problem is, when you click "create your account" after typing a username, twice the same password, and solve a captcha, you actively do something that can reasonably be expected to be publicly logged. When you click an interlanguage link, you don't reasonably expect that to show up in logs. In general I feel that this draft does not sufficiently address the information available in Special:Log, it simply assumes that "public contributions" includes that. darkweasel94 (talk) 20:29, 28 January 2014 (UTC)[reply]
Hi darkweasel94. We discussed your concern internally and have added some language to the "Information We Collect" section in hopes of addressing this issue. Please let us know if you have any further concerns we can help you with! Mpaulson (WMF) (talk) 01:11, 8 February 2014 (UTC)[reply]
Yes, I think that legally at least this should be sufficient, although from a user's point of view I think it would be better to clarify under what circumstances an account is really created, because people might take that to mean just the creation on the home wiki. Proposed wording to be inserted before the last sentence of the last paragraph of that section (feel free to copy-edit, I'm not a native speaker of English):
If you have a unified account for all Wikimedia Sites, which is true for (but not limited to) all accounts created after 2008-xx-xx, account creation on a particular Wikimedia Site may occur, and be publicly logged, whenever you first visit that Wikimedia Site while logged into that unified account.
I think this (the date needs to be filled in, I couldn't quickly find it) should definitely make it clear even to users who don't know anything about how SUL works. darkweasel94 (talk) 13:45, 8 February 2014 (UTC)[reply]
The date (2008-xx-xx) still hasn't occurred, so it is unknown which date it will be. For the moment, you get a non-SUL account if you try to create an account on a project and someone else already has that user name on some other project without having activated SUL. For example, sulutil:Account tells that the user name "Account" exists on some projects but that there is no SUL account with that name. If you try to create an account with the user name "Account" on some other project, then you will get a non-SUL account with that name.
It would be a good idea to automatically create SUL accounts in cases like this to prevent the creation of further SUL conflicts. --Stefan2 (talk) 16:52, 8 February 2014 (UTC)[reply]
That is interesting to know, thank you. In that case, "all" should probably be replaced with "most", and the date should be whenever new accounts with entirely new names became SUL accounts (as my current account already is). darkweasel94 (talk) 18:02, 8 February 2014 (UTC)[reply]
Hi Dearkweasel94 and Stefan2. Thank you for your suggestions. I understand why you wanted clarification in the privacy policy that some information may be contained within public logs (and happily added language indicating this). However, I'm unclear as to why the distinctions between various SUL/semi-SUL accounts is something that should be contained within the privacy policy. Could you explain to me what the connection is between this information and the privacy practices of WMF? Thanks! Mpaulson (WMF) (talk) 21:31, 11 February 2014 (UTC)[reply]
What I mean is this: try clicking this link leading to Latin Wikipedia, a project where you haven't yet logged in from your staff account (which is a SUL account). Then, go to Special:CentralAuth/Mpaulson_(WMF). You will see that in the row "la.wikipedia.org", there is now public information about exactly when you clicked that link (or some other link leading to la.wikipedia.org, but in any case it shows that you were online at the time).
This applies only to SUL accounts, however. Non-SUL accounts can login only in one wiki, so if they click the above link, they'll be logged out on that wiki. I think it's useful to tell people in some way "if your account is relatively new, it's probably a SUL account and affected by this problem". The current wording doesn't really make it clear that simply clicking an interwiki link can count as account creation; from a newbie's point of view, account creation is done by the "create account" link, not somehow else. darkweasel94 (talk) 21:51, 11 February 2014 (UTC)[reply]
Thanks for the clarification. I understand what you are saying, but I think that level of specificity isn't really what we are going for in the privacy policy. While I think it's important (as you noted) to explain that some information about a user's actions may be accessible in public logs, I don't think it's the right place to explain in detail how the information in specific logs is gathered. Mpaulson (WMF) (talk) 23:15, 12 February 2014 (UTC)[reply]


WMF Response to concerns about unsampled data

It is important to understand Wikimedia projects including Wikipedia as they are -- critical Internet and mobile applications in a rapidly changing social and technological landscape. If we are to remain relevant and vibrant we need to be able to understand the behavior and preferences of readers and editors. From a practical standpoint, it's important that we give ourselves the tools that we need to do this. We are not simply collecting data because we can -- there are several categories of use cases that are are impossible to satisfy without unsampled data.

For example, we need to understand how desktop and mobile users interact with the Sites as browsing patterns change. To help tackle problems like that, we need to be able to capture information about a session that requires unsampled data, including the number of pages visited, session duration, and other valuable indications of engagement, and ultimately retention. As the Wikimedia movement grows both geographically and across different platforms, we need to understand the new and different ways in which users interact with the Sites.

There are also use cases around long tail behavior that require further research, which sampling would render impossible or, at best, very difficult. For example, there are lines of thinking around ratios of readership and editing where we would like to understand if pages with relatively low readership are actually good sources of editors. While this is only a theory, we need to be able to address this and similar lines of research.

Finally, we want to underline our commitment to privacy. Unsampled data does not have to be any less private than sampled data. Our commitment to aggregation and anonymization would still be applied with the same degree of effectiveness and respect for Wikimedia readers and editors. Toby Negrin, WMF Director of Analytics TNegrin (WMF) (talk) 00:07, 5 February 2014 (UTC)[reply]

It is also worth noting that the recording of raw unsampled data was permitted by the old privacy policy, which said that we sampled such raw logs to provide statistics and kept the raw logs private. We tended not to do it (primarily, as I understand it, for performance reasons), but it was permitted. The change between the old and new policy is that the new policy is much more clearly written, not that the rules in this area changed. (This is a good example of why the policy is so detailed: we wanted everyone to know what we are doing, and have good, serious discussions about it.)
I think it is also worth pointing out that in many cases, sampling is also a poor substitute for a solid set of data retention guidelines. I’m reasonably confident that even increased amounts of unsampled logging will still be a net win for user safety and privacy when combined with the new data retention guidelines.
On a more personal note, because clarity and transparency was our goal all along, it was particularly frustrating to see people speculate about why we “changed” the policy, and accuse us of acting in bad faith. The issue only came up in this discussion because of our extreme dedication to transparency. I hope that instead of that sort of speculation, this last part of the discussion can focus on the actual pros and cons of unsampled data, and why we think it is important to have that option, as Toby has outlined above. —LVilla (WMF) (talk) 00:12, 5 February 2014 (UTC)[reply]
Not "bad faith" but misjudgement. What did you expect us to think, when we find, for example, a requirement to log in on google, in order to get access to a Wikimedia project ? That feels so wrong. Alexpl (talk) 15:25, 12 February 2014 (UTC)[reply]
Alexpl, that one was fixed. On the other hand, the "response" above doesn't really respond on anything. It rather reinforces the appearance that WMF now wants to follow the model "let's collect stuff we don't really need now, one day it might come handy", rather than the usual privacy-friendly cautious approach exemplified (if not mandated) by the current privacy policy where it talks of sampled logs. --Nemo 15:35, 12 February 2014 (UTC)[reply]
If it comes in handy for wikipedia only, and access is restricted, it could be debated. But my fear is, that third parties, somewhere along that vast research process TNegrin announced, may get their hands on those data. So the requirement for a google-account on a WMF project seemed like a taste of things to come. Alexpl (talk) 16:30, 12 February 2014 (UTC)[reply]
"It rather reinforces the appearance that WMF now wants to follow the model 'let's collect stuff we don't really need now, one day it might come handy', rather than the usual privacy-friendly cautious approach exemplified (if not mandated) by the current privacy policy where it talks of sampled logs." I see the concern, but I think it is wrong. Sampled logs were false security that did not protect the people who had the bad luck to be involved in the sample. (And my understanding is that they were never intended as security, only for performance reasons.) A real data retention policy that applies to all logs, sampled or not, and makes sure that we actively delete sensitive information, is a stronger and more protective policy for all users. So I think we're moving in the right direction with this. —LVilla (WMF) (talk) 23:15, 12 February 2014 (UTC)[reply]

Should users have right to know when, by whom and under which reason they have been checkusered?

Hi, I think the current draft does not address this issue directly, but allowing user to know those information(not the checkuser data itself) should make the checkuser more transparent. Is here the right policy to propose this idea?--朝鲜的轮子 (talk) 01:54, 9 February 2014 (UTC)[reply]

This would reveal information about other users. Assume that a checkuser tells me that he has been looking at my data. Immediately after that, I go to Meta:Requests for CheckUser information and find that someone has been asking for information about a user who self-identifies as living in the same country as I do. Now I suddenly know that it is very likely that the user has been using the same IP address as I have been using, which may reveal more information to me about who the user might be. This sounds bad, so let's avoid it. --Stefan2 (talk) 20:22, 9 February 2014 (UTC)[reply]

invalid id's in static anchors (also not working in translations)

  • all headings are translatable, they cannot work as reliable targets of links to point to them with the same links between languages
  • so we need to define static anchors hidden in tvars.
  • but static anchors must be valid ids and must be unique un the page !
  • Most anchors are invalid as they include invalid characters like spaces, or commas and other punctuation.

This means that all existing static anchors defined must be changed by making sure they are different from the default id's generated by MediaWiki for section headings. The simlest is to see that English headings always start by a capital letter (so MEdiaWiki will generate id's starting by a capital letter).

Let's then generate only id's in English starting by a lowercase letter; we can abbreviate these id's but we must remove spaces, and punctuations, possibly by replacing them by a single hyphen. We can filter out so,e non-meaningful words present in section headings, from the static anchors we generate.

Then we must make sure that anchors used in source links throughout the article will point to the correct target static anchor. Remove all dependency to translatable section headings... verdy_p (talk) 01:12, 12 February 2014 (UTC)[reply]

You are correct on all accounts, I noticed that when I was fixing the broken links that someone else pointed to me as well. I thought that I had set them all up as unique anchors placed above the header (rather then using any kind of header) and put them in tvar's but it's possible that I made some mistakes here (since it was the first of the privacy policy pages I marked for translation) and wasn't more religious about using {{anchor}} until some of the other pages. Right now I'm swamped so I don't anticipate being able to truly fix all of this until we move to foundation wiki (after the board approves) or, at best, in about 2 weeks. I want to make sure that I set enough time aside to be confidant that when I change an anchor I change all of the links to it. Because of that my goal at the moment is to 'make it work for now' and most of these links do appear to work (at least in FF and Chrome) on mediawiki. If you have some time to work on it I would very much appreciate the help but don't feel obligated. I'm going to try and work on it one of these evenings but because of my other projects (and a work trip next week) I don't want to promise anything. Jalexander--WMF 01:40, 12 February 2014 (UTC)[reply]
I fixed the invalid id's used by anchors in the page; but a few entries in the "summary" template need to be synchronized in translations (the "$id" used by tvars have identical names as the id's used in the HTML for target anchors in the Privacy policy page). These entries are now fuzzy (I fixed them, in French in order to test that they were all functional). Theses id's no longer contain any space or punctuation except the hyphen, and are often shorter (this makes translations easier to edit too). To translators: as usual, don't translate the "$id" placeholders (as usual), but make sure they are present (don't replace them by updated URLs for a specific language).
Note that this summary template can be viewed now directly in its page: its links will point to the translated version to the Privacy Policy page (in the current user language) instead of linking to the current page as they did before, but for this works only if you click the "purge" button after you have made edits to translation units in the "summary" template.
This is because that template is now fully within a noinclude section, and when viewing the template page, it locally takes a "PP" parameter pointing to the page name of the Privacy page. But when the template is transcluded n the Privacy page, this template parameter is not used, and is empty by default, so the links will point to the local page, as they did before. Hope this helps testing the links.
I also fixed the RTL layout of the page (for Arabic, Hebrew, Divehi...), but this depends on the template {{dir}}.
To Foundation admins: if you import it later to the Foundation site,that does not have the Dir template the "/Lft and /Right subtemplates will have to be edited on the WMF site depending on translations. {{dir}} is used in the /Left, /Right and /RHeader subtemplates, using the helper {{pagelang}} which is supposed to return the content language code of the current page, i.e. the suffix after the "/" (even in languages currently not supported by MediaWiki). The Dir utility template should be OK with the list of RTL klanguage translations supported anyway (the minimum list of RTL languages in the Dir template should include "ar", "bcc", "dv", "fa", "he", "ug", "ur", "yi", but you can easily complete the list by using the statistics page of the Translate tool which dives the full ist of autonyms for language codes supported, this list is sortable by autonym and you get a column of language codes to define in the Dir template).
Also the ugly black double stroke borders in section summaries in small prints (to the left in English) are now using the colors used in the main summary box at the top of page and single stroke, and the same background color is uesd. verdy_p (talk) 09:47, 12 February 2014 (UTC)[reply]
Thank you so much for all of your help Verdy, I will look through things as soon as I'm able to try and unfuzzy and synchronize. Jalexander--WMF 11:12, 12 February 2014 (UTC)[reply]
@Jalexander:: if you have thenecessary privilege, can you add "azb" (Southern Azeri, written in the Arabic script, outside Azerbaijan that uses the Latin script for Azeri "az") to the list of RTL languages in {{Dir}} ? In fact there may be a few other codes to add, if I just consider the list displayed on the Translation statistics, where I see other MediaWiki-supported languages using the Hebrew; Arabic, Aramaic, or Divehi scripts in their displayed autonym: some of these are language variants of another language already listed as RTL in their main variant, or not listed as their main variant is LTR (I think this may concern other African languages, like Haussa. The list should be reviewed by the Wikimedia Language Committee. verdy_p (talk) 14:26, 12 February 2014 (UTC)[reply]
Added azb, I agree we should do a review and see what other languages should be getting the treatment. Jalexander--WMF 00:58, 13 February 2014 (UTC)[reply]

Slight changes to definition of PII, this policy, and data retention policy as a result of question about headers from Verdy_p

verdy p asked a question about HTTP headers on the data retention policy, and so we did some final reviews of our language on that issue. As part of that review, we realized Friday that there was a sentence in the main privacy policy that was poorly drafted (not future-proof) and inaccurate. It prohibited collecting four specific browser properties. This is bad drafting, because it isn't future-proof: what if new browser properties are added by browser manufacturers? What if we realize other existing properties are problematic? It also was inaccurate because some of this sort of data may be collected (but usually not retained or logged) for useful, non-harmful purposes. For example, it could be used to help determine how to make fonts available efficiently as part of our language work.

Reviewing this also made us realize that we'd made a similar drafting mistake in the definition of PII- it was not flexible enough to require us to protect new forms of identifying information we might see in the future.

We think the best way to handle this is in three parts, and have made changes to match:

  1. Broaden the definition of PII by adding "at least", so that if we discover that there are new types of identifying information, we can cover them as necessary. This would cover these four headers, for example, but could also cover other things in the future. (change)
  2. Added headers specifically as an example in the data retention policy, so that it is clear this sort of data has to be protected in the same way all other PII. (change)
  3. Delete the specific sentence. (change)

We think, on the whole, that these changes make us more able to handle new types of data in the future, while protecting them in the same way we protect other data instead of in a special case. Please let us know if you have any concerns. -LVilla (WMF) (talk) 18:47, 13 February 2014 (UTC)[reply]

Slight changes? Not in my view! This is MAJOR change. Revoking a commitment, the DAY BEFORE debate is scheduled to close, that browser sniffing is incompatible with this Policy is no slight change. I'm trying not to blow my lid, but I'm really pissed off! The deadline needed extension because of the change and needs extension, retroactively, now. Although it appeared at first that this MAJOR change was slipped in under the wire, I understand that it was prompted by verdy_p's questions starting 1/15. Still, asumming all that LVilla says is valid regarding future-proof-ness, that in no way justifies total removal of the commitment from the policy. The policy is now, once again, a blatant lie. I had fixed it. The time for considering such radical changes was back in December when this was discussed AT LENGTH. @LVilla (WMF): what about that discussion? I'm disappointed that no one else involved in the December discussion said a thing about this troubling change!
  1. Change 1 is awful; see the December discussion. I said then, "Let's not set a bad example and be deceitful about what we collect…" With the changes LVilla has made, Wikimedia IS setTING a bad example and beING deceitful about what IT collectS. FOR SHAME!
  2. Change 2 is awful for the same reasons.
  3. Change 3 … slight? Yeah, and nothing Snowden blew the whistle on was illegal.

I think the community is owed an apology and I think the changes need to be revisited. We need to stop lying to our users. Lying to our fellow users is inexcusable. If anyone wants to talk to me about this offline, let me know. --Elvey (talk) 06:44, 19 March 2014 (UTC)[reply]

WTF?

@LVilla (WMF):, involved in the December discussion:@Geoffbrigham:, @Drdee:, @Stephen LaPorte (WMF): No response to my comment above? If this isn't going to be addressed, I guess I can ping the board directly to let them know, before they vote. --Elvey (talk) 02:24, 25 March 2014 (UTC)[reply]

We didn't respond because I don't think your criticisms are accurate, and your tone suggests you do not want to have a constructive conversation. In particular, the change you've characterized as "deceitful" allows us to add more things, but not take them away, from the list. I think most people would agree that, as we mentioned above, this is a pro-user and pro-future-proofing step - it allows us to protect users more in the future, but not less. If you'd like to take that to the board, feel free, but I'll feel very comfortable explaining to them why you're wrong. Sorry that we disagree. —Luis Villa (WMF) (talk) 18:09, 28 March 2014 (UTC)[reply]
You revoked a commitment to users that browser sniffing is incompatible with this Policy. That is no slight change, no matter how you spin it. And it seems inexplicable to me why you think that revoking a pro-user commitment to collecting less data is a "pro-user" step. But intelligent people disagree sometimes. --Elvey (talk) 02:31, 6 May 2014 (UTC)[reply]
The bottom line is that I pushed for and gained consensus for language that made it clear that the privacy policy would not allow browser sniffing. It was added and stayed in the draft for weeks. Then on the last day, it was removed. Now we have a privacy policy that allows browser sniffing, and yet claims to be informative. That's an untenable situation. That's the bottom line. If this is in any way inaccurate, I welcome corrections. Specific corrections only. Vague assertions based on no specific facts, as in your last comment, are not appropriate. --Elvey (talk) 06:35, 6 May 2014 (UTC) (update 20:49, 10 May 2014 (UTC): @Geoffbrigham:, @Drdee:, @Stephen LaPorte (WMF):, @LuisV (WMF): Well? )[reply]
But wait, the checkuser tool contains the IP address, Operating system and browser in order to identify potential sockpuppet accounts. Are you saying that they cannot do that anymore? Reguyla (talk) 18:15, 16 May 2014 (UTC)[reply]

Closing of the Community Consultation for the Draft Privacy Policy

The community consultation for this Privacy Policy draft has closed as of 14 February 2014. We thank the many community members who have participated in this robust discussion since the opening of the consultation 03 September 2013. Your input has helped create a transparent proposed Privacy Policy that reflects our community's values. You can read more about the consultation and the next steps for the proposed Privacy Policy on the Wikimedia blog. Mpaulson (WMF) (talk) 00:00, 15 February 2014 (UTC)[reply]

Fine

and what happens with this kind of "Schurkenlisten" (list of desperados) https://de.wikipedia.org/w/index.php?title=Benutzer:Seewolf/Liste_der_Schurken_im_Wikipedia-Universum&diff=121652431&oldid=121648831#Angel54? Nothing, many sign, nothing happens. Thats wikipedia policy. Background: Im a teacher in history and they call me an "Antisemite" there - although never anything happened in this case: thats a kind of "ratfucking", dont u agree?--Angel54 5 (talk) 01:16, 15 March 2014 (UTC) And I add an uttering of that person, who holds me in prison there (from twitter), to demonstrate, what kind of underlying pattern there really is: https://twitter.com/search?q=hkrichel&src=typd&f=realtime[reply]

Harald Krichel ‏@hkrichel 12. Feb. @zynaesthesie Das funktioniert nicht für homophobe Antisemiten, eine überdurchschnittlich häufige Betroffenenkombination (seems to me meanwhile erased)

He answers (translated): That only doesnt work for homophobic antisemites, a more than average combination number of persons concerned.--Angel54 5 (talk) 20:26, 22 March 2014 (UTC) Then take his last one: Harald Krichel ‏@hkrichel 7 Std.[reply]

Merkel hat recht: Ein Twitterverbot ist keine Zensur. Zensur macht man mit feineren Werkzeugen.

Merkel is right: Forbidding twitter is not censoring. This is done with finer tools. Means: He knows, what censoring is and uses that kind of stuff in his own sense of right or wrong. Btw. He has the German WP in his hands, and tributes to it with programming filters, noone ever agreed. Noone knows how far this scheme goes, cause most of those filtering is hidden.--Angel54 5 (talk) 20:53, 22 March 2014 (UTC)[reply]

There's still a need for more information

I understand that you didn't want to commit yourself absolutely in the policy. Nonetheless, "Once we receive personal information from you, we keep it for the shortest possible time that is consistent with the maintenance, understanding, and improvement of the Wikimedia Sites, and our obligations under applicable U.S. law." is a question waiting to be asked. Can you provide the users with a report of how long these retention times are, and especially, what obligations you feel you have under U.S. law? Wnt (talk) 10:57, 22 March 2014 (UTC)[reply]

Seconded. --Nemo 11:04, 22 March 2014 (UTC)[reply]
You can ask about the requirements of US law - but you can hardly ask Wikimedia to promise in the Privacy Policy (by giving a specific timespan) that those laws wont change. Alexpl (talk) 09:55, 22 April 2014 (UTC)[reply]
Retention timespans consistent with the maintenance, understanding, and improvement of the Wikimedia Sites can and should be provided.
Retention timespans consistent with perceived obligations under applicable U.S. law can and should be provided.
--Elvey (talk) 02:26, 6 May 2014 (UTC)[reply]
They sure can. But I see little benefit to the users, since such timespans do not apply to warrantless domestic wiretapping and data retention without any judicial oversight by state agencies. Alexpl (talk) 17:03, 8 May 2014 (UTC)[reply]
You're being myopic. Those with dragnet surveillance abilities aren't the only ones who can trample privacy rights. Privacy rights are regularly trampled without dragnet surveillance. --Elvey (talk) 20:55, 10 May 2014 (UTC)[reply]
The archives should prove how myopic I am about third parties. But fact remains that data could show up after the retention timespan consistent with the law, and I dont want WM to be held accountable for that because it had promised to have that data deleted by a specific date. Something like: "We will delete it after X years - but it wont disappear if the dataminig industrie or a state agency have gotten their hands on it before that date" does not sound helpful. Alexpl (talk) 06:05, 12 May 2014 (UTC)[reply]
Hi Wnt. Alexpl (talk) is accurate that we cannot predict whether our obligations under U.S. law will change in the future and require us to keep certain information for a longer or shorter period of time. One of the reasons that we chose not to include time frames in the privacy policy is that we want the flexibility to adjust our retention times as the law or our technological needs change, without seeking board approval for every adjustment. We do, however, provide our users with a better idea of what our promise to keep information for the shortest time possible means through our document retention guidelines. We also recently released requests for user information procedures and guidelines to provide our users with more information about our obligations under U.S. law and how we respond to requests for user information. Finally, we’re happy to answer, to the best of our ability, any specific questions you have if either of those documents don’t address them. RPatel (WMF) (talk) 20:29, 20 May 2014 (UTC)[reply]

Tracking pixel

Where's the discussion which determined that this technique with "less than the best reputation" is needed on the voyage? The phrase "tracking pixel" doesn't even exist in the cookie FAQ. More dirty laundry hanging in the front yard, s'il vous plaît, if you're serious about public comment. MaxEnt (talk) 07:29, 8 May 2014 (UTC)[reply]

In the archive maybe. I´m not qualified to answer the FAQ problem. Alexpl (talk) 08:24, 9 May 2014 (UTC)[reply]
https://meta.wikimedia.org/wiki/Talk:Privacy_policy/Archives/2014 Obviously they're very very serious about creating the appearance of consultation with and acceptance of help from the user community. However, the history of edits shows otherwise, I saw no users arguing for the opaqueness around critical issues like profiling that I tried to address through comments and edits. And yet the edits I proposed and contributed were removed. On the plus side, although the policy is certainly not clear about what it is collected, at least it no longer claims to be clear about what it is collected. Earlier versions both were not clear and yet claimed to be clear. --Elvey (talk) 03:25, 11 May 2014 (UTC)[reply]
MaxEnt (talk), you can find tracking pixels in our glossary of key terms. If you would like to read some of the discussion we had during the consultation regarding this topic, please see answers from tech here and discussion regarding third party data collection here. RPatel (WMF) (talk) 18:59, 14 May 2014 (UTC)[reply]

Edits about tracking and personal information

This edits User:Elvey was remedied. User:LVilla (WMF) Elvey, please share context? (Like you did for some other thing here). Gryllida (talk) 04:30, 7 January 2014 (UTC)[reply]

To explain why I changed those -
  • this edit removed "retained" from the description of what we do with direct communications between users. I did this because we it is not accurate to say that we retain those - we may in some cases but in most cases that I'm aware of we don't.
So does anyone think that justifies silence on this important topic? Not that I've seen (other than staff.)--Elvey (talk) 03:25, 11 May 2014 (UTC)[reply]
  • this edit removed an example about tracking pixels that Elvey had edited. Elvey's edit correctly pointed out that the example was a little hard to understand, but I don't think his edit improved it. I spent a little bit of time trying to explain it better without writing a book or assuming the reader is a web developer, and failed, so I deleted it. If folks want to take another stab at it, I'm happy to discuss it here.
Sorry for not explaining this earlier, User:Elvey - I do appreciate that you were trying to improve it :) —LVilla (WMF) (talk) 00:00, 9 January 2014 (UTC)[reply]
So does anyone think that justifies increasing opacity regarding this important topic? Not that I've seen (other than staff.) --Elvey (talk) 03:25, 11 May 2014 (UTC)[reply]

Layout problem

The blue-box summary for each major section in the left margin seems to be creating blank space in the main prose, as if there were a {{clear}} around it rather than being adjacent to the actual text. I'm using Firefox 29.0 on OS X. Seems to resolve itself if I make my browser window extra wide, so maybe something is hardcoded for some minimum something? Sorry, I can't upload images to meta to illustrate it. DMacks (talk) 00:16, 9 May 2014 (UTC)[reply]

Hi DMacks, thanks for pointing this out! We are looking into whether we can fix this. RPatel (WMF) (talk) 19:03, 14 May 2014 (UTC)[reply]

Exemptions from the Privacy Policy

I'm going to make this brief, because I don't think anyone really cares anyway, but I have a bit of a problem with the wording of this new privacy policy. In particular the part which says that Admins and functionaries (checkusers and the like) are exempt. Now I realize that there has been a developed culture where the admins here are treated like royalty and I agree there needs to be some language that allows them to do their tasks. But to say they are exempt from policy referring to Privacy information is a big problem for me. Functionaries I can go with because their identity and age are vetted. But administrators are selected by the community and their identities are never verified. There is enough problems with admin abuse on Wikipedia. We really should not be writing language that specifically excludes the from privacy policy. Reguyla (talk) 02:17, 15 May 2014 (UTC)[reply]

Are you referring to the "To Protect You, Ourselves & Others" section? The box on the left summarizes the cases when "users with certain administrative rights" can disclose information:
  • enforce or investigate potential violations of Foundation or community-based policies;
  • protect our organization, infrastructure, employees, contractors, or the public; or
  • prevent imminent or serious bodily harm or death to a person.
The third definitely makes sense. The second one is somewhat vague (protect the public/employees from what?), but seems reasonable. However, the first one could potentially be problematic. Violating WMF policy is very different from violating a "community-based" policy. Which part of the new privacy policy are you concerned with? I don't see anything where admins "are exempt", but I admit I only searched the document for the word "admin[istrator]". PiRSquared17 (talk) 22:07, 15 May 2014 (UTC)[reply]
Have you tried uncollapsing? The most important parts of the text are the two collapsed ones. Or, Talk:Privacy_policy/Archives/2014#Google Analytics, GitHub ribbon, Facebook like button, etc. and the three threads linked from it (plus some others). --Nemo 16:34, 16 May 2014 (UTC)[reply]
Oh yeah I read every word, which leads to a seperate issue of it being very long and sufficiently complex and legalistic to ensure very few will take the time to read it. In regards to the matter of admins and privacy. There are multiple problems with not clearly defining their role in the privacy policy. For example:
  1. There are about 1400 admins on the english wiki alone with varying levels of activity and interpretations of policy. Of that, only about 500 edit more than once every thirty days and of that less than 100 edit every day.
  2. They are not vetted through the WMF and are anonymous, makning privacy security dubious
  3. Even the the Functionaries like checkuser are questionable because eventhought their identifications are verified through the WMF. The verification process is pretty limited and the documentation isn't retained.
So I would recommend rewording the part about Admins like Checkuser, to refer to functionaries instead of admins and I would lose the loose wording of who is exempt. We don't have that many roles, we should just list them. Reguyla (talk) 18:12, 16 May 2014 (UTC)[reply]
@Nemo: Why are those boxes collapsed? They contain important information.
@Reguyla: Ah, I think I see what you are referring to now. "Administrative volunteers, such as CheckUsers or Stewards" is not clear whether it includes normal admins (sysops) or only CU/OS/Stewards (who are at least identified to the Wikimedia Foundation and have specific policies, as well as the access to nonpublic information policy). It would make sense to list out the specific groups or rights this covers. I don't see why admins should be exempt from policies regarding privacy. This wording seems to allow admins, essentially normal users with a few extra buttons, to disregard the privacy of other users, if I am interpreting it correctly.
@LVilla (WMF): are normal admins (sysops) exempt from this policy, or does that wording only apply to CU/OS/Stewards, who have more specific policies? PiRSquared17 (talk) 21:53, 16 May 2014 (UTC)[reply]
Hi Reguyla & PiRSquared17. Thank you for your comments and questions. We wanted to clarify why administrative volunteers are excluded from the privacy policy. The privacy policy is meant to be an agreement between the Foundation and its users on how the Foundation will handle user data. The Foundation can’t control the actions of community members such as administrative volunteers, so we don’t include them under the privacy policy. However, administrative volunteers, including CheckUsers and Stewards are subject to the access to nonpublic information policy (access policy). Under the access policy, these volunteers must sign a confidentiality agreement which requires them to treat any personal information that they handle according to the same standards outlined in the privacy policy. So, even though administrative volunteers are not included in the privacy policy, the access policy and the confidentiality agreement require them to follow the same rules set forth in the privacy policy. I hope that clears up any confusion. RPatel (WMF) (talk) 20:48, 20 May 2014 (UTC)[reply]
The Access to nonpublic information policy does not apply to "normal" sysops who are not identified to the Wikimedia Foundation, but who may have access to some private data (deleted edits). PiRSquared17 (talk) 23:07, 20 May 2014 (UTC)[reply]

Typo

The phrase "such a merger" should read "such as a merger". If this is a community-developed privacy policy draft, why isn't it editable? I shouldn't have to post notices like this just to get a typographical error fixed. Semi-protection from IP vandals ought to be sufficient. If a page as contentious as en:w:Wikipedia:Manual of Style can be editable, so can this.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  08:09, 17 May 2014 (UTC)[reply]

Because, SMcCandlish, this Policy is approved by the Board, and the Board can only approve a particular version. People can't just add whatever they think "improves" the document afterwards, just as administrators can't just "improve" passed legislation. — Pajz (talk) 08:40, 17 May 2014 (UTC) (That said, I'm very sure both Legal and the Board welcome pointers to such errors, I'm just saying that this is unlike something like the Wikipedia Manual of Style.)[reply]
Somewhere in there it still says it's a draft being worked on, not an approved final policy. That's why I thought it should be editable.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  09:57, 17 May 2014 (UTC)[reply]
Thank you, SMcCandlish. We will fix the typo. RPatel (WMF) (talk) 03:05, 20 May 2014 (UTC)[reply]
Fixed! Thanks. RPatel (WMF) (talk) 20:16, 20 May 2014 (UTC)[reply]