Policy:User-Agent policy: Difference between revisions

From Wikimedia Foundation Governance Wiki
Content deleted Content added
As of February 15th 2010, Wikimedia sites require a '''HTTP User-Agent header''' for all requests
 
Anomie (talk | contribs)
Mention that copying a browser user agent for a bot is not allowed
Line 5: Line 5:
:''Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice.''
:''Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice.''


This change is most likely to affect scripts (bots) accessing Wikimedia websites such as Wikipedia automatically, via api.php or otherwise [http://www.mediawiki.org/wiki/API:FAQ#do_I_get_HTTP_403_errors.3F]. If you run a bot, please send a User-Agent header identifying the bot and supplying some way of contacting you, e.g.: <code><nowiki>User-Agent: MyCoolTool (+http://example.com/MyCoolToolPage/)</nowiki></code>. For more information, please refer to the [http://www.mediawiki.org/wiki/API:Quick_start_guide#Identifying_your_client MediaWiki API Documentation].
This change is most likely to affect scripts (bots) accessing Wikimedia websites such as Wikipedia automatically, via api.php or otherwise [http://www.mediawiki.org/wiki/API:FAQ#do_I_get_HTTP_403_errors.3F]. If you run a bot, please send a User-Agent header identifying the bot and supplying some way of contacting you, e.g.: <code><nowiki>User-Agent: MyCoolTool (+http://example.com/MyCoolToolPage/)</nowiki></code>. Do not copy a browser's user agent for your bot, as bot-like behavior with a browser's user agent will be assumed malicious.[http://lists.wikimedia.org/pipermail/wikitech-l/2010-February/046783.html] For more information, please refer to the [http://www.mediawiki.org/wiki/API:Quick_start_guide#Identifying_your_client MediaWiki API Documentation].


Web browsers generally send a User-Agent string automatically; if you encounter the above error, please refer to your browser's manual to find out how to set the User-Agent string. Note that some plugins or proxies for privacy enhancement may suppress this header. However, for anonymous surfing, it is recommended to send a generic User-Agent string, instead of suppressing it or sending an empty string. Note that other features are much more likely to identify you to a website - if you are interested in protecting your privacy, visit the [http://panopticlick.eff.org/ panopticlick project].
Web browsers generally send a User-Agent string automatically; if you encounter the above error, please refer to your browser's manual to find out how to set the User-Agent string. Note that some plugins or proxies for privacy enhancement may suppress this header. However, for anonymous surfing, it is recommended to send a generic User-Agent string, instead of suppressing it or sending an empty string. Note that other features are much more likely to identify you to a website - if you are interested in protecting your privacy, visit the [http://panopticlick.eff.org/ panopticlick project].

Revision as of 15:54, 24 February 2010

Note: this place is purely informative, reflecting the current state of affairs. To discuss this topic, please use the wikitech-l mailing list.

As of February 15th 2010, Wikimedia sites require a HTTP User-Agent header for all requests. This was an operative decision made by the technical staff and was announced and discussed on the technical mailing list [1][2]. The rationale is, that clients that do not send a User-Agent string are mostly ill behaved scripts that cause a lot of load on the servers, without benefiting the projects. Note that non-descriptive default values for the User-Agent string, such as used by Perl's libwww, may also be blocked from using Wikimedia web site (or parts of the web sites, such as api.php). User agents (browsers or scripts) that fail to send an informative user agent may now encounter an error message like this:

Scripts should use an informative User-Agent string with contact information, or they may be IP-blocked without notice.

This change is most likely to affect scripts (bots) accessing Wikimedia websites such as Wikipedia automatically, via api.php or otherwise [3]. If you run a bot, please send a User-Agent header identifying the bot and supplying some way of contacting you, e.g.: User-Agent: MyCoolTool (+http://example.com/MyCoolToolPage/). Do not copy a browser's user agent for your bot, as bot-like behavior with a browser's user agent will be assumed malicious.[4] For more information, please refer to the MediaWiki API Documentation.

Web browsers generally send a User-Agent string automatically; if you encounter the above error, please refer to your browser's manual to find out how to set the User-Agent string. Note that some plugins or proxies for privacy enhancement may suppress this header. However, for anonymous surfing, it is recommended to send a generic User-Agent string, instead of suppressing it or sending an empty string. Note that other features are much more likely to identify you to a website - if you are interested in protecting your privacy, visit the panopticlick project.