# See http://www.robotstxt.org/wc/norobots.html for documentation on how to use the robots.txt file
# Mainly to reduce server load from bots, we block pages which are actions, and
# searches. We also block /feed/, as RSS readers (rightly, I think) don't seem
# to check robots.txt.
# Note: Can delay Bing's crawler with:
# Crawl-delay: 1
# http://www.bing.com/community/blogs/webmaster/archive/2009/08/10/crawl-delay-and-the-bing-crawler-msnbot.aspx
# This file uses the non-standard extension characters * and $, which are supported by Google and Yahoo!
# http://code.google.com/web/controlcrawlindex/docs/robots_txt.html
# http://help.yahoo.com/l/us/yahoo/search/webcrawler/slurp-02.html
User-agent: *
Disallow: */annotate/
Disallow: */new/
Disallow: */search/
Disallow: */similar/
Disallow: */track/
Disallow: */upload/
Disallow: */user/contact/
Disallow: */feed/
Disallow: */profile/
Disallow: */signin
Disallow: */request/*/response/
Disallow: */body/*/view_email$
# The following adding Jan 2012 to stop robots crawling pages
# generated in error (see
# https://github.com/mysociety/alaveteli/issues/311). Can be removed
# later in 2012 when the error pages have been dropped from the index
Disallow: *.json.j*