From: Ian Lance Taylor <ian@airs.com>
To: overseers@sourceware.org
Subject: Re: [v-amwilc at microsoft period com: Robots.txt file restricting msnbot with crawl-delay at http://gcc.gnu.org/robots.txt]
Date: Fri, 18 Nov 2011 01:00:00 -0000 [thread overview]
Message-ID: <m3pqgqb5y0.fsf@pepe.airs.com> (raw)
In-Reply-To: <20111117175629.GA3124@sourceware.org> (Chris Faylor's message of "Thu, 17 Nov 2011 17:56:29 +0000")
Chris Faylor <cgf-use-the-mailinglist-please@cgf.cx> writes:
> [Reply-To set to overseers]
> Should we change the Crawl-Delay ?
I think it would be fine to try.
Ian
> From: "Amy Wilcox (Murphy & Associates)" <v-amwilc at microsoft period com>
> Subject: Robots.txt file restricting msnbot with crawl-delay at http://gcc.gnu.org/robots.txt
> To: gcc
> Date: Wed, 16 Nov 2011 20:05:26 +0000
>
> Hi,
>
> I am contacting you from the Microsoft Corporation and its Internet search engine Bing (http://www.bing.com) in regards to your robots.txt file at http://gcc.gnu.org/robots.txt. Our customers have alerted us that some of your site content was not visible in our results. We have discovered that you are preventing us from crawling this content by the following crawl-delay settings in your robots.txt.
>
> User-agent: *
> Disallow: /viewcvs
> Disallow: /cgi-bin/
> Disallow: /bugzilla/buglist.cgi
> Crawl-Delay: 60
>
> Your current crawl-delay setting of 60 authorizes us to crawl around 1440 URLs per day (86,400 seconds per day / 60 crawl-delay ) which is not enough to guarantee that new URLs are crawled and indexed. Also this rate will not allow us to crawl older URLs to verify if they have been updated or if they are still available on your site.
>
> Since you have a large number of URLs on your site, we would be pleased if you remove the crawl delay settings in your robots.txt which additionally will increase traffic to your site via Bing and Yahoo search results. If you would like to use a slower or faster crawl rate at different times of the day our Bing Webmaster Tools will allow you to configure these settings (http://www.bing.com/community/site_blogs/b/webmaster/archive/2011/06/08/updates-to-bing-webmaster-tools-data-and-content.aspx) and also assist you further in obtaining the best results possible for your business or website (http://www.bing.com/toolbox/webmaster/ ).
>
> If you have further questions please let me know.
>
> Best regards,
>
> Amy Wilcox
> Web Analyst, Bing from Microsoft
> v-amwilc at microsoft period com
>
>
> ----------
prev parent reply other threads:[~2011-11-18 1:00 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-11-17 17:56 Chris Faylor
2011-11-18 0:22 ` Jonathan Larmour
2011-11-18 1:00 ` Ian Lance Taylor [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=m3pqgqb5y0.fsf@pepe.airs.com \
--to=ian@airs.com \
--cc=overseers@sourceware.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).