From: Ian Lance Taylor <iant@google.com>
To: Richard Guenther <richard.guenther@gmail.com>
Cc: Andrew Haley <aph@redhat.com>, Michael Matz <matz@suse.de>,
gcc-patches@gcc.gnu.org
Subject: Re: Don't let search bots look at buglist.cgi
Date: Tue, 17 May 2011 07:17:00 -0000 [thread overview]
Message-ID: <BANLkTikwjjfsbuCacXdvd=5CyqYdEzzChA@mail.gmail.com> (raw)
In-Reply-To: <BANLkTikJP8i6i+55qCKfD4YhfMyhJNLigg@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 1131 bytes --]
On Mon, May 16, 2011 at 6:42 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
>>>
>>> httpd being in the top-10 always, fiddling with bugzilla URLs?
>>> (Note, I don't have access to gcc.gnu.org, I'm relaying info from multiple
>>> instances of discussion on #gcc and richi poking on it; that said, it
>>> still might not be web crawlers, that's right, but I'll happily accept
>>> _any_ load improvement on gcc.gnu.org, how unfounded they might seem)
I think that simply blocking buglist.cgi has dropped bugzilla off the
immediate radar.
It also seems to have lowered the load, although I'm not sure if we
are still keeping
historical data.
> I for example see also
>
> 66.249.71.59 - - [16/May/2011:13:37:58 +0000] "GET
> /viewcvs?view=revision&revision=169814 HTTP/1.1" 200 1334 "-"
> "Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)" (35%) 2060117us
>
> and viewvc is certainly even worse (from an I/O perspecive). I thought
> we blocked all bot traffic from the viewvc stuff ...
This is only happening at top level. I committed this patch to fix this.
Ian
[-- Attachment #2: foo.patch --]
[-- Type: text/x-patch, Size: 517 bytes --]
Index: robots.txt
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/robots.txt,v
retrieving revision 1.10
diff -u -r1.10 robots.txt
--- robots.txt 13 May 2011 17:09:11 -0000 1.10
+++ robots.txt 17 May 2011 05:19:11 -0000
@@ -2,8 +2,8 @@
# for information about the file format.
# Contact gcc@gcc.gnu.org for questions.
-User-Agent: *
-Disallow: /viewcvs/
+User-agent: *
+Disallow: /viewcvs
Disallow: /cgi-bin/
Disallow: /bugzilla/buglist.cgi
Crawl-Delay: 60
next prev parent reply other threads:[~2011-05-17 5:28 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2011-05-13 19:19 Ian Lance Taylor
2011-05-16 12:59 ` Richard Guenther
2011-05-16 13:17 ` Andrew Haley
2011-05-16 13:18 ` Michael Matz
2011-05-16 13:28 ` Andrew Haley
2011-05-16 13:32 ` Andreas Schwab
2011-05-16 13:34 ` Richard Guenther
2011-05-16 13:39 ` Andrew Haley
2011-05-16 13:42 ` Michael Matz
2011-05-16 13:42 ` Andrew Haley
2011-05-16 13:45 ` Michael Matz
2011-05-16 14:18 ` Andrew Haley
2011-05-16 14:37 ` Richard Guenther
2011-05-16 15:28 ` Andrew Haley
2011-05-17 7:17 ` Ian Lance Taylor [this message]
2011-05-17 11:12 ` Axel Freyn
2011-05-17 13:39 ` Michael Matz
2011-05-16 23:13 ` Ian Lance Taylor
2011-05-17 2:53 ` Joseph S. Myers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='BANLkTikwjjfsbuCacXdvd=5CyqYdEzzChA@mail.gmail.com' \
--to=iant@google.com \
--cc=aph@redhat.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=matz@suse.de \
--cc=richard.guenther@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).