public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Ian Lance Taylor <iant@google.com>
To: Richard Guenther <richard.guenther@gmail.com>
Cc: Andrew Haley <aph@redhat.com>, Michael Matz <matz@suse.de>,
	       gcc-patches@gcc.gnu.org
Subject: Re: Don't let search bots look at buglist.cgi
Date: Tue, 17 May 2011 07:17:00 -0000	[thread overview]
Message-ID: <BANLkTikwjjfsbuCacXdvd=5CyqYdEzzChA@mail.gmail.com> (raw)
In-Reply-To: <BANLkTikJP8i6i+55qCKfD4YhfMyhJNLigg@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1131 bytes --]

On Mon, May 16, 2011 at 6:42 AM, Richard Guenther
<richard.guenther@gmail.com> wrote:
>>>
>>> httpd being in the top-10 always, fiddling with bugzilla URLs?
>>> (Note, I don't have access to gcc.gnu.org, I'm relaying info from multiple
>>> instances of discussion on #gcc and richi poking on it; that said, it
>>> still might not be web crawlers, that's right, but I'll happily accept
>>> _any_ load improvement on gcc.gnu.org, how unfounded they might seem)

I think that simply blocking buglist.cgi has dropped bugzilla off the
immediate radar.
It also seems to have lowered the load, although I'm not sure if we
are still keeping
historical data.


> I for example see also
>
> 66.249.71.59 - - [16/May/2011:13:37:58 +0000] "GET
> /viewcvs?view=revision&revision=169814 HTTP/1.1" 200 1334 "-"
> "Mozilla/5.0 (compatible; Googlebot/2.1;
> +http://www.google.com/bot.html)" (35%) 2060117us
>
> and viewvc is certainly even worse (from an I/O perspecive).  I thought
> we blocked all bot traffic from the viewvc stuff ...

This is only happening at top level.  I committed this patch to fix this.

Ian

[-- Attachment #2: foo.patch --]
[-- Type: text/x-patch, Size: 517 bytes --]

Index: robots.txt
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/robots.txt,v
retrieving revision 1.10
diff -u -r1.10 robots.txt
--- robots.txt	13 May 2011 17:09:11 -0000	1.10
+++ robots.txt	17 May 2011 05:19:11 -0000
@@ -2,8 +2,8 @@
 # for information about the file format.
 # Contact gcc@gcc.gnu.org for questions.
 
-User-Agent: *
-Disallow: /viewcvs/
+User-agent: *
+Disallow: /viewcvs
 Disallow: /cgi-bin/
 Disallow: /bugzilla/buglist.cgi
 Crawl-Delay: 60

  parent reply	other threads:[~2011-05-17  5:28 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-13 19:19 Ian Lance Taylor
2011-05-16 12:59 ` Richard Guenther
2011-05-16 13:17   ` Andrew Haley
2011-05-16 13:18     ` Michael Matz
2011-05-16 13:28       ` Andrew Haley
2011-05-16 13:32         ` Andreas Schwab
2011-05-16 13:34         ` Richard Guenther
2011-05-16 13:39           ` Andrew Haley
2011-05-16 13:42             ` Michael Matz
2011-05-16 13:42               ` Andrew Haley
2011-05-16 13:45                 ` Michael Matz
2011-05-16 14:18                   ` Andrew Haley
2011-05-16 14:37                     ` Richard Guenther
2011-05-16 15:28                       ` Andrew Haley
2011-05-17  7:17                       ` Ian Lance Taylor [this message]
2011-05-17 11:12                         ` Axel Freyn
2011-05-17 13:39                         ` Michael Matz
2011-05-16 23:13   ` Ian Lance Taylor
2011-05-17  2:53     ` Joseph S. Myers

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='BANLkTikwjjfsbuCacXdvd=5CyqYdEzzChA@mail.gmail.com' \
    --to=iant@google.com \
    --cc=aph@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=matz@suse.de \
    --cc=richard.guenther@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).