From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4718 invoked by alias); 16 May 2011 20:35:21 -0000 Received: (qmail 4652 invoked by uid 22791); 16 May 2011 20:35:20 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,SPF_HELO_PASS,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.44.51) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 16 May 2011 20:35:06 +0000 Received: from wpaz5.hot.corp.google.com (wpaz5.hot.corp.google.com [172.24.198.69]) by smtp-out.google.com with ESMTP id p4GKZ5MJ016621 for ; Mon, 16 May 2011 13:35:05 -0700 Received: from pwj8 (pwj8.prod.google.com [10.241.219.72]) by wpaz5.hot.corp.google.com with ESMTP id p4GKZ30W015203 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Mon, 16 May 2011 13:35:04 -0700 Received: by pwj8 with SMTP id 8so3538044pwj.27 for ; Mon, 16 May 2011 13:35:03 -0700 (PDT) Received: by 10.68.47.34 with SMTP id a2mr122311pbn.165.1305578103210; Mon, 16 May 2011 13:35:03 -0700 (PDT) Received: from coign.google.com ([216.239.45.130]) by mx.google.com with ESMTPS id f1sm3501132pbm.93.2011.05.16.13.35.01 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 16 May 2011 13:35:02 -0700 (PDT) From: Ian Lance Taylor To: Richard Guenther Cc: overseers@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Re: Don't let search bots look at buglist.cgi References: Date: Mon, 16 May 2011 23:13:00 -0000 In-Reply-To: (Richard Guenther's message of "Mon, 16 May 2011 11:45:42 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-System-Of-Record: true X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org X-SW-Source: 2011-05/txt/msg01167.txt.bz2 Richard Guenther writes: > On Fri, May 13, 2011 at 7:14 PM, Ian Lance Taylor wrote: >> I noticed that buglist.cgi was taking quite a bit of CPU time. =C2=A0I l= ooked >> at some of the long running instances, and they were coming from >> searchbots. =C2=A0I can't think of a good reason for this, so I have >> committed this patch to the gcc.gnu.org robots.txt file to not let >> searchbots search through lists of bugs. =C2=A0I plan to make a similar >> change on the sourceware.org and cygwin.com sides. =C2=A0Please let me k= now >> if this seems like a mistake. >> >> Does anybody have any experience with >> http://code.google.com/p/bugzilla-sitemap/ ? =C2=A0That might be a sligh= tly >> better approach. > > Shouldn't we keep searchbots way from bugzilla completely? Searchbots > can crawl the gcc-bugs mailinglist archives. I don't see anything wrong with crawling bugzilla, though, and the resulting links should be better. Ian