From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 4733 invoked by alias); 16 May 2011 20:35:21 -0000 Received: (qmail 4650 invoked by uid 22791); 16 May 2011 20:35:20 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,SPF_HELO_PASS,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (74.125.121.67) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 16 May 2011 20:35:07 +0000 Received: from kpbe15.cbf.corp.google.com (kpbe15.cbf.corp.google.com [172.25.105.79]) by smtp-out.google.com with ESMTP id p4GKZ5eZ006415 for ; Mon, 16 May 2011 13:35:05 -0700 Received: from pvg16 (pvg16.prod.google.com [10.241.210.144]) by kpbe15.cbf.corp.google.com with ESMTP id p4GKYeBc001730 (version=TLSv1/SSLv3 cipher=RC4-MD5 bits=128 verify=NOT) for ; Mon, 16 May 2011 13:35:03 -0700 Received: by pvg16 with SMTP id 16so3035454pvg.15 for ; Mon, 16 May 2011 13:35:03 -0700 (PDT) Received: by 10.68.47.34 with SMTP id a2mr122311pbn.165.1305578103210; Mon, 16 May 2011 13:35:03 -0700 (PDT) Received: from coign.google.com ([216.239.45.130]) by mx.google.com with ESMTPS id f1sm3501132pbm.93.2011.05.16.13.35.01 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 16 May 2011 13:35:02 -0700 (PDT) From: Ian Lance Taylor To: Richard Guenther Cc: overseers@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Re: Don't let search bots look at buglist.cgi References: Date: Mon, 16 May 2011 20:35:00 -0000 In-Reply-To: (Richard Guenther's message of "Mon, 16 May 2011 11:45:42 +0200") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-System-Of-Record: true Mailing-List: contact overseers-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: , Sender: overseers-owner@sourceware.org X-SW-Source: 2011-q2/txt/msg00070.txt.bz2 Richard Guenther writes: > On Fri, May 13, 2011 at 7:14 PM, Ian Lance Taylor wrote: >> I noticed that buglist.cgi was taking quite a bit of CPU time. =C2=A0I l= ooked >> at some of the long running instances, and they were coming from >> searchbots. =C2=A0I can't think of a good reason for this, so I have >> committed this patch to the gcc.gnu.org robots.txt file to not let >> searchbots search through lists of bugs. =C2=A0I plan to make a similar >> change on the sourceware.org and cygwin.com sides. =C2=A0Please let me k= now >> if this seems like a mistake. >> >> Does anybody have any experience with >> http://code.google.com/p/bugzilla-sitemap/ ? =C2=A0That might be a sligh= tly >> better approach. > > Shouldn't we keep searchbots way from bugzilla completely? Searchbots > can crawl the gcc-bugs mailinglist archives. I don't see anything wrong with crawling bugzilla, though, and the resulting links should be better. Ian