From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 24720 invoked by alias); 16 May 2011 21:20:12 -0000 Received: (qmail 24712 invoked by uid 22791); 16 May 2011 21:20:11 -0000 X-SWARE-Spam-Status: No, hits=-1.8 required=5.0 tests=AWL,BAYES_00,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.codesourcery.com (HELO mail.codesourcery.com) (38.113.113.100) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 16 May 2011 21:19:54 +0000 Received: (qmail 5822 invoked from network); 16 May 2011 21:19:53 -0000 Received: from unknown (HELO digraph.polyomino.org.uk) (joseph@127.0.0.2) by mail.codesourcery.com with ESMTPA; 16 May 2011 21:19:53 -0000 Received: from jsm28 (helo=localhost) by digraph.polyomino.org.uk with local-esmtp (Exim 4.72) (envelope-from ) id 1QM5Ch-0000LZ-Pw; Mon, 16 May 2011 21:19:51 +0000 Date: Mon, 16 May 2011 21:20:00 -0000 From: "Joseph S. Myers" To: Ian Lance Taylor cc: Richard Guenther , overseers@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Re: Don't let search bots look at buglist.cgi In-Reply-To: Message-ID: References: MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="-1152306461-1556762401-1305580791=:30957" Mailing-List: contact overseers-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: , Sender: overseers-owner@sourceware.org X-SW-Source: 2011-q2/txt/msg00072.txt.bz2 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. ---1152306461-1556762401-1305580791=:30957 Content-Type: TEXT/PLAIN; charset=utf-8 Content-Transfer-Encoding: QUOTED-PRINTABLE Content-length: 1278 On Mon, 16 May 2011, Ian Lance Taylor wrote: > Richard Guenther writes: >=20 > > On Fri, May 13, 2011 at 7:14 PM, Ian Lance Taylor wro= te: > >> I noticed that buglist.cgi was taking quite a bit of CPU time. =C2=A0I= looked > >> at some of the long running instances, and they were coming from > >> searchbots. =C2=A0I can't think of a good reason for this, so I have > >> committed this patch to the gcc.gnu.org robots.txt file to not let > >> searchbots search through lists of bugs. =C2=A0I plan to make a similar > >> change on the sourceware.org and cygwin.com sides. =C2=A0Please let me= know > >> if this seems like a mistake. > >> > >> Does anybody have any experience with > >> http://code.google.com/p/bugzilla-sitemap/ ? =C2=A0That might be a sli= ghtly > >> better approach. > > > > Shouldn't we keep searchbots way from bugzilla completely? Searchbots > > can crawl the gcc-bugs mailinglist archives. >=20 > I don't see anything wrong with crawling bugzilla, though, and the > resulting links should be better. Indeed. I think the individual bugs, and the GCC-specific help texts=20 (such as describekeywords.cgi and describecomponents.cgi), should be=20 indexed. --=20 Joseph S. Myers joseph@codesourcery.com= ---1152306461-1556762401-1305580791=:30957--