From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1895 invoked by alias); 13 May 2011 17:14:56 -0000 Received: (qmail 1877 invoked by uid 22791); 13 May 2011 17:14:55 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL,BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,SPF_HELO_PASS,T_RP_MATCHES_RCVD,T_TVD_MIME_NO_HEADERS X-Spam-Check-By: sourceware.org Received: from smtp-out.google.com (HELO smtp-out.google.com) (216.239.44.51) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 13 May 2011 17:14:41 +0000 Received: from hpaq2.eem.corp.google.com (hpaq2.eem.corp.google.com [172.25.149.2]) by smtp-out.google.com with ESMTP id p4DHEdes022729 for ; Fri, 13 May 2011 10:14:40 -0700 Received: from pxi2 (pxi2.prod.google.com [10.243.27.2]) by hpaq2.eem.corp.google.com with ESMTP id p4DHEai7024513 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for ; Fri, 13 May 2011 10:14:38 -0700 Received: by pxi2 with SMTP id 2so1685810pxi.38 for ; Fri, 13 May 2011 10:14:36 -0700 (PDT) Received: by 10.68.57.105 with SMTP id h9mr2539975pbq.206.1305306876435; Fri, 13 May 2011 10:14:36 -0700 (PDT) Received: from coign.google.com (dhcp-172-22-126-184.mtv.corp.google.com [172.22.126.184]) by mx.google.com with ESMTPS id o20sm1473665pbt.50.2011.05.13.10.14.35 (version=TLSv1/SSLv3 cipher=OTHER); Fri, 13 May 2011 10:14:35 -0700 (PDT) From: Ian Lance Taylor To: overseers@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: Don't let search bots look at buglist.cgi Date: Fri, 13 May 2011 17:14:00 -0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="=-=-=" X-System-Of-Record: true Mailing-List: contact overseers-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: , Sender: overseers-owner@sourceware.org X-SW-Source: 2011-q2/txt/msg00066.txt.bz2 --=-=-= Content-length: 570 I noticed that buglist.cgi was taking quite a bit of CPU time. I looked at some of the long running instances, and they were coming from searchbots. I can't think of a good reason for this, so I have committed this patch to the gcc.gnu.org robots.txt file to not let searchbots search through lists of bugs. I plan to make a similar change on the sourceware.org and cygwin.com sides. Please let me know if this seems like a mistake. Does anybody have any experience with http://code.google.com/p/bugzilla-sitemap/ ? That might be a slightly better approach. Ian --=-=-= Content-Type: text/x-diff Content-Disposition: inline; filename=foo.patch Content-Description: patch Content-length: 393 Index: robots.txt =================================================================== RCS file: /cvs/gcc/wwwdocs/htdocs/robots.txt,v retrieving revision 1.9 diff -u -r1.9 robots.txt --- robots.txt 22 Sep 2009 19:19:30 -0000 1.9 +++ robots.txt 13 May 2011 17:08:33 -0000 @@ -5,4 +5,5 @@ User-Agent: * Disallow: /viewcvs/ Disallow: /cgi-bin/ +Disallow: /bugzilla/buglist.cgi Crawl-Delay: 60 --=-=-=--