From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 27694 invoked by alias); 11 Feb 2005 21:25:42 -0000 Mailing-List: contact overseers-help@sources.redhat.com; run by ezmlm Precedence: bulk List-Archive: List-Post: List-Help: , Sender: overseers-owner@sources.redhat.com Received: (qmail 27590 invoked from network); 11 Feb 2005 21:25:37 -0000 Received: from unknown (HELO dair.pair.com) (209.68.1.49) by sourceware.org with SMTP; 11 Feb 2005 21:25:37 -0000 Received: (qmail 36864 invoked by uid 20157); 11 Feb 2005 21:25:36 -0000 Date: Sun, 13 Feb 2005 20:19:00 -0000 From: Hans-Peter Nilsson X-X-Sender: hp@dair.pair.com To: overseers@sourceware.org Subject: Conf changes to htdig sourceware side Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-SW-Source: 2005-q1/txt/msg00216.txt.bz2 Properly marking sourceware.org as the canonical name, with sources.redhat.com and www.sourceware.org as aliases. Committed. This should avoid excluding stuff only referenced as "sourceware.org" and also avoid http redirects for the parts that are only reached by http (as opposed to files). (Whether or not sourceware.org is actually supposed to be the canonical name is here of less importance than what DNS thinks.) Index: sourceware.conf =================================================================== RCS file: /cvs/sourceware/infra/htdig-conf/sourceware.conf,v retrieving revision 1.19 diff -p -c -u -p -r1.19 sourceware.conf cvs diff: conflicting specifications of output style --- sourceware.conf 20 Oct 2003 20:25:01 -0000 1.19 +++ sourceware.conf 11 Feb 2005 21:11:29 -0000 @@ -33,12 +33,13 @@ exclude_too_long_words: true # start_url: `${common_dir}/start.url` # # Keep the included file-path in sync with what's in htupdate-sourceware.sh. -start_url: http://sources.redhat.com/ \ +start_url: http://sourceware.org/ \ `${database_dir}/noindex-follow-urls` # The old hostname (left side) is here changed to the canonical hostname # (right side), to avoid a loop of redirects. -server_aliases: sourceware.cygnus.com=sources.redhat.com +server_aliases: sourceware.cygnus.com=sourceware.org \ + sources.redhat.com=sourceware.org www.sourceware.org=sourceware.org # # This attribute limits the scope of the indexing process. The default is to @@ -52,8 +53,8 @@ server_aliases: sourceware.cygnus.com=s # # Unless we set "limit_normalized", we need this to include all hosts # that may be canonicalized into those we are interested in. -limit_urls_to: ${start_url} http://sourceware.cygnus.com/ - +limit_urls_to: ${start_url} http://sourceware.cygnus.com/ \ + http://sources.redhat.com/ http://www.sourceware.org/ # # If there are particular pages that you definately do NOT want to index, you @@ -131,15 +132,15 @@ exclude_urls: ${site__exclude_urls} ${r # This is parsed by generate-htdig-include-list.sh. Make sure it stays # parseable: do not refer to variables and make sure trailing slashes are # in place. -local_urls: http://sources.redhat.com/ml/=/www/sourceware/ml/ \ - http://sources.redhat.com/=/www/sourceware/htdocs/ +local_urls: http://sourceware.org/ml/=/www/sourceware/ml/ \ + http://sourceware.org/=/www/sourceware/htdocs/ # Include one instance of the old base, sourceware.cygnus.com. Quite a # few URLs contain it. If you change this without indexing from scratch, # the tokens for the old parts will be mapped to the new parts. -common_url_parts: http:// http://sources.redhat.com/ml \ - http://sources.redhat.com ftp://sources.redhat.com/pub\ - http://sources.redhat.com/ml/cygwin/199 \ +common_url_parts: http:// http://sourceware.org/ml \ + http://sourceware.org ftp://sourceware.org/pub\ + http://sourceware.org/ml/cygwin/199 \ http://www. ftp:// ftp://ftp. /pub/ .html .png .jpg .jpeg \ /index.html /index.htm .com/ .com mailto: \ sourceware.cygnus.com brgds, H-P