public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed
From: Christopher Faylor <cgf-use-the-mailinglist-please@sourceware.org>
To: "Frank Ch. Eigler" <fche@redhat.com>,
	overseers@sourceware.org, Jonathan Larmour <jifl@jifvik.org>
Subject: Re: sourceware on search engines
Date: Thu, 11 Nov 2010 20:28:00 -0000	[thread overview]
Message-ID: <20101111202737.GA26142@ednor.casa.cgf.cx> (raw)
In-Reply-To: <4CDC490A.8080901@jifvik.org>

On Thu, Nov 11, 2010 at 07:50:34PM +0000, Jonathan Larmour wrote:
>On 11/11/10 18:09, Frank Ch. Eigler wrote:
>> Hi -
>> 
>>> Frank> They generally are indexed (not included in robots.txt).  Can you give
>>> Frank> an example of what you see missing?
>>>
>>> I've seen this too.
>>> Almost any search that I would expect to hit on sourceware.org instead
>>> pulls up results from elsewhere, often cygwin.ru.
>> 
>> google has funny heuristics about which copy of a mirror to present for
>> any given query.
>
>It seems like this is indeed the origin of my belief too. If I look much
>later in search results I do eventually see sourceware aliases crop up.
>For example a google for "gdb cortex registers" does eventually show a
>result straight from sourceware, but only on p.10.
>
>>> E.g., search for "systemtap signedness roland" on Google.
>>> This shows cygwin.ru, nabble.com, but not sourceware.
>>> Now add "site:sourceware.org" -- I see no hits.
>> 
>> OTOH, sourceware.org/ml/* is indexed by google, and for other queries,
>> we get hits just fine.  So whatever the problem is, it's not as simple
>> as it being blocked.  It's more about freshness or crawling rate or
>> something.
>
>Perhaps it's the presence of all the site aliases? I vaguely recall that
>google lowers the rank of sites that have aliases pointing to the same
>pages - that's a ploy that people have done in the past to try and improve
>their search ranking. Maybe cygwin.com should only have links to the
>cygwin* mailing lists, ecos.sourceware.org to the ecos* lists, and so on?

And don't forget: sources.redhat.com.

http://google.com/search?q=systemtap+signedness+roland+site:sources.redhat.com

cgf

      reply	other threads:[~2010-11-11 20:28 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-11-10  7:16 Jonathan Larmour
2010-11-10 14:21 ` Frank Ch. Eigler
2010-11-11 17:59   ` Tom Tromey
2010-11-11 18:06     ` Per Bothner
2010-11-11 18:09     ` Frank Ch. Eigler
2010-11-11 19:50       ` Jonathan Larmour
2010-11-11 20:28         ` Christopher Faylor [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20101111202737.GA26142@ednor.casa.cgf.cx \
    --to=cgf-use-the-mailinglist-please@sourceware.org \
    --cc=fche@redhat.com \
    --cc=jifl@jifvik.org \
    --cc=overseers@sourceware.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).