public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed
* yet yet more htdig: gcc ml:s losing messages?
@ 2003-01-20  5:28 Hans-Peter Nilsson
  2003-01-20 14:41 ` Christopher Faylor
  0 siblings, 1 reply; 9+ messages in thread
From: Hans-Peter Nilsson @ 2003-01-20  5:28 UTC (permalink / raw)
  To: overseers

Frome the (again!) failed gcc htdig indexing:

Found 636 hits for words=print_operand_address, expected 61
Found 1128 hits for words=rms, expected 311
test-htdig-db: Calling "/sourceware/htdig/gcc/htsearch
 -c /sourceware/htdig/gcc/tmpdir/test_htdig_conf.29230 words\=bugreport 2>&1"
Expected hits 1207, found 170

It's what it looks like: a search for the word "bugreport"
returned 170 hits, but was expected to return 1207 hits.  On
2002-12-21, or actually, at the last successful index before
that due to the bug I mentioned before, things were like this:

Found 1610 hits for words=rms, expected 311
Found 1063 hits for words=print_operand_address, expected 61
Found 3290 hits for words=bugreport, expected 1207
Found 478 hits for words=suspicious, expected 73
Found 5 hits for words=benevolent, expected 1

This *may* of course be reality slapping me to make me pay
attention because I turned more than one knob at a time: conf
was changed before re-index to not index certain attachments in
order to trim the db.  The other word hit results look low too.
Note that test-htdig-db returned early for failure due to low
hits (now fixed).

So the question is: is there a way to find out whether the gcc
mailing list archives are "sane", as opposed to losing messages?

Judging from the very low hits on the word "bugreport" (used to
be in the ICE bug-report request message), gcc-bugs may be on a
particularly tough diet.

I'll undo the conf changes and re-index again to check...

brgds, H-P

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: yet yet more htdig: gcc ml:s losing messages?
  2003-01-20  5:28 yet yet more htdig: gcc ml:s losing messages? Hans-Peter Nilsson
@ 2003-01-20 14:41 ` Christopher Faylor
  2003-01-20 15:17   ` Christopher Faylor
  0 siblings, 1 reply; 9+ messages in thread
From: Christopher Faylor @ 2003-01-20 14:41 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: overseers

On Mon, Jan 20, 2003 at 12:28:08AM -0500, Hans-Peter Nilsson wrote:
>So the question is: is there a way to find out whether the gcc
>mailing list archives are "sane", as opposed to losing messages?

I can't think of any way.  Maybe do a google search (quick!) for the
same and then see if the two jive?

cgf

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: yet yet more htdig: gcc ml:s losing messages?
  2003-01-20 14:41 ` Christopher Faylor
@ 2003-01-20 15:17   ` Christopher Faylor
  2003-01-20 21:12     ` Hans-Peter Nilsson
  0 siblings, 1 reply; 9+ messages in thread
From: Christopher Faylor @ 2003-01-20 15:17 UTC (permalink / raw)
  To: Hans-Peter Nilsson, overseers

On Mon, Jan 20, 2003 at 09:43:07AM -0500, Christopher Faylor wrote:
>On Mon, Jan 20, 2003 at 12:28:08AM -0500, Hans-Peter Nilsson wrote:
>>So the question is: is there a way to find out whether the gcc
>>mailing list archives are "sane", as opposed to losing messages?
>
>I can't think of any way.  Maybe do a google search (quick!) for the
>same and then see if the two jive?

Btw, I keep meaning to say that I can copy the old db files from the old
machine, if that would help.  I naively thought they would just be
automatically regenerated so I hadn't been rsyncing them since they are
so huge.  Howeer, if it is an issue, I can pull them over.

I've also reactivated your account on 209.249.29.67 if you want to login
and look around.

cgf

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: yet yet more htdig: gcc ml:s losing messages?
  2003-01-20 15:17   ` Christopher Faylor
@ 2003-01-20 21:12     ` Hans-Peter Nilsson
  2003-01-21  7:53       ` Hans-Peter Nilsson
  0 siblings, 1 reply; 9+ messages in thread
From: Hans-Peter Nilsson @ 2003-01-20 21:12 UTC (permalink / raw)
  To: overseers

On Mon, 20 Jan 2003, Christopher Faylor wrote:

> On Mon, Jan 20, 2003 at 09:43:07AM -0500, Christopher Faylor wrote:
> >On Mon, Jan 20, 2003 at 12:28:08AM -0500, Hans-Peter Nilsson wrote:
> >>So the question is: is there a way to find out whether the gcc
> >>mailing list archives are "sane", as opposed to losing messages?
> >
> >I can't think of any way.  Maybe do a google search (quick!) for the
> >same and then see if the two jive?

Jason run the obvious(?) find (doh!) and the judgement is that
it's htdig that is fishy.

> Btw, I keep meaning to say that I can copy the old db files from the old
> machine, if that would help.

No, that will not help.

>  I naively thought they would just be
> automatically regenerated

Well, they *should* be regenerated.  I'm on it.

> I've also reactivated your account on 209.249.29.67 if you want to login
> and look around.

Thanks.  That may help.

brgds, H-P

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: yet yet more htdig: gcc ml:s losing messages?
  2003-01-20 21:12     ` Hans-Peter Nilsson
@ 2003-01-21  7:53       ` Hans-Peter Nilsson
  2003-01-21  9:02         ` Jason Molenda
  2003-01-22  1:36         ` Christopher Faylor
  0 siblings, 2 replies; 9+ messages in thread
From: Hans-Peter Nilsson @ 2003-01-21  7:53 UTC (permalink / raw)
  To: overseers

On Mon, 20 Jan 2003, Hans-Peter Nilsson wrote:
> > >On Mon, Jan 20, 2003 at 12:28:08AM -0500, Hans-Peter Nilsson wrote:
> > >>So the question is: is there a way to find out whether the gcc
> > >>mailing list archives are "sane", as opposed to losing messages?
> Jason run the obvious(?) find (doh!) and the judgement is that
> it's htdig that is fishy.

Looks like locale-related fish.  Trying yet another round,
setting LC_ALL=C.  It definitely affects "sort" (the utility)
results, though I can't trivially repeat the occurrences I saw
of

bugreport
bug_report
bugreport

brgds, H-P

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: yet yet more htdig: gcc ml:s losing messages?
  2003-01-21  7:53       ` Hans-Peter Nilsson
@ 2003-01-21  9:02         ` Jason Molenda
  2003-01-21 15:42           ` Christopher Faylor
  2003-01-22  1:36         ` Christopher Faylor
  1 sibling, 1 reply; 9+ messages in thread
From: Jason Molenda @ 2003-01-21  9:02 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: overseers

On Tue, Jan 21, 2003 at 02:53:13AM -0500, Hans-Peter Nilsson wrote:

> 
> Looks like locale-related fish.  Trying yet another round,

Great news!  Much better than if it were SCSI-drive-related fish
like on the old system.  Those were some stinky fish.

:-)

Thanks for all the effort on htdig - I feel a tinge of guilt every
time I see you tweaking the sourceware side to fix problems when
you only really wanted to get that damned gcc search engine working
right. :-)

J

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: yet yet more htdig: gcc ml:s losing messages?
  2003-01-21  9:02         ` Jason Molenda
@ 2003-01-21 15:42           ` Christopher Faylor
  2003-01-21 16:55             ` Hans-Peter Nilsson
  0 siblings, 1 reply; 9+ messages in thread
From: Christopher Faylor @ 2003-01-21 15:42 UTC (permalink / raw)
  To: Jason Molenda; +Cc: Hans-Peter Nilsson, overseers

On Tue, Jan 21, 2003 at 01:02:29AM -0800, Jason Molenda wrote:
>Thanks for all the effort on htdig - I feel a tinge of guilt every
>time I see you tweaking the sourceware side to fix problems when
>you only really wanted to get that damned gcc search engine working
>right. :-)

Yeah, tell me about it.  I feel guilty that it isn't working, too.
I thought it was just going to work like everything else.  It sounds
like htdig is a tempermental beast.  We're lucky to have Hans-Peter
around to calm it down.

cgf

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: yet yet more htdig: gcc ml:s losing messages?
  2003-01-21 15:42           ` Christopher Faylor
@ 2003-01-21 16:55             ` Hans-Peter Nilsson
  0 siblings, 0 replies; 9+ messages in thread
From: Hans-Peter Nilsson @ 2003-01-21 16:55 UTC (permalink / raw)
  To: Christopher Faylor; +Cc: Jason Molenda, overseers

On Tue, 21 Jan 2003, Christopher Faylor wrote:
> On Tue, Jan 21, 2003 at 01:02:29AM -0800, Jason Molenda wrote:
> >Thanks for all the effort on htdig - I feel a tinge of guilt every
> >time I see you tweaking the sourceware side to fix problems when
> >you only really wanted to get that damned gcc search engine working
> >right. :-)
>
> Yeah, tell me about it.  I feel guilty that it isn't working, too.

Here's a thing to do: wherever in the system (presumably just
one global /etc/* something?) "LC_<whatever>: en_US" is set,
change it to "C".  The breakage is subtle, so it's quite
possible there are other things that almost work.

Actually htdig *does* that, but the results of a setlocale call
is not in effect effect for calls to "external" programs like
"sort".

> I thought it was just going to work like everything else.  It sounds
> like htdig is a tempermental beast.  We're lucky to have Hans-Peter
> around to calm it down.

I should say thanks, but I don't like that thinking. ;-)

I don't like it (htdig-3.1.5) too, but it seems there's nothing
that can re-index (or update?) in even the same order of time,
including later htdig releases. :-(

brgds, H-P

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: yet yet more htdig: gcc ml:s losing messages?
  2003-01-21  7:53       ` Hans-Peter Nilsson
  2003-01-21  9:02         ` Jason Molenda
@ 2003-01-22  1:36         ` Christopher Faylor
  1 sibling, 0 replies; 9+ messages in thread
From: Christopher Faylor @ 2003-01-22  1:36 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: overseers

On Tue, Jan 21, 2003 at 02:53:13AM -0500, Hans-Peter Nilsson wrote:
>On Mon, 20 Jan 2003, Hans-Peter Nilsson wrote:
>> > >On Mon, Jan 20, 2003 at 12:28:08AM -0500, Hans-Peter Nilsson wrote:
>> > >>So the question is: is there a way to find out whether the gcc
>> > >>mailing list archives are "sane", as opposed to losing messages?
>> Jason run the obvious(?) find (doh!) and the judgement is that
>> it's htdig that is fishy.
>
>Looks like locale-related fish.  Trying yet another round,
>setting LC_ALL=C.  It definitely affects "sort" (the utility)
>results, though I can't trivially repeat the occurrences I saw
>of
>
>bugreport
>bug_report
>bugreport

I have set LANG=C in /etc/sysconfig/i18n.  I don't know if that will fix
things or not but from looking at the system startup and profile files
it just may.  I'd have to reboot the system to know for sure, though.

cgf

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2003-01-22  1:36 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-01-20  5:28 yet yet more htdig: gcc ml:s losing messages? Hans-Peter Nilsson
2003-01-20 14:41 ` Christopher Faylor
2003-01-20 15:17   ` Christopher Faylor
2003-01-20 21:12     ` Hans-Peter Nilsson
2003-01-21  7:53       ` Hans-Peter Nilsson
2003-01-21  9:02         ` Jason Molenda
2003-01-21 15:42           ` Christopher Faylor
2003-01-21 16:55             ` Hans-Peter Nilsson
2003-01-22  1:36         ` Christopher Faylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).