htdig and sources.redhat.com loadavg

public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed

* htdig and sources.redhat.com loadavg
@ 2004-04-05 18:50 David Edelsohn
  2004-04-05 19:36 ` Hans-Peter Nilsson
  2004-04-05 20:51 ` Christopher Faylor
  0 siblings, 2 replies; 54+ messages in thread
From: David Edelsohn @ 2004-04-05 18:50 UTC (permalink / raw)
  To: overseers

	The load average on sources.redhat.com appears to be spiking very
high on Monday, according to reports from people who can log in:

<dnovillo>  15:56:05  up 45 days, 17:01,  1 user,  load average: 23.58, 23.54, 25.16
<echristo>  18:31:24  up 45 days, 19:36,  2 users,  load average: 37.94, 37.67, 35.41

htdig is a candidate for the root cause of the problem.  The heavy disk
traffic and decrease in performance from htdig is generating a backlog of
CVS operations -- 24 cvs processes in disk wait state, which further loads
the system.

	An informal poll of the 30+ people on the #gcc IRC channel did not
find anyone who felt htdig was useful relative to other search options.
Four channel members suggested disabling htdig, so I would I like propose
that to overseers.  The performance problem of sources.redhat.com on
Monday is impacting development.

David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 18:50 htdig and sources.redhat.com loadavg David Edelsohn
@ 2004-04-05 19:36 ` Hans-Peter Nilsson
  2004-04-05 19:46   ` Phil Edwards
  2004-04-05 21:12   ` Hans-Peter Nilsson
  2004-04-05 20:51 ` Christopher Faylor
  1 sibling, 2 replies; 54+ messages in thread
From: Hans-Peter Nilsson @ 2004-04-05 19:36 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

On Mon, 5 Apr 2004, David Edelsohn wrote:
> 	The load average on sources.redhat.com appears to be spiking very
> high on Monday, according to reports from people who can log in:
>
> <dnovillo>  15:56:05  up 45 days, 17:01,  1 user,  load average: 23.58, 23.54, 25.16
> <echristo>  18:31:24  up 45 days, 19:36,  2 users,  load average: 37.94, 37.67, 35.41
>
> htdig is a candidate for the root cause of the problem.  The heavy disk
> traffic and decrease in performance from htdig is generating a backlog of
> CVS operations -- 24 cvs processes in disk wait state, which further loads
> the system.
>
> 	An informal poll of the 30+ people on the #gcc IRC channel did not
> find anyone who felt htdig was useful relative to other search options.
> Four channel members suggested disabling htdig, so I would I like propose
> that to overseers.  The performance problem of sources.redhat.com on
> Monday is impacting development.

Please be informed that htdig is used on the gcc side and
sourceware side (all other projects) separately.  I don't think
what's said on #gcc shouldn't be final (or perhaps shouldn't
matter at all) for the sourceware side.

I have no problem shutting down htdig for gcc permanently.

Though it seems it wasn't considered that there could be an
error situation that led to the high load from htdig.  I just
looked and there was; a DB file is over the 2G limit once again,
and the last two updates (done every other day for gcc htdig)
have failed.  Since the updates have been from scratch and the
scripts retries once in case of failure, the load from htdig has
been much higher that what is normal.  I've shut off gcc htdig
until I get to investigate whether it's fixable (once again).

(Everyone ok with shutting off gcc htdig permanently?)

brgds, H-P

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 19:36 ` Hans-Peter Nilsson
@ 2004-04-05 19:46   ` Phil Edwards
  2004-04-05 19:56     ` Frank Ch. Eigler
                       ` (2 more replies)
  2004-04-05 21:12   ` Hans-Peter Nilsson
  1 sibling, 3 replies; 54+ messages in thread
From: Phil Edwards @ 2004-04-05 19:46 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: David Edelsohn, overseers

On Mon, Apr 05, 2004 at 03:36:34PM -0400, Hans-Peter Nilsson wrote:
> looked and there was; a DB file is over the 2G limit once again,
> and the last two updates (done every other day for gcc htdig)
> have failed.

Can we stick

  if test -n `find /wherever/htdig/lives -type -f -size +2g`; then
     echo 'Ah crap, out of space again.'
     kill htdigpid
  fi

in a crontab and be done with this once and for all?


> (Everyone ok with shutting off gcc htdig permanently?)

Works for me.  Anything which can't scan for '++' doesn't really help
for a C++ library mailing list.  Dunno how the rest of gcc feels.  :-)


-- 
Behind everything some further thing is found, forever; thus the tree behind
the bird, stone beneath soil, the sun behind Urth.  Behind our efforts, let
there be found our efforts.
              - Ascian saying, as related by Loyal to the Group of Seventeen

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 19:46   ` Phil Edwards
@ 2004-04-05 19:56     ` Frank Ch. Eigler
  2004-04-05 20:03       ` Phil Edwards
  2004-04-05 20:36     ` Hans-Peter Nilsson
  2004-04-05 20:48     ` Benjamin Kosnik
  2 siblings, 1 reply; 54+ messages in thread
From: Frank Ch. Eigler @ 2004-04-05 19:56 UTC (permalink / raw)
  To: overseers

Hi -

> [...]
> > (Everyone ok with shutting off gcc htdig permanently?)
> 
> Works for me.  Anything which can't scan for '++' doesn't really help
> for a C++ library mailing list.  Dunno how the rest of gcc feels.  :-)

Is there some reason that gcc's htdig.conf doesn't include "+"
within its "extra_word_characters" configuration item?

- FChE

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 19:56     ` Frank Ch. Eigler
@ 2004-04-05 20:03       ` Phil Edwards
  0 siblings, 0 replies; 54+ messages in thread
From: Phil Edwards @ 2004-04-05 20:03 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: overseers

On Mon, Apr 05, 2004 at 03:56:31PM -0400, Frank Ch. Eigler wrote:
> Hi -
> 
> > [...]
> > > (Everyone ok with shutting off gcc htdig permanently?)
> > 
> > Works for me.  Anything which can't scan for '++' doesn't really help
> > for a C++ library mailing list.  Dunno how the rest of gcc feels.  :-)
> 
> Is there some reason that gcc's htdig.conf doesn't include "+"
> within its "extra_word_characters" configuration item?

I think it was tried once, and the database files blew up even faster.


-- 
Behind everything some further thing is found, forever; thus the tree behind
the bird, stone beneath soil, the sun behind Urth.  Behind our efforts, let
there be found our efforts.
              - Ascian saying, as related by Loyal to the Group of Seventeen

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 19:46   ` Phil Edwards
  2004-04-05 19:56     ` Frank Ch. Eigler
@ 2004-04-05 20:36     ` Hans-Peter Nilsson
  2004-04-05 21:15       ` Phil Edwards
  2004-04-05 20:48     ` Benjamin Kosnik
  2 siblings, 1 reply; 54+ messages in thread
From: Hans-Peter Nilsson @ 2004-04-05 20:36 UTC (permalink / raw)
  To: Phil Edwards; +Cc: David Edelsohn, overseers

On Mon, 5 Apr 2004, Phil Edwards wrote:
> On Mon, Apr 05, 2004 at 03:36:34PM -0400, Hans-Peter Nilsson wrote:
> > looked and there was; a DB file is over the 2G limit once again,
> > and the last two updates (done every other day for gcc htdig)
> > have failed.
>
> Can we stick
>
>   if test -n `find /wherever/htdig/lives -type -f -size +2g`; then
>      echo 'Ah crap, out of space again.'
>      kill htdigpid
>   fi
>
> in a crontab and be done with this once and for all?

No.  The limit isn't related to "out of diskspace"; there's
enough of that.  The limit is of indexing; 2G for files indexed
by a signed integer and no, it doesn't help using the cute
-D_FILE_OFFSET_BITS=64 (sp?).  Feel free to fix it if you care,
but please see older posts this list before further suggestions.

It's not like a local search engine is as needed now as it was
when htdig was first installed, and htdig-3.1.5 just can't cope
with the current amount of data.  A later version (3.2 series)
has been tried, but was prohibitively slow at indexing.  Maybe
mnogosearch is an option.  Sorry, I'm just out of time and
interest.

brgds, H-P

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 19:46   ` Phil Edwards
  2004-04-05 19:56     ` Frank Ch. Eigler
  2004-04-05 20:36     ` Hans-Peter Nilsson
@ 2004-04-05 20:48     ` Benjamin Kosnik
  2004-04-05 20:52       ` Ian Lance Taylor
  2 siblings, 1 reply; 54+ messages in thread
From: Benjamin Kosnik @ 2004-04-05 20:48 UTC (permalink / raw)
  To: Phil Edwards; +Cc: hp, dje, overseers


>> (Everyone ok with shutting off gcc htdig permanently?)
>
>Works for me.  Anything which can't scan for '++' doesn't really help
>for a C++ library mailing list.  Dunno how the rest of gcc feels.  :-)

If this was useful for other people, and didn't bring the system to its
knees, then I wouldn't care. As it is, i think htdig is solidly in the
negative side of things.

Personally, I think we should just recommend using google with

site:gcc.gnu.org


-benjamin

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 18:50 htdig and sources.redhat.com loadavg David Edelsohn
  2004-04-05 19:36 ` Hans-Peter Nilsson
@ 2004-04-05 20:51 ` Christopher Faylor
  2004-04-05 21:21   ` Matthew Galgoci
       [not found]   ` <cgf@alum.bu.edu>
  1 sibling, 2 replies; 54+ messages in thread
From: Christopher Faylor @ 2004-04-05 20:51 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

On Mon, Apr 05, 2004 at 02:49:50PM -0400, David Edelsohn wrote:
>An informal poll of the 30+ people on the #gcc IRC channel did not find
>anyone who felt htdig was useful relative to other search options.
>Four channel members suggested disabling htdig, so I would I like
>propose that to overseers.  The performance problem of
>sources.redhat.com on Monday is impacting development.

While I tend to agree that htdig is not a great search tool I don't
think we can just disable it without moving to something else.  Also,
since the system is shared by other projects like binutils and gdb,
polling the gcc community may not offer a representative sample.

I'd previously suggested using a google link for searches but that was
vetoed because google isn't free.

There are other alternative search engines out there but I don't know if
they would constitute less system load or not.  I think I recall
researching mnogosearch and concluding that it would be better but I
never got beyond an aborted attempt to generate the initial database.
Supposedly, it can then be configured to do incremental updates which
are relatively fast.

cgf

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 20:48     ` Benjamin Kosnik
@ 2004-04-05 20:52       ` Ian Lance Taylor
  2004-04-05 20:57         ` Zack Weinberg
  2004-04-08 21:18         ` Gerald Pfeifer
  0 siblings, 2 replies; 54+ messages in thread
From: Ian Lance Taylor @ 2004-04-05 20:52 UTC (permalink / raw)
  To: Benjamin Kosnik; +Cc: Phil Edwards, hp, dje, overseers

Benjamin Kosnik <bkoz@redhat.com> writes:

> >> (Everyone ok with shutting off gcc htdig permanently?)
> >
> >Works for me.  Anything which can't scan for '++' doesn't really help
> >for a C++ library mailing list.  Dunno how the rest of gcc feels.  :-)
> 
> If this was useful for other people, and didn't bring the system to its
> knees, then I wouldn't care. As it is, i think htdig is solidly in the
> negative side of things.
> 
> Personally, I think we should just recommend using google with
> 
> site:gcc.gnu.org

I find the feature of limiting the search to a particular date range
to be very helpful on the current pages.  It lets me track down the
e-mail message associated with a particular patch in a reasonable
fashion.  I don't know how to do that with Google.

That said, clearly htdig is problematic.  As others have said, we
probably need to use a new search engine.  However, that is going to
require somebody to volunteer to do it.

Ian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 20:52       ` Ian Lance Taylor
@ 2004-04-05 20:57         ` Zack Weinberg
  2004-04-08 21:18         ` Gerald Pfeifer
  1 sibling, 0 replies; 54+ messages in thread
From: Zack Weinberg @ 2004-04-05 20:57 UTC (permalink / raw)
  To: phil; +Cc: hp, dje, overseers, bkoz

Ian Lance Taylor <ian@airs.com> writes:

> I find the feature of limiting the search to a particular date range
> to be very helpful on the current pages.  It lets me track down the
> e-mail message associated with a particular patch in a reasonable
> fashion.  I don't know how to do that with Google.

I have also found this feature helpful in the past (also, the ability
to limit to specific mailing lists); however, it still doesn't work
very well - the lack of any way to search for an exact string, for
instance, has always been a major hindrance.  I would not be upset
if the search facility was disabled until someone volunteers time
to hook up something better.  And perhaps disabling it would flush a
volunteer out of the woodwork. ;-)

zw

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
       [not found]   ` <cgf@alum.bu.edu>
@ 2004-04-05 21:03     ` David Edelsohn
  2004-04-05 21:08       ` Ian Lance Taylor
  2004-04-06 14:49     ` htdig and sources.redhat.com loadavg David Edelsohn
                       ` (3 subsequent siblings)
  4 siblings, 1 reply; 54+ messages in thread
From: David Edelsohn @ 2004-04-05 21:03 UTC (permalink / raw)
  To: overseers

	The consensus on #gcc seems to be that if htdig causes problems
and has little benefit, and Google is not an option because it is not
free, then it is better to disable htdig until we have a better solution. 

	Also, what is package-grep?  It is gobbling up a lot of cpu time
as well.

Thanks, David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 21:03     ` David Edelsohn
@ 2004-04-05 21:08       ` Ian Lance Taylor
       [not found]         ` <ian@airs.com>
  0 siblings, 1 reply; 54+ messages in thread
From: Ian Lance Taylor @ 2004-04-05 21:08 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

David Edelsohn <dje@watson.ibm.com> writes:

> 	Also, what is package-grep?  It is gobbling up a lot of cpu time
> as well.

It's a cygwin thing.  You get it by typing something in the "Search
Package List" box on this page:
    http://www.cygwin.com/packages/

I don't know why we are being hit so heavily.  It seems to be coming
from different IP addresses.

Ian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 19:36 ` Hans-Peter Nilsson
  2004-04-05 19:46   ` Phil Edwards
@ 2004-04-05 21:12   ` Hans-Peter Nilsson
  1 sibling, 0 replies; 54+ messages in thread
From: Hans-Peter Nilsson @ 2004-04-05 21:12 UTC (permalink / raw)
  To: overseers

On Mon, 5 Apr 2004, Hans-Peter Nilsson wrote:

> I've shut off gcc htdig
> until I get to investigate whether it's fixable (once again).

Clarification: I've shut off htdig *indexing* on the gcc side.
Searching should still work fine and include stuff up to
2004-03-31 08:17 UTC.

brgds, H-P

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
       [not found]         ` <ian@airs.com>
@ 2004-04-05 21:14           ` David Edelsohn
  2004-04-05 22:51             ` Jason Molenda
  0 siblings, 1 reply; 54+ messages in thread
From: David Edelsohn @ 2004-04-05 21:14 UTC (permalink / raw)
  To: overseers

>>>>> Ian Lance Taylor writes:

Ian> I don't know why we are being hit so heavily.  It seems to be coming
Ian> from different IP addresses.

	I am guessing that htdig pushed sourceware into a thrashing mode.
Because CVS doesn't time out or service requests sequentially, people are
just hanging around with long CVS operations while they do other things.
sourceware needs workload management.

David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 20:36     ` Hans-Peter Nilsson
@ 2004-04-05 21:15       ` Phil Edwards
  2004-04-05 21:23         ` Hans-Peter Nilsson
  0 siblings, 1 reply; 54+ messages in thread
From: Phil Edwards @ 2004-04-05 21:15 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: David Edelsohn, overseers

On Mon, Apr 05, 2004 at 04:36:36PM -0400, Hans-Peter Nilsson wrote:
> On Mon, 5 Apr 2004, Phil Edwards wrote:
> > Can we stick
> >
> >   if test -n `find /wherever/htdig/lives -type -f -size +2g`; then
> 
> No.  The limit isn't related to "out of diskspace"; there's
> enough of that.  The limit is of indexing; 2G for files indexed

Ignore my echo command, then, and just look at the find.  :-)

-- 
Behind everything some further thing is found, forever; thus the tree behind
the bird, stone beneath soil, the sun behind Urth.  Behind our efforts, let
there be found our efforts.
              - Ascian saying, as related by Loyal to the Group of Seventeen

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 20:51 ` Christopher Faylor
@ 2004-04-05 21:21   ` Matthew Galgoci
  2004-04-05 23:36     ` Zack Weinberg
       [not found]   ` <cgf@alum.bu.edu>
  1 sibling, 1 reply; 54+ messages in thread
From: Matthew Galgoci @ 2004-04-05 21:21 UTC (permalink / raw)
  To: Christopher Faylor; +Cc: David Edelsohn, overseers

On Mon, 5 Apr 2004, Christopher Faylor wrote:

> On Mon, Apr 05, 2004 at 02:49:50PM -0400, David Edelsohn wrote:
> >An informal poll of the 30+ people on the #gcc IRC channel did not find
> >anyone who felt htdig was useful relative to other search options.
> >Four channel members suggested disabling htdig, so I would I like
> >propose that to overseers.  The performance problem of
> >sources.redhat.com on Monday is impacting development.
> 
> While I tend to agree that htdig is not a great search tool I don't
> think we can just disable it without moving to something else.  Also,
> since the system is shared by other projects like binutils and gdb,
> polling the gcc community may not offer a representative sample.
> 
> I'd previously suggested using a google link for searches but that was
> vetoed because google isn't free.
> 
> There are other alternative search engines out there but I don't know if
> they would constitute less system load or not.  I think I recall
> researching mnogosearch and concluding that it would be better but I
> never got beyond an aborted attempt to generate the initial database.
> Supposedly, it can then be configured to do incremental updates which
> are relatively fast.

Ideally the searching should be offloaded to another machine that is dedicated
to the purpose of indexing and running the search database. I'd be willing to set
up mnogosearch to replace htdig, but to do so on sources I think would be a bad
idea.


-- 
Matthew Galgoci
System Administrator and Sr. Manager of Ruminants
Red Hat, Inc
919.754.3700 x44155

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 21:15       ` Phil Edwards
@ 2004-04-05 21:23         ` Hans-Peter Nilsson
  2004-04-05 21:46           ` Phil Edwards
  0 siblings, 1 reply; 54+ messages in thread
From: Hans-Peter Nilsson @ 2004-04-05 21:23 UTC (permalink / raw)
  To: Phil Edwards; +Cc: overseers

On Mon, 5 Apr 2004, Phil Edwards wrote:
> On Mon, Apr 05, 2004 at 04:36:36PM -0400, Hans-Peter Nilsson wrote:
> > On Mon, 5 Apr 2004, Phil Edwards wrote:
> > > Can we stick
> > >
> > >   if test -n `find /wherever/htdig/lives -type -f -size +2g`; then
> >
> > No.  The limit isn't related to "out of diskspace"; there's
> > enough of that.  The limit is of indexing; 2G for files indexed
>
> Ignore my echo command, then, and just look at the find.  :-)

I don't really understand that sentence or why you persist, but
anyway no, you can win this game with a cute crontab entry,
except perhaps the change made that prepended an existing one
with "#".

brgds, H-P
PS. feel free to prove me wrong but please prove.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 21:23         ` Hans-Peter Nilsson
@ 2004-04-05 21:46           ` Phil Edwards
  2004-04-05 22:11             ` Hans-Peter Nilsson
  0 siblings, 1 reply; 54+ messages in thread
From: Phil Edwards @ 2004-04-05 21:46 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: overseers

On Mon, Apr 05, 2004 at 05:23:00PM -0400, Hans-Peter Nilsson wrote:
> On Mon, 5 Apr 2004, Phil Edwards wrote:
> >
> > Ignore my echo command, then, and just look at the find.  :-)
> 
> I don't really understand that sentence or why you persist, but

Huh?  If the problem is that the database files are growing larger than
their limit of 2G -- and that seems to be the issue based on all the old
mail in my mailbox -- then we look for files that are getting too large:

  if test -n `find /sourceware/htdig/gcc/db -type f -size +1887436k`; then
     echo hey, the dbfiles are over 1.8GB, getting close, go investigate!
  fi

I find it amazing that we can't at least *monitor* the indexing progress
with automated tools.  Or that nobody's even willing to try.


> anyway no, you can win this game with a cute crontab entry,
> except perhaps the change made that prepended an existing one
> with "#".

Likewise, I can't parse that.


-- 
Behind everything some further thing is found, forever; thus the tree behind
the bird, stone beneath soil, the sun behind Urth.  Behind our efforts, let
there be found our efforts.
              - Ascian saying, as related by Loyal to the Group of Seventeen

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 21:46           ` Phil Edwards
@ 2004-04-05 22:11             ` Hans-Peter Nilsson
  2004-04-05 22:26               ` Phil Edwards
  0 siblings, 1 reply; 54+ messages in thread
From: Hans-Peter Nilsson @ 2004-04-05 22:11 UTC (permalink / raw)
  To: Phil Edwards; +Cc: overseers

On Mon, 5 Apr 2004, Phil Edwards wrote:
> Huh?  If the problem is that the database files are growing larger than
> their limit of 2G -- and that seems to be the issue based on all the old
> mail in my mailbox -- then we look for files that are getting too large:

No, you misunderstood.  There's no need to search around for
files that "grew too large".  Checking the known set of DB files
and see whether one of them is 0x7fffffff bytes long (but I
suggest comparing with a slightly lower number) would give an
improved indication that further attempts to re-index will also
fail so it's no use for the script to retry.  Feel free to
improve the htdig-update script that way if you think it would
improve the situation, but that's not the crontab thing you
suggested.

Still, nothing in that direction will help *now*; it would maybe
help against useless re-indexing attempts if some temporary
measure was taken that'd shrink the DB file to within 2G.  It's
just that I think I've already made all such reasonable
temporary measures in the past.

> > anyway no, you can win this game with a cute crontab entry,
> > except perhaps the change made that prepended an existing one
> > with "#".
>
> Likewise, I can't parse that.

By prepending the existing htdig-update entry with "#", I
disabled it.  The point was that once htdig DB gets "over the
limit", it stays that way.  Re-indexing makes files (at least
temporarily) slightly larger than with a plain update.

brgds, H-P

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 22:11             ` Hans-Peter Nilsson
@ 2004-04-05 22:26               ` Phil Edwards
  0 siblings, 0 replies; 54+ messages in thread
From: Phil Edwards @ 2004-04-05 22:26 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: overseers

On Mon, Apr 05, 2004 at 06:11:28PM -0400, Hans-Peter Nilsson wrote:
> 
> By prepending the existing htdig-update entry with "#", I

I know what '#' does.  :-)  I don't have perms to read any of those files;
that's why I misunderstood.  Thanks for the explanation.


-- 
Behind everything some further thing is found, forever; thus the tree behind
the bird, stone beneath soil, the sun behind Urth.  Behind our efforts, let
there be found our efforts.
              - Ascian saying, as related by Loyal to the Group of Seventeen

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 21:14           ` David Edelsohn
@ 2004-04-05 22:51             ` Jason Molenda
  2004-04-05 23:39               ` GCC snapshot generation (was Re: htdig and sources.redhat.com loadavg) Zack Weinberg
  0 siblings, 1 reply; 54+ messages in thread
From: Jason Molenda @ 2004-04-05 22:51 UTC (permalink / raw)
  To: overseers

Hi all, sorry I was in meetings all afternoon - just catching
up.

On Mon, Apr 05, 2004 at 05:14:20PM -0400, David Edelsohn wrote:

> 	I am guessing that htdig pushed sourceware into a thrashing mode.
> Because CVS doesn't time out or service requests sequentially, people are
> just hanging around with long CVS operations while they do other things.
> sourceware needs workload management.

The schedule of crontabs is rather carefully considered.  The
service load on sourceware variest considerably by hour-of-the-day --
Monday mornings EST happen to be the highest load of the week.
Sunday is the lowest load.

The htdig index update jobs are scheduled so they finish before
the weekday morning load happens.  When something goes wrong - as
it did in this case - and htdig runs into the morning rush, the
system trashes for quite a long time.

Matthew Galgoci wrote;

> Ideally the searching should be offloaded to another machine that
> is dedicated to the purpose of indexing and running the search
> database. 

The reason for keeping the search engine on the main system is
that the search engine has direct access to the files; it doesn't
have to go through httpd.  NFS could be used to access the files
from a different system, but you're still introducing a slowdown
by not having local access.

I'm not trying to preclude such a change, I'm just pointing out
the thinking behind the current arrangement.

(well, and the fact that we only had one computer allocated
for the original sourceware system.)

No one has mentioned my favorite possibility:  Not archiving older
e-mail notes.  Or having multiple search archives, divided by time
period.  e.g. epoch - 2001.  2001 - 2003.  2004 - ...

I can't remember if there's a good reason to not do this.  It seems
like a good idea to me, with the obvious caveat that this complicates
the web search engine UI.

J

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 21:21   ` Matthew Galgoci
@ 2004-04-05 23:36     ` Zack Weinberg
  2004-04-06  0:06       ` Matthew Galgoci
  0 siblings, 1 reply; 54+ messages in thread
From: Zack Weinberg @ 2004-04-05 23:36 UTC (permalink / raw)
  To: Matthew Galgoci; +Cc: Christopher Faylor, David Edelsohn, overseers

Matthew Galgoci <mgalgoci@redhat.com> writes:

> Ideally the searching should be offloaded to another machine that is
> dedicated to the purpose of indexing and running the search
> database. I'd be willing to set up mnogosearch to replace htdig, but
> to do so on sources I think would be a bad idea.

Interesting suggestion.  Is Red Hat willing to donate the hardware?

zw

^ permalink raw reply	[flat|nested] 54+ messages in thread

* GCC snapshot generation (was Re: htdig and sources.redhat.com loadavg)
  2004-04-05 22:51             ` Jason Molenda
@ 2004-04-05 23:39               ` Zack Weinberg
  0 siblings, 0 replies; 54+ messages in thread
From: Zack Weinberg @ 2004-04-05 23:39 UTC (permalink / raw)
  To: overseers

Jason Molenda <jason-swarelist@molenda.com> writes:

> The schedule of crontabs is rather carefully considered.  The
> service load on sourceware variest considerably by hour-of-the-day --
> Monday mornings EST happen to be the highest load of the week.
> Sunday is the lowest load.
>
> The htdig index update jobs are scheduled so they finish before
> the weekday morning load happens.  When something goes wrong - as
> it did in this case - and htdig runs into the morning rush, the
> system trashes for quite a long time.

One thing I was wondering this morning was whether GCC snapshot
generation should be moved to a lower-load time.  Sunday afternoon EST
for instance.  I don't have a good sense of how much it contributes to
the overall problem, but I've had 'cvs update' collide with its locks
on three successive Mondays now.

zw

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 23:36     ` Zack Weinberg
@ 2004-04-06  0:06       ` Matthew Galgoci
  2004-04-06  0:17         ` Matthew Galgoci
  2004-04-06  0:29         ` Zack Weinberg
  0 siblings, 2 replies; 54+ messages in thread
From: Matthew Galgoci @ 2004-04-06  0:06 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Christopher Faylor, David Edelsohn, overseers

On Mon, 5 Apr 2004, Zack Weinberg wrote:

> Matthew Galgoci <mgalgoci@redhat.com> writes:
> 
> > Ideally the searching should be offloaded to another machine that is
> > dedicated to the purpose of indexing and running the search
> > database. I'd be willing to set up mnogosearch to replace htdig, but
> > to do so on sources I think would be a bad idea.
> 
> Interesting suggestion.  Is Red Hat willing to donate the hardware?
> 

Getting hardware at a software company is a pain. I was wondering if maybe
we could hit up maybe ibm or hp and get one of those swanky opteron 1u servers.

-- 
Matthew Galgoci
System Administrator and Sr. Manager of Ruminants
Red Hat, Inc
919.754.3700 x44155

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06  0:06       ` Matthew Galgoci
@ 2004-04-06  0:17         ` Matthew Galgoci
  2004-04-06  0:29         ` Zack Weinberg
  1 sibling, 0 replies; 54+ messages in thread
From: Matthew Galgoci @ 2004-04-06  0:17 UTC (permalink / raw)
  To: Zack Weinberg; +Cc: Christopher Faylor, David Edelsohn, overseers

On Mon, 5 Apr 2004, Matthew Galgoci wrote:

> On Mon, 5 Apr 2004, Zack Weinberg wrote:
> 
> > Matthew Galgoci <mgalgoci@redhat.com> writes:
> > 
> > > Ideally the searching should be offloaded to another machine that is
> > > dedicated to the purpose of indexing and running the search
> > > database. I'd be willing to set up mnogosearch to replace htdig, but
> > > to do so on sources I think would be a bad idea.
> > 
> > Interesting suggestion.  Is Red Hat willing to donate the hardware?
> > 
> 
> Getting hardware at a software company is a pain. I was wondering if maybe
> we could hit up maybe ibm or hp and get one of those swanky opteron 1u servers.

Mind you, I would also be willing to impliment mnogosearch with mysql if we can
get a dedicated search and indexer box. I have used mnogosearch with good results
in the past.

-- 
Matthew Galgoci
System Administrator and Sr. Manager of Ruminants
Red Hat, Inc
919.754.3700 x44155

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06  0:06       ` Matthew Galgoci
  2004-04-06  0:17         ` Matthew Galgoci
@ 2004-04-06  0:29         ` Zack Weinberg
  1 sibling, 0 replies; 54+ messages in thread
From: Zack Weinberg @ 2004-04-06  0:29 UTC (permalink / raw)
  To: Matthew Galgoci; +Cc: Christopher Faylor, David Edelsohn, overseers

Matthew Galgoci <mgalgoci@redhat.com> writes:

> On Mon, 5 Apr 2004, Zack Weinberg wrote:
>
>> Matthew Galgoci <mgalgoci@redhat.com> writes:
>> 
>> > Ideally the searching should be offloaded to another machine that is
>> > dedicated to the purpose of indexing and running the search
>> > database. I'd be willing to set up mnogosearch to replace htdig, but
>> > to do so on sources I think would be a bad idea.
>> 
>> Interesting suggestion.  Is Red Hat willing to donate the hardware?
>> 
>
> Getting hardware at a software company is a pain. I was wondering if
> maybe we could hit up maybe ibm or hp and get one of those swanky
> opteron 1u servers.

Maybe, but I'm not the person to be asking...

zw

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
       [not found]   ` <cgf@alum.bu.edu>
  2004-04-05 21:03     ` David Edelsohn
@ 2004-04-06 14:49     ` David Edelsohn
  2004-04-06 16:18       ` Jonathan Larmour
  2004-04-06 17:40     ` David Edelsohn
                       ` (2 subsequent siblings)
  4 siblings, 1 reply; 54+ messages in thread
From: David Edelsohn @ 2004-04-06 14:49 UTC (permalink / raw)
  To: overseers

	We're not seeing a lot of improvement with sources.redhat.com
performance: 

<dnovillo> [dnovillo@sourceware ~]$ uptime
<dnovillo>  14:47:34  up 46 days, 15:52,  2 users,  load average: 21.51, 23.89, 26.11
<dnovillo>   PID USER     PRI  NI  SIZE  RSS SHARE STAT CPU %MEM   TIME CPU  COMMAND
<dnovillo> 24943 htdigid   24   0  118M  25M   568 D    66.8  1.2   1:38   0  htdig
<dnovillo>  3435 root      15   0  3740  268   176 S    14.4  0.0   0:48   0  httpd
<dnovillo>  2935 rsyncid   16   1  3936 3936   252 D N  12.6  0.1   0:33   0  rsync

David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 14:49     ` htdig and sources.redhat.com loadavg David Edelsohn
@ 2004-04-06 16:18       ` Jonathan Larmour
  2004-04-06 16:25         ` David Edelsohn
                           ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Jonathan Larmour @ 2004-04-06 16:18 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

David Edelsohn wrote:
> 	We're not seeing a lot of improvement with sources.redhat.com
> performance: 
> 
> <dnovillo> [dnovillo@sourceware ~]$ uptime
> <dnovillo>  14:47:34  up 46 days, 15:52,  2 users,  load average: 21.51, 23.89, 26.11
> <dnovillo>   PID USER     PRI  NI  SIZE  RSS SHARE STAT CPU %MEM   TIME CPU  COMMAND
> <dnovillo> 24943 htdigid   24   0  118M  25M   568 D    66.8  1.2   1:38   0  htdig
> <dnovillo>  3435 root      15   0  3740  268   176 S    14.4  0.0   0:48   0  httpd
> <dnovillo>  2935 rsyncid   16   1  3936 3936   252 D N  12.6  0.1   0:33   0  rsync

 From a brief poke myself (and I'm no overseer) I'd hazard a guess it may 
be more to do with the 17 simultaneous cvs checkouts as well as 2 rsyncs 
and a couple of ftps. netstat also seems to be reporting a TCP SYN attack 
from tproxy1.NTCU.net (62 sockets in SYN_RECV state).

I don't know about the "supervise" thingy but I know xinetd has a 
"max_load" parameter that could be used to e.g. deny anonymous (not logged 
in) cvs over a certain load (since having 10 cvs operations complete two 
times is better than 20 cvs operations taking nearly forever).

Jifl
-- 
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
 >>>>> Visit us in booth 2527 at the Embedded Systems Conference 2004 <<<<<
March 30 - April 1, San Francisco http://www.esconline.com/electronicaUSA/
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 16:18       ` Jonathan Larmour
@ 2004-04-06 16:25         ` David Edelsohn
  2004-04-06 16:34         ` Ian Lance Taylor
  2004-04-06 16:41         ` Ian Lance Taylor
  2 siblings, 0 replies; 54+ messages in thread
From: David Edelsohn @ 2004-04-06 16:25 UTC (permalink / raw)
  To: Jonathan Larmour; +Cc: overseers

>>>>> Jonathan Larmour writes:

Jon> From a brief poke myself (and I'm no overseer) I'd hazard a guess it may 
Jon> be more to do with the 17 simultaneous cvs checkouts as well as 2 rsyncs 
Jon> and a couple of ftps. netstat also seems to be reporting a TCP SYN attack 
Jon> from tproxy1.NTCU.net (62 sockets in SYN_RECV state).

Jon> I don't know about the "supervise" thingy but I know xinetd has a 
Jon> "max_load" parameter that could be used to e.g. deny anonymous (not logged 
Jon> in) cvs over a certain load (since having 10 cvs operations complete two 
Jon> times is better than 20 cvs operations taking nearly forever).

	cvs is now aborting due to load:

Fatal error, aborting.
load average of 40 is too high
cvs [checkout aborted]: authorization failed: server gcc.gnu.org rejected access

David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 16:18       ` Jonathan Larmour
  2004-04-06 16:25         ` David Edelsohn
@ 2004-04-06 16:34         ` Ian Lance Taylor
  2004-04-06 16:39           ` Phil Edwards
  2004-04-07  2:58           ` Christopher Faylor
  2004-04-06 16:41         ` Ian Lance Taylor
  2 siblings, 2 replies; 54+ messages in thread
From: Ian Lance Taylor @ 2004-04-06 16:34 UTC (permalink / raw)
  To: Jonathan Larmour; +Cc: David Edelsohn, overseers

Jonathan Larmour <jifl@eCosCentric.com> writes:

>  From a brief poke myself (and I'm no overseer) I'd hazard a guess it
> may be more to do with the 17 simultaneous cvs checkouts as well as 2
> rsyncs and a couple of ftps. netstat also seems to be reporting a TCP
> SYN attack from tproxy1.NTCU.net (62 sockets in SYN_RECV state).
> 
> I don't know about the "supervise" thingy but I know xinetd has a
> "max_load" parameter that could be used to e.g. deny anonymous (not
> logged in) cvs over a certain load (since having 10 cvs operations
> complete two times is better than 20 cvs operations taking nearly
> forever).

We only permit 10 simultaneous anonymous CVS connections.  However,
there is no limit on the number of CVS operations performed via ssh,
and there are several hundred people with ssh access.

The number of connections from 211.76.240.245 is interesting.  I count
39 connections at the moment, all to port 80.  Looking at the HTTP
logs, though, I don't think it is a TCP_SYN attack.  I think somebody
is downloading the cygwin.com web site, including all the mailing list
messages.

Ian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 16:34         ` Ian Lance Taylor
@ 2004-04-06 16:39           ` Phil Edwards
  2004-04-07  2:58           ` Christopher Faylor
  1 sibling, 0 replies; 54+ messages in thread
From: Phil Edwards @ 2004-04-06 16:39 UTC (permalink / raw)
  To: dje, overseers, jifl


Maybe we don't need a separate machine for search engines, but rather just
a separate machine for all things cygwin (package searching, ftp, etc).

-- 
Behind everything some further thing is found, forever; thus the tree behind
the bird, stone beneath soil, the sun behind Urth.  Behind our efforts, let
there be found our efforts.
              - Ascian saying, as related by Loyal to the Group of Seventeen

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 16:18       ` Jonathan Larmour
  2004-04-06 16:25         ` David Edelsohn
  2004-04-06 16:34         ` Ian Lance Taylor
@ 2004-04-06 16:41         ` Ian Lance Taylor
  2004-04-07  2:59           ` Christopher Faylor
  2 siblings, 1 reply; 54+ messages in thread
From: Ian Lance Taylor @ 2004-04-06 16:41 UTC (permalink / raw)
  To: Jonathan Larmour; +Cc: David Edelsohn, overseers

Jonathan Larmour <jifl@eCosCentric.com> writes:

>  From a brief poke myself (and I'm no overseer) I'd hazard a guess it
> may be more to do with the 17 simultaneous cvs checkouts as well as 2
> rsyncs and a couple of ftps. netstat also seems to be reporting a TCP
> SYN attack from tproxy1.NTCU.net (62 sockets in SYN_RECV state).

I ran this command on sourceware:

/sbin/iptables -A block -s 211.76.240.245 -i eth0 -j DROP

I'm no iptables expert, but that may block out connections from
tproxy1.ntcu.net.  Let's see if that helps any.

Ian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
       [not found]   ` <cgf@alum.bu.edu>
  2004-04-05 21:03     ` David Edelsohn
  2004-04-06 14:49     ` htdig and sources.redhat.com loadavg David Edelsohn
@ 2004-04-06 17:40     ` David Edelsohn
  2004-04-06 18:00       ` Jonathan Larmour
  2004-04-08 14:48     ` gcc.gnu.org CVS meta-data corrupt? David Edelsohn
  2004-05-02 11:32     ` sourceware load problem again? David Edelsohn
  4 siblings, 1 reply; 54+ messages in thread
From: David Edelsohn @ 2004-04-06 17:40 UTC (permalink / raw)
  To: overseers

 2949 anoncvs   15   0  3868 1528    76 S    45.3  0.0   2:44   1 cvs
24943 htdigid   15   0  379M  79M   408 D     1.7  3.9  36:42   1 htdig

Is 379M/79M for htdig really a good use of sourceware resources relative
to buffer caches?

David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 17:40     ` David Edelsohn
@ 2004-04-06 18:00       ` Jonathan Larmour
  2004-04-06 19:43         ` Hans-Peter Nilsson
  0 siblings, 1 reply; 54+ messages in thread
From: Jonathan Larmour @ 2004-04-06 18:00 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

David Edelsohn wrote:
>  2949 anoncvs   15   0  3868 1528    76 S    45.3  0.0   2:44   1 cvs
> 24943 htdigid   15   0  379M  79M   408 D     1.7  3.9  36:42   1 htdig
> 
> Is 379M/79M for htdig really a good use of sourceware resources relative
> to buffer caches?

vmstat says:
>    procs                      memory      swap          io     system      cpu
>  r  b  w   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy id
>  0 29  0 645584  40872  24568 751668   11    0  3544   305 1043  1898 16 10 75

750+24Mb cache out of 2Gb total seems alright (obviously some of the page 
cache is exes, but most of those will be shared between identical 
processes), and the swapping figures are low so clearly its the 79Mb figure 
that's relevant which is nothing in comparison. sourceware is I/O bound in 
ways that buffers/page cache is unlikely to help. Unsurprisingly cvs 
checkouts in particular are likely to chew a lot of cache.

Remember that the htdig problems of yesterday were to do with the gcc side 
of the server. The htdig you are looking at is the sourceware side which 
should have no real problem (other than the load due to other processes on 
the server). The problems of today seem more likely to me to be people 
updating the cvs checkouts they didn't get yesterday. I certainly don't 
want htdig disabled with no replacement for eCos, but perhaps someone could 
consider doing:

kill -STOP 24943 ; sleep 3600; kill -CONT 24943

to give the rest of the system an opportunity to catch up.

It may be interesting to see what projects the anon cvs checkouts are for, 
as for example if they are gcc, then it may help for more to move to 
savannah's anon cvs server.

Jifl
-- 
eCosCentric    http://www.eCosCentric.com/    The eCos and RedBoot experts
 >>>>> Visit us in booth 2527 at the Embedded Systems Conference 2004 <<<<<
March 30 - April 1, San Francisco http://www.esconline.com/electronicaUSA/
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 18:00       ` Jonathan Larmour
@ 2004-04-06 19:43         ` Hans-Peter Nilsson
  2004-04-06 19:52           ` Ian Lance Taylor
  2004-04-06 19:52           ` Frank Ch. Eigler
  0 siblings, 2 replies; 54+ messages in thread
From: Hans-Peter Nilsson @ 2004-04-06 19:43 UTC (permalink / raw)
  To: Jonathan Larmour; +Cc: David Edelsohn, overseers

On Tue, 6 Apr 2004, Jonathan Larmour wrote:
> Remember that the htdig problems of yesterday were to do with the gcc side
> of the server. The htdig you are looking at is the sourceware side which
> should have no real problem (other than the load due to other processes on
> the server).

I agree with what you said except add that the same problem
(database files needing to grow larger than the access methods
can handle) will very soon be visible on the sourceware side
too.  At this moment, the db.wordlist database is 2065784280
bytes long.

(Hmm, I *could* pipe that through gzip, which may also
temporarily help with the gcc problem...  I'll try and look into
that later.  Plenty of CPU on sourceware, it's the disk I/O
that's the problem.  I guess (can't access
/proc/ide/hda/settings) that the IDE drives are still not using
DMA...)

> kill -STOP 24943 ; sleep 3600; kill -CONT 24943

The load is currently at 44.  I've stopped htdig.  I'll keep it
stopped for two hours and see if it helps.  Somebody please
remind me in 3-4 hours if you haven't heard from me.

brgds, H-P

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 19:43         ` Hans-Peter Nilsson
@ 2004-04-06 19:52           ` Ian Lance Taylor
  2004-04-06 19:52           ` Frank Ch. Eigler
  1 sibling, 0 replies; 54+ messages in thread
From: Ian Lance Taylor @ 2004-04-06 19:52 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: Jonathan Larmour, David Edelsohn, overseers

Hans-Peter Nilsson <hp@bitrange.com> writes:

> (Hmm, I *could* pipe that through gzip, which may also
> temporarily help with the gcc problem...  I'll try and look into
> that later.  Plenty of CPU on sourceware, it's the disk I/O
> that's the problem.  I guess (can't access
> /proc/ide/hda/settings) that the IDE drives are still not using
> DMA...)

bash> cat /proc/ide/hda/settings
name                    value           min             max             mode
----                    -----           ---             ---             ----
current_speed           0               0               70              rw
init_speed              0               0               70              rw
io_32bit                0               0               3               rw
keepsettings            0               0               1               rw
nice1                   0               0               1               rw
number                  0               0               3               rw
pio_mode                write-only      0               255             w
slow                    0               0               1               rw
unmaskirq               0               0               1               rw
using_dma               1               0               1               rw

Looks to me like DMA is turned on for that drive.

Ian

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 19:43         ` Hans-Peter Nilsson
  2004-04-06 19:52           ` Ian Lance Taylor
@ 2004-04-06 19:52           ` Frank Ch. Eigler
  2004-04-06 23:24             ` Hans-Peter Nilsson
  1 sibling, 1 reply; 54+ messages in thread
From: Frank Ch. Eigler @ 2004-04-06 19:52 UTC (permalink / raw)
  To: Hans-Peter Nilsson; +Cc: David Edelsohn, overseers

[-- Attachment #1: Type: text/plain, Size: 321 bytes --]

Hi -

H-P wrote:

> [...]  Plenty of CPU on sourceware, it's the disk I/O
> that's the problem.  I guess (can't access
> /proc/ide/hda/settings) that the IDE drives are still not using
> DMA...)

The only IDE device on sourceware is a CD-ROM.  BTW, we're
looking into doubling physical RAM on the machine to 4GB.

- FChE

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 19:52           ` Frank Ch. Eigler
@ 2004-04-06 23:24             ` Hans-Peter Nilsson
  0 siblings, 0 replies; 54+ messages in thread
From: Hans-Peter Nilsson @ 2004-04-06 23:24 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: overseers

On Tue, 6 Apr 2004, Frank Ch. Eigler wrote:
> H-P wrote:
>
> > [...]  Plenty of CPU on sourceware, it's the disk I/O
> > that's the problem.  I guess (can't access
> > /proc/ide/hda/settings) that the IDE drives are still not using
> > DMA...)
>
> The only IDE device on sourceware is a CD-ROM.

Oh right, I was thinking of the old system.

Load is 30 now.  I'll continue the htdig process anyway.

brgds, H-P

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 16:34         ` Ian Lance Taylor
  2004-04-06 16:39           ` Phil Edwards
@ 2004-04-07  2:58           ` Christopher Faylor
  1 sibling, 0 replies; 54+ messages in thread
From: Christopher Faylor @ 2004-04-07  2:58 UTC (permalink / raw)
  To: overseers

On Tue, Apr 06, 2004 at 12:34:36PM -0400, Ian Lance Taylor wrote:
>Jonathan Larmour <jifl@eCosCentric.com> writes:
>
>>  From a brief poke myself (and I'm no overseer) I'd hazard a guess it
>> may be more to do with the 17 simultaneous cvs checkouts as well as 2
>> rsyncs and a couple of ftps. netstat also seems to be reporting a TCP
>> SYN attack from tproxy1.NTCU.net (62 sockets in SYN_RECV state).
>> 
>> I don't know about the "supervise" thingy but I know xinetd has a
>> "max_load" parameter that could be used to e.g. deny anonymous (not
>> logged in) cvs over a certain load (since having 10 cvs operations
>> complete two times is better than 20 cvs operations taking nearly
>> forever).
>
>We only permit 10 simultaneous anonymous CVS connections.  However,
>there is no limit on the number of CVS operations performed via ssh,
>and there are several hundred people with ssh access.
>
>The number of connections from 211.76.240.245 is interesting.  I count
>39 connections at the moment, all to port 80.  Looking at the HTTP
>logs, though, I don't think it is a TCP_SYN attack.  I think somebody
>is downloading the cygwin.com web site, including all the mailing list
>messages.

That's usually a sign of a spammer grabbing email addresses.  I've been
turning off access when I notice that.  I have a script
"/home/cgf/bin/wwwstat" which shows connections by IP address that
I run periodically, looking for this type of thing.

cgf

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-06 16:41         ` Ian Lance Taylor
@ 2004-04-07  2:59           ` Christopher Faylor
  0 siblings, 0 replies; 54+ messages in thread
From: Christopher Faylor @ 2004-04-07  2:59 UTC (permalink / raw)
  To: dje, overseers, jifl

On Tue, Apr 06, 2004 at 12:40:44PM -0400, Ian Lance Taylor wrote:
>Jonathan Larmour <jifl@eCosCentric.com> writes:
>
>>  From a brief poke myself (and I'm no overseer) I'd hazard a guess it
>> may be more to do with the 17 simultaneous cvs checkouts as well as 2
>> rsyncs and a couple of ftps. netstat also seems to be reporting a TCP
>> SYN attack from tproxy1.NTCU.net (62 sockets in SYN_RECV state).
>
>I ran this command on sourceware:
>
>/sbin/iptables -A block -s 211.76.240.245 -i eth0 -j DROP
>
>I'm no iptables expert, but that may block out connections from
>tproxy1.ntcu.net.  Let's see if that helps any.

That should do it.

You can see from /etc/sysconfig/iptables that I've added this type of
thing for many other IP addresses as well.

Unfortunately, I don't think it helped.

cgf

^ permalink raw reply	[flat|nested] 54+ messages in thread

* gcc.gnu.org CVS meta-data corrupt?
@ 2004-04-08  4:04 David Edelsohn
  2004-04-08 13:20 ` Christopher Faylor
                   ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: David Edelsohn @ 2004-04-08  4:04 UTC (permalink / raw)
  To: overseers

	After yet another painful CVS update, I decided to watch an update
without the quiet flag and I saw something very interesting.  On the trunk
and on every branch, CVS is getting stuck in

src/gcc/testsuite/g++.old-deja/g++.bob

The directory has 18 files and just hangs for me in every copy that I have
checked out.

	I suspect that something is broken in the CVS meta-data or the
filesystem backing that directory.  This may be a large cause of the
slowdown if every single CVS update is timing out on that directory.

David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: gcc.gnu.org CVS meta-data corrupt?
  2004-04-08  4:04 gcc.gnu.org CVS meta-data corrupt? David Edelsohn
@ 2004-04-08 13:20 ` Christopher Faylor
  2004-04-08 13:42 ` Frank Ch. Eigler
  2004-04-08 13:42 ` gcc.gnu.org CVS meta-data corrupt? Frank Ch. Eigler
  2 siblings, 0 replies; 54+ messages in thread
From: Christopher Faylor @ 2004-04-08 13:20 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

On Thu, Apr 08, 2004 at 12:04:36AM -0400, David Edelsohn wrote:
>After yet another painful CVS update, I decided to watch an update
>without the quiet flag and I saw something very interesting.  On the
>trunk and on every branch, CVS is getting stuck in
>
>src/gcc/testsuite/g++.old-deja/g++.bob
>
>The directory has 18 files and just hangs for me in every copy that I
>have checked out.
>
>I suspect that something is broken in the CVS meta-data or the
>filesystem backing that directory.  This may be a large cause of the
>slowdown if every single CVS update is timing out on that directory.

The fileattr data looks fine and copying the directory to /tmp seems to
work ok.

I wiped out the fileattr data and did a checkout on my own system.
There was no noticeable delay.  The load average is around 5 right now
so maybe this isn't a good test.

The next time I see a bunch of cvs processes hanging I will check to see
what they are waiting for to see if there is a common denominator.

cgf

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: gcc.gnu.org CVS meta-data corrupt?
  2004-04-08  4:04 gcc.gnu.org CVS meta-data corrupt? David Edelsohn
  2004-04-08 13:20 ` Christopher Faylor
  2004-04-08 13:42 ` Frank Ch. Eigler
@ 2004-04-08 13:42 ` Frank Ch. Eigler
  2004-04-08 14:13   ` David Edelsohn
  2 siblings, 1 reply; 54+ messages in thread
From: Frank Ch. Eigler @ 2004-04-08 13:42 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

Hi -

dje wrote:

> [...]	I suspect that something is broken in the CVS meta-data or the
> filesystem backing that directory.  This may be a large cause of the
> slowdown if every single CVS update is timing out on that directory.

By the way, this particular problem did not occur yesterday -
routine du's and cvs lock-file searches proceeded unblocked,
so I doubt it was responsible for the slowdown lately.

- FChE

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: gcc.gnu.org CVS meta-data corrupt?
  2004-04-08  4:04 gcc.gnu.org CVS meta-data corrupt? David Edelsohn
  2004-04-08 13:20 ` Christopher Faylor
@ 2004-04-08 13:42 ` Frank Ch. Eigler
  2004-04-08 13:54   ` system rebooted (was Re: gcc.gnu.org CVS meta-data corrupt?) Christopher Faylor
  2004-04-08 13:42 ` gcc.gnu.org CVS meta-data corrupt? Frank Ch. Eigler
  2 siblings, 1 reply; 54+ messages in thread
From: Frank Ch. Eigler @ 2004-04-08 13:42 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

Hi -

> [...]
> 	I suspect that something is broken in the CVS meta-data or the
> filesystem backing that directory.  This may be a large cause of the
> slowdown if every single CVS update is timing out on that directory.

The same thing happens to ordinary UNIX file access in the area:
processes simply get stuck in "D" state.  The system logs are eerily
quiet on the subject.  I suspect the machine needs a reboot and an fsck.
Hmm, while poking around, it just froze. :-(

- FChE

^ permalink raw reply	[flat|nested] 54+ messages in thread

* system rebooted (was Re: gcc.gnu.org CVS meta-data corrupt?)
  2004-04-08 13:42 ` Frank Ch. Eigler
@ 2004-04-08 13:54   ` Christopher Faylor
  0 siblings, 0 replies; 54+ messages in thread
From: Christopher Faylor @ 2004-04-08 13:54 UTC (permalink / raw)
  To: overseers

On Thu, Apr 08, 2004 at 05:50:48AM -0400, Frank Ch. Eigler wrote:
>The system logs are eerily quiet on the subject.  I suspect the machine
>needs a reboot and an fsck.

# uptime
 13:53:17  up  2:08,  1 user,  load average: 1.92, 3.78, 3.77

Looks like somebody did just that.

cgf

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: gcc.gnu.org CVS meta-data corrupt?
  2004-04-08 13:42 ` gcc.gnu.org CVS meta-data corrupt? Frank Ch. Eigler
@ 2004-04-08 14:13   ` David Edelsohn
  2004-04-08 14:21     ` Christopher Faylor
  0 siblings, 1 reply; 54+ messages in thread
From: David Edelsohn @ 2004-04-08 14:13 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: overseers

>>>>> Frank Ch Eigler writes:

>> [...]	I suspect that something is broken in the CVS meta-data or the
>> filesystem backing that directory.  This may be a large cause of the
>> slowdown if every single CVS update is timing out on that directory.

Frank> By the way, this particular problem did not occur yesterday -
Frank> routine du's and cvs lock-file searches proceeded unblocked,
Frank> so I doubt it was responsible for the slowdown lately.

	CVS updates occur alphabetically.  I have been seeing long delays
between updating "gcc", especially testsuite subdirectory, and
"libstdc++-v3".  Intevening directories are not modified frequently, so it
is hard to determine where the time is being spent without non-quiet mode
or "-t" option.

	I had assumed the delay was the size of testsuite or libjava.
Manually updating those directories in non-quiet mode last night (and
gcc.gnu.org showing a load of 70), demonstrated that those large
directories are updated very rapidly.  Therefore, I believe that the long
delays were due to some CVS meta-data problem or disk problem which has
been lingering over the past few weeks.

	I do not have enough access to the system or historical
information to justify my hypothesis, but I thought my personal
observations might be helpful.

David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: gcc.gnu.org CVS meta-data corrupt?
  2004-04-08 14:13   ` David Edelsohn
@ 2004-04-08 14:21     ` Christopher Faylor
  0 siblings, 0 replies; 54+ messages in thread
From: Christopher Faylor @ 2004-04-08 14:21 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

On Thu, Apr 08, 2004 at 10:13:40AM -0400, David Edelsohn wrote:
>I do not have enough access to the system or historical information to
>justify my hypothesis, but I thought my personal observations might be
>helpful.

Yes, any data like this helps.  Thanks.

Are you still seeing problems today, David?  The system seems to be
much improved after its reboot.

cgf

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: gcc.gnu.org CVS meta-data corrupt?
       [not found]   ` <cgf@alum.bu.edu>
                       ` (2 preceding siblings ...)
  2004-04-06 17:40     ` David Edelsohn
@ 2004-04-08 14:48     ` David Edelsohn
  2004-04-08 14:53       ` Frank Ch. Eigler
  2004-05-02 11:32     ` sourceware load problem again? David Edelsohn
  4 siblings, 1 reply; 54+ messages in thread
From: David Edelsohn @ 2004-04-08 14:48 UTC (permalink / raw)
  To: overseers

>>>>> Christopher Faylor writes:

Chris> Are you still seeing problems today, David?  The system seems to be
Chris> much improved after its reboot.

	CVS updates now appear to be much faster.

	Someone manually fsck'ed the filesystem with the latest reboot,
but the previous reboot showed a clean filesystem unmount avoiding an
automatic fsck?

	We probably never will know how the filesystem became corrupted.

Thanks, David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: gcc.gnu.org CVS meta-data corrupt?
  2004-04-08 14:48     ` gcc.gnu.org CVS meta-data corrupt? David Edelsohn
@ 2004-04-08 14:53       ` Frank Ch. Eigler
  2004-04-08 15:18         ` Christopher Faylor
  0 siblings, 1 reply; 54+ messages in thread
From: Frank Ch. Eigler @ 2004-04-08 14:53 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

[-- Attachment #1: Type: text/plain, Size: 527 bytes --]

Hi -

dje wrote:

> [...]
> 	Someone manually fsck'ed the filesystem with the latest reboot,
> but the previous reboot showed a clean filesystem unmount avoiding an
> automatic fsck?

I don't know whether this morning a real fsck was done on the filesystems.
I touched /forcefsck so next time (after the memory upgrade?), it'll all
get checked for sure.

> 	We probably never will know how the filesystem became corrupted.

It could also have been a kernel data structure problem, with nothing
actually wrong on disk.


- FChE

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: gcc.gnu.org CVS meta-data corrupt?
  2004-04-08 14:53       ` Frank Ch. Eigler
@ 2004-04-08 15:18         ` Christopher Faylor
  0 siblings, 0 replies; 54+ messages in thread
From: Christopher Faylor @ 2004-04-08 15:18 UTC (permalink / raw)
  To: overseers; +Cc: David Edelsohn

On Thu, Apr 08, 2004 at 10:53:45AM -0400, Frank Ch. Eigler wrote:
>Hi -
>
>dje wrote:
>
>> [...]
>> 	Someone manually fsck'ed the filesystem with the latest reboot,
>> but the previous reboot showed a clean filesystem unmount avoiding an
>> automatic fsck?
>
>I don't know whether this morning a real fsck was done on the filesystems.
>I touched /forcefsck so next time (after the memory upgrade?), it'll all
>get checked for sure.

According to tune2fs, the last time a real fsck was performed on the
cvs filesystem was Tue May 13 22:06:19 2003.  That doesn't seem right
to me, though.  I thought Matt had done a check after that time.

The system was apparently down for three hours, too, so it seems like an
fsck might have been done.  It's odd that none of the partitions that I
sampled reflected that fact, though.

cgf

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: htdig and sources.redhat.com loadavg
  2004-04-05 20:52       ` Ian Lance Taylor
  2004-04-05 20:57         ` Zack Weinberg
@ 2004-04-08 21:18         ` Gerald Pfeifer
  1 sibling, 0 replies; 54+ messages in thread
From: Gerald Pfeifer @ 2004-04-08 21:18 UTC (permalink / raw)
  To: Ian Lance Taylor
  Cc: Benjamin Kosnik, Phil Edwards, Hans-Peter Nilsson, dje, overseers

On Mon, 5 Apr 2004, Ian Lance Taylor wrote:
> I find the feature of limiting the search to a particular date range
> to be very helpful on the current pages.  It lets me track down the
> e-mail message associated with a particular patch in a reasonable
> fashion.  I don't know how to do that with Google.

Seconded.  FWIW, I think Jason and H-P really did a great job setting
things up (and it's a pitty that the software now doesn't scale).

Gerald

^ permalink raw reply	[flat|nested] 54+ messages in thread

* sourceware load problem again?
@ 2004-04-29 19:40 David Edelsohn
  2004-04-29 19:45 ` Christopher Faylor
  0 siblings, 1 reply; 54+ messages in thread
From: David Edelsohn @ 2004-04-29 19:40 UTC (permalink / raw)
  To: overseers

	Sourceware has become extremely sluggish again.  Does anyone know
the cause?

David

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: sourceware load problem again?
  2004-04-29 19:40 sourceware load problem again? David Edelsohn
@ 2004-04-29 19:45 ` Christopher Faylor
  0 siblings, 0 replies; 54+ messages in thread
From: Christopher Faylor @ 2004-04-29 19:45 UTC (permalink / raw)
  To: David Edelsohn; +Cc: overseers

On Thu, Apr 29, 2004 at 03:30:18PM -0400, David Edelsohn wrote:
>Sourceware has become extremely sluggish again.  Does anyone know the
>cause?

The load average is average for this time of day and the CVS access
seems acceptable here.

cgf

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: sourceware load problem again?
       [not found]   ` <cgf@alum.bu.edu>
                       ` (3 preceding siblings ...)
  2004-04-08 14:48     ` gcc.gnu.org CVS meta-data corrupt? David Edelsohn
@ 2004-05-02 11:32     ` David Edelsohn
  4 siblings, 0 replies; 54+ messages in thread
From: David Edelsohn @ 2004-05-02 11:32 UTC (permalink / raw)
  To: overseers

>>>>> Christopher Faylor writes:

Chris> The load average is average for this time of day and the CVS access
Chris> seems acceptable here.

	Okay, thanks.  Maybe it is some sort of network problem near my
Internet connetion.

David

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2004-04-29 19:45 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2004-04-05 18:50 htdig and sources.redhat.com loadavg David Edelsohn
2004-04-05 19:36 ` Hans-Peter Nilsson
2004-04-05 19:46   ` Phil Edwards
2004-04-05 19:56     ` Frank Ch. Eigler
2004-04-05 20:03       ` Phil Edwards
2004-04-05 20:36     ` Hans-Peter Nilsson
2004-04-05 21:15       ` Phil Edwards
2004-04-05 21:23         ` Hans-Peter Nilsson
2004-04-05 21:46           ` Phil Edwards
2004-04-05 22:11             ` Hans-Peter Nilsson
2004-04-05 22:26               ` Phil Edwards
2004-04-05 20:48     ` Benjamin Kosnik
2004-04-05 20:52       ` Ian Lance Taylor
2004-04-05 20:57         ` Zack Weinberg
2004-04-08 21:18         ` Gerald Pfeifer
2004-04-05 21:12   ` Hans-Peter Nilsson
2004-04-05 20:51 ` Christopher Faylor
2004-04-05 21:21   ` Matthew Galgoci
2004-04-05 23:36     ` Zack Weinberg
2004-04-06  0:06       ` Matthew Galgoci
2004-04-06  0:17         ` Matthew Galgoci
2004-04-06  0:29         ` Zack Weinberg
     [not found]   ` <cgf@alum.bu.edu>
2004-04-05 21:03     ` David Edelsohn
2004-04-05 21:08       ` Ian Lance Taylor
     [not found]         ` <ian@airs.com>
2004-04-05 21:14           ` David Edelsohn
2004-04-05 22:51             ` Jason Molenda
2004-04-05 23:39               ` GCC snapshot generation (was Re: htdig and sources.redhat.com loadavg) Zack Weinberg
2004-04-06 14:49     ` htdig and sources.redhat.com loadavg David Edelsohn
2004-04-06 16:18       ` Jonathan Larmour
2004-04-06 16:25         ` David Edelsohn
2004-04-06 16:34         ` Ian Lance Taylor
2004-04-06 16:39           ` Phil Edwards
2004-04-07  2:58           ` Christopher Faylor
2004-04-06 16:41         ` Ian Lance Taylor
2004-04-07  2:59           ` Christopher Faylor
2004-04-06 17:40     ` David Edelsohn
2004-04-06 18:00       ` Jonathan Larmour
2004-04-06 19:43         ` Hans-Peter Nilsson
2004-04-06 19:52           ` Ian Lance Taylor
2004-04-06 19:52           ` Frank Ch. Eigler
2004-04-06 23:24             ` Hans-Peter Nilsson
2004-04-08 14:48     ` gcc.gnu.org CVS meta-data corrupt? David Edelsohn
2004-04-08 14:53       ` Frank Ch. Eigler
2004-04-08 15:18         ` Christopher Faylor
2004-05-02 11:32     ` sourceware load problem again? David Edelsohn
2004-04-08  4:04 gcc.gnu.org CVS meta-data corrupt? David Edelsohn
2004-04-08 13:20 ` Christopher Faylor
2004-04-08 13:42 ` Frank Ch. Eigler
2004-04-08 13:54   ` system rebooted (was Re: gcc.gnu.org CVS meta-data corrupt?) Christopher Faylor
2004-04-08 13:42 ` gcc.gnu.org CVS meta-data corrupt? Frank Ch. Eigler
2004-04-08 14:13   ` David Edelsohn
2004-04-08 14:21     ` Christopher Faylor
2004-04-29 19:40 sourceware load problem again? David Edelsohn
2004-04-29 19:45 ` Christopher Faylor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).