public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed
* high load average -> slow builds
@ 2013-01-15  8:45 Jim Meyering
  2013-01-15 18:25 ` Jonathan Larmour
  2013-01-15 21:18 ` Frank Ch. Eigler
  0 siblings, 2 replies; 5+ messages in thread
From: Jim Meyering @ 2013-01-15  8:45 UTC (permalink / raw)
  To: overseers

I've been getting warnings that my cvs-to-git processes
are hitting their locks several times per day: i.e., a prior
job is still running when the next cron-triggered job would have run.
That's not a problem in and of itself, but is indicative of another.

Over the last week or so, the load average has consistently been
above 20, so it's no surprise that things take longer.

I will probably reduce the cvs-to-git crontab frequency just to
stop the nag mail.  But longer term, what's the schedule for the
hardware update?

FYI, here's a snapshot of top output:

top - 08:39:36 up 812 days, 17:27,  2 users,  load average: 22.23, 26.47, 25.77
Tasks: 318 total,  16 running, 299 sleeping,   0 stopped,   3 zombie
Cpu(s): 18.3% us,  7.7% sy, 43.6% ni, 15.7% id, 14.5% wa,  0.2% hi,  0.0% si
Mem:  16631336k total, 15213152k used,  1418184k free,   199780k buffers
Swap:  2096376k total,    22868k used,  2073508k free, 12319912k cached

  PID USER      PR  NI %CPU    TIME+  %MEM  VIRT  RES  SHR S COMMAND
13593 apache    33   8 67.9   0:07.56  0.4 73536  67m 1516 R python
 8219 bugzilla  25   0 56.0  26:17.24  3.1  506m 502m 3104 R email_in.pl
11343 apache    33   8 40.1   0:14.88  0.1 17440  11m 3072 R python
 3181 apache    24   8 20.2   0:26.09  0.1 18400  11m 3116 R python
 9252 nobody    39  19 16.9   0:04.37  0.2 91132  37m  16m S git
14388 apache    28   8 10.6   0:00.32  0.0  8656 5240 2020 R python
 5096 nobody    34  19  8.9   5:38.26  2.1  698m 340m  45m S git
16448 anoncvs   15   0  6.3   2:16.35  0.1 29916  22m 2204 D svnserve
14199 nobody    39  19  5.6   2:29.12  1.2  491m 198m  46m R git
23042 clamav    16   0  5.0   0:21.00  0.5 89240  75m 3780 S spamd
24027 root      18   0  5.0   0:08.12  0.0  3104  632  412 S sh
18369 nobody    35  19  4.3   0:24.01  0.6  258m  97m  52m R git
19618 nobody    39  19  3.3   0:26.54  0.5  155m  87m  40m R git
14969 meyering  35  19  3.0   1:48.54  0.2 33720  27m  832 R cvs
  567 root      15   0  1.7  13286:44  0.0     0    0    0 S md10_raid5
14724 htdigid   34  19  1.7   2:37.78  0.1 19076  11m 1848 S indexer
14075 meyering  16   0  1.0   0:00.09  0.0  3716 1144  780 R top
14397 clamav    17   0  1.0   0:00.03  0.1 13556 9896  912 S qpsmtpd-forkser
24026 root      15   0  1.0   0:00.80  0.0  2652  512  416 D find
 2345 apache    23   8  0.7   0:00.22  0.0 25800 7896 1872 S httpd
 3825 root      15   0  0.7   1348:44  0.0     0    0    0 S kjournald
 9641 root      15   0  0.7 281:10.73  0.0  3140  644  544 S syslogd
14859 meyering  34  19  0.7   0:17.02  0.3 60072  56m  560 R cvsps
18584 ftp       15   0  0.7   1:03.52  0.0  6680 2324 1220 S proftpd
22318 apache    23   8  0.7   0:00.48  0.1 26948 9084 2628 S httpd
24817 dberlin   26  10  0.7   0:05.59  0.2 48632  40m 2320 D svn
31895 root      34  19  0.7   1:34.42  0.0  3160  632  384 R updatedb
   79 root      15   0  0.3   3431:53  0.0     0    0    0 S kswapd0
  573 root      15   0  0.3   6:40.87  0.0     0    0    0 S md4_raid1
 3828 root      15   0  0.3 491:51.42  0.0     0    0    0 S kjournald
 7003 apache    23   8  0.3   0:02.93  0.0 25896 8164 1928 S httpd
 8114 nobody    15   0  0.3   7:35.53  0.0  6312 1924 1212 S proftpd
 9583 nscd      16   0  0.3 668:54.44  0.0  135m 2836 1916 S nscd
11865 dnscache  15   0  0.3   3906:14  0.3 50624  48m  232 S dnscache
12855 apache    23   8  0.3   0:00.01  0.0 25084 7392 1820 S httpd
14900 meyering  35  19  0.3   0:23.21  0.0  5848 1124  936 R cvs
19775 apache    23   8  0.3   0:00.23  0.0 25188 7640 1932 D httpd
22589 apache    23   8  0.3   0:00.31  0.0 25384 7832 1920 D httpd
25270 apache    23   8  0.3   0:00.11  0.1 26396 8780 1940 S httpd
28140 apache    24   8  0.3   0:00.33  0.0 25868 8068 1872 D httpd
28269 apache    23   8  0.3   0:00.13  0.0 25172 7556 1872 D httpd
31920 apache    23   8  0.3   0:00.35  0.0 25884 8080 1924 S httpd

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: high load average -> slow builds
  2013-01-15  8:45 high load average -> slow builds Jim Meyering
@ 2013-01-15 18:25 ` Jonathan Larmour
  2013-01-15 18:31   ` Jonathan Larmour
  2013-01-15 21:18 ` Frank Ch. Eigler
  1 sibling, 1 reply; 5+ messages in thread
From: Jonathan Larmour @ 2013-01-15 18:25 UTC (permalink / raw)
  To: Jim Meyering; +Cc: overseers

On 15/01/13 08:45, Jim Meyering wrote:
> 
> top - 08:39:36 up 812 days, 17:27,  2 users,  load average: 22.23, 26.47, 25.77
> Tasks: 318 total,  16 running, 299 sleeping,   0 stopped,   3 zombie
> Cpu(s): 18.3% us,  7.7% sy, 43.6% ni, 15.7% id, 14.5% wa,  0.2% hi,  0.0% si
> Mem:  16631336k total, 15213152k used,  1418184k free,   199780k buffers
> Swap:  2096376k total,    22868k used,  2073508k free, 12319912k cached
> 
>   PID USER      PR  NI %CPU    TIME+  %MEM  VIRT  RES  SHR S COMMAND
> 13593 apache    33   8 67.9   0:07.56  0.4 73536  67m 1516 R python
>  8219 bugzilla  25   0 56.0  26:17.24  3.1  506m 502m 3104 R email_in.pl

That email_in.pl is odd. There's one running now, occupying full CPU, and
has been running for 45 mins and counting. It comes from this:

bugzilla 25689 25675 25689  0    1 17:31 ?        00:00:00
/qmail/bin/preline sh -c /www/gcc/bugzilla/email_in.pl -v
2>>/sourceware/bugzilla/email_in.log || exit 99

Presumably started directly from qmail. I tried stracing the perl script
instance itself, but whatever it's doing is all user space.

It seems too much of a coincidence that it's there at the top when you
looked and when I looked.

The log at this time includes:

Comments cannot be longer than 65535 characters.

which seems fair enough as a one-off, except this log message seems to get
triggered a lot - about once an hour, which doesn't seem right.

Also:
SENDER=gcc-bugzilla@gcc.gnu.org
which implies bugzilla was the one sending the mail.

Has the script got stuck, perhaps sending ever larger email in a loop? Or
is someone abusing it?

It's hard for me to see what the email is it's dealing with. Perhaps
someone more familiar with bugzilla can work out where (or if) it spools
it temporarily.

Jifl
-- 
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: high load average -> slow builds
  2013-01-15 18:25 ` Jonathan Larmour
@ 2013-01-15 18:31   ` Jonathan Larmour
  2013-01-15 18:36     ` Daniel Berlin
  0 siblings, 1 reply; 5+ messages in thread
From: Jonathan Larmour @ 2013-01-15 18:31 UTC (permalink / raw)
  To: Jim Meyering; +Cc: overseers

On 15/01/13 18:25, Jonathan Larmour wrote:
> 
> Also:
> SENDER=gcc-bugzilla@gcc.gnu.org
> which implies bugzilla was the one sending the mail.
> 
> Has the script got stuck, perhaps sending ever larger email in a loop? Or
> is someone abusing it?

Doh, I missed this obvious process in the 'ps' output:

bugzilla 25675  0.0  0.0  1356  284 ?        S    17:31   0:00
bin/qmail-local -- bugzilla /sourceware/bugzilla bugzilla-gcc - gcc
sourceware.org gcc-bugzilla@gcc.gnu.org |bouncesaying "No forwarding
information (#5.2.1)"

So it looks a lot like someone (probably, inadvertently, a spammer) has
fabricated a mail which has caused gcc bugzilla's email interface to get
stuck in a loop of ever larger mails.

Whoever's managing the gcc bugzilla needs to a) work out how to kill it
off so it won't just restart, and b) fix (or report) this flaw in the
bugzilla email interface.

Jifl
-- 
--["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: high load average -> slow builds
  2013-01-15 18:31   ` Jonathan Larmour
@ 2013-01-15 18:36     ` Daniel Berlin
  0 siblings, 0 replies; 5+ messages in thread
From: Daniel Berlin @ 2013-01-15 18:36 UTC (permalink / raw)
  To: Jonathan Larmour, LpSolit; +Cc: Jim Meyering, overseers

+Frédéric Buclin

Note, however, that he is blocked from upgrading bugzilla in general
because of too old versions of various dependencies (python, perl,
etc)

On Tue, Jan 15, 2013 at 1:30 PM, Jonathan Larmour <jifl@jifvik.org> wrote:
> On 15/01/13 18:25, Jonathan Larmour wrote:
>>
>> Also:
>> SENDER=gcc-bugzilla@gcc.gnu.org
>> which implies bugzilla was the one sending the mail.
>>
>> Has the script got stuck, perhaps sending ever larger email in a loop? Or
>> is someone abusing it?
>
> Doh, I missed this obvious process in the 'ps' output:
>
> bugzilla 25675  0.0  0.0  1356  284 ?        S    17:31   0:00
> bin/qmail-local -- bugzilla /sourceware/bugzilla bugzilla-gcc - gcc
> sourceware.org gcc-bugzilla@gcc.gnu.org |bouncesaying "No forwarding
> information (#5.2.1)"
>
> So it looks a lot like someone (probably, inadvertently, a spammer) has
> fabricated a mail which has caused gcc bugzilla's email interface to get
> stuck in a loop of ever larger mails.
>
> Whoever's managing the gcc bugzilla needs to a) work out how to kill it
> off so it won't just restart, and b) fix (or report) this flaw in the
> bugzilla email interface.
>
> Jifl
> --
> --["No sense being pessimistic, it wouldn't work anyway"]-- Opinions==mine

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: high load average -> slow builds
  2013-01-15  8:45 high load average -> slow builds Jim Meyering
  2013-01-15 18:25 ` Jonathan Larmour
@ 2013-01-15 21:18 ` Frank Ch. Eigler
  1 sibling, 0 replies; 5+ messages in thread
From: Frank Ch. Eigler @ 2013-01-15 21:18 UTC (permalink / raw)
  To: Jim Meyering; +Cc: overseers

Hi -

On Tue, Jan 15, 2013 at 09:45:07AM +0100, Jim Meyering wrote:

> [...] But longer term, what's the schedule for the hardware update?

As soon as we have the time to do the job right; sorry, no time fixed.

> [...]
>  8219 bugzilla  25   0 56.0  26:17.24  3.1  506m 502m 3104 R email_in.pl
> [...]

These should be gone.  (There was another mailing-loop in effect that
was corrected with some .qmail-FOO magic.)

- FChE

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2013-01-15 21:18 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-01-15  8:45 high load average -> slow builds Jim Meyering
2013-01-15 18:25 ` Jonathan Larmour
2013-01-15 18:31   ` Jonathan Larmour
2013-01-15 18:36     ` Daniel Berlin
2013-01-15 21:18 ` Frank Ch. Eigler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).