public inbox for overseers@sourceware.org
 help / color / mirror / Atom feed
* Multiple cvs/git mirror tasks?
@ 2012-07-13 15:21 Jason Merrill
  2012-07-13 15:43 ` Jim Meyering
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Merrill @ 2012-07-13 15:21 UTC (permalink / raw)
  To: jim; +Cc: overseers

I'm noticing overlapping cvs/git mirroring tasks running on sourceware:

> meyering  5577  0.0  0.0  3492  900 ?        SNs  14:50   0:00 /bin/sh /home/meyering/bin/mirror-sw binutils
> meyering  5608  0.0  0.0  3492  412 ?        SN   14:50   0:00 /bin/sh /home/meyering/bin/mirror-sw binutils
> meyering  5609  0.0  0.0  3492  500 ?        SN   14:50   0:00 /bin/sh /home/meyering/bin/mirror-sw binutils
...
> meyering 11613  0.0  0.0  2948  896 ?        SNs  14:38   0:00 /bin/sh /home/meyering/bin/mirror-sw gdb
> meyering 11665  0.0  0.0  2948  404 ?        SN   14:38   0:00 /bin/sh /home/meyering/bin/mirror-sw gdb
> meyering 11666  0.0  0.0  2948  496 ?        SN   14:38   0:00 /bin/sh /home/meyering/bin/mirror-sw gdb
...
> meyering 27713  0.0  0.0  2632  896 ?        SNs  15:08   0:00 /bin/sh /home/meyering/bin/mirror-sw gdb
> meyering 27728  0.0  0.0  2632  404 ?        SN   15:08   0:00 /bin/sh /home/meyering/bin/mirror-sw gdb
> meyering 27729  0.0  0.0  2632  496 ?        SN   15:08   0:00 /bin/sh /home/meyering/bin/mirror-sw gdb

Is this intentional?

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multiple cvs/git mirror tasks?
  2012-07-13 15:21 Multiple cvs/git mirror tasks? Jason Merrill
@ 2012-07-13 15:43 ` Jim Meyering
  2012-07-13 15:55   ` Frank Ch. Eigler
  2012-07-16 18:52   ` Jason Merrill
  0 siblings, 2 replies; 11+ messages in thread
From: Jim Meyering @ 2012-07-13 15:43 UTC (permalink / raw)
  To: Jason Merrill; +Cc: overseers

Jason Merrill wrote:

> I'm noticing overlapping cvs/git mirroring tasks running on sourceware:
>
>> meyering 5577 0.0 0.0 3492 900 ?  SNs 14:50 0:00 /bin/sh
>> /home/meyering/bin/mirror-sw binutils
>> meyering 5608 0.0 0.0 3492 412 ?  SN 14:50 0:00 /bin/sh
>> /home/meyering/bin/mirror-sw binutils
>> meyering 5609 0.0 0.0 3492 500 ?  SN 14:50 0:00 /bin/sh
>> /home/meyering/bin/mirror-sw binutils
> ...
>> meyering 11613 0.0 0.0 2948 896 ?  SNs 14:38 0:00 /bin/sh
>> /home/meyering/bin/mirror-sw gdb
>> meyering 11665 0.0 0.0 2948 404 ?  SN 14:38 0:00 /bin/sh
>> /home/meyering/bin/mirror-sw gdb
>> meyering 11666 0.0 0.0 2948 496 ?  SN 14:38 0:00 /bin/sh
>> /home/meyering/bin/mirror-sw gdb
> ...
>> meyering 27713 0.0 0.0 2632 896 ?  SNs 15:08 0:00 /bin/sh
>> /home/meyering/bin/mirror-sw gdb
>> meyering 27728 0.0 0.0 2632 404 ?  SN 15:08 0:00 /bin/sh
>> /home/meyering/bin/mirror-sw gdb
>> meyering 27729 0.0 0.0 2632 496 ?  SN 15:08 0:00 /bin/sh
>> /home/meyering/bin/mirror-sw gdb
>
> Is this intentional?

Not by me.
Those jobs are supposed to be started by these crontab entries:

0-59/15 * * * * sh -c "nice $HOME/bin/mirror-sw lvm"
20-59/30  * * * * sh -c "nice $HOME/bin/mirror-sw binutils"
8-59/30   * * * * sh -c "nice $HOME/bin/mirror-sw gdb"
14        * * * * sh -c "nice $HOME/bin/mirror-sw newlib"

Any idea how we'd get three of each, with each triple starting
at the same minute?

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multiple cvs/git mirror tasks?
  2012-07-13 15:43 ` Jim Meyering
@ 2012-07-13 15:55   ` Frank Ch. Eigler
  2012-07-13 16:05     ` Jim Meyering
  2012-07-16 18:52   ` Jason Merrill
  1 sibling, 1 reply; 11+ messages in thread
From: Frank Ch. Eigler @ 2012-07-13 15:55 UTC (permalink / raw)
  To: Jim Meyering; +Cc: Jason Merrill, overseers

Hi -

meyering wrote:

> [...] Any idea how we'd get three of each, with each triple starting
> at the same minute?

There appeared to be a bunch of crond's running for some reason;
stopped all but one.

- FChE

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multiple cvs/git mirror tasks?
  2012-07-13 15:55   ` Frank Ch. Eigler
@ 2012-07-13 16:05     ` Jim Meyering
  2012-07-13 16:12       ` Jim Meyering
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2012-07-13 16:05 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Jason Merrill, overseers

Frank Ch. Eigler wrote:
> Hi -
>
> meyering wrote:
>
>> [...] Any idea how we'd get three of each, with each triple starting
>> at the same minute?
>
> There appeared to be a bunch of crond's running for some reason;
> stopped all but one.

Thanks.  I've killed the binutils-related processes and restarted
one manually.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multiple cvs/git mirror tasks?
  2012-07-13 16:05     ` Jim Meyering
@ 2012-07-13 16:12       ` Jim Meyering
  2012-07-13 16:17         ` Jim Meyering
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2012-07-13 16:12 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Jason Merrill, overseers

Jim Meyering wrote:

> Frank Ch. Eigler wrote:
>> Hi -
>>
>> meyering wrote:
>>
>>> [...] Any idea how we'd get three of each, with each triple starting
>>> at the same minute?
>>
>> There appeared to be a bunch of crond's running for some reason;
>> stopped all but one.
>
> Thanks.  I've killed the binutils-related processes and restarted
> one manually.

How strange.
The run I just started via "bash -x ~/bin/mirror-sw binutils" is now
showing up in ps auxww output as three different processes:

    meyering 32694  0.0  0.0  6352 1020 pts/0    S+   16:05   0:00 \
      bash -x /home/meyering/bin/mirror-sw binutils
    meyering 32704  0.0  0.0  6352  504 pts/0    S+   16:05   0:00 \
      bash -x /home/meyering/bin/mirror-sw binutils
    meyering 32705  0.0  0.0  6352  560 pts/0    S+   16:05   0:00 \
      bash -x /home/meyering/bin/mirror-sw binutils

How strange.
Should there still be this many crond processes?

    $ ps auxww|grep -c '^root.*crond'
    6

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multiple cvs/git mirror tasks?
  2012-07-13 16:12       ` Jim Meyering
@ 2012-07-13 16:17         ` Jim Meyering
  0 siblings, 0 replies; 11+ messages in thread
From: Jim Meyering @ 2012-07-13 16:17 UTC (permalink / raw)
  To: Frank Ch. Eigler; +Cc: Jason Merrill, overseers

Jim Meyering wrote:

> Jim Meyering wrote:
>
>> Frank Ch. Eigler wrote:
>>> Hi -
>>>
>>> meyering wrote:
>>>
>>>> [...] Any idea how we'd get three of each, with each triple starting
>>>> at the same minute?
>>>
>>> There appeared to be a bunch of crond's running for some reason;
>>> stopped all but one.
>>
>> Thanks.  I've killed the binutils-related processes and restarted
>> one manually.
>
> How strange.
> The run I just started via "bash -x ~/bin/mirror-sw binutils" is now
> showing up in ps auxww output as three different processes:
>
>     meyering 32694  0.0  0.0  6352 1020 pts/0    S+   16:05   0:00 \
>       bash -x /home/meyering/bin/mirror-sw binutils
>     meyering 32704  0.0  0.0  6352  504 pts/0    S+   16:05   0:00 \
>       bash -x /home/meyering/bin/mirror-sw binutils
>     meyering 32705  0.0  0.0  6352  560 pts/0    S+   16:05   0:00 \
>       bash -x /home/meyering/bin/mirror-sw binutils
>
> How strange.

I suspect it's normal after all, since it's running this:

    filter_stderr()
    {
      # Usage: filter-stderr EGREP_REGEXP 'command'
      regex=$1 cmd=$2
      (exec 3>&1; eval "$cmd" 2>&1 >&3 | grep -E -v "$regex" 1>&2)
    }
    ...
    filter_stderr "$mirror_re" "$mirror_cmd"

and ps is just showing the eval'd subshell and filter-running processes
with the name of the parent.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multiple cvs/git mirror tasks?
  2012-07-13 15:43 ` Jim Meyering
  2012-07-13 15:55   ` Frank Ch. Eigler
@ 2012-07-16 18:52   ` Jason Merrill
  2012-07-20 16:20     ` Jim Meyering
  1 sibling, 1 reply; 11+ messages in thread
From: Jason Merrill @ 2012-07-16 18:52 UTC (permalink / raw)
  To: Jim Meyering; +Cc: overseers

On 07/13/2012 11:43 AM, Jim Meyering wrote:
> Those jobs are supposed to be started by these crontab entries:
>
> 0-59/15 * * * * sh -c "nice $HOME/bin/mirror-sw lvm"
> 20-59/30  * * * * sh -c "nice $HOME/bin/mirror-sw binutils"
> 8-59/30   * * * * sh -c "nice $HOME/bin/mirror-sw gdb"
> 14        * * * * sh -c "nice $HOME/bin/mirror-sw newlib"

It seems that when the machine is heavily loaded one of these jobs can 
take more than the allotted time, which is why I was seeing three of 
them (including two for gdb) running at the same time; it would probably 
be good to use a mkdir mutex to make sure that you only have one running 
at a time.

Thanks,
Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multiple cvs/git mirror tasks?
  2012-07-16 18:52   ` Jason Merrill
@ 2012-07-20 16:20     ` Jim Meyering
  2012-07-24 18:20       ` Jason Merrill
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2012-07-20 16:20 UTC (permalink / raw)
  To: Jason Merrill; +Cc: overseers

Jason Merrill wrote:

> On 07/13/2012 11:43 AM, Jim Meyering wrote:
>> Those jobs are supposed to be started by these crontab entries:
>>
>> 0-59/15 * * * * sh -c "nice $HOME/bin/mirror-sw lvm"
>> 20-59/30  * * * * sh -c "nice $HOME/bin/mirror-sw binutils"
>> 8-59/30   * * * * sh -c "nice $HOME/bin/mirror-sw gdb"
>> 14        * * * * sh -c "nice $HOME/bin/mirror-sw newlib"
>
> It seems that when the machine is heavily loaded one of these jobs can
> take more than the allotted time, which is why I was seeing three of
> them (including two for gdb) running at the same time; it would
> probably be good to use a mkdir mutex to make sure that you only have
> one running at a time.

Hi Jason,

It is not uncommon (particularly after a tagging) for a gdb or binutils
mirror-sw run to take much longer than the usual minute or two. When
that happens or when the system is overloaded, you may see concurrent
runs for two or more projects.  However, that is not the norm.

For example, I see that just as I started writing this, the gdb sync
started [btw, it completed in under a minute]:

    meyering  7750  0.0  0.0  2720  904 ?        SNs  15:38   0:00 /bin/sh
      /home/meyering/bin/mirror-sw gdb
    meyering  9407  0.0  0.0  2720  412 ?        SN   15:38   0:00 /bin/sh
      /home/meyering/bin/mirror-sw gdb
    meyering  9408  0.0  0.0  2720  512 ?        SN   15:38   0:00 /bin/sh
      /home/meyering/bin/mirror-sw gdb

At first glance, you may think there are three invocations of the script,
but as recently discussed here, that's merely due to a subshelled
pipeline in the script: each element of the pipeline ends up looking
(in ps output) like another invocation of the parent script.  It really
was invoked only once.

I'm leery about adding naive cross-project locks, because that could
let a busy project starve one that needs only a tiny update.

Using lockfile(1) and a timeout might help.

I could add a per-project locking mechanism to prevent e.g., two gdb
syncs from running in parallel when the first one takes more than the
30m between cron invocations.  That could save some system resources,
but would not improve correctness or integrity.

For starters, I've instrumented mirror-sw with code to record the
duration of each run from now on.  You can see some numbers already:

    sourceware$ tail ~meyering/log/*
    ==> /home/meyering/log/gdb <==
    2012-07-20-16:10:00 5

    ==> /home/meyering/log/lvm <==
    2012-07-20-16:15:17 16

    ==> /home/meyering/log/newlib <==
    2012-07-20-16:14:28 27

Let's see if/when things overlap.

Besides, I've heard there might be a hardware upgrade coming...
That might make performance-related tweaks premature.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multiple cvs/git mirror tasks?
  2012-07-20 16:20     ` Jim Meyering
@ 2012-07-24 18:20       ` Jason Merrill
  2012-07-25  8:08         ` Jim Meyering
  0 siblings, 1 reply; 11+ messages in thread
From: Jason Merrill @ 2012-07-24 18:20 UTC (permalink / raw)
  To: Jim Meyering; +Cc: overseers

On 07/20/2012 12:19 PM, Jim Meyering wrote:
> At first glance, you may think there are three invocations of the script,
> but as recently discussed here, that's merely due to a subshelled
> pipeline in the script: each element of the pipeline ends up looking
> (in ps output) like another invocation of the parent script.  It really
> was invoked only once.

Right, but in the ps output I sent before, there were two gdb runs (each 
with 3 processes) running at the same time.

> I'm leery about adding naive cross-project locks, because that could
> let a busy project starve one that needs only a tiny update.
>
> Using lockfile(1) and a timeout might help.
>
> I could add a per-project locking mechanism to prevent e.g., two gdb
> syncs from running in parallel when the first one takes more than the
> 30m between cron invocations.  That could save some system resources,
> but would not improve correctness or integrity.

That makes sense to me.  The times when it would run long are likely to 
correspond with times when system resources are tightest, as with the 
time I noticed.

Jason

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multiple cvs/git mirror tasks?
  2012-07-24 18:20       ` Jason Merrill
@ 2012-07-25  8:08         ` Jim Meyering
  2012-07-25 13:05           ` Jason Merrill
  0 siblings, 1 reply; 11+ messages in thread
From: Jim Meyering @ 2012-07-25  8:08 UTC (permalink / raw)
  To: Jason Merrill; +Cc: overseers

Jason Merrill wrote:
> On 07/20/2012 12:19 PM, Jim Meyering wrote:
>> At first glance, you may think there are three invocations of the script,
>> but as recently discussed here, that's merely due to a subshelled
>> pipeline in the script: each element of the pipeline ends up looking
>> (in ps output) like another invocation of the parent script.  It really
>> was invoked only once.
>
> Right, but in the ps output I sent before, there were two gdb runs
> (each with 3 processes) running at the same time.
>
>> I'm leery about adding naive cross-project locks, because that could
>> let a busy project starve one that needs only a tiny update.
>>
>> Using lockfile(1) and a timeout might help.
>>
>> I could add a per-project locking mechanism to prevent e.g., two gdb
>> syncs from running in parallel when the first one takes more than the
>> 30m between cron invocations.  That could save some system resources,
>> but would not improve correctness or integrity.
>
> That makes sense to me.  The times when it would run long are likely
> to correspond with times when system resources are tightest, as with
> the time I noticed.

I've just adjusted the script (~meyering/bin/mirror-sw) to do that.
[I did have qualms about doing this locking business in sh,
and even implemented it first in Perl, mainly because using a
diagnostic to distinguish the EEXIST case from other mkdir failure
is so ugly.  Finally I opted for this in-script sh code rather than
a separate perl script.]

Now it starts like this:

  ...
  repo=$1

  # =============================================================
  # Require a lock per repository, in case a sync operation
  # takes longer than the interval between cron job invocations.
  # It doesn't affect correctness when two of these jobs run concurrently,
  # but it is definitely wasted work and can contribute to performance problems.

  lock_dir=$HOME/mirror-git-to-cvs/$repo.lock

  export LC_ALL=C # we require an English diagnostic from mkdir
  err=$(mkdir $lock_dir 2>&1)
  if test $? = 0; then
    echo $$ > $lock_dir/pid
  else
    case $err in
      *exists)
        test -d $lock_dir || die "$lock_dir is not a directory?!?"
        pid=$(cat $lock_dir/pid) || die "no $lock_dir/pid file?!?"
        # See if PID is still valid.
        kill -1 $pid 2>/dev/null \
          && die process $pid remains \
          || {
               # process $pid is dead; steal lock
               echo $$ > $lock_dir/pid
             }
        ;;
      *)
        # mkdir failed; print the diagnostic we recorded
        die "$err"
        ;;
    esac
  fi
  # =============================================================

and cleans up at the end with "rm -r $lock_dir"

FYI, since I've begun collecting sync-job duration data,
the vast majority of them have finished in under 10 minutes.
Here are the few (duration in seconds) that did not:

    1132
    803
    797
    704
    684
    671
    658
    657
    656
    646
    631
    609
    607
    602

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: Multiple cvs/git mirror tasks?
  2012-07-25  8:08         ` Jim Meyering
@ 2012-07-25 13:05           ` Jason Merrill
  0 siblings, 0 replies; 11+ messages in thread
From: Jason Merrill @ 2012-07-25 13:05 UTC (permalink / raw)
  To: Jim Meyering; +Cc: overseers

On 07/25/2012 04:07 AM, Jim Meyering wrote:
> I've just adjusted the script (~meyering/bin/mirror-sw) to do that.

Thanks.

Jason


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2012-07-25 13:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-13 15:21 Multiple cvs/git mirror tasks? Jason Merrill
2012-07-13 15:43 ` Jim Meyering
2012-07-13 15:55   ` Frank Ch. Eigler
2012-07-13 16:05     ` Jim Meyering
2012-07-13 16:12       ` Jim Meyering
2012-07-13 16:17         ` Jim Meyering
2012-07-16 18:52   ` Jason Merrill
2012-07-20 16:20     ` Jim Meyering
2012-07-24 18:20       ` Jason Merrill
2012-07-25  8:08         ` Jim Meyering
2012-07-25 13:05           ` Jason Merrill

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).