From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <systemtap-return-18097-listarch-systemtap=sources.redhat.com@sourceware.org>
Received: (qmail 6552 invoked by alias); 19 Jul 2011 08:59:09 -0000
Received: (qmail 6541 invoked by uid 22791); 19 Jul 2011 08:59:07 -0000
X-SWARE-Spam-Status: No, hits=-6.9 required=5.0	tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,SPF_HELO_PASS,TW_GF
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28)    by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 19 Jul 2011 08:58:32 +0000
Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11])	by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p6J8wVAD021678	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK)	for <systemtap@sourceware.org>; Tue, 19 Jul 2011 04:58:31 -0400
Received: from springer.wildebeest.org (ovpn-116-31.ams2.redhat.com [10.36.116.31])	by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with SMTP id p6J8wUvR024493	for <systemtap@sourceware.org>; Tue, 19 Jul 2011 04:58:31 -0400
Received: by springer.wildebeest.org (Postfix, from userid 500)	id 0201B4BA9A; Tue, 19 Jul 2011 10:58:29 +0200 (CEST)
Subject: Making the transport layer more robust
From: Mark Wielaard <mjw@redhat.com>
To: systemtap@sourceware.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Date: Tue, 19 Jul 2011 08:59:00 -0000
Message-ID: <1311065908.9144.27.camel@springer.wildebeest.org>
Mime-Version: 1.0
Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <systemtap.sourceware.org>
List-Subscribe: <mailto:systemtap-subscribe@sourceware.org>
List-Post: <mailto:systemtap@sourceware.org>
List-Help: <mailto:systemtap-help@sourceware.org>, <http://sourceware.org/lists.html#faqs>
Sender: systemtap-owner@sourceware.org
X-SW-Source: 2011-q3/txt/msg00051.txt.bz2

Hi,

Since I was noticing some crashes related to Bug 12960 - _stp_ctl_send
tries to msleep when out of memory, I went through the transport layer
and cleaned some stuff up. This also solves Bug 10189 - STP_START gets
lost in a warning flood. The whole thing now feels a lot more robust. I
added a testcase testsuite/systemtap.base/warn_overflow.stp that often
could crash my machine. Now it never crashes and staprun exists nicely
while providing some warning messages about the warn/system related
floods and the module gets cleanly unloaded.

Since this was somewhat tricky most of the patches just add a lot of
comments about the transport layers, the message types and the
interactions between the messages and locks used in the code. It would
be nice if someone could read through the patches to see if there is
anything I missed or misunderstood.

I tested on f14 x86_64 and rhel5 x86_64 on which I could reproduce the
crashers previously and they are totally gone. The test results on f14
x86_64 look pretty good. I am still investigating the test results on
rhel5, they don't look that good (in particular exelib.exp fails a lot
of tests because it is unable to find process.mark probe points), but I
don't believe any of the failures is new or caused by my changes. Except
maybe one on rhel5 testing some abnormal exit case, pr10854.exp.

pr10854.exp acts strangely on rhel5, it seems fine on f14. It just sits
there waiting the reap staprun, which will never happen since it tries
to pkill it at the same time, that could be because the startup/exit of
staprun/stapio is much more robust now, but I don't fully understand the
expect spawn, catch, wait logic. Maybe it is some strange bug in the
rhel5 expect? Maybe I changed some expectation of staprun/stapio/module
interaction? Any help understanding the expect logic would be
appreciated.

Apart from that stap feels a lot more robust now. I do occasionally see
kernel message "hrtimer: interrupt took 30014948 ns" while running
(multiple) make installchecks, which is somewhat worrying, but doesn't
immediately seem to be bad nor new (I have seen these before some time
back).

I haven't yet tested on really old kernels, like those used on rhel4,
where we are using an almost completely different transport layer
implementation (even using different message types in some cases). I'll
try and find such an old machine and test a bit (does anybody still use
such an old beast with systemtap git trunk?). But in general I think we
should deprecate this part of the transport layer. It makes things
really complex.

I hope that after the next release I can cleanup the transport layer a
bit more. In particular I think we should really get rid of the
difference between _stp_ctl_send and _stp_ctl_write, which leads to
needing an extra timer just to flush the messages from the queue (and
check for the exit condition). If we can replace that logic with just a
wait queue based trigger then the code could become a little simpler.
The extra timer triggers made analyzing the above two bugs that much
harder.

Again, patch review of the commits would be really appreciated. If only
to make sure there are others that get to know this part of the code a
bit more.

Thanks,

Mark

commit d8edf750d3c8815784a32c96c9fd939ce5e2f85b
Author: Mark Wielaard <mjw@redhat.com>
Date:   Mon Jul 18 20:55:38 2011 +0200

    PR10189 and PR12960 reserve system cmd messages for delivery.
=20=20=20=20
    runtime/transport/control.c kept one pool for all cmd messages that
    the module had to deliver to staprun/io.  This pool could become
    empty.  This meant essential control message would not be delivered.
    Leading to the module not properly starting and/or exiting.
=20=20=20=20
    We now set aside buffers for one time messages (STP_START, STP_EXIT,
    STP_TRANSPORT, STAP_REQUEST_EXIT) and "overflow" messages that get
    delivered whenever one of the dynamically allocated messages cannot
    get a free slot from the pool (STP_OOB_DATA - warnings and errors,
    STP_SYSTEM and STP_REALTIME_DATA).
=20=20=20=20
    The type field is used to mark whether or not a special pre-allocated
    buffer is currently unused. This needs careful locking using a new
    &_stp_ctl_special_msg_lock that is used in the new helper functions
    _stp_ctl_get_buffer and _stp_ctl_free_buffer.
=20=20=20=20
    Now when we run out of message buffers we just drop the message and
    printk. stapio will have received either the one time message or an
    overflow message, there is nothing more we can do.
=20=20=20=20
    The STP_DEFAULT_BUFFERS for debugfs.c got decreased again to allow
    8 pre-allocated and 32 dynamic (pending) cmd messages.
=20=20=20=20
    A new testcase testsuite/systemtap.base/warn_overflow.exp was added.

commit 8a581846923f5bd210644a05add78e13ed5591f1
Author: Mark Wielaard <mjw@redhat.com>
Date:   Mon Jul 18 16:11:11 2011 +0200

    staprun STP_OOB_DATA payload.data "WARNING:" prefix shouldn't be transl=
ated.
=20=20=20=20
    The STP_OOB_DATA payload data prefix "WARNING:" or "ERROR:" is part of
    the module cmd protocol.

commit 3eafa6b8fd7e016eb5236815f84844d87cb5c708
Author: Mark Wielaard <mjw@redhat.com>
Date:   Fri Jul 15 23:54:47 2011 +0200

    PR12960 Don't msleep in _stp_ctl_send when out of memory.
=20=20=20=20
    This is mainly a documentation patch to better explain the transport
    layers and the interaction between _stp_ctl_read_cmd, _stp_ctl_send and
    _stp_ctl_write.
=20=20=20=20
    It also contains the first step to resolve PR12960. The msleep() in
    _stp_ctl_send() has been replaced with a loop that checks whether there
    are messages on the queue, tries to wake up _stp_ctl_read_cmd so stapio
    has a change to read some of the pending messages and a small mdelay
    (which is save, because it doesn't actually sleep or schedule). It
    only prevents the crash and makes the possibility of loosing control
    messages slightly less. A followup patch will introduce special buffers
    to hold cannot be lost messages so the module will always be able to
    properly shut down.
=20=20=20=20
    STP_DEFAULT_BUFFERS for debugfs also got increased a little from 50 to =
64.

commit c714669f502d29e9c5317a648986709be2ff7454
Author: Mark Wielaard <mjw@redhat.com>
Date:   Fri Jul 15 17:06:52 2011 +0200

    Create dropped file with mode 0400 in relay_v2.c to make it not world r=
eadable.

commit 27d0d0cabb7452088ab5336195b0a07ee1413fda
Author: Mark Wielaard <mjw@redhat.com>
Date:   Fri Jul 15 15:11:41 2011 +0200

    Document stp control channel command values under runtime/transport.