From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 7251 invoked by alias); 15 Aug 2011 08:24:08 -0000 Received: (qmail 7240 invoked by uid 22791); 15 Aug 2011 08:24:04 -0000 X-SWARE-Spam-Status: No, hits=-6.9 required=5.0 tests=AWL,BAYES_00,RCVD_IN_DNSWL_HI,RP_MATCHES_RCVD,SPF_HELO_PASS,TW_FC X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 15 Aug 2011 08:23:49 +0000 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id p7F8Nmbl027598 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Mon, 15 Aug 2011 04:23:48 -0400 Received: from hermans.wildebeest.org (ovpn-113-75.phx2.redhat.com [10.3.113.75]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id p7F8NlgW003090 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 15 Aug 2011 04:23:48 -0400 Received: by hermans.wildebeest.org (Postfix, from userid 500) id 7F1133405B4; Mon, 15 Aug 2011 10:23:46 +0200 (CEST) Date: Mon, 15 Aug 2011 08:24:00 -0000 From: Mark Wielaard To: systemtap@sourceware.org Subject: Re: Making the transport layer more robust Message-ID: <20110815082346.GA10032@hermans.wildebeest.org> References: <1311065908.9144.27.camel@springer.wildebeest.org> <20110812174324.GA1394@hermans.wildebeest.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20110812174324.GA1394@hermans.wildebeest.org> User-Agent: Mutt/1.5.21 (2010-09-15) Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2011-q3/txt/msg00160.txt.bz2 Hi, On Fri, Aug 12, 2011 at 07:43:24PM +0200, Mark Wielaard wrote: > commit 46ac9ed5bad86641e552bee4e42a2d973ffc12d0 > Author: Mark Wielaard > Date: Fri Aug 12 19:34:20 2011 +0200 > > Remove _stp_ctl_work_timer from module transport layer. > > The _stp_ctl_work_timer would trigger every 20ms to check whether > there were cmd messages queued, but not announced yet and to > check the _stp_exit_flag was set. > > This commit makes all control messages announce themselves and > check the _stp_exit_flag in the _stp_ctl_read_cmd loop (delivery > is still possibly delayed since the messages are just pushed on > a wait queue). And with the timer out of the way it wasn't too hard to add poll support to the command channel so that we can use a sleeping select on the channel instead of busy-polling in stapio. commit a9e19b380f9814630018e79b8cafa3c675dd182c Author: Mark Wielaard Date: Sun Aug 14 23:07:46 2011 +0200 Implement and use select to wait for cmd channel data. Add a poll implementation to runtime/transport/control.c (_stp_ctl_poll_cmd) based on the _stp_ctl_ready_q wait queue. Check whether select is supported in runtime/staprun/mainloop.c (stp_main_loop) and use pselect with a sigmask that includes SIGURG to get EINTR notifications whenever an interruptable event occurred. I am not seeing any regressions with this, but the signal code in runtime/staprun/mainloop.c is pretty, uhm, creative, so some extra review and testing would certainly be appreciated. This has a nice effect on the stapio impact during probing. With stap 1.6: $ stap -e 'global scs; probe syscall.* { if (execname() == "stapio") scs[name]++ }' -c 'sleep 10' scs["read"]=0x5b scs["fcntl"]=0x52 scs["ppoll"]=0x32 scs["nanosleep"]=0x28 scs["execve"]=0x5 scs["kill"]=0x1 scs["sigreturn"]=0x1 scs["rt_sigaction"]=0x1 scs["rt_sigprocmask"]=0x1 scs["wait4"]=0x1 scs["write"]=0x1 With stap from git trunk: $ stap -e 'global scs; probe syscall.* { if (execname() == "stapio") scs[name]++ }' -c 'sleep 10' scs["read"]=0x34 scs["ppoll"]=0x32 scs["execve"]=0x5 scs["fcntl"]=0x4 scs["kill"]=0x1 scs["pselect6"]=0x1 scs["sigreturn"]=0x1 scs["rt_sigaction"]=0x1 scs["rt_sigprocmask"]=0x1 scs["wait4"]=0x1 scs["write"]=0x1 So in this example one pselect6 replaces ~38 reads, ~80 fcntls and ~40 nanosleeps. The remaining reads and (timeing out) ppolls come from the relay channel. I haven't investigated yet whether those can be eliminated too. Cheers, Mark