From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2984 invoked by alias); 5 Sep 2011 14:32:19 -0000 Received: (qmail 2976 invoked by uid 22791); 5 Sep 2011 14:32:18 -0000 X-SWARE-Spam-Status: No, hits=-1.7 required=5.0 tests=AWL,BAYES_00,LOTS_OF_MONEY,RP_MATCHES_RCVD,TW_YF X-Spam-Check-By: sourceware.org Received: from bear.ext.ti.com (HELO bear.ext.ti.com) (192.94.94.41) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 05 Sep 2011 14:32:04 +0000 Received: from dlep33.itg.ti.com ([157.170.170.112]) by bear.ext.ti.com (8.13.7/8.13.7) with ESMTP id p85EVxJM009620 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 5 Sep 2011 09:31:59 -0500 Received: from dlep26.itg.ti.com (smtp-le.itg.ti.com [157.170.170.27]) by dlep33.itg.ti.com (8.13.7/8.13.8) with ESMTP id p85EVxrL015734; Mon, 5 Sep 2011 09:31:59 -0500 (CDT) Received: from dnce72.ent.ti.com (localhost [127.0.0.1]) by dlep26.itg.ti.com (8.13.8/8.13.8) with ESMTP id p85EVvYR013306; Mon, 5 Sep 2011 09:31:59 -0500 (CDT) Received: from dnce02.ent.ti.com ([137.167.131.106]) by dnce72.ent.ti.com ([137.167.131.87]) with mapi; Mon, 5 Sep 2011 16:31:58 +0200 From: "Turgis, Frederic" To: Mark Wielaard CC: "systemtap@sourceware.org" Content-Class: urn:content-classes:message Date: Mon, 05 Sep 2011 14:32:00 -0000 Subject: RE: Making the transport layer more robust Message-ID: <13872098A06B02418CF379A158C0F1460163182646@dnce02.ent.ti.com> References: <1311065908.9144.27.camel@springer.wildebeest.org> <20110812174324.GA1394@hermans.wildebeest.org> <4E4965B3.6080700@redhat.com> <1313500965.3393.5.camel@springer.wildebeest.org> <13872098A06B02418CF379A158C0F1460162DC0A08@dnce02.ent.ti.com> <1315222009.3431.22.camel@springer.wildebeest.org> Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-IsSubscribed: yes Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2011-q3/txt/msg00267.txt.bz2 >The kernel side "polling" is not just for exit, it is for any cmd >message that is generated from a possible "unsafe" >context I have then probably not understood code correctly (code was before latest = changes, now this polling is mandatory as I mentioned later in mail). "unsa= fe" context is associated to unannounced message, isn't it ? Well, even for= announced messages, I have the impression that reading message relies only= on user side polling because kernel side is not waiting for a wake-up of _= stp_ctl_ready_q. Here is my understanding but I didn't take time to perform= some traces: - for annouced messages or on kernel polling (_stp_ctl_work_callback), I un= derstand that we trigger a reading through wake_up_interruptible(&_stp_ctl_= wq); - on user side, main reading loop does: flags |=3D O_NONBLOCK; fcntl(control_channel, F_SETFL, flags); nb =3D read(control_channel, &recvbuf, sizeof(recvbuf)); So I expect a non = blocking read (however, there may be another place where we read cmd messag= e) This ends in "_stp_ctl_read_cmd" in kernel doing: while (list_empty(&_stp_ctl_ready_q)) { spin_unlock_irqrestore(&_stp_ctl_ready_lock, flags); if (file->f_flags & O_NONBLOCK) -> non blocking read, we re= ly on polling to recheck _stp_ctl_ready_q return -EAGAIN; if (wait_event_interruptible(_stp_ctl_wq, !list_empty(&_stp= _ctl_ready_q))) -> code not reached so kernel polling (or even message anno= ucement) useless ? return -ERESTARTSYS; >I am very interested in any results you get from the new code. Never tested bulk mode. We quite like filling up buffer and doing a long bu= ffer dump but doing very small regular dumps could also work. Our modifications are just ugly hacks to understand the internals. They mak= e sense for some, but for some other parts, we probably have different requ= irements between a server and an embedded platform. Capability to tune a ti= mer would be OK (or maybe bulk-mode would be good) Here are the v1.3 experiments we performed few months ago (latest months ha= ve been too busy with customer to share before :-( ) It seemed to work well at 2 levels: - task scheduling monitoring: systemtap work-queues and staprun/stapio proc= esses were seen only every s or more, which is OK. And occurrence was match= ing with timer setting. - power consumption monitoring (requires specific HW) was showing no CPU ac= tivity (interrupts, timers during Idle task are not seen at scheduler level) Control channel userpace polling (well, I consider control everything that = is not trace/data output from script) - usleep (250*1000); /* sleep 250ms between polls */ + usleep (2000*1000); /* sleep 250ms between polls */ -> no longer ne= eded with pselect() Control channel kernel polling (you might find it a bit extreme ;-) ) - if (likely(_stp_ctl_attached)) - queue_delayed_work(_stp_wq, &_stp_work, STP_WORK_TIMER); + //if (likely(_stp_ctl_attached)) + // queue_delayed_work(_stp_wq, &_stp_work, STP_WORK_TIMER); ->= reworked ;-) Data channel userspace timeout of select() - struct timespec tim =3D {.tv_sec=3D0, .tv_nsec=3D200000000}, *timeo= ut =3D &tim; + struct timespec tim =3D {.tv_sec=3D5, .tv_nsec=3D0}, *timeout =3D &= tim; -> timeout so could be fair to be that high Data channel kernel polling -#define STP_RELAY_TIMER_INTERVAL ((HZ + 99) / 100) +#define STP_RELAY_TIMER_INTERVAL HZ /* ((HZ + 99) / 100= ) */ -> wake-up every s, we may need tunable Of course reliability depends on data trace throughput. Main contributor is= task scheduling monitoring, around 0.5MB/s max. I had done the computation= of number of relayfs buffer*buffer size: we could not overflow all buffers= with insufficient wake-up to dump trace. For v1.5 (and next), we handle control channel kernel side through STP_CTL_= INTERVAL and get rid of our old ugly patch. May require tunable for embedde= d tests as it does not sound very logic to not poll regularly if we want me= ssages back quickly Regards fred Texas Instruments France SA, 821 Avenue Jack Kilby, 06270 Villeneuve Loubet= . 036 420 040 R.C.S Antibes. Capital de EUR 753.920