From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 1326 invoked by alias); 16 Sep 2010 23:18:58 -0000 Received: (qmail 1317 invoked by uid 22791); 16 Sep 2010 23:18:57 -0000 X-SWARE-Spam-Status: No, hits=0.5 required=5.0 tests=AWL,BAYES_20,TW_JS,T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail.aconex.com (HELO postoffice2.aconex.com) (203.89.202.182) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 16 Sep 2010 23:18:50 +0000 X-ASG-Debug-ID: 1284679125-24b000e30000-Ms21ry X-Barracuda-URL: http://postoffice2.aconex.com:8000/cgi-bin/mark.cgi Received: from postoffice.aconex.com (localhost [127.0.0.1]) by postoffice2.aconex.com (Spam & Virus Firewall) with ESMTP id E3441727ED4; Fri, 17 Sep 2010 09:18:45 +1000 (EST) Received: from postoffice.aconex.com (postoffice.yarra.acx [192.168.102.1]) by postoffice2.aconex.com with ESMTP id g1lAUGoBhCHxU9ur; Fri, 17 Sep 2010 09:18:45 +1000 (EST) X-Barracuda-Envelope-From: nathans@aconex.com Received: from gatekeeper.aconex.com (gatekeeper.yarra.acx [192.168.102.10]) by postoffice.aconex.com (Postfix) with ESMTP id C71C8A5011B; Fri, 17 Sep 2010 09:18:45 +1000 (EST) Received: from localhost (localhost.localdomain [127.0.0.1]) by gatekeeper.aconex.com (Postfix) with ESMTP id BC92A9D0003; Fri, 17 Sep 2010 09:18:45 +1000 (EST) Received: from gatekeeper.aconex.com ([127.0.0.1]) by localhost (gatekeeper.aconex.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jqNyNHMPm-Yb; Fri, 17 Sep 2010 09:18:41 +1000 (EST) Received: from mail-au.aconex.com (mail-au.aconex.com [192.168.102.12]) by gatekeeper.aconex.com (Postfix) with ESMTP id 04D4E9D0001; Fri, 17 Sep 2010 09:18:41 +1000 (EST) Received: from mail-au.aconex.com (mail-au.aconex.com [192.168.102.12]) by mail-au.aconex.com (Postfix) with ESMTP id F0ACE64B865A; Fri, 17 Sep 2010 09:18:40 +1000 (EST) Date: Thu, 16 Sep 2010 23:18:00 -0000 From: nathans@aconex.com To: "Frank Ch. Eigler" , Ken McDonell , Greg Banks Cc: systemtap@sources.redhat.com, pcp@oss.sgi.com Message-ID: <388028135.1068001284679120887.JavaMail.root@mail-au.aconex.com> In-Reply-To: <1341556404.1064361284677032819.JavaMail.root@mail-au.aconex.com> X-ASG-Orig-Subj: Re: [pcp] suitability of PCP for event tracing Subject: Re: [pcp] suitability of PCP for event tracing MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Barracuda-Connect: postoffice.yarra.acx[192.168.102.1] X-Barracuda-Start-Time: 1284679125 X-Barracuda-Virus-Scanned: by Aconex Staff Email Spam Firewall at aconex.com Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2010-q3/txt/msg00429.txt.bz2 Hi guys, ----- "Ken McDonell" wrote: > On 16/09/2010 12:07 PM, Greg Banks wrote: > > Frank Ch. Eigler wrote: > > I think we should pursue the discussion this approach a little > further. > There is only one layer of buffering needed, at the PMDA. > > So far, the obstacles to this approach would appear to be ... > > 1. Need buffer management and per-client state in the PMDA (actually > it is per PMAPI context which is a little more complicated, but doable) > ... > I don't see either issue as a big deal, and together they are order of > magnitude simpler than supporting the sort of asynchronous callbacks > from PMCD that have been suggested. > > 2. Latency for event notification ... the client can control the > polling > interval (down to a few milliseconds demonstrably works), so I expect > > you'd be able to tune the latency to match the semantic demands. If > really low latency is needed then any TPC-based mechanism is probably > [sp. TCP] :) ... local context mode could be used in that situation (PM_CONTEXT_LOCAL), which would map more closely to the current trace tools and doesn't use TCP. I haven't seen any reason why this scheme wont work for our little-used local context friend, good thing we did not remove that code, eh Ken? ;) > not going to work well, and PCP may be the wrong tool for that space. Local context should be fine, and perhaps that should be the default mode for any generic PCP tracing client tool (which, I imagine, we'll soon be needing). > > c. does not break any existing PMDAs or PMAPI clients > I guess it remains to be seen what (existing) tools will do with the trace data ... I'm guessing for the most part they will ignore it (as many of them do for STRING/AGGREGATE type already (pmie, pmval, etc). So, there's still plenty of work to be done to do a good job of adding support to the client tools - almost certainly a new tracing-specific tool will be needed. > d. be doable in a very short time ... for instance wrapping an array > of > events inside a "special" data aggregate is simple and isolated, and > there is already the basis for the required PMCD-PMDA interaction to > ensure the context id is known to the PMDA, and the existing context > cleanup code in PMCD provides the place to notify PMDAs that a context > will no longer be requesting events. > > So, can anyone mount a convincing argument that the requirements would > > demand changes to allow asynchronous behaviour between PMAPI clients > <---> PMCD <---> PMDAs? Main concerns center around the PMDA buffering scheme ... things like, how does a PMDA decide what a sensible timeframe for buffering data is (probably will need some kind of per-PMDA memory limit on buffer size, rather than time frame). Also, will the PMDA have to keep track of which clients have been sent which (portions of?) buffered data? (in case of multiple clients with different request frequencies ... might get a bit hairy?). Also, we've not really considered the additional requirements that we have in archive mode. Unlike the sampled data, traces have explicit start and end points, which we will need to know about. For example, if I construct a chart with starting offset (-S) at 10am and ending (-T) at 10:15, and a trace started at 9:45 which completes at 10:10, I'd expect to see that trace displayed, even though the trace data would (AIUI, in this proposal) all be stored at the time the trace was sampled? Well, actually, not sure how this will look? - does a trace have to end before a PMDA would see it? that'd be a bit lame; or would we export start and end events separately? then we need a way to tie them back together in the client tools. Or in this example of a long-running trace (relative to client sample time), does the PMDA report "trace X is in-progress" on each sample? That'd be a bit wasteful on disk space ... hmm, not clear what the best approach here will be. Could extend the existing temporal index to index start/end time for traces so we can quickly find whether a client sample covers a trace? Either way, I suspect "trace start" and "trace end" may need to each be a new metric type (in addition to PM_TYPE_COUNTER, PM_TYPE_INSTANT and PM_TYPE_DISCRETE that we have now, iow). > If not, I strongly suggest we work to flesh out the changes needed to > make a variable length array of structured event records avaliable > through the existing poll-based APIs. I'm not far away (got distracted after starting then tossing XML, and switching to JSON) from sending out some prototype JSON PDU support, which adds in a libpcp_json library that would be handy here I think. FWIW, the structured data approach should be just fine for capturing the parent/child trace relationship which I want us to tackle as well (from those papers I fwd'd); for traces that support this concept we can add those as additional JSON maps (or XML elements, or...), so I am content there. Alot of work here, but its all fascinating stuff & gonna be great fun to code! cheers. -- Nathan