From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 2589 invoked by alias); 30 Jun 2008 01:05:44 -0000 Received: (qmail 2572 invoked by uid 22791); 30 Jun 2008 01:05:43 -0000 X-Spam-Status: No, hits=-1.2 required=5.0 tests=AWL,BAYES_50,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx1.redhat.com (HELO mx1.redhat.com) (66.187.233.31) by sourceware.org (qpsmtpd/0.31) with ESMTP; Mon, 30 Jun 2008 01:05:26 +0000 Received: from int-mx1.corp.redhat.com (int-mx1.corp.redhat.com [172.16.52.254]) by mx1.redhat.com (8.13.8/8.13.8) with ESMTP id m5U15Ow9029855; Sun, 29 Jun 2008 21:05:24 -0400 Received: from pobox-3.corp.redhat.com (pobox-3.corp.redhat.com [10.11.255.67]) by int-mx1.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m5U15OF5006044; Sun, 29 Jun 2008 21:05:24 -0400 Received: from touchme.toronto.redhat.com (IDENT:postfix@touchme.yyz.redhat.com [10.15.16.9]) by pobox-3.corp.redhat.com (8.13.1/8.13.1) with ESMTP id m5U14xEA027833; Sun, 29 Jun 2008 21:05:23 -0400 Received: from ton.toronto.redhat.com (ton.yyz.redhat.com [10.15.16.15]) by touchme.toronto.redhat.com (Postfix) with ESMTP id A1BE88001FF; Sun, 29 Jun 2008 21:04:42 -0400 (EDT) Received: from ton.toronto.redhat.com (localhost.localdomain [127.0.0.1]) by ton.toronto.redhat.com (8.13.1/8.13.1) with ESMTP id m5U14QP4010706; Sun, 29 Jun 2008 21:04:26 -0400 Received: (from fche@localhost) by ton.toronto.redhat.com (8.13.1/8.13.1/Submit) id m5U14N1E010705; Sun, 29 Jun 2008 21:04:23 -0400 Date: Mon, 30 Jun 2008 13:57:00 -0000 From: "Frank Ch. Eigler" To: ksummit-2008-discuss@lists.linux-foundation.org Cc: systemtap@sources.redhat.com Subject: DTrace Message-ID: <20080630010423.GA7068@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i X-Scanned-By: MIMEDefang 2.58 on 172.16.52.254 Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2008-q2/txt/msg00792.txt.bz2 Hi - Please forgive me for "crashing" the discussion party here. I would like to clarify some systemtap-related issues that people have raised. (I'm one of its developers.) I'll just list individual points, roughly in order they were raised. For a fuller treatment of any of the topics, please involve our public mailing list. * postgres, other dtrace-probe-instrumented userspace programs We aim to piggyback on these efforts by reusing the dtrace instrumentation calls embedded into postgres etc., if at all possible. * "klunky and prone to break in unexpected ways" There's a germ of truth there, but OTOH the case James ran into involved complications beyond normal symbolic debugging too (possibly having to search separately compiled modules for definitions of opaque struct-pointer types). We're working on it; our bug/feature list is in public bugzilla. * "unhappy week with dwarf" Guilty as charged. :-) * kprobes, markers Performance of kprobes-based probes is about 1 us per hit overhead. Markers are on the order of tens of nanoseconds, which makes a huge difference for frequently-hit probes. We'd be happy to interface to other event sources like ftrace or whatever, as long as they provide suitable kernel-module-accessible APIs. * user-space probing We're finally getting very close in this. Yes, it'd use the IBM uprobes prototype above the Red Hat utrace work as a lower layer, which we hope get upstream as soon as possible. It will behave analogously to dtrace: executing probes in kernel space. If it can be made safe (and we think it can), it's a huge performance win over trying to do it in userspace (with some gang of debugging processes or whatever). * oprofile It's a fine special-purpose tool. We hope to hook into the same sorts of underlying hardware performance counters to enable the same profiling capability in systemtap - except well integrated with the rest of the probing events / scripts. perfmon2 upstream would be very helpful. * dtrace "just works" Yeah, so I hear, but think about how different their target environment is. Their kernel hardly changes (several fixed APIs, ABIs): this has huge implications. Their kernel was willing to insert probes (~ markers), a bunch of build system changes (debug info subset transcribing). Here in linux land, we suffer multifaceted tensions and it is hard to go toward a goal without obstructions (well-meaning as they may be). A bunch of third-party scripts are often conflated with "dtrace", which is just a matter of growing the user community enough, and giving them a good tool to build on top of. A growing set of runnable end-user scripts is already packaged with systemtap, intended for use by nonexperts, more help (e.g. concise problem statements about what you'd like to measure/see) would be welcome. * integrating systemtap runtime into kernel We did some analysis about how much of the runtime code contains novel & relevant code to the kernel. We came up with a fraction like 20% (IIRC; still searching for a link to the thread). Some of the code is indeed in need of some cleanup love. Some of it has been necessary to work around kernel disruptions (e.g., unexporting stuff like kallsyms_lookup). The parts that are deeply kernel-version-sensitive (and would thus benefit from your maintenance) are quite small. We're still open to trying to pursue copying/upstreaming some of this code into the kernel. * tapsets Theodore is mistaken that we are deflecting the job of tapset (probe macro; abstracting architecture and kernel version-change - $foo->bar->baz, function names) authorship. We have asked for help, and have received a little, but the group has in fact authored a growing collection of this stuff. We would welcome having tapsets be included with the kernel and cared for by you guys. * debuginfo Yes, it's very helpful & necessary if one wants to place probes at just about any statement and extract just about any data value. It's the same prerequisite that crash or kgdb would have, since we operate at a similar level of object/source code visibility. Other distros are learning to package this admittedly bulky data up, so it'll be a matter of a largish download for distro users. Kernel developers will of course have the data generated locally already. We've recently gained the ability to work on symbol table level data only. It's a compromise technology: it shrinks the installation footprint but we get only function-entry probes; we lose data typing; can only get at ABI-dictated positional integral arguments. * systemtap building The only thing unusual with building the thing is the use of the elfutils library to parse elf/dwarf data; links to that are provided and one can link to a private copy if the system lacks it. * systemtap releases True, we've been spotty with formal releases, though they are archived and available, and we're moving to a more regular release schedule very shortly. The weekly snapshots have been good (except a recent unfortunate regression that hits 2.6.25 kernels particularly badly - that's holding up the new release plans). Thanks for reading; sorry about the length. - FChE