From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29390 invoked by alias); 21 Apr 2009 20:58:42 -0000 Received: (qmail 29380 invoked by uid 22791); 21 Apr 2009 20:58:41 -0000 X-SWARE-Spam-Status: No, hits=-2.2 required=5.0 tests=AWL,BAYES_00,SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: sourceware.org Received: from mx2.redhat.com (HELO mx2.redhat.com) (66.187.237.31) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Tue, 21 Apr 2009 20:58:35 +0000 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n3LKwSMl025034 for ; Tue, 21 Apr 2009 16:58:28 -0400 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n3LKwSvi026598 for ; Tue, 21 Apr 2009 16:58:28 -0400 Received: from [10.32.10.47] (vpn-10-47.str.redhat.com [10.32.10.47]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n3LKwQwx026674 for ; Tue, 21 Apr 2009 16:58:27 -0400 Subject: Re: dwarf unwinder (only works on i386/x86_64) - now with user space unwinding From: Mark Wielaard To: systemtap@sourceware.org In-Reply-To: <1239977157.2336.33.camel@fedora.wildebeest.org> References: <1239977157.2336.33.camel@fedora.wildebeest.org> Content-Type: text/plain Date: Tue, 21 Apr 2009 20:58:00 -0000 Message-Id: <1240347505.19523.41.camel@hermans.wildebeest.org> Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Mailing-List: contact systemtap-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Subscribe: List-Post: List-Help: , Sender: systemtap-owner@sourceware.org X-SW-Source: 2009-q2/txt/msg00364.txt.bz2 Hi, On Fri, 2009-04-17 at 16:05 +0200, Mark Wielaard wrote: > I fixed up a couple of small things and enabled the dwarf unwinder for > in-kernel unwinding (merge commit 7c2136cf), this also fixes bug #5748 > for which there is a testcase in testsuite/systemtap.context/context.exp > (backtrace.tcl) that now has all calling functions in the trace (for a > specific test module inserted). There can be some improvements to the > code. The unwinder sometimes goes on after falling off the stack, when > it should really use the fallback stack unwinder that was the default > before. But in general the stack traces are more complete than before. > There should be more tests written though. > > Currently the dwarf unwinder is only enabled for i386 and x86_64 > [...] > I am working on using the dwarf unwinder also for user space > backtracing. First using the debug_frame tables that we also are using > for the kernel case, but maybe switching to the eh_frame tables (it > isn't clear which one is really the most accurate at the moment, we > might need to consult both, but I am trying to avoid doing that for > now). I merged my user_unwind branch to master including some test cases. uprobes_ustack.exp is the main one that does a couple of backtraces through an exe and a shared library. This now works on i386 and x86_64. This has only been lightly tested. But should just gently fail (you just get no or a partial backtrace) in case of failures. There is some cleanup to do with respect to the adjustStartAddress() in unwind.c versus the _stp_module_relocate() and _stp_mod_sec_lookup() in sym.c. See the comments/XXX/TODO there. It should work against any probe that provides the CONTEXT with a user pt_regs. But the test cases all use uprobe points for now. Any utrace probe hook should also work, except that will almost always point into the executables VDSO which we currently don't track and from which we cannot unwind atm (bug #10080). timer probes should work since they should provide a user pt_reg when interrupting a user process, which you can test with context tapset function user_mode(). But they will only work if you explicitly included the symbols and unwind data for the libraries and executable (with -d) that gets interrupted. The last point is a general gotcha that might surprise the user. We might want to provide an easy option to include all shared libraries of an executable a user wants to trace, since if you miss one with -d backtraces are either unavailable or truncated. There are a couple of new tapset functions included for inspecting user backtraces and symbols. They are in ucontext-symbols.stp and ucontext-unwind.stp. For now I have marked them EXPERIMENTAL (by adding that string to their description) since they don't seem optimal yet, I want to reorganize them (and their kernel context counterparts) according to bug #6580. Suggestions welcome. There are three somewhat obvious drawbacks with the new ucontext functions at the moment. There really should be a way to associate a ubacktrace() string with the task. Currently print_ustack() interprets the given backtrace in the context of the current task, which can be different between when ubacktrace() and when print_ustack() is called. A special subcase of that is the end probe (where the process is most likely already gone, and stapio is the current process). So something like the following will not work properly: stap -d /lib/libc.so.6 -ve \ 'global b; probe process("/bin/ls").function("main").call { b = ubacktrace(); } probe end { print_ustack(b); }' -c '/bin/ls' Which is a pity since you might want to do some filtering and statistics in your end probe, but then you cannot get at the symbolic names of your trace. And finally ubacktrace() returns a string capped to MAXSTRINGLEN which is only 128. On a 64bit architecture that means only 6 entries. So one has to explicitly add -DMAXSTRINGLEN=something_much_bigger to handle larger traces (print_ubacktrace() doesn't have that restriction, but doesn't allow to capture the output and is more expensive since it does a full symbol lookup immediately for the stack trace). Ideas on how to make these things more intuitive for the user appreciated. Cheers, Mark