public inbox for systemtap@sourceware.org
 help / color / mirror / Atom feed
* dwarf unwinder (only works on i386/x86_64)
@ 2009-04-17 14:06 Mark Wielaard
  2009-04-21 20:58 ` dwarf unwinder (only works on i386/x86_64) - now with user space unwinding Mark Wielaard
  2009-04-28 18:02 ` [Query] Re: dwarf unwinder (only works on i386/x86_64) Prerna Saxena
  0 siblings, 2 replies; 10+ messages in thread
From: Mark Wielaard @ 2009-04-17 14:06 UTC (permalink / raw)
  To: systemtap

Hi,

I fixed up a couple of small things and enabled the dwarf unwinder for
in-kernel unwinding (merge commit 7c2136cf), this also fixes bug #5748
for which there is a testcase in testsuite/systemtap.context/context.exp
(backtrace.tcl) that now has all calling functions in the trace (for a
specific test module inserted). There can be some improvements to the
code. The unwinder sometimes goes on after falling off the stack, when
it should really use the fallback stack unwinder that was the default
before. But in general the stack traces are more complete than before.
There should be more tests written though.

Currently the dwarf unwinder is only enabled for i386 and x86_64 in
runtime/runtime.h:

/* dwarf unwinder only tested so far on i386 and x86_64. */
#if (defined(__i386__) || defined(__x86_64__))
#ifndef STP_USE_DWARF_UNWINDER
#define STP_USE_DWARF_UNWINDER
#endif
#endif

This is because the unwinder needs register setup initialization which
is currently only defined for i386 and x86_64 in runtime/unwind/[i386|
x86_64].h. To support other architectures one needs to add a new header
file defining an appropriate struct unwind_frame_info that can be
initialized through a function arch_unw_init_frame_info() that takes a
struct pt_regs and define a function arch_unw_user_mode() that given a
struct unwind_frame_info can detect it reached the end of the kernel
stack/start of user space stack (these don't actually work very well in
the i386/x86_64 cases btw because the dwarf unwinder cannot currently
unwind through the assembly level functions that setup the kernel stack
on kernel entry - this is a general issue with unwinding through
assembly functions which don't have cfi information that Roland is
looking into).

Then in the architecture specific runtime/stack-[arch].c file you can
use these the enable the dwarf unwinder in your __stp_stack_print()
function #ifdef STP_USE_DWARF_UNWINDER and otherwise fallback to some
_stp_stack_print_fallback function that does the original heuristic
stack walking. At least that is how i386 and x86_64 set things up.

I am working on using the dwarf unwinder also for user space
backtracing. First using the debug_frame tables that we also are using
for the kernel case, but maybe switching to the eh_frame tables (it
isn't clear which one is really the most accurate at the moment, we
might need to consult both, but I am trying to avoid doing that for
now).

Cheers,

Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: dwarf unwinder (only works on i386/x86_64) - now with user  space unwinding
  2009-04-17 14:06 dwarf unwinder (only works on i386/x86_64) Mark Wielaard
@ 2009-04-21 20:58 ` Mark Wielaard
  2009-05-21  8:03   ` dwarf unwinder (only works on i386/x86_64) - now with eh_frame and debug_frame fallback Mark Wielaard
  2009-04-28 18:02 ` [Query] Re: dwarf unwinder (only works on i386/x86_64) Prerna Saxena
  1 sibling, 1 reply; 10+ messages in thread
From: Mark Wielaard @ 2009-04-21 20:58 UTC (permalink / raw)
  To: systemtap

Hi,

On Fri, 2009-04-17 at 16:05 +0200, Mark Wielaard wrote:
> I fixed up a couple of small things and enabled the dwarf unwinder for
> in-kernel unwinding (merge commit 7c2136cf), this also fixes bug #5748
> for which there is a testcase in testsuite/systemtap.context/context.exp
> (backtrace.tcl) that now has all calling functions in the trace (for a
> specific test module inserted). There can be some improvements to the
> code. The unwinder sometimes goes on after falling off the stack, when
> it should really use the fallback stack unwinder that was the default
> before. But in general the stack traces are more complete than before.
> There should be more tests written though.
> 
> Currently the dwarf unwinder is only enabled for i386 and x86_64
> [...]
> I am working on using the dwarf unwinder also for user space
> backtracing. First using the debug_frame tables that we also are using
> for the kernel case, but maybe switching to the eh_frame tables (it
> isn't clear which one is really the most accurate at the moment, we
> might need to consult both, but I am trying to avoid doing that for
> now).

I merged my user_unwind branch to master including some test cases.
uprobes_ustack.exp is the main one that does a couple of backtraces
through an exe and a shared library. This now works on i386 and x86_64.

This has only been lightly tested. But should just gently fail (you just
get no or a partial backtrace) in case of failures.

There is some cleanup to do with respect to the adjustStartAddress() in
unwind.c versus the _stp_module_relocate() and _stp_mod_sec_lookup() in
sym.c. See the comments/XXX/TODO there.

It should work against any probe that provides the CONTEXT with a user
pt_regs. But the test cases all use uprobe points for now. Any utrace
probe hook should also work, except that will almost always point into
the executables VDSO which we currently don't track and from which we
cannot unwind atm (bug #10080). timer probes should work since they
should provide a user pt_reg when interrupting a user process, which you
can test with context tapset function user_mode(). But they will only
work if you explicitly included the symbols and unwind data for the
libraries and executable (with -d) that gets interrupted.

The last point is a general gotcha that might surprise the user. We
might want to provide an easy option to include all shared libraries of
an executable a user wants to trace, since if you miss one with -d
backtraces are either unavailable or truncated.

There are a couple of new tapset functions included for inspecting user
backtraces and symbols. They are in ucontext-symbols.stp and
ucontext-unwind.stp. For now I have marked them EXPERIMENTAL (by adding
that string to their description) since they don't seem optimal yet, I
want to reorganize them (and their kernel context counterparts)
according to bug #6580. Suggestions welcome.

There are three somewhat obvious drawbacks with the new ucontext
functions at the moment. There really should be a way to associate a
ubacktrace() string with the task. Currently print_ustack() interprets
the given backtrace in the context of the current task, which can be
different between when ubacktrace() and when print_ustack() is called.

A special subcase of that is the end probe (where the process is most
likely already gone, and stapio is the current process). So something
like the following will not work properly:
stap -d /lib/libc.so.6 -ve \
'global b;
 probe process("/bin/ls").function("main").call { b = ubacktrace(); }
 probe end { print_ustack(b); }' -c '/bin/ls'
Which is a pity since you might want to do some filtering and statistics
in your end probe, but then you cannot get at the symbolic names of your
trace.

And finally ubacktrace() returns a string capped to MAXSTRINGLEN which
is only 128. On a 64bit architecture that means only 6 entries. So one
has to explicitly add -DMAXSTRINGLEN=something_much_bigger to handle
larger traces (print_ubacktrace() doesn't have that restriction, but
doesn't allow to capture the output and is more expensive since it does
a full symbol lookup immediately for the stack trace).

Ideas on how to make these things more intuitive for the user
appreciated.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Query] Re: dwarf unwinder (only works on i386/x86_64)
  2009-04-17 14:06 dwarf unwinder (only works on i386/x86_64) Mark Wielaard
  2009-04-21 20:58 ` dwarf unwinder (only works on i386/x86_64) - now with user space unwinding Mark Wielaard
@ 2009-04-28 18:02 ` Prerna Saxena
  2009-04-28 18:19   ` Roland McGrath
  1 sibling, 1 reply; 10+ messages in thread
From: Prerna Saxena @ 2009-04-28 18:02 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: systemtap, roland

Hi Mark,
An elementary query regarding the dwarf-unwinder implementation...

Mark Wielaard wrote:
> ......
>   
> I am working on using the dwarf unwinder also for user space
> backtracing. First using the debug_frame tables that we also are using
> for the kernel case, but maybe switching to the eh_frame tables (it
> isn't clear which one is really the most accurate at the moment, we
> might need to consult both, but I am trying to avoid doing that for
> now).
>
>   
I was trying to contrast the ".eh_frame" vs ".debug_frame" 
specifications for keeping track of stack backtraces. Both appear rather 
similar wrt information they maintain.

The Exception header ".eh_frame" section seems to be present in vmlinux 
even when kernel is compiled without debuginfo.

i. what gcc flags cause this section to be compiled ?
ii. This section seemingly appears to be a better bet than DWARF to base 
the unwinder on--- because a ".debug_frame" based unwinder might not be 
useful in case of a kernel complied without debuginfo.

Looks like I'm missing some reasoning here, could you throw some light ? 
:-)
> Cheers,
>
> Mark
>
>   
Regards,

-- 
Prerna Saxena

Linux Technology Centre,
IBM Systems and Technology Lab,
Bangalore, India 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Query] Re: dwarf unwinder (only works on i386/x86_64)
  2009-04-28 18:02 ` [Query] Re: dwarf unwinder (only works on i386/x86_64) Prerna Saxena
@ 2009-04-28 18:19   ` Roland McGrath
  2009-04-28 19:52     ` Mark Wielaard
  0 siblings, 1 reply; 10+ messages in thread
From: Roland McGrath @ 2009-04-28 18:19 UTC (permalink / raw)
  To: Prerna Saxena; +Cc: Mark Wielaard, systemtap

> I was trying to contrast the ".eh_frame" vs ".debug_frame" 
> specifications for keeping track of stack backtraces. Both appear rather 
> similar wrt information they maintain.

.debug_frame format is that specified in the formal DWARF standard.
.eh_frame format is a slight variant of that format, optimized for being
used by a process itself without address fixups.  (It is used by C++
exception handling, the backtrace() function, and so forth.)

> The Exception header ".eh_frame" section seems to be present in vmlinux 
> even when kernel is compiled without debuginfo.

This depends on lots of details of the kernel build that vary across
versions and machines.  In today's kernels, it is not usually there.

> i. what gcc flags cause this section to be compiled ?

-funwind-tables and similar options.

> ii. This section seemingly appears to be a better bet than DWARF to base 
> the unwinder on--- because a ".debug_frame" based unwinder might not be 
> useful in case of a kernel complied without debuginfo.

It is a somewhat hairy subject.  But in short, this is not so in current
kernels.  That is not entirely apropos, because it's only the situation for
the kernel, and there are also user binaries to consider.  There it is an
even more complex subject.  The overall answer is that the answer is complex,
but potentially both sections are involved.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Query] Re: dwarf unwinder (only works on i386/x86_64)
  2009-04-28 18:19   ` Roland McGrath
@ 2009-04-28 19:52     ` Mark Wielaard
  2009-04-28 20:15       ` Roland McGrath
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Wielaard @ 2009-04-28 19:52 UTC (permalink / raw)
  To: Roland McGrath; +Cc: Prerna Saxena, systemtap

Hi,

Roland already answered most of the questions. Some time ago I wrote
some high level overview of all the moving pieces related to unwinding.
Maybe someone finds that useful:
A while ago I tried to summarize some of these issues:
http://gnu.wildebeest.org/diary/2007/08/23/stack-unwinding/

On Tue, 2009-04-28 at 11:17 -0700, Roland McGrath wrote:
> > ii. This section seemingly appears to be a better bet than DWARF to base 
> > the unwinder on--- because a ".debug_frame" based unwinder might not be 
> > useful in case of a kernel complied without debuginfo.
> 
> It is a somewhat hairy subject.  But in short, this is not so in current
> kernels.  That is not entirely apropos, because it's only the situation for
> the kernel, and there are also user binaries to consider.  There it is an
> even more complex subject.  The overall answer is that the answer is complex,
> but potentially both sections are involved.

The original choice for using debug_frame was because it was always
available (since we required debuginfo already) and it was complete. GCC
4.4 changed this though (and the uprobes_ustack.exp test does indeed
fail when build with gcc 4.4). With that version .debug_frame is no
longer complete, if unwind data is emitted into .eh_frame, it is not
emitted into .debug_frame (so no duplication) and only when .eh_frame is
not emitted, .debug_frame is emitted. So we have to start doing
something more clever. Defaulting to .eh_frame (at least for user space)
might be a good idea, and maybe then combining the two tables (and maybe
creating our own search table).

.eh_frame and .debug_frame are encoded slightly differently, but
supporting both is not hard.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [Query] Re: dwarf unwinder (only works on i386/x86_64)
  2009-04-28 19:52     ` Mark Wielaard
@ 2009-04-28 20:15       ` Roland McGrath
  0 siblings, 0 replies; 10+ messages in thread
From: Roland McGrath @ 2009-04-28 20:15 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: Prerna Saxena, systemtap

> The original choice for using debug_frame was because it was always
> available (since we required debuginfo already) and it was complete. GCC
> 4.4 changed this though (and the uprobes_ustack.exp test does indeed
> fail when build with gcc 4.4). [...]

To be fair, 4.4 only changed one of the many wrinkles that fold together
here.  It was never really the case that either one was always complete
when present.  (It's a big hairy subject.)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: dwarf unwinder (only works on i386/x86_64) - now with eh_frame  and debug_frame fallback
  2009-04-21 20:58 ` dwarf unwinder (only works on i386/x86_64) - now with user space unwinding Mark Wielaard
@ 2009-05-21  8:03   ` Mark Wielaard
  2009-05-21 18:44     ` Roland McGrath
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Wielaard @ 2009-05-21  8:03 UTC (permalink / raw)
  To: systemtap

Hi,

Yesterday I pushed some commits to make the dwarf unwinder use both
debug_frame and eh_frame tables. At first I had wanted to just used
debug_frame tables for the kernel and eh_frame for user space, but
depending on the gcc version, architecture (and apparently GNU/Linux
distro defaults), either table can be missing or only have partial
coverage (see gcc options -fexceptions, -fnon-call-exceptions,
-funwind-tables and -fasynchronous-unwind-tables). So currently we
default to the debug_frame table, and if that fails to unwind for a
particular location we fall back and retry using the eh_frame table.

This make the uprobes_ustack.exp testcase (and hopefully user stack
traces in general) work against gcc 4.4 (which is the default compiler
for fedora 11). Please do test and let me know of any situations where
things don't seem to work (especially if the uprobes_ustack.exp testcase
fails). Currently the dwarf unwinder is only enabled on i386 and x86_64.
It would be interesting to see if it can easily be enabled on other
architectures.

The "ugly" code in these patches is in adjustStartAddress() in
runtime/unwind.c. This really should go into _stp_module_relocate or
read_pointer. One tricky issue here is that we read the eh_frame section
during translation time and then load it in kernel space at module init
time. eh_frame tables can use pointer encodings that are absolute or
pc_relative (actually data relative), so we need to readjust for the new
load location of the eh_frame.

Some optimizations that could be done:
- Use the eh_frame_hdr binary search table
  (needs careful auditing of adjustStartAddress -> read_pointer).
- Try to read eh_frame in-place from user space
  (risks tricky page fault issues if not available)
- Merge debug_frame and eh_frame at runtime and build our own
  binary search hdr.

But for now I won't be working on those, unless the backtraces become a
bottleneck for actual code using them.

Next steps to make stacktraces better are:
- Add more tests (in particular ones that test prelinking
  and missing or split-file debuginfo).
- Make vma-tracker more robust (_stp_tf_mmap_cb)
  Wenji send me some notes on things it seems to miss. If we cannot
  track a location to a _stp_module we cannot unwind.
- Track vdso for process symbols/backtraces. PR10080.
- Simplify unwind interface. Architecture dependent code has too
  much duplication. Need to just handle address, not function symbol
  printing.
- Nicer fallback to in-kernel unwinder/backtrace, in particular
  for backtracing from non-pt_regs probe context. PR6961.
- unwind through kretprobes. PR6436/PR9999.
- Better tapset functions for handling stacks. PR6580.

I'll be away for a couple of days, but will be back early next week.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: dwarf unwinder (only works on i386/x86_64) - now with eh_frame  and debug_frame fallback
  2009-05-21  8:03   ` dwarf unwinder (only works on i386/x86_64) - now with eh_frame and debug_frame fallback Mark Wielaard
@ 2009-05-21 18:44     ` Roland McGrath
  2009-05-21 22:57       ` Mark Wielaard
  0 siblings, 1 reply; 10+ messages in thread
From: Roland McGrath @ 2009-05-21 18:44 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: systemtap

> The "ugly" code in these patches is in adjustStartAddress() in
> runtime/unwind.c. This really should go into _stp_module_relocate or
> read_pointer. One tricky issue here is that we read the eh_frame section
> during translation time and then load it in kernel space at module init
> time. eh_frame tables can use pointer encodings that are absolute or
> pc_relative (actually data relative), so we need to readjust for the new
> load location of the eh_frame.

In the long run I think the right thing here will be to convert the data at
translation time.  That is, make all addresses use a simple "absolute" form
(as is usual in .debug_frame), which really means "loadbase-relative" for
DSOs--i.e., the same as addresses in the symbol table, etc.  Then at run
time you just have one uniform way to treat addresses in each module.
That keeps things as simple as possible at runtime.

> Some optimizations that could be done:
> - Use the eh_frame_hdr binary search table
>   (needs careful auditing of adjustStartAddress -> read_pointer).
[...]
> - Merge debug_frame and eh_frame at runtime and build our own
>   binary search hdr.

By "runtime" here, you mean "translation time", right?  In the unspecified
future, elfutils libs will provide easy-to-use code for merging tables,
emitting them in whichever format, and generating binary search tables.
Probably any such optimization concerns can wait for that.


Thanks,
Roland

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: dwarf unwinder (only works on i386/x86_64) - now with eh_frame   and debug_frame fallback
  2009-05-21 18:44     ` Roland McGrath
@ 2009-05-21 22:57       ` Mark Wielaard
  2009-05-22  1:19         ` Roland McGrath
  0 siblings, 1 reply; 10+ messages in thread
From: Mark Wielaard @ 2009-05-21 22:57 UTC (permalink / raw)
  To: Roland McGrath; +Cc: systemtap

On Thu, 2009-05-21 at 11:44 -0700, Roland McGrath wrote:
> > The "ugly" code in these patches is in adjustStartAddress() in
> > runtime/unwind.c. This really should go into _stp_module_relocate or
> > read_pointer. One tricky issue here is that we read the eh_frame section
> > during translation time and then load it in kernel space at module init
> > time. eh_frame tables can use pointer encodings that are absolute or
> > pc_relative (actually data relative), so we need to readjust for the new
> > load location of the eh_frame.
> 
> In the long run I think the right thing here will be to convert the data at
> translation time.  That is, make all addresses use a simple "absolute" form
> (as is usual in .debug_frame), which really means "loadbase-relative" for
> DSOs--i.e., the same as addresses in the symbol table, etc.  Then at run
> time you just have one uniform way to treat addresses in each module.
> That keeps things as simple as possible at runtime.

Yes, agreed.

> > Some optimizations that could be done:
> > - Use the eh_frame_hdr binary search table
> >   (needs careful auditing of adjustStartAddress -> read_pointer).
> [...]
> > - Merge debug_frame and eh_frame at runtime and build our own
> >   binary search hdr.
> 
> By "runtime" here, you mean "translation time", right?  In the unspecified
> future, elfutils libs will provide easy-to-use code for merging tables,
> emitting them in whichever format, and generating binary search tables.
> Probably any such optimization concerns can wait for that.

Yes, I meant translation time. It would be wonderful if elfutils
provides an easy way to, merge the tables, transform them to be
"loadbase-relative" and generate a binary-search hdr for the result. I
am not sure how urgent such an cleanup and optimization is, we don't
have much experience with the user backtraces (or the kernel dwarf
unwinder for that matter). It isn't on my "short-list" at the moment.

Cheers,

Mark

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: dwarf unwinder (only works on i386/x86_64) - now with eh_frame   and debug_frame fallback
  2009-05-21 22:57       ` Mark Wielaard
@ 2009-05-22  1:19         ` Roland McGrath
  0 siblings, 0 replies; 10+ messages in thread
From: Roland McGrath @ 2009-05-22  1:19 UTC (permalink / raw)
  To: Mark Wielaard; +Cc: systemtap

> Yes, I meant translation time. It would be wonderful if elfutils
> provides an easy way to, merge the tables, transform them to be
> "loadbase-relative" and generate a binary-search hdr for the result. I
> am not sure how urgent such an cleanup and optimization is, we don't
> have much experience with the user backtraces (or the kernel dwarf
> unwinder for that matter). It isn't on my "short-list" at the moment.

It's certainly on the "in the fullness of time" list for elfutils. :-)

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2009-05-22  1:19 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-04-17 14:06 dwarf unwinder (only works on i386/x86_64) Mark Wielaard
2009-04-21 20:58 ` dwarf unwinder (only works on i386/x86_64) - now with user space unwinding Mark Wielaard
2009-05-21  8:03   ` dwarf unwinder (only works on i386/x86_64) - now with eh_frame and debug_frame fallback Mark Wielaard
2009-05-21 18:44     ` Roland McGrath
2009-05-21 22:57       ` Mark Wielaard
2009-05-22  1:19         ` Roland McGrath
2009-04-28 18:02 ` [Query] Re: dwarf unwinder (only works on i386/x86_64) Prerna Saxena
2009-04-28 18:19   ` Roland McGrath
2009-04-28 19:52     ` Mark Wielaard
2009-04-28 20:15       ` Roland McGrath

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).