On Wed, 2017-04-26 at 17:27 +0200, Ulf Hermann wrote: > However, if you strip .eh_frame and .eh_frame_hdr from the exe (thus > triggering the fp unwinding on the first frame), you will see that it > skips sigusr2. At the same time it invents another frame 0x403f40 on > the main thread. Apparently pthread_join creates two stack frames. As > it correctly unwinds the rest, the latter seemed harmless to me. I am a little concerned about testing against an exec where .eh_frame is forcibly removed since that is an allocated section you are messing up the binary (which shows because the symbol table doesn't match anymore). It seems nicer to do the checks instead with a hacked up libdwfl/frame_unwind.c that simply doesn't handle cfi and so always uses the frame pointer unwinder: diff --git a/libdwfl/frame_unwind.c b/libdwfl/frame_unwind.c index fb42c1a..6de64e5 100644 --- a/libdwfl/frame_unwind.c +++ b/libdwfl/frame_unwind.c @@ -539,6 +539,7 @@ new_unwound (Dwfl_Frame *state) static void handle_cfi (Dwfl_Frame *state, Dwarf_Addr pc, Dwarf_CFI *cfi, Dwarf_Addr bias) { + return; Dwarf_Frame *frame; if (INTUSE(dwarf_cfi_addrframe) (cfi, pc, &frame) != 0) { You are right that in that case we loose/skip over sigusr2 from raise and end up at stdarg directly if we remove the pc & 0x1 check. But... that really is because we deliberately skip it. Proper/simple link-register/frame unwinding should say: - newPc = newLr & (~0x1); + newPc = lr; The newPc is the current link register, not the new one. With that we get the backtrace as expected. But... I now realize why you needed something like that in the case of mixed CFI/no-framepointer/no-CFI/framepointer code. Like we have in our testcase. In that case there is no good way to determine whether or not there really were proper frame pointers and/or how the previous frame was unwound. Our testcase is somewhat mean by using some signal/no-return code which, which is hard to properly unwind without full frame pointers or full CFI. And with the simpler code that doesn't try to guess whether or not to skip a frame you do end up with an extra siguser2 and/or main frame. We could try to be clever and realize the link register and pc are the same and then use the newLR instead as newPC. That however might just mean that it is a recursive call to the same function. So maybe the proper "fix" for that is to make our testcase a little less strict and allow the occasional extra frame instead of trying to make the frame pointer unwinder "extra smart". Maybe something like the attached patch? Cheers, Mark