public inbox for elfutils@sourceware.org
 help / color / mirror / Atom feed
* [Bug libdw/30272] New: Unwinding multithreaded musl applications fails
@ 2023-03-24 23:39 godlygeek at gmail dot com
  2023-04-02 23:42 ` [Bug libdw/30272] " godlygeek at gmail dot com
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: godlygeek at gmail dot com @ 2023-03-24 23:39 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=30272

            Bug ID: 30272
           Summary: Unwinding multithreaded musl applications fails
           Product: elfutils
           Version: unspecified
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: libdw
          Assignee: unassigned at sourceware dot org
          Reporter: godlygeek at gmail dot com
                CC: elfutils-devel at sourceware dot org
  Target Milestone: ---

Unwinding multithreaded applications linked against musl libc on x86-64 seems
to fail, getting stuck on `__clone`:

TID 241:
...
#20 0x00007f6f2f74f08b start
#21 0x00007f6f2f75138e __clone
#22 0x00007f6f2f75138e __clone
#23 0x00007f6f2f75138e __clone
...
#253 0x00007f6f2f75138e __clone
#254 0x00007f6f2f75138e __clone
#255 0x00007f6f2f75138e __clone
eu-stack: tid 241: shown max number of frames (256, use -n 0 for unlimited)


GDB seems to detect the condition that libdw is getting stuck on, emitting a
warning message:

#44 0x00007f8f83e4d08b in start (p=0x7f8f836b8b00) at
src/thread/pthread_create.c:203
#45 0x00007f8f83e4f38e in __clone () at src/thread/x86_64/clone.s:22
Backtrace stopped: frame did not save the PC

I believe it's detecting that two frames in a row have the same DWARF CFA, if I
understand correctly.


Reproducer:

docker run -it --privileged python:3.10-alpine sh

And in the container:

apk add --update musl-dbg elfutils
python3.10 -c "import os, threading; threading.Thread(target=lambda:
os.system(f'eu-stack --pid={os.getpid()}')).start()"

That spawns a thread that forks a subprocess that runs `eu-stack` on its
parent, and reproduces the issue. If you remove the thread and just run:

python3.10 -c "import os; os.system(f'eu-stack --pid={os.getpid()}')"

then unwinding succeeds, ending at `_start`.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug libdw/30272] Unwinding multithreaded musl applications fails
  2023-03-24 23:39 [Bug libdw/30272] New: Unwinding multithreaded musl applications fails godlygeek at gmail dot com
@ 2023-04-02 23:42 ` godlygeek at gmail dot com
  2023-04-03  5:44 ` sam at gentoo dot org
  2023-04-06 22:16 ` mark at klomp dot org
  2 siblings, 0 replies; 4+ messages in thread
From: godlygeek at gmail dot com @ 2023-04-02 23:42 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=30272

--- Comment #1 from Matt Wozniski <godlygeek at gmail dot com> ---
I encountered this issue using `dwfl_getthread_frames`, and I've found that
calling `dwfl_frame_reg` to check if the stack pointer register was the same
for two frames in a row and breaking out if so seems to work around it. I'm not
sure if that's entirely correct, though. Are there any legitimate cases where
two different frames passed to the callback would have the same stack pointer?
My impression is that the stack pointer should change for every function call
because the return address is stored on the stack, but perhaps there are some
architectures where that isn't the case...

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug libdw/30272] Unwinding multithreaded musl applications fails
  2023-03-24 23:39 [Bug libdw/30272] New: Unwinding multithreaded musl applications fails godlygeek at gmail dot com
  2023-04-02 23:42 ` [Bug libdw/30272] " godlygeek at gmail dot com
@ 2023-04-03  5:44 ` sam at gentoo dot org
  2023-04-06 22:16 ` mark at klomp dot org
  2 siblings, 0 replies; 4+ messages in thread
From: sam at gentoo dot org @ 2023-04-03  5:44 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=30272

Sam James <sam at gentoo dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |sam at gentoo dot org

--- Comment #2 from Sam James <sam at gentoo dot org> ---
See also https://marc.info/?l=musl&m=168023060722100&w=2.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug libdw/30272] Unwinding multithreaded musl applications fails
  2023-03-24 23:39 [Bug libdw/30272] New: Unwinding multithreaded musl applications fails godlygeek at gmail dot com
  2023-04-02 23:42 ` [Bug libdw/30272] " godlygeek at gmail dot com
  2023-04-03  5:44 ` sam at gentoo dot org
@ 2023-04-06 22:16 ` mark at klomp dot org
  2 siblings, 0 replies; 4+ messages in thread
From: mark at klomp dot org @ 2023-04-06 22:16 UTC (permalink / raw)
  To: elfutils-devel

https://sourceware.org/bugzilla/show_bug.cgi?id=30272

Mark Wielaard <mark at klomp dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |mark at klomp dot org

--- Comment #3 from Mark Wielaard <mark at klomp dot org> ---
This does seem a bug in musl which doesn't seem to add enough cfi as the email
thread mentioned in comment #2 says.

Note that glibc does explicitly mark the end of stack in clone in cfi by
undefining the pc:

sysdeps/unix/sysv/linux/aarch64/clone.S:        cfi_undefined (x30)
sysdeps/unix/sysv/linux/aarch64/clone3.S:       cfi_undefined (x30)
sysdeps/unix/sysv/linux/alpha/clone.S:  cfi_undefined(ra)
sysdeps/unix/sysv/linux/csky/abiv2/clone.S:     cfi_undefined (lr)
sysdeps/unix/sysv/linux/i386/clone.S:   cfi_undefined (eip);
sysdeps/unix/sysv/linux/i386/clone3.S:  cfi_undefined (eip)
sysdeps/unix/sysv/linux/loongarch/clone.S:      cfi_undefined (1)
sysdeps/unix/sysv/linux/loongarch/clone3.S:     cfi_undefined (1)
sysdeps/unix/sysv/linux/m68k/clone.S:   cfi_undefined (pc)      /* Mark end of
stack */
sysdeps/unix/sysv/linux/mips/clone.S:   cfi_undefined ($31)
sysdeps/unix/sysv/linux/nios2/clone.S:  cfi_undefined (ra)
sysdeps/unix/sysv/linux/riscv/clone.S:  cfi_undefined (ra)
sysdeps/unix/sysv/linux/s390/s390-32/clone.S:   cfi_undefined (r14)
sysdeps/unix/sysv/linux/s390/s390-64/clone.S:   cfi_undefined (r14)
sysdeps/unix/sysv/linux/x86_64/clone.S: cfi_undefined (rip);
sysdeps/unix/sysv/linux/x86_64/clone3.S:        cfi_undefined (rip)

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-04-06 22:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-24 23:39 [Bug libdw/30272] New: Unwinding multithreaded musl applications fails godlygeek at gmail dot com
2023-04-02 23:42 ` [Bug libdw/30272] " godlygeek at gmail dot com
2023-04-03  5:44 ` sam at gentoo dot org
2023-04-06 22:16 ` mark at klomp dot org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).