public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/30530] New: deadlock during deallocation of threads and fork
@ 2023-06-08 15:37 ced.couton at gmail dot com
2023-06-08 16:17 ` [Bug nptl/30530] " schwab@linux-m68k.org
2023-06-08 19:29 ` ced.couton at gmail dot com
0 siblings, 2 replies; 3+ messages in thread
From: ced.couton at gmail dot com @ 2023-06-08 15:37 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30530
Bug ID: 30530
Summary: deadlock during deallocation of threads and fork
Product: glibc
Version: 2.31
Status: UNCONFIRMED
Severity: normal
Priority: P2
Component: nptl
Assignee: unassigned at sourceware dot org
Reporter: ced.couton at gmail dot com
CC: drepper.fsp at gmail dot com
Target Milestone: ---
Hi,
Since a long time we seldomly have a deadlock during the deallocation of
threads and fork after performing a job (in ruby). But it seems that since we
add more threads in our job the case is more present.
What is our deadlock? Thanks to gdb we have the backtraces :
This one for all threads :
(gdb) info thread
Id Target Id Frame
* 1 LWP 31163 0x00007f3c778aba14 in
futex_wait (private=0, expected=2, futex_word=0x7f3c587ff71c) at
../sysdeps/nptl/futex-internal.h:144
2 LWP 31165 __lll_lock_wait_private
(futex=0x7f3c778c53e0 <stack_cache_lock>) at lowlevellock.c:35
3 LWP 31167 __lll_lock_wait_private
(futex=0x7f3c778c53e0 <stack_cache_lock>) at lowlevellock.c:35
4 LWP 31170 __lll_lock_wait_private
(futex=0x7f3c778c53e0 <stack_cache_lock>) at lowlevellock.c:35
5 LWP 31171 __lll_lock_wait_private
(futex=0x7f3c778c53e0 <stack_cache_lock>) at lowlevellock.c:35
6 LWP 34739 __lll_lock_wait_private
(futex=0x7f3c778c53e0 <stack_cache_lock>) at lowlevellock.c:35
If we focus on the main thread we can see that it stuck during the free of a
dynamic library of the fork :
(gdb) bt
#0 0x00007f3c778aba14 in futex_wait (private=0, expected=2,
futex_word=0x7f3c587ff71c) at ../sysdeps/nptl/futex-internal.h:144
#1 futex_wait_simple (private=0, expected=2, futex_word=0x7f3c587ff71c) at
../sysdeps/nptl/futex-internal.h:175
#2 __wait_lookup_done () at allocatestack.c:1239
#3 0x00007f3c7843e7dc in _dl_close_worker (map=map@entry=0x7f3c634d8700,
force=force@entry=false) at dl-close.c:536
#4 0x00007f3c7843e97e in _dl_close (_map=0x7f3c634d8700) at dl-close.c:859
#5 0x00007f3c77805a90 in __GI__dl_catch_exception
(exception=exception@entry=0x7fffb8030820, operate=operate@entry=0x7f3c77a48340
<dlclose_doit>, args=args@entry=0x7f3c634d8700) at dl-error-skeleton.c:208
#6 0x00007f3c77805b4f in __GI__dl_catch_error
(objname=objname@entry=0x7f3c76a2dcd0,
errstring=errstring@entry=0x7f3c76a2dcd8,
mallocedp=mallocedp@entry=0x7f3c76a2dcc8, operate=operate@entry=0x7f3c77a48340
<dlclose_doit>, args=args@entry=0x7f3c634d8700)
at dl-error-skeleton.c:227
#7 0x00007f3c77a48a65 in _dlerror_run (operate=operate@entry=0x7f3c77a48340
<dlclose_doit>, args=0x7f3c634d8700) at dlerror.c:170
#8 0x00007f3c77a48374 in __dlclose (handle=<optimized out>) at dlclose.c:46
#9 0x00007f3c671c9e01 in library_free (library=0x7f3c63571898) at
DynamicLibrary.c:171
#10 0x00007f3c77f49c83 in vm_ccs_free (alive=1, objspace=0x0, klass=36,
ccs=0x7f3c634d7300) at gc.c:3254
#11 rb_vm_ccs_free (ccs=0x7f3c634d7300) at gc.c:3285
#12 0x00007f3c77057488 in ?? ()
#13 0x0000000000000ac8 in ?? ()
#14 0x00007f3c61952ff0 in ?? ()
#15 0x0000000000000000 in ?? ()
And in the backtrace of a thread we can see that is locked during its
deallocation of the memory stack :
(gdb) bt
#0 __lll_lock_wait_private (futex=0x7f3c778c53e0 <stack_cache_lock>) at
lowlevellock.c:35
#1 0x00007f3c778ab10c in __deallocate_stack (pd=pd@entry=0x7f3c5158f700) at
allocatestack.c:790
#2 0x00007f3c778abda9 in __free_tcb (pd=pd@entry=0x7f3c5158f700) at
pthread_create.c:368
#3 0x00007f3c778ac07b in start_thread (arg=<optimized out>) at
pthread_create.c:573
#4 0x00007f3c777cba2f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95
All threads wait the mutex that is kept by this function of the glibc/nptl :
https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/allocatestack.c;h=110ba18f5dbdb3054ee0b9545a76757a3ae74568;hb=9ea3686266dca3f004ba874745a4087a89682617#l1214
And this function is stuck during the tentative to lock the mutex here
https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/allocatestack.c;h=110ba18f5dbdb3054ee0b9545a76757a3ae74568;hb=9ea3686266dca3f004ba874745a4087a89682617#l1239
Last clue, when I do a x/3 0x7f3c587ff71c on the memory pointer of the stuck
mutex I have "2 0 0" as if the mutex is still considered as locked but no trace
of pid?
(gdb) x/3 0x7f3c587ff71c
0x7f3c587ff71c: 2 0 0
While we can see
Either this gscope_flagp is already free and so the futex_wait_simple stuck or
other stuff…
Is it something already seen?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug nptl/30530] deadlock during deallocation of threads and fork
2023-06-08 15:37 [Bug nptl/30530] New: deadlock during deallocation of threads and fork ced.couton at gmail dot com
@ 2023-06-08 16:17 ` schwab@linux-m68k.org
2023-06-08 19:29 ` ced.couton at gmail dot com
1 sibling, 0 replies; 3+ messages in thread
From: schwab@linux-m68k.org @ 2023-06-08 16:17 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30530
--- Comment #1 from Andreas Schwab <schwab@linux-m68k.org> ---
Threads and fork don't mix well. After a fork only async-signal-safe functions
can be called in the child.
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 3+ messages in thread
* [Bug nptl/30530] deadlock during deallocation of threads and fork
2023-06-08 15:37 [Bug nptl/30530] New: deadlock during deallocation of threads and fork ced.couton at gmail dot com
2023-06-08 16:17 ` [Bug nptl/30530] " schwab@linux-m68k.org
@ 2023-06-08 19:29 ` ced.couton at gmail dot com
1 sibling, 0 replies; 3+ messages in thread
From: ced.couton at gmail dot com @ 2023-06-08 19:29 UTC (permalink / raw)
To: glibc-bugs
https://sourceware.org/bugzilla/show_bug.cgi?id=30530
--- Comment #2 from Cédric Couton <ced.couton at gmail dot com> ---
Do you see an example of non async-signal-safe function that is called in the
stack of the thread?
--
You are receiving this mail because:
You are on the CC list for the bug.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2023-06-08 19:29 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-08 15:37 [Bug nptl/30530] New: deadlock during deallocation of threads and fork ced.couton at gmail dot com
2023-06-08 16:17 ` [Bug nptl/30530] " schwab@linux-m68k.org
2023-06-08 19:29 ` ced.couton at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).