public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/30530] New: deadlock during deallocation of threads and fork
@ 2023-06-08 15:37 ced.couton at gmail dot com
  2023-06-08 16:17 ` [Bug nptl/30530] " schwab@linux-m68k.org
  2023-06-08 19:29 ` ced.couton at gmail dot com
  0 siblings, 2 replies; 3+ messages in thread
From: ced.couton at gmail dot com @ 2023-06-08 15:37 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30530

            Bug ID: 30530
           Summary: deadlock during deallocation of threads and fork
           Product: glibc
           Version: 2.31
            Status: UNCONFIRMED
          Severity: normal
          Priority: P2
         Component: nptl
          Assignee: unassigned at sourceware dot org
          Reporter: ced.couton at gmail dot com
                CC: drepper.fsp at gmail dot com
  Target Milestone: ---

Hi,

Since a long time we seldomly have a deadlock during the deallocation of
threads and fork after performing a job (in ruby). But it seems that since we
add more threads in our job the case is more present.
What is our deadlock? Thanks to gdb we have the backtraces :

This one for all threads :
(gdb) info thread
  Id   Target Id                                   Frame 
* 1    LWP 31163                                   0x00007f3c778aba14 in
futex_wait (private=0, expected=2, futex_word=0x7f3c587ff71c) at
../sysdeps/nptl/futex-internal.h:144
  2    LWP 31165                                   __lll_lock_wait_private
(futex=0x7f3c778c53e0 <stack_cache_lock>) at lowlevellock.c:35
  3    LWP 31167                                   __lll_lock_wait_private
(futex=0x7f3c778c53e0 <stack_cache_lock>) at lowlevellock.c:35
  4    LWP 31170                                   __lll_lock_wait_private
(futex=0x7f3c778c53e0 <stack_cache_lock>) at lowlevellock.c:35
  5    LWP 31171                                   __lll_lock_wait_private
(futex=0x7f3c778c53e0 <stack_cache_lock>) at lowlevellock.c:35
  6    LWP 34739                                   __lll_lock_wait_private
(futex=0x7f3c778c53e0 <stack_cache_lock>) at lowlevellock.c:35

If we focus on the main thread we can see that it stuck during the free of a
dynamic library of the fork :
(gdb) bt
#0  0x00007f3c778aba14 in futex_wait (private=0, expected=2,
futex_word=0x7f3c587ff71c) at ../sysdeps/nptl/futex-internal.h:144
#1  futex_wait_simple (private=0, expected=2, futex_word=0x7f3c587ff71c) at
../sysdeps/nptl/futex-internal.h:175
#2  __wait_lookup_done () at allocatestack.c:1239
#3  0x00007f3c7843e7dc in _dl_close_worker (map=map@entry=0x7f3c634d8700,
force=force@entry=false) at dl-close.c:536
#4  0x00007f3c7843e97e in _dl_close (_map=0x7f3c634d8700) at dl-close.c:859
#5  0x00007f3c77805a90 in __GI__dl_catch_exception
(exception=exception@entry=0x7fffb8030820, operate=operate@entry=0x7f3c77a48340
<dlclose_doit>, args=args@entry=0x7f3c634d8700) at dl-error-skeleton.c:208
#6  0x00007f3c77805b4f in __GI__dl_catch_error
(objname=objname@entry=0x7f3c76a2dcd0,
errstring=errstring@entry=0x7f3c76a2dcd8,
mallocedp=mallocedp@entry=0x7f3c76a2dcc8, operate=operate@entry=0x7f3c77a48340
<dlclose_doit>, args=args@entry=0x7f3c634d8700)
    at dl-error-skeleton.c:227
#7  0x00007f3c77a48a65 in _dlerror_run (operate=operate@entry=0x7f3c77a48340
<dlclose_doit>, args=0x7f3c634d8700) at dlerror.c:170
#8  0x00007f3c77a48374 in __dlclose (handle=<optimized out>) at dlclose.c:46
#9  0x00007f3c671c9e01 in library_free (library=0x7f3c63571898) at
DynamicLibrary.c:171
#10 0x00007f3c77f49c83 in vm_ccs_free (alive=1, objspace=0x0, klass=36,
ccs=0x7f3c634d7300) at gc.c:3254
#11 rb_vm_ccs_free (ccs=0x7f3c634d7300) at gc.c:3285
#12 0x00007f3c77057488 in ?? ()
#13 0x0000000000000ac8 in ?? ()
#14 0x00007f3c61952ff0 in ?? ()
#15 0x0000000000000000 in ?? ()

And in the backtrace of a thread we can see that is locked during its
deallocation of the memory stack :
(gdb) bt
#0  __lll_lock_wait_private (futex=0x7f3c778c53e0 <stack_cache_lock>) at
lowlevellock.c:35
#1  0x00007f3c778ab10c in __deallocate_stack (pd=pd@entry=0x7f3c5158f700) at
allocatestack.c:790
#2  0x00007f3c778abda9 in __free_tcb (pd=pd@entry=0x7f3c5158f700) at
pthread_create.c:368
#3  0x00007f3c778ac07b in start_thread (arg=<optimized out>) at
pthread_create.c:573
#4  0x00007f3c777cba2f in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

All threads wait the mutex that is kept by this function of the glibc/nptl :
https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/allocatestack.c;h=110ba18f5dbdb3054ee0b9545a76757a3ae74568;hb=9ea3686266dca3f004ba874745a4087a89682617#l1214
And this function is stuck during the tentative to lock the mutex here
https://sourceware.org/git/?p=glibc.git;a=blob;f=nptl/allocatestack.c;h=110ba18f5dbdb3054ee0b9545a76757a3ae74568;hb=9ea3686266dca3f004ba874745a4087a89682617#l1239

Last clue, when I do a x/3 0x7f3c587ff71c on the memory pointer of the stuck
mutex I have "2 0 0" as if the mutex is still considered as locked but no trace
of pid?
(gdb) x/3 0x7f3c587ff71c
0x7f3c587ff71c: 2       0       0

While we can see 
Either this gscope_flagp is already free and so the futex_wait_simple stuck or
other stuff…

Is it something already seen?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug nptl/30530] deadlock during deallocation of threads and fork
  2023-06-08 15:37 [Bug nptl/30530] New: deadlock during deallocation of threads and fork ced.couton at gmail dot com
@ 2023-06-08 16:17 ` schwab@linux-m68k.org
  2023-06-08 19:29 ` ced.couton at gmail dot com
  1 sibling, 0 replies; 3+ messages in thread
From: schwab@linux-m68k.org @ 2023-06-08 16:17 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30530

--- Comment #1 from Andreas Schwab <schwab@linux-m68k.org> ---
Threads and fork don't mix well.  After a fork only async-signal-safe functions
can be called in the child.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug nptl/30530] deadlock during deallocation of threads and fork
  2023-06-08 15:37 [Bug nptl/30530] New: deadlock during deallocation of threads and fork ced.couton at gmail dot com
  2023-06-08 16:17 ` [Bug nptl/30530] " schwab@linux-m68k.org
@ 2023-06-08 19:29 ` ced.couton at gmail dot com
  1 sibling, 0 replies; 3+ messages in thread
From: ced.couton at gmail dot com @ 2023-06-08 19:29 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=30530

--- Comment #2 from Cédric Couton <ced.couton at gmail dot com> ---
Do you see an example of non async-signal-safe function that is called in the
stack of the thread?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-08 19:29 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-08 15:37 [Bug nptl/30530] New: deadlock during deallocation of threads and fork ced.couton at gmail dot com
2023-06-08 16:17 ` [Bug nptl/30530] " schwab@linux-m68k.org
2023-06-08 19:29 ` ced.couton at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).