public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup
       [not found] <bug-654-131@http.sourceware.org/bugzilla/>
@ 2010-12-10  3:43 ` r0bertz at gentoo dot org
  2010-12-10 12:51 ` r0bertz at gentoo dot org
  2012-05-06  9:04 ` aj at suse dot de
  2 siblings, 0 replies; 10+ messages in thread
From: r0bertz at gentoo dot org @ 2010-12-10  3:43 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=654

Zhang, Le <r0bertz at gentoo dot org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |r0bertz at gentoo dot org

--- Comment #8 from Zhang, Le <r0bertz at gentoo dot org> 2010-12-10 02:56:50 UTC ---
Hey, guys, I found this testcase can't trigger the bug in recent glibc.
How is this bug finally resovled?
I have searched glibc git commit message for dl_load_lock, but nothing showed
up.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup
       [not found] <bug-654-131@http.sourceware.org/bugzilla/>
  2010-12-10  3:43 ` [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup r0bertz at gentoo dot org
@ 2010-12-10 12:51 ` r0bertz at gentoo dot org
  2012-05-06  9:04 ` aj at suse dot de
  2 siblings, 0 replies; 10+ messages in thread
From: r0bertz at gentoo dot org @ 2010-12-10 12:51 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=654

--- Comment #9 from ZHANG, Le <r0bertz at gentoo dot org> 2010-12-10 03:43:08 UTC ---
(In reply to comment #4)
> This is the same deadlock as has been fixed by:
> 2004-07-07  Ulrich Drepper  <drepper@redhat.com>
> 
>         * elf/dl-fini.c (_dl_fini): Move the unlock of the ld.so lock
>         before the loop running the destructors.
...
[snip]
...

Ah, sorry, overlooked this comment.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup
       [not found] <bug-654-131@http.sourceware.org/bugzilla/>
  2010-12-10  3:43 ` [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup r0bertz at gentoo dot org
  2010-12-10 12:51 ` r0bertz at gentoo dot org
@ 2012-05-06  9:04 ` aj at suse dot de
  2 siblings, 0 replies; 10+ messages in thread
From: aj at suse dot de @ 2012-05-06  9:04 UTC (permalink / raw)
  To: glibc-bugs

http://sourceware.org/bugzilla/show_bug.cgi?id=654

Andreas Jaeger <aj at suse dot de> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |aj at suse dot de
         Resolution|                            |FIXED

--- Comment #10 from Andreas Jaeger <aj at suse dot de> 2012-05-06 09:03:13 UTC ---
I checked glibc 2.15 and the testcase works for me, I assume this is fixed
with:

2006-10-27  Ulrich Drepper  <drepper@redhat.com>

    * elf/dl-close.c (_dl_close_worker): Renamed from _dl_close and
    split out locking and parameter checking.
    (_dl_close): Call _dl_close_worker after locking and checking.
    * elf/dl-open.c (_dl_open): Call _dl_close_worker instead of
    _dl_close.
    * elf/Makefile: Add rules to build and run tst-thrlock.
    * elf/tst-thrlock.c:  New file.


If you still have the problem with glibc 2.15, please reopen and tell us a
better way to reproduce.

-- 
Configure bugmail: http://sourceware.org/bugzilla/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup
  2005-01-12 10:47 [Bug nptl/654] New: " alexei dot khlebnikov at datacon dot at
                   ` (5 preceding siblings ...)
  2005-01-17 12:47 ` alexei dot khlebnikov at datacon dot at
@ 2006-05-02 22:03 ` drepper at redhat dot com
  6 siblings, 0 replies; 10+ messages in thread
From: drepper at redhat dot com @ 2006-05-02 22:03 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From drepper at redhat dot com  2006-05-02 22:03 -------
Nothing related to C++, exceptions, and dlopen can be critical.

-- 
           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|critical                    |normal


http://sourceware.org/bugzilla/show_bug.cgi?id=654

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup
  2005-01-12 10:47 [Bug nptl/654] New: " alexei dot khlebnikov at datacon dot at
                   ` (4 preceding siblings ...)
  2005-01-17 12:43 ` alexei dot khlebnikov at datacon dot at
@ 2005-01-17 12:47 ` alexei dot khlebnikov at datacon dot at
  2006-05-02 22:03 ` drepper at redhat dot com
  6 siblings, 0 replies; 10+ messages in thread
From: alexei dot khlebnikov at datacon dot at @ 2005-01-17 12:47 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From alexei dot khlebnikov at datacon dot at  2005-01-17 12:47 -------
Created an attachment (id=369)
 --> (http://sources.redhat.com/bugzilla/attachment.cgi?id=369&action=view)
Proposed patch, the first try.

This is the same patch as listed in the comment #5.


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=654

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup
  2005-01-12 10:47 [Bug nptl/654] New: " alexei dot khlebnikov at datacon dot at
                   ` (3 preceding siblings ...)
  2005-01-13 13:15 ` jakub at redhat dot com
@ 2005-01-17 12:43 ` alexei dot khlebnikov at datacon dot at
  2005-01-17 12:47 ` alexei dot khlebnikov at datacon dot at
  2006-05-02 22:03 ` drepper at redhat dot com
  6 siblings, 0 replies; 10+ messages in thread
From: alexei dot khlebnikov at datacon dot at @ 2005-01-17 12:43 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From alexei dot khlebnikov at datacon dot at  2005-01-17 12:43 -------
I've constructed a patch that unlocks dl_load_lock just before running the
destructors, and locks it again just after that. My testcase now runs properly,
but I don't know wether or not my patch has any side-effects. So, dear glibc
developers, please watch it and either confirm that the patch is correct or
point me where am I wrong. Thanks.

The patch:
---
--- glibc/elf/dl-close.c.orig   2005-01-09 09:27:52.000000000 +0100
+++ glibc/elf/dl-close.c        2005-01-17 15:04:52.000000000 +0100
@@ -265,6 +265,10 @@
       }
   assert (new_opencount[0] == 0);

+  /* Release dl_load_lock during running destructors,
+     like in dl-fini.c. */
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
+
   /* Call all termination functions at once.  */
 #ifdef SHARED
   bool do_audit = GLRO(dl_naudit) > 0 && !GL(dl_ns)[ns]._ns_loaded->l_auditing;
@@ -389,6 +393,9 @@
       assert (imap->l_type == lt_loaded || imap->l_opencount > 0);
     }

+  /* Destructors finished, acquire dl_load_lock again. */
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+
 #ifdef SHARED
   /* Auditing checkpoint: we will start deleting objects.  */
   if (__builtin_expect (do_audit, 0))
---


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=654

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup
  2005-01-12 10:47 [Bug nptl/654] New: " alexei dot khlebnikov at datacon dot at
                   ` (2 preceding siblings ...)
  2005-01-13 12:52 ` alexei dot khlebnikov at datacon dot at
@ 2005-01-13 13:15 ` jakub at redhat dot com
  2005-01-17 12:43 ` alexei dot khlebnikov at datacon dot at
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: jakub at redhat dot com @ 2005-01-13 13:15 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From jakub at redhat dot com  2005-01-13 13:15 -------
This is the same deadlock as has been fixed by:
2004-07-07  Ulrich Drepper  <drepper@redhat.com>

        * elf/dl-fini.c (_dl_fini): Move the unlock of the ld.so lock
        before the loop running the destructors.
for destructors that are run at exit time.
ATM ld.so holds dl_load_lock when running shared library destructors and the
same lock is used indirectly by libgcc_s.so when unwinding.  If you call
pthread_cancel in a shared library destructor that is run during dlclose,
dl_load_lock is held in the thread calling pthread_cancel, but the cancelled
thread needs to be unwound.  As you also call pthread_join in the same destructor
that waits for the cancelled thread and the cancelled thread is waiting until
dl_load_lock is released (this would happen when dlclose is about to return),
they are deadlocking.

The fix is avoid running shared library destructors with dl_load_lock held,
but that's certainly not trivial.

-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=654

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup
  2005-01-12 10:47 [Bug nptl/654] New: " alexei dot khlebnikov at datacon dot at
  2005-01-12 10:49 ` [Bug nptl/654] " alexei dot khlebnikov at datacon dot at
  2005-01-13 12:31 ` alexei dot khlebnikov at datacon dot at
@ 2005-01-13 12:52 ` alexei dot khlebnikov at datacon dot at
  2005-01-13 13:15 ` jakub at redhat dot com
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: alexei dot khlebnikov at datacon dot at @ 2005-01-13 12:52 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From alexei dot khlebnikov at datacon dot at  2005-01-13 12:52 -------
I've investigated the problem further.
I've found a (not very precise) place in the libc where the hangup takes place.
It's the file nptl/pthread_join.c, line 86, which looks like
---
  /* Wait for the child.  */
  lll_wait_tid (pd->tid);
---

lll_wait_tid is a macro with assembler code which I don't understand so far:
---
/* The kernel notifies a process with uses CLONE_CLEARTID via futex
   wakeup when the clone terminates.  The memory location contains the
   thread ID while the clone is running and is reset to zero
   afterwards.

   The macro parameter must not have any side effect.  */
#define lll_wait_tid(tid) \
  do {									      \
    int __ignore;							      \
    register __typeof (tid) _tid asm ("edx") = (tid);			      \
    if (_tid != 0)							      \
      __asm __volatile (LLL_EBX_LOAD					      \
			"1:\tmovl %1, %%eax\n\t"			      \
			LLL_ENTER_KERNEL				      \
			"cmpl $0, (%%ebx)\n\t"				      \
			"jne,pn 1b\n\t"					      \
			LLL_EBX_LOAD					      \
			: "=&a" (__ignore)				      \
			: "i" (SYS_futex), LLL_EBX_REG (&tid), "S" (0),	      \
			  "c" (FUTEX_WAIT), "d" (_tid),			      \
			  "i" (offsetof (tcbhead_t, sysinfo)));		      \
  } while (0)
---


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=654

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup
  2005-01-12 10:47 [Bug nptl/654] New: " alexei dot khlebnikov at datacon dot at
  2005-01-12 10:49 ` [Bug nptl/654] " alexei dot khlebnikov at datacon dot at
@ 2005-01-13 12:31 ` alexei dot khlebnikov at datacon dot at
  2005-01-13 12:52 ` alexei dot khlebnikov at datacon dot at
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: alexei dot khlebnikov at datacon dot at @ 2005-01-13 12:31 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From alexei dot khlebnikov at datacon dot at  2005-01-13 12:30 -------
I've tested the same testcase on another system, having kernel 2.4.20 and glibc
2.3.2 with linuxthreads. The program ran just fine. The test has been conducted
today, 2004-01-13.

The output:
---
$ ./run
loading ./libtestmod.so now
Constructor called
pureShutdown::func(void*) called
hi there, new thread is up and running, thread id is 16386
Constructor finished
= thread 16386 is still running...
= thread 16386 is still running...
= thread 16386 is still running...
= thread 16386 is still running...
unloading ./libtestmod.so now
Destructor called
modShutdown() called
bye, cancelling down thread 16386
running pthread_join(g_tid, &result) ...
returned from pthread_join(g_tid, &result) !
all's well that end's well
modShutdown() finished
Destructor finished
$
---

System information:
CPU: Intel(R) Xeon(TM) CPU 2.80GHz
Distribution: SuSE Linux 8.2
Kernel: 2.4.20-64GB-SMP, from the SuSE distribution

Glibc version:
---
$ /lib/libc.so.6
GNU C Library stable release version 2.3.2, by Roland McGrath et al.
Copyright (C) 2003 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 3.3 20030226 (prerelease) (SuSE Linux).
Compiled on a Linux 2.4.20 system on 2003-03-13.
Available extensions:
        GNU libio by Per Bothner
        crypt add-on version 2.1 by Michael Glad and others
        linuxthreads-0.10 by Xavier Leroy
        NoVersion patch for broken glibc 2.0 binaries
        BIND-8.2.3-T5B
        libthread_db work sponsored by Alpha Processor Inc
        NIS(YP)/NIS+ NSS modules 0.19 by Thorsten Kukuk
Report bugs using the `glibcbug' script to <bugs@gnu.org>.
---

GCC version:
---
$ gcc -v
Reading specs from /usr/lib/gcc-lib/i486-suse-linux/3.3/specs
Configured with: ../configure --enable-threads=posix --prefix=/usr
--with-local-prefix=/usr/local --infodir=/usr/share/info --mandir=/usr/share/man
--libdir=/usr/lib --enable-languages=c,c++,f77,objc,java,ada --disable-checking
--enable-libgcj --with-gxx-include-dir=/usr/include/g++ --with-slibdir=/lib
--with-system-zlib --enable-shared --enable-__cxa_atexit i486-suse-linux
Thread model: posix
gcc version 3.3 20030226 (prerelease) (SuSE Linux)
---

Ld/Binutils version:
---
$ ld -v
GNU ld version 2.13.90.0.18 20030121 (SuSE Linux)
---


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=654

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup
  2005-01-12 10:47 [Bug nptl/654] New: " alexei dot khlebnikov at datacon dot at
@ 2005-01-12 10:49 ` alexei dot khlebnikov at datacon dot at
  2005-01-13 12:31 ` alexei dot khlebnikov at datacon dot at
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 10+ messages in thread
From: alexei dot khlebnikov at datacon dot at @ 2005-01-12 10:49 UTC (permalink / raw)
  To: glibc-bugs


------- Additional Comments From alexei dot khlebnikov at datacon dot at  2005-01-12 10:49 -------
Created an attachment (id=350)
 --> (http://sources.redhat.com/bugzilla/attachment.cgi?id=350&action=view)
Testcase for the bug.


-- 


http://sources.redhat.com/bugzilla/show_bug.cgi?id=654

------- You are receiving this mail because: -------
You are on the CC list for the bug, or are watching someone who is.


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2012-05-06  9:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <bug-654-131@http.sourceware.org/bugzilla/>
2010-12-10  3:43 ` [Bug nptl/654] Cancelling nptl thread on dlclose() leads to application hangup r0bertz at gentoo dot org
2010-12-10 12:51 ` r0bertz at gentoo dot org
2012-05-06  9:04 ` aj at suse dot de
2005-01-12 10:47 [Bug nptl/654] New: " alexei dot khlebnikov at datacon dot at
2005-01-12 10:49 ` [Bug nptl/654] " alexei dot khlebnikov at datacon dot at
2005-01-13 12:31 ` alexei dot khlebnikov at datacon dot at
2005-01-13 12:52 ` alexei dot khlebnikov at datacon dot at
2005-01-13 13:15 ` jakub at redhat dot com
2005-01-17 12:43 ` alexei dot khlebnikov at datacon dot at
2005-01-17 12:47 ` alexei dot khlebnikov at datacon dot at
2006-05-02 22:03 ` drepper at redhat dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).