From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 403673857C6E; Mon, 13 Sep 2021 02:50:10 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 403673857C6E From: "xujing99 at huawei dot com" To: glibc-bugs@sourceware.org Subject: [Bug dynamic-link/19329] dl-tls.c assert failure at concurrent pthread_create and dlopen Date: Mon, 13 Sep 2021 02:50:09 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: glibc X-Bugzilla-Component: dynamic-link X-Bugzilla-Version: 2.22 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: xujing99 at huawei dot com X-Bugzilla-Status: RESOLVED X-Bugzilla-Resolution: FIXED X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: nszabolcs at gmail dot com X-Bugzilla-Target-Milestone: 2.34 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://sourceware.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: glibc-bugs@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Glibc-bugs mailing list List-Unsubscribe: , List-Archive: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Sep 2021 02:50:11 -0000 https://sourceware.org/bugzilla/show_bug.cgi?id=3D19329 --- Comment #39 from xujing --- (In reply to Szabolcs Nagy from comment #38) > (In reply to xujing from comment #35) > > (In reply to cvs-commit@gcc.gnu.org from comment #31) > > > commit 1387ad6225c2222f027790e3f460e31aa5dd2c54 > > > Author: Szabolcs Nagy > > > Date: Wed Dec 30 19:19:37 2020 +0000 > > >=20 > > > elf: Fix data races in pthread_create and TLS access [BZ #19329] > > >=20=20=20=20=20 > > this patch use dl_load_lock in _dl_allocate_tls_init, is there a problem > > when dlopen a dynamic library which will call pthread_create? I think it > > will cause dl_load_lock and dl_load_lock dead lock. >=20 > the real bug is that ctors are run with the dlopen lock held. > that can causes deadlocks anyway (a ctor can create threads > and that thread can call dlopen). this is bug 15686 which is not > easy to fix, but that's the right solution. (in general, running > user callbacks while libc internal locks are held is wrong.) >=20 > that bug is now more exposed because the lock is also taken > at _dl_allocate_tls_init during thread creation. however i > expect that to be called in the parent thread only, so there > should be no deadlock when ctor calls pthread_create, only > when the child thread calls it again (which i considered rare). >=20 > if you have example code that you think should work but now > deadlocks, then please report it. I'm sorry, I misled you. I think there is an ABBA deadlock issue in some scenarios. If I have a c++ dynamic library(named libA.so) that contains a global objec= t, the global object will call the post-constructor at initialization and hold it's own lock(named A_lock) when dlopen loads libA.so. Assume that two thre= ads execute the following process=EF=BC=9A Thread1:dlopen(libA.so) =3D> hold dl_load_lock =3D> load libA.so =3D> i= nit global=20 object from libA.so =3D> wait for hold A_lock Thread2:my own code hold A_lock =3D> pthread_create =3D> _dl_allocate_t= ls_init=20 =3D> wait for hold dl_load_lock In this case, an ABBA deadlock occurs. Is this a bug? My stack looks like this: Thread 1 (LWP 136013): #0 0x00007f57a108510d in ?? () from /usr/lib64/libpthread.so.0 #1 0x00007f57a107e4d1 in pthread_mutex_lock () from /usr/lib64/libpthread.= so.0 #1 stack waiting for holding A_lock ... #6 0x00007f5781c1bb8b in LogProcess::Init (strProcName=3D..., nProcHandle=3DnProcHandle@entry=3D0) at ./service/biz_frame/code/server/src/logging/logprocess.cpp:107 ... #20 0x00007f57a0fef21f in _dl_catch_exception () from /usr/lib64/libc.so.6 #21 0x00007f57a786442b in ?? () from /lib64/ld-linux-x86-64.so.2 #22 0x00007f57a3de2296 in ?? () from /usr/lib64/libdl.so.2 #23 0x00007f57a0fef21f in _dl_catch_exception () from /usr/lib64/libc.so.6 #24 0x00007f57a0fef2af in _dl_catch_error () from /usr/lib64/libc.so.6 #25 0x00007f57a3de2985 in ?? () from /usr/lib64/libdl.so.2 #26 0x00007f57a3de2351 in dlopen () from /usr/lib64/libdl.so.2 ... ... #38 0x00007f57a0fb3520 in clone () from /usr/lib64/libc.so.6 Thread 2 (LWP 134627): #0 0x00007f57a108510d in ?? () from /usr/lib64/libpthread.so.0 #1 0x00007f57a107e580 in pthread_mutex_lock () from /usr/lib64/libpthread.= so.0 #2 0x00007f57a7863835 in _dl_allocate_tls_init () from /lib64/ld-linux-x86-64.so.2 #3 0x00007f57a107cb7c in pthread_create () from /usr/lib64/libpthread.so.0 ... #10 Stack holding A_lock ... #14 0x0000561689e0d579 in main () --=20 You are receiving this mail because: You are on the CC list for the bug.=