public inbox for glibc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug dynamic-link/27137] New: lazy tlsdesc relocation has data races
@ 2020-12-31 16:08 nsz at gcc dot gnu.org
  2021-04-09 17:53 ` [Bug dynamic-link/27137] " woodard at redhat dot com
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: nsz at gcc dot gnu.org @ 2020-12-31 16:08 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=27137

            Bug ID: 27137
           Summary: lazy tlsdesc relocation has data races
           Product: glibc
           Version: 2.32
            Status: NEW
          Severity: normal
          Priority: P2
         Component: dynamic-link
          Assignee: unassigned at sourceware dot org
          Reporter: nsz at gcc dot gnu.org
  Target Milestone: ---

on x86 tlsdesc can be lazily relocated, but that

(1) requires locks for _dl_make_tlsdesc_dynamic
(2) allocations in _dl_make_tlsdesc_dynamic can fail
(3) calls _dl_try_allocate_static_tls without locks

from these issues (1) and (2) applies to normal tls access
too (bug 16133 and bug 16134) but (3) is specific to lazy
tlsdesc relocations: there are a lot of accesses to global
state that was not designed to be done without holding
locks such as writing to GL(dl_tls_static_optional),
GL(dl_tls_static_used) and map->l_tls_offset.

on arm and aarch64 lazy tlsdesc reloc was removed earlier
because of data races between accesses to the same tls
variable, these are data races with concurrent thread
creation and independent tls accesses.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug dynamic-link/27137] lazy tlsdesc relocation has data races
  2020-12-31 16:08 [Bug dynamic-link/27137] New: lazy tlsdesc relocation has data races nsz at gcc dot gnu.org
@ 2021-04-09 17:53 ` woodard at redhat dot com
  2021-04-09 17:53 ` woodard at redhat dot com
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: woodard at redhat dot com @ 2021-04-09 17:53 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=27137

Ben Woodard <woodard at redhat dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |woodard at redhat dot com

--- Comment #1 from Ben Woodard <woodard at redhat dot com> ---
It is hard to argue that this is the same problem but the same bit of code is
to blame. The calculation for the TLSDESC relocation ends up crashing the
dynamic linker when done lazily in some cases. The patch that addresses these
race conditions by disabling lazy relocations also fixes this problem. In this
particular case, the problem doesn't appear to be a data race, somehow some
libraries are being compiled in such a way that they use the gnu2 TLS variant
but they do not contain a PLT or GOT this causes the relocation calculation to
go awry and execution crashes in the dynamic linker.

Steps to Reproduce:
1. compile up auditor.so and dlmain.c
2. LD_AUDIT=/path/to/auditor.so /path/to/dlmain /path/to/libmemkind.so

The patch that addresses this problem also addresses the problem I am
describing. 
The key one needed to fix this problem is:
https://sourceware.org/pipermail/libc-alpha/2021-February/122637.html
It simply disables lazy binding of gnu2 TLSDESC relocations. Unfortunately,
there is a minor bug in that patch which the author said that he will address
with the second version of the patch set.
https://sourceware.org/pipermail/libc-alpha/2021-April/124889.html

It is fairly likely that a variant of this problem also exists in i386 but I
have yet to see that in the wild. It therefore may be worth considering also
adding patch 12 from Szabolcs Nagy as well when fixing this problem.
https://sourceware.org/pipermail/libc-alpha/2021-February/122638.html

Patches 13 and 14 are not strictly necessary to address this bug but they
remove the code that has the bug in it.
https://sourceware.org/pipermail/libc-alpha/2021-February/122639.html
https://sourceware.org/pipermail/libc-alpha/2021-February/122640.html

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug dynamic-link/27137] lazy tlsdesc relocation has data races
  2020-12-31 16:08 [Bug dynamic-link/27137] New: lazy tlsdesc relocation has data races nsz at gcc dot gnu.org
  2021-04-09 17:53 ` [Bug dynamic-link/27137] " woodard at redhat dot com
@ 2021-04-09 17:53 ` woodard at redhat dot com
  2021-04-12  8:36 ` nsz at gcc dot gnu.org
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: woodard at redhat dot com @ 2021-04-09 17:53 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=27137

--- Comment #2 from Ben Woodard <woodard at redhat dot com> ---
Created attachment 13359
  --> https://sourceware.org/bugzilla/attachment.cgi?id=13359&action=edit
simple reproducer of this problem

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug dynamic-link/27137] lazy tlsdesc relocation has data races
  2020-12-31 16:08 [Bug dynamic-link/27137] New: lazy tlsdesc relocation has data races nsz at gcc dot gnu.org
  2021-04-09 17:53 ` [Bug dynamic-link/27137] " woodard at redhat dot com
  2021-04-09 17:53 ` woodard at redhat dot com
@ 2021-04-12  8:36 ` nsz at gcc dot gnu.org
  2021-04-15  8:48 ` cvs-commit at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: nsz at gcc dot gnu.org @ 2021-04-12  8:36 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=27137

--- Comment #3 from Szabolcs Nagy <nsz at gcc dot gnu.org> ---
(In reply to Ben Woodard from comment #1)
> somehow some libraries are being compiled in such a way that they use the
> gnu2 TLS variant but they do not contain a PLT or GOT this causes the
> relocation calculation to go awry and execution crashes in the dynamic
> linker.

that would be a linker bug as the DT_TLSDESC_PLT/GOT are needed
for lazy tlsdesc to work, however..

libmemkind.so DYNAMIC section has

 0x000000000000001e (FLAGS)              BIND_NOW
 0x000000006ffffffb (FLAGS_1)            Flags: NOW

i.e. it does not support lazy entry points.

The bug in this case is in ldaudit: it disables bind now and
tries to relocate everything lazily (so it can use plt hooks)
that's wrong for tlsdesc, i opened bug 27721.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug dynamic-link/27137] lazy tlsdesc relocation has data races
  2020-12-31 16:08 [Bug dynamic-link/27137] New: lazy tlsdesc relocation has data races nsz at gcc dot gnu.org
                   ` (2 preceding siblings ...)
  2021-04-12  8:36 ` nsz at gcc dot gnu.org
@ 2021-04-15  8:48 ` cvs-commit at gcc dot gnu.org
  2021-04-15  8:49 ` cvs-commit at gcc dot gnu.org
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-15  8:48 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=27137

--- Comment #4 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Szabolcs Nagy <nsz@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=8f7e09f4dbdb5c815a18b8285fbc5d5d7bc17d86

commit 8f7e09f4dbdb5c815a18b8285fbc5d5d7bc17d86
Author: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date:   Thu Feb 11 11:29:23 2021 +0000

    x86_64: Avoid lazy relocation of tlsdesc [BZ #27137]

    Lazy tlsdesc relocation is racy because the static tls optimization and
    tlsdesc management operations are done without holding the dlopen lock.

    This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67
    for aarch64, but it fixes a different race: bug 27137.

    Another issue is that ld auditing ignores DT_BIND_NOW and thus tries to
    relocate tlsdesc lazily, but that does not work in a BIND_NOW module
    due to missing DT_TLSDESC_PLT. Unconditionally relocating tlsdesc at
    load time fixes this bug 27721 too.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug dynamic-link/27137] lazy tlsdesc relocation has data races
  2020-12-31 16:08 [Bug dynamic-link/27137] New: lazy tlsdesc relocation has data races nsz at gcc dot gnu.org
                   ` (3 preceding siblings ...)
  2021-04-15  8:48 ` cvs-commit at gcc dot gnu.org
@ 2021-04-15  8:49 ` cvs-commit at gcc dot gnu.org
  2021-04-15 10:45 ` nsz at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-04-15  8:49 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=27137

--- Comment #5 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Szabolcs Nagy <nsz@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ddcacd91cc10ff92d6201eda87047d029c14158d

commit ddcacd91cc10ff92d6201eda87047d029c14158d
Author: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date:   Thu Feb 11 11:40:11 2021 +0000

    i386: Avoid lazy relocation of tlsdesc [BZ #27137]

    Lazy tlsdesc relocation is racy because the static tls optimization and
    tlsdesc management operations are done without holding the dlopen lock.

    This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67
    for aarch64, but it fixes a different race: bug 27137.

    On i386 the code is a bit more complicated than on x86_64 because both
    rel and rela relocs are supported.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug dynamic-link/27137] lazy tlsdesc relocation has data races
  2020-12-31 16:08 [Bug dynamic-link/27137] New: lazy tlsdesc relocation has data races nsz at gcc dot gnu.org
                   ` (4 preceding siblings ...)
  2021-04-15  8:49 ` cvs-commit at gcc dot gnu.org
@ 2021-04-15 10:45 ` nsz at gcc dot gnu.org
  2021-05-11 16:17 ` cvs-commit at gcc dot gnu.org
  2021-06-10  2:30 ` xujing99 at huawei dot com
  7 siblings, 0 replies; 9+ messages in thread
From: nsz at gcc dot gnu.org @ 2021-04-15 10:45 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=27137

Szabolcs Nagy <nsz at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |FIXED
             Status|NEW                         |RESOLVED
            Version|2.32                        |2.34

--- Comment #6 from Szabolcs Nagy <nsz at gcc dot gnu.org> ---
fixed for 2.34

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug dynamic-link/27137] lazy tlsdesc relocation has data races
  2020-12-31 16:08 [Bug dynamic-link/27137] New: lazy tlsdesc relocation has data races nsz at gcc dot gnu.org
                   ` (5 preceding siblings ...)
  2021-04-15 10:45 ` nsz at gcc dot gnu.org
@ 2021-05-11 16:17 ` cvs-commit at gcc dot gnu.org
  2021-06-10  2:30 ` xujing99 at huawei dot com
  7 siblings, 0 replies; 9+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2021-05-11 16:17 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=27137

--- Comment #7 from cvs-commit at gcc dot gnu.org <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Szabolcs Nagy <nsz@sourceware.org>:

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=1387ad6225c2222f027790e3f460e31aa5dd2c54

commit 1387ad6225c2222f027790e3f460e31aa5dd2c54
Author: Szabolcs Nagy <szabolcs.nagy@arm.com>
Date:   Wed Dec 30 19:19:37 2020 +0000

    elf: Fix data races in pthread_create and TLS access [BZ #19329]

    DTV setup at thread creation (_dl_allocate_tls_init) is changed
    to take the dlopen lock, GL(dl_load_lock).  Avoiding data races
    here without locks would require design changes: the map that is
    accessed for static TLS initialization here may be concurrently
    freed by dlclose.  That use after free may be solved by only
    locking around static TLS setup or by ensuring dlclose does not
    free modules with static TLS, however currently every link map
    with TLS has to be accessed at least to see if it needs static
    TLS.  And even if that's solved, still a lot of atomics would be
    needed to synchronize DTV related globals without a lock. So fix
    both bug 19329 and bug 27111 with a lock that prevents DTV setup
    running concurrently with dlopen or dlclose.

    _dl_update_slotinfo at TLS access still does not use any locks
    so CONCURRENCY NOTES are added to explain the synchronization.
    The early exit from the slotinfo walk when max_modid is reached
    is not strictly necessary, but does not hurt either.

    An incorrect acquire load was removed from _dl_resize_dtv: it
    did not synchronize with any release store or fence and
    synchronization is now handled separately at thread creation
    and TLS access time.

    There are still a number of racy read accesses to globals that
    will be changed to relaxed MO atomics in a followup patch. This
    should not introduce regressions compared to existing behaviour
    and avoid cluttering the main part of the fix.

    Not all TLS access related data races got fixed here: there are
    additional races at lazy tlsdesc relocations see bug 27137.

    Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug dynamic-link/27137] lazy tlsdesc relocation has data races
  2020-12-31 16:08 [Bug dynamic-link/27137] New: lazy tlsdesc relocation has data races nsz at gcc dot gnu.org
                   ` (6 preceding siblings ...)
  2021-05-11 16:17 ` cvs-commit at gcc dot gnu.org
@ 2021-06-10  2:30 ` xujing99 at huawei dot com
  7 siblings, 0 replies; 9+ messages in thread
From: xujing99 at huawei dot com @ 2021-06-10  2:30 UTC (permalink / raw)
  To: glibc-bugs

https://sourceware.org/bugzilla/show_bug.cgi?id=27137

xujing <xujing99 at huawei dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xujing99 at huawei dot com

--- Comment #8 from xujing <xujing99 at huawei dot com> ---
(In reply to Ben Woodard from comment #2)
> Created attachment 13359 [details]
> simple reproducer of this problem

Hi,how can I reproduce on i386? I try to reproduce it on x86_64 by building
32bits program. The steps are as follows:
1. add -m32 to make, like this:
    dlmain: dlmain.c
            gcc -O2 -m32 -g -o dlmain dlmain.c -ldl
    auditor.so: auditor.c
            gcc -O2 -m32 -g -fPIC -shared -o auditor.so auditor.c
2. make
3. make auditor.so
4. LD_AUDIT=/path/to/auditor.so /path/to/dlmain /path/to/libmemkind.so

And I try to build(get) libmemkind.so(ELF 32-bit LSB shared object), but I
can't success.
I also try to use other shared library(libreadline.so), I can't reproduce
too(on x86_64 can't reproduce too. I think any shared library can replace
libmemkind.so to reproduce but not).

Is it only produce on 32bits mathine? Or other problem?

Thanks.

-- 
You are receiving this mail because:
You are on the CC list for the bug.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-06-10  2:30 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-31 16:08 [Bug dynamic-link/27137] New: lazy tlsdesc relocation has data races nsz at gcc dot gnu.org
2021-04-09 17:53 ` [Bug dynamic-link/27137] " woodard at redhat dot com
2021-04-09 17:53 ` woodard at redhat dot com
2021-04-12  8:36 ` nsz at gcc dot gnu.org
2021-04-15  8:48 ` cvs-commit at gcc dot gnu.org
2021-04-15  8:49 ` cvs-commit at gcc dot gnu.org
2021-04-15 10:45 ` nsz at gcc dot gnu.org
2021-05-11 16:17 ` cvs-commit at gcc dot gnu.org
2021-06-10  2:30 ` xujing99 at huawei dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).