public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH 00/15] Dynamic TLS related data race fixes
@ 2021-02-15 11:56 Szabolcs Nagy
  2021-02-15 11:56 ` [PATCH 01/15] aarch64: free tlsdesc data on dlclose [BZ #27403] Szabolcs Nagy
                   ` (17 more replies)
  0 siblings, 18 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 11:56 UTC (permalink / raw)
  To: libc-alpha

This series is trying to address some long standing dynamic
linker races:

19329 - race between tls access, pthread_create and dlopen
27111 - race between tls access, pthread_create and dlclose
21349 - a lazy symbol binding race (not related to TLS)
27137 - lazy tlsdesc relocation on x86 is racy

and there were a number of minor issues uncovered:

27135 - dtv gaps are not reused
27136 - dtv off-by-1 error,
27403 - aarch64 memleak on dlclose

there are related issues that are not fixed, but tried to
not make the situation worse:

16133 - tls access is not as-safe
16134 - tls allocation aborts on oom
27112 - generation count may overflow on 32 bit targets

some of the patches can be handled independently.

The set is also available in the nsz/bug19329 branch.

There are new test cases that are early in the series for now
so they can be run before the fixes.

A common theme between the races is that the GL(dl_load_lock)
lock protects various dynamic linker shared global state that
are written under that lock, but in various situations (lazy
binding, tls access, thread creation) they are read without
the lock.

An earlier analysis about bug 19329
https://sourceware.org/pipermail/libc-alpha/2020-December/121090.html

In some cases atomics is used for synchronization, this means
all accesses to that object has to be atomic even if that's
behind a lock. I think this is unnecessary and can be confusing:
https://sourceware.org/pipermail/libc-alpha/2020-December/121200.html

This patchset does not try to turn non-racy read into relaxed
atomic read.

The removal of lazy tlsdesc relocation may allow further changes
at the end.  The x86 changes were not excessively tested. I ran
the glibc tests on x86_64, i386 and aarch64 linux.

Maninder Singh (1):
  elf: Fix data race in _dl_name_match_p [BZ #21349]

Szabolcs Nagy (14):
  aarch64: free tlsdesc data on dlclose [BZ #27403]
  Add test case for [BZ #19329]
  Add a DTV setup test [BZ #27136]
  elf: Fix a DTV setup issue [BZ #27136]
  elf: Fix comments and logic in _dl_add_to_slotinfo
  elf: Refactor _dl_update_slotinfo to avoid use after free
  elf: Fix data races in pthread_create and TLS access [BZ #19329]
  elf: Use relaxed atomics for racy accesses [BZ #19329]
  elf: Fix DTV gap reuse logic [BZ #27135]
  x86_64: Avoid lazy relocation of tlsdesc [BZ #27137]
  i386: Avoid lazy relocation of tlsdesc [BZ #27137]
  x86_64: Remove lazy tlsdesc relocation related code
  i386: Remove lazy tlsdesc relocation related code
  elf: Remove lazy tlsdesc relocation related code

 elf/dl-close.c                 |  26 ++--
 elf/dl-load.c                  |  18 ++-
 elf/dl-misc.c                  |   4 +-
 elf/dl-open.c                  |  15 +--
 elf/dl-tls.c                   | 133 +++++++++++--------
 elf/tlsdeschtab.h              |  53 +-------
 nptl/Makefile                  |  25 +++-
 nptl/tst-tls7.c                |  69 ++++++++++
 nptl/tst-tls7mod-dep.c         |   1 +
 nptl/tst-tls7mod.c             |   1 +
 nptl/tst-tls8.c                |  96 ++++++++++++++
 nptl/tst-tls8mod-bad.c         |   2 +
 sysdeps/aarch64/dl-lookupcfg.h |  27 ++++
 sysdeps/aarch64/tlsdesc.c      |   1 -
 sysdeps/arm/tlsdesc.c          |   1 -
 sysdeps/i386/dl-machine.h      |  76 +++++------
 sysdeps/i386/dl-tlsdesc.S      | 156 ----------------------
 sysdeps/i386/dl-tlsdesc.h      |   6 +-
 sysdeps/i386/tlsdesc.c         | 230 ---------------------------------
 sysdeps/x86_64/dl-machine.h    |  19 ++-
 sysdeps/x86_64/dl-tlsdesc.S    | 104 ---------------
 sysdeps/x86_64/dl-tlsdesc.h    |   4 +-
 sysdeps/x86_64/tlsdesc.c       | 108 ----------------
 23 files changed, 393 insertions(+), 782 deletions(-)
 create mode 100644 nptl/tst-tls7.c
 create mode 100644 nptl/tst-tls7mod-dep.c
 create mode 100644 nptl/tst-tls7mod.c
 create mode 100644 nptl/tst-tls8.c
 create mode 100644 nptl/tst-tls8mod-bad.c
 create mode 100644 sysdeps/aarch64/dl-lookupcfg.h

-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 01/15] aarch64: free tlsdesc data on dlclose [BZ #27403]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
@ 2021-02-15 11:56 ` Szabolcs Nagy
  2021-04-01 12:57   ` Adhemerval Zanella
  2021-02-15 11:56 ` [PATCH 02/15] elf: Fix data race in _dl_name_match_p [BZ #21349] Szabolcs Nagy
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 11:56 UTC (permalink / raw)
  To: libc-alpha

DL_UNMAP_IS_SPECIAL and DL_UNMAP were not defined. The definitions are
now copied from arm, since the same is needed on aarch64. The cleanup
of tlsdesc data is handled by the custom _dl_unmap.

Fixes bug 27403.
---
 sysdeps/aarch64/dl-lookupcfg.h | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)
 create mode 100644 sysdeps/aarch64/dl-lookupcfg.h

diff --git a/sysdeps/aarch64/dl-lookupcfg.h b/sysdeps/aarch64/dl-lookupcfg.h
new file mode 100644
index 0000000000..c1ae676ae1
--- /dev/null
+++ b/sysdeps/aarch64/dl-lookupcfg.h
@@ -0,0 +1,27 @@
+/* Configuration of lookup functions.
+   Copyright (C) 2006-2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library.  If not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#define DL_UNMAP_IS_SPECIAL
+
+#include_next <dl-lookupcfg.h>
+
+struct link_map;
+
+extern void _dl_unmap (struct link_map *map);
+
+#define DL_UNMAP(map) _dl_unmap (map)
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 02/15] elf: Fix data race in _dl_name_match_p [BZ #21349]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
  2021-02-15 11:56 ` [PATCH 01/15] aarch64: free tlsdesc data on dlclose [BZ #27403] Szabolcs Nagy
@ 2021-02-15 11:56 ` Szabolcs Nagy
  2021-04-01 14:01   ` Adhemerval Zanella
  2021-02-15 11:57 ` [PATCH 03/15] Add test case for [BZ #19329] Szabolcs Nagy
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 11:56 UTC (permalink / raw)
  To: libc-alpha; +Cc: Maninder Singh

From: Maninder Singh <maninder1.s@samsung.com>

dlopen updates libname_list by writing to lastp->next, but concurrent
reads in _dl_name_match_p were not synchronized when it was called
without holding GL(dl_load_lock), which can happen during lazy symbol
resolution.

This patch fixes the race between _dl_name_match_p reading lastp->next
and add_name_to_object writing to it. This could cause segfault on
targets with weak memory order when lastp->next->name is read, which
was observed on an arm system. Fixes bug 21349.

(Code is from Maninder Singh, comments and description is from Szabolcs
Nagy.)

Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
---
 elf/dl-load.c | 18 +++++++++++++++++-
 elf/dl-misc.c |  4 +++-
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/elf/dl-load.c b/elf/dl-load.c
index 9e2089cfaa..be54bafad5 100644
--- a/elf/dl-load.c
+++ b/elf/dl-load.c
@@ -438,7 +438,23 @@ add_name_to_object (struct link_map *l, const char *name)
   newname->name = memcpy (newname + 1, name, name_len);
   newname->next = NULL;
   newname->dont_free = 0;
-  lastp->next = newname;
+  /* CONCURRENCY NOTES:
+
+     Make sure the initialization of newname happens before its address is
+     read from the lastp->next store below.
+
+     GL(dl_load_lock) is held here (and by other writers, e.g. dlclose), so
+     readers of libname_list->next (e.g. _dl_check_caller or the reads above)
+     can use that for synchronization, however the read in _dl_name_match_p
+     may be executed without holding the lock during _dl_runtime_resolve
+     (i.e. lazy symbol resolution when a function of library l is called).
+
+     The release MO store below synchronizes with the acquire MO load in
+     _dl_name_match_p.  Other writes need to synchronize with that load too,
+     however those happen either early when the process is single threaded
+     (dl_main) or when the library is unloaded (dlclose) and the user has to
+     synchronize library calls with unloading.  */
+  atomic_store_release (&lastp->next, newname);
 }
 
 /* Standard search directories.  */
diff --git a/elf/dl-misc.c b/elf/dl-misc.c
index 082f75f459..d4803bba4e 100644
--- a/elf/dl-misc.c
+++ b/elf/dl-misc.c
@@ -347,7 +347,9 @@ _dl_name_match_p (const char *name, const struct link_map *map)
     if (strcmp (name, runp->name) == 0)
       return 1;
     else
-      runp = runp->next;
+      /* Synchronize with the release MO store in add_name_to_object.
+	 See CONCURRENCY NOTES in add_name_to_object in dl-load.c.  */
+      runp = atomic_load_acquire (&runp->next);
 
   return 0;
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 03/15] Add test case for [BZ #19329]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
  2021-02-15 11:56 ` [PATCH 01/15] aarch64: free tlsdesc data on dlclose [BZ #27403] Szabolcs Nagy
  2021-02-15 11:56 ` [PATCH 02/15] elf: Fix data race in _dl_name_match_p [BZ #21349] Szabolcs Nagy
@ 2021-02-15 11:57 ` Szabolcs Nagy
  2021-04-02 19:10   ` Adhemerval Zanella
  2021-02-15 11:59 ` [PATCH 04/15] Add a DTV setup test [BZ #27136] Szabolcs Nagy
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 11:57 UTC (permalink / raw)
  To: libc-alpha

From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>

Test concurrent dlopen and pthread_create when the loaded
modules have TLS.  This triggers dl-tls assertion failures
more reliably than the tst-stack4 test.

The dlopened module has 100 DT_NEEDED dependencies, the
number of concurrent threads created during dlopen depends
on filesystem speed and hardware. At most 3 threads exist
at a time to limit resource usage.

Doing the test in a fork loop can make it more reliable.

---
v4:
- rebased, updated copyright year.
- moved to tests-internal because of <atomic.h>

v3:
- use the new test support code.
- better Makefile usage so modules are cleaned properly.

v2:
- undef NDEBUG.
- join nop threads so at most 3 threads exist at a time.
- remove stack size setting (resource usage is no longer a concern).
- stop creating threads after dlopen observably finished.
- print the number of threads created for debugging.

2016-12-13  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	* nptl/Makefile (tests): Add tst-tls7.
	(modules-names): Add tst-tls7mod, tst-tls7mod-dep.
	* nptl/tst-tls7.c: New file.
	* nptl/tst-tls7mod-dep.c: New file.
	* nptl/tst-tls7mod.c: New file.
---
 nptl/Makefile          | 17 +++++++++--
 nptl/tst-tls7.c        | 69 ++++++++++++++++++++++++++++++++++++++++++
 nptl/tst-tls7mod-dep.c |  1 +
 nptl/tst-tls7mod.c     |  1 +
 4 files changed, 85 insertions(+), 3 deletions(-)
 create mode 100644 nptl/tst-tls7.c
 create mode 100644 nptl/tst-tls7mod-dep.c
 create mode 100644 nptl/tst-tls7mod.c

diff --git a/nptl/Makefile b/nptl/Makefile
index 8fb7fee6db..208629876d 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -355,20 +355,25 @@ tests += tst-cancelx7 tst-cancelx17 tst-cleanupx4
 
 ifeq ($(build-shared),yes)
 tests += tst-compat-forwarder tst-audit-threads
-tests-internal += tst-tls3 tst-tls3-malloc tst-tls5 tst-stackguard1
+tests-internal += tst-tls3 tst-tls3-malloc tst-tls5 tst-stackguard1 tst-tls7
 ifeq ($(have-z-execstack),yes)
 tests += tst-execstack
 endif
 endif
 
+one-hundred = $(foreach x,0 1 2 3 4 5 6 7 8 9, \
+  0$x 1$x 2$x 3$x 4$x 5$x 6$x 7$x 8$x 9$x)
+tst-tls7mod-deps = $(one-hundred:%=tst-tls7mod-dep-%.so)
+
 modules-names = tst-tls3mod \
 		tst-tls5mod tst-tls5moda tst-tls5modb tst-tls5modc \
 		tst-tls5modd tst-tls5mode tst-tls5modf tst-stack4mod \
 		tst-execstack-mod \
 		tst-compat-forwarder-mod tst-audit-threads-mod1 \
-		tst-audit-threads-mod2
+		tst-audit-threads-mod2 \
+		tst-tls7mod tst-tls7mod-dep
 extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) \
-		   tst-cleanup4aux.o tst-cleanupx4aux.o
+		   tst-cleanup4aux.o tst-cleanupx4aux.o $(tst-tls7mod-deps)
 test-extras += tst-cleanup4aux tst-cleanupx4aux
 
 tst-tls3mod.so-no-z-defs = yes
@@ -517,6 +522,12 @@ $(objpfx)tst-tls6.out: tst-tls6.sh $(objpfx)tst-tls5 \
 	$(evaluate-test)
 endif
 
+$(objpfx)tst-tls7: $(libdl) $(shared-thread-library)
+$(objpfx)tst-tls7.out: $(objpfx)tst-tls7mod.so
+$(objpfx)tst-tls7mod.so: $(tst-tls7mod-deps:%=$(objpfx)%)
+$(tst-tls7mod-deps:%=$(objpfx)%): $(objpfx)tst-tls7mod-dep.so
+	cp -f $< $@
+
 $(objpfx)tst-dlsym1: $(libdl) $(shared-thread-library)
 
 ifeq (yes,$(build-shared))
diff --git a/nptl/tst-tls7.c b/nptl/tst-tls7.c
new file mode 100644
index 0000000000..4b3f9005fe
--- /dev/null
+++ b/nptl/tst-tls7.c
@@ -0,0 +1,69 @@
+/* Test concurrent dlopen and pthread_create: BZ 19329.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <dlfcn.h>
+#include <pthread.h>
+#include <stdio.h>
+#include <atomic.h>
+
+#include <support/xthread.h>
+
+#define THREADS 10000
+
+static volatile int done;
+
+static void *
+start (void *a)
+{
+  if (dlopen ("tst-tls7mod.so", RTLD_LAZY) == NULL)
+    {
+      printf ("dlopen failed: %s\n", dlerror ());
+      exit (1);
+    }
+  atomic_store_relaxed (&done, 1);
+  return 0;
+}
+
+static void *
+nop (void *a)
+{
+  return 0;
+}
+
+static int
+do_test (void)
+{
+  pthread_t t1, t2;
+  int i;
+
+  /* Load a module with lots of dependencies and TLS.  */
+  t1 = xpthread_create (0, start, 0);
+
+  /* Concurrently create lots of threads until dlopen is observably done.  */
+  for (i = 0; i < THREADS && !atomic_load_relaxed (&done); i++)
+    {
+      t2 = xpthread_create (0, nop, 0);
+      xpthread_join (t2);
+    }
+
+  xpthread_join (t1);
+  printf ("threads created during dlopen: %d\n", i);
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/nptl/tst-tls7mod-dep.c b/nptl/tst-tls7mod-dep.c
new file mode 100644
index 0000000000..206ece4fb3
--- /dev/null
+++ b/nptl/tst-tls7mod-dep.c
@@ -0,0 +1 @@
+int __thread x;
diff --git a/nptl/tst-tls7mod.c b/nptl/tst-tls7mod.c
new file mode 100644
index 0000000000..206ece4fb3
--- /dev/null
+++ b/nptl/tst-tls7mod.c
@@ -0,0 +1 @@
+int __thread x;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 04/15] Add a DTV setup test [BZ #27136]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (2 preceding siblings ...)
  2021-02-15 11:57 ` [PATCH 03/15] Add test case for [BZ #19329] Szabolcs Nagy
@ 2021-02-15 11:59 ` Szabolcs Nagy
  2021-04-02 19:35   ` Adhemerval Zanella
  2021-02-15 11:59 ` [PATCH 05/15] elf: Fix a DTV setup issue " Szabolcs Nagy
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 11:59 UTC (permalink / raw)
  To: libc-alpha

The test relies on reusing slotinfo entries after dlclose which
can result in non-monotonic increasing generation counters in
the slotinfo list. It can trigger bug 27136.

The test requires large number of modules with TLS so the
modules of tst-tls7 are used instead of duplicating them.
---
 nptl/Makefile          | 10 ++++-
 nptl/tst-tls8.c        | 96 ++++++++++++++++++++++++++++++++++++++++++
 nptl/tst-tls8mod-bad.c |  2 +
 3 files changed, 106 insertions(+), 2 deletions(-)
 create mode 100644 nptl/tst-tls8.c
 create mode 100644 nptl/tst-tls8mod-bad.c

diff --git a/nptl/Makefile b/nptl/Makefile
index 208629876d..a11b598efd 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -354,7 +354,7 @@ LDFLAGS-pthread.so = -Wl,--enable-new-dtags,-z,nodelete,-z,initfirst
 tests += tst-cancelx7 tst-cancelx17 tst-cleanupx4
 
 ifeq ($(build-shared),yes)
-tests += tst-compat-forwarder tst-audit-threads
+tests += tst-compat-forwarder tst-audit-threads tst-tls8
 tests-internal += tst-tls3 tst-tls3-malloc tst-tls5 tst-stackguard1 tst-tls7
 ifeq ($(have-z-execstack),yes)
 tests += tst-execstack
@@ -371,7 +371,7 @@ modules-names = tst-tls3mod \
 		tst-execstack-mod \
 		tst-compat-forwarder-mod tst-audit-threads-mod1 \
 		tst-audit-threads-mod2 \
-		tst-tls7mod tst-tls7mod-dep
+		tst-tls7mod tst-tls7mod-dep tst-tls8mod-bad
 extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) \
 		   tst-cleanup4aux.o tst-cleanupx4aux.o $(tst-tls7mod-deps)
 test-extras += tst-cleanup4aux tst-cleanupx4aux
@@ -528,6 +528,12 @@ $(objpfx)tst-tls7mod.so: $(tst-tls7mod-deps:%=$(objpfx)%)
 $(tst-tls7mod-deps:%=$(objpfx)%): $(objpfx)tst-tls7mod-dep.so
 	cp -f $< $@
 
+# Reuse tst-tls7mod-dep*.so.
+tst-tls8mod-bad.so-no-z-defs = yes
+$(objpfx)tst-tls8: $(libdl) $(shared-thread-library)
+$(objpfx)tst-tls8.out: $(objpfx)tst-tls8mod-bad.so \
+		       $(tst-tls7mod-deps:%=$(objpfx)%)
+
 $(objpfx)tst-dlsym1: $(libdl) $(shared-thread-library)
 
 ifeq (yes,$(build-shared))
diff --git a/nptl/tst-tls8.c b/nptl/tst-tls8.c
new file mode 100644
index 0000000000..be7b64c9be
--- /dev/null
+++ b/nptl/tst-tls8.c
@@ -0,0 +1,96 @@
+/* Test dtv setup if entries don't have monoton increasing generation.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <dlfcn.h>
+#include <pthread.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <support/check.h>
+#include <support/xdlfcn.h>
+#include <support/xthread.h>
+
+#define NMOD 100
+static void *mod[NMOD];
+
+static void
+load_fail (void)
+{
+  /* Expected to fail because of a missing symbol.  */
+  void *m = dlopen ("tst-tls8mod-bad.so", RTLD_NOW);
+  if (m != NULL)
+    FAIL_EXIT1 ("dlopen of tst-tls8mod-bad.so succeeded\n");
+}
+
+static void
+load_mod (int i)
+{
+  char buf[100];
+  sprintf (buf, "tst-tls7mod-dep-%02d.so", i);
+  mod[i] = xdlopen (buf, RTLD_LAZY);
+}
+
+static void
+unload_mod (int i)
+{
+  if (mod[i])
+    xdlclose (mod[i]);
+  mod[i] = 0;
+}
+
+static void
+access (int i)
+{
+  dlerror ();
+  int *p = dlsym (mod[i], "x");
+  printf ("mod[%d]: &x = %p\n", i, p);
+  if (p == NULL)
+    FAIL_EXIT1 ("dlsym failed: %s\n", dlerror ());
+  ++*p;
+}
+
+static void *
+start (void *a)
+{
+  for (int i = 0; i < NMOD; i++)
+    if (mod[i])
+      access (i);
+  return 0;
+}
+
+static int
+do_test (void)
+{
+  int i;
+
+  for (i = 0; i < NMOD; i++)
+    {
+      load_mod (i);
+      /* Bump the generation of mod[0] without using new dtv slot.  */
+      unload_mod (0);
+      load_fail (); /* Ensure GL(dl_tls_dtv_gaps) is true: see bug 27135.  */
+      load_mod (0);
+      /* Access TLS in all loaded modules.  */
+      pthread_t t = xpthread_create (0, start, 0);
+      xpthread_join (t);
+    }
+  for (i = 0; i < NMOD; i++)
+    unload_mod (i);
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/nptl/tst-tls8mod-bad.c b/nptl/tst-tls8mod-bad.c
new file mode 100644
index 0000000000..c1aed8ea7d
--- /dev/null
+++ b/nptl/tst-tls8mod-bad.c
@@ -0,0 +1,2 @@
+void missing_symbol (void);
+void f (void) {missing_symbol ();}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 05/15] elf: Fix a DTV setup issue [BZ #27136]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (3 preceding siblings ...)
  2021-02-15 11:59 ` [PATCH 04/15] Add a DTV setup test [BZ #27136] Szabolcs Nagy
@ 2021-02-15 11:59 ` Szabolcs Nagy
  2021-04-02 19:46   ` Adhemerval Zanella
  2021-02-15 11:59 ` [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo Szabolcs Nagy
                   ` (12 subsequent siblings)
  17 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 11:59 UTC (permalink / raw)
  To: libc-alpha

The max modid is a valid index in the dtv, it should not be skipped.

The bug is observable if the last module has modid == 64 and its
generation is same or less than the max generation of the previous
modules.  Then dtv[0].counter implies dtv[64] is initialized but
it isn't. Fixes bug 27136.
---
 elf/dl-tls.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index dd76829e74..79b93ad91b 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -590,7 +590,7 @@ _dl_allocate_tls_init (void *result)
 	}
 
       total += cnt;
-      if (total >= GL(dl_tls_max_dtv_idx))
+      if (total > GL(dl_tls_max_dtv_idx))
 	break;
 
       listp = listp->next;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (4 preceding siblings ...)
  2021-02-15 11:59 ` [PATCH 05/15] elf: Fix a DTV setup issue " Szabolcs Nagy
@ 2021-02-15 11:59 ` Szabolcs Nagy
  2021-04-02 20:50   ` Adhemerval Zanella
  2021-02-15 12:00 ` [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free Szabolcs Nagy
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 11:59 UTC (permalink / raw)
  To: libc-alpha

From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>

Since

  commit a509eb117fac1d764b15eba64993f4bdb63d7f3c
  Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]

the generation counter update is not needed in the failure path.
---
 elf/dl-tls.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index 79b93ad91b..24d00c14ef 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -998,16 +998,7 @@ _dl_add_to_slotinfo (struct link_map *l, bool do_add)
 		+ TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo));
       if (listp == NULL)
 	{
-	  /* We ran out of memory.  We will simply fail this
-	     call but don't undo anything we did so far.  The
-	     application will crash or be terminated anyway very
-	     soon.  */
-
-	  /* We have to do this since some entries in the dtv
-	     slotinfo array might already point to this
-	     generation.  */
-	  ++GL(dl_tls_generation);
-
+	  /* We ran out of memory while resizing the dtv slotinfo list.  */
 	  _dl_signal_error (ENOMEM, "dlopen", NULL, N_("\
 cannot create TLS data structures"));
 	}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (5 preceding siblings ...)
  2021-02-15 11:59 ` [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo Szabolcs Nagy
@ 2021-02-15 12:00 ` Szabolcs Nagy
  2021-04-06 19:40   ` Adhemerval Zanella
  2021-04-07 17:05   ` Adhemerval Zanella
  2021-02-15 12:01 ` [PATCH 08/15] elf: Fix data races in pthread_create and TLS access [BZ #19329] Szabolcs Nagy
                   ` (10 subsequent siblings)
  17 siblings, 2 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:00 UTC (permalink / raw)
  To: libc-alpha

map is not valid to access here because it can be freed by a
concurrent dlclose, so don't check the modid.

The map == 0 and map != 0 code paths can be shared (avoiding
the dtv resize in case of map == 0 is just an optimization:
larger dtv than necessary would be fine too).
---
 elf/dl-tls.c | 21 +++++----------------
 1 file changed, 5 insertions(+), 16 deletions(-)

diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index 24d00c14ef..f8b32b3ecb 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -743,6 +743,8 @@ _dl_update_slotinfo (unsigned long int req_modid)
 	{
 	  for (size_t cnt = total == 0 ? 1 : 0; cnt < listp->len; ++cnt)
 	    {
+	      size_t modid = total + cnt;
+
 	      size_t gen = listp->slotinfo[cnt].gen;
 
 	      if (gen > new_gen)
@@ -758,25 +760,12 @@ _dl_update_slotinfo (unsigned long int req_modid)
 
 	      /* If there is no map this means the entry is empty.  */
 	      struct link_map *map = listp->slotinfo[cnt].map;
-	      if (map == NULL)
-		{
-		  if (dtv[-1].counter >= total + cnt)
-		    {
-		      /* If this modid was used at some point the memory
-			 might still be allocated.  */
-		      free (dtv[total + cnt].pointer.to_free);
-		      dtv[total + cnt].pointer.val = TLS_DTV_UNALLOCATED;
-		      dtv[total + cnt].pointer.to_free = NULL;
-		    }
-
-		  continue;
-		}
-
 	      /* Check whether the current dtv array is large enough.  */
-	      size_t modid = map->l_tls_modid;
-	      assert (total + cnt == modid);
 	      if (dtv[-1].counter < modid)
 		{
+		  if (map == NULL)
+		    continue;
+
 		  /* Resize the dtv.  */
 		  dtv = _dl_resize_dtv (dtv);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 08/15] elf: Fix data races in pthread_create and TLS access [BZ #19329]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (6 preceding siblings ...)
  2021-02-15 12:00 ` [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free Szabolcs Nagy
@ 2021-02-15 12:01 ` Szabolcs Nagy
  2021-02-15 12:01 ` [PATCH 09/15] elf: Use relaxed atomics for racy accesses " Szabolcs Nagy
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:01 UTC (permalink / raw)
  To: libc-alpha

DTV setup at thread creation (_dl_allocate_tls_init) is changed
to take the dlopen lock, GL(dl_load_lock).  Avoiding data races
here without locks would require design changes: the map that is
accessed for static TLS initialization here may be concurrently
freed by dlclose.  That use after free may be solved by only
locking around static TLS setup or by ensuring dlclose does not
free modules with static TLS, however currently every link map
with TLS has to be accessed at least to see if it needs static
TLS.  And even if that's solved, still a lot of atomics would be
needed to synchronize DTV related globals without a lock. So fix
both bug 19329 and bug 27111 with a lock that prevents DTV setup
running concurrently with dlopen or dlclose.

_dl_update_slotinfo at TLS access still does not use any locks
so CONCURRENCY NOTES are added to explain the synchronization.
The early exit from the slotinfo walk when max_modid is reached
is not strictly necessary, but does not hurt either.

An incorrect acquire load was removed from _dl_resize_dtv: it
did not synchronize with any release store or fence and
synchronization is now handled separately at thread creation
and TLS access time.

There are still a number of racy read accesses to globals that
will be changed to relaxed MO atomics in a followup patch. This
should not introduce regressions compared to existing behaviour
and avoid cluttering the main part of the fix.

Not all TLS access related data races got fixed here: there are
additional races at lazy tlsdesc relocations see bug 27137.
---
 elf/dl-tls.c | 63 +++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 47 insertions(+), 16 deletions(-)

diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index f8b32b3ecb..33c06782b1 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -471,14 +471,11 @@ extern dtv_t _dl_static_dtv[];
 #endif
 
 static dtv_t *
-_dl_resize_dtv (dtv_t *dtv)
+_dl_resize_dtv (dtv_t *dtv, size_t max_modid)
 {
   /* Resize the dtv.  */
   dtv_t *newp;
-  /* Load GL(dl_tls_max_dtv_idx) atomically since it may be written to by
-     other threads concurrently.  */
-  size_t newsize
-    = atomic_load_acquire (&GL(dl_tls_max_dtv_idx)) + DTV_SURPLUS;
+  size_t newsize = max_modid + DTV_SURPLUS;
   size_t oldsize = dtv[-1].counter;
 
   if (dtv == GL(dl_initial_dtv))
@@ -524,11 +521,14 @@ _dl_allocate_tls_init (void *result)
   size_t total = 0;
   size_t maxgen = 0;
 
+  /* Protects global dynamic TLS related state.  */
+  __rtld_lock_lock_recursive (GL(dl_load_lock));
+
   /* Check if the current dtv is big enough.   */
   if (dtv[-1].counter < GL(dl_tls_max_dtv_idx))
     {
       /* Resize the dtv.  */
-      dtv = _dl_resize_dtv (dtv);
+      dtv = _dl_resize_dtv (dtv, GL(dl_tls_max_dtv_idx));
 
       /* Install this new dtv in the thread data structures.  */
       INSTALL_DTV (result, &dtv[-1]);
@@ -596,6 +596,7 @@ _dl_allocate_tls_init (void *result)
       listp = listp->next;
       assert (listp != NULL);
     }
+  __rtld_lock_unlock_recursive (GL(dl_load_lock));
 
   /* The DTV version is up-to-date now.  */
   dtv[0].counter = maxgen;
@@ -730,12 +731,29 @@ _dl_update_slotinfo (unsigned long int req_modid)
 
   if (dtv[0].counter < listp->slotinfo[idx].gen)
     {
-      /* The generation counter for the slot is higher than what the
-	 current dtv implements.  We have to update the whole dtv but
-	 only those entries with a generation counter <= the one for
-	 the entry we need.  */
+      /* CONCURRENCY NOTES:
+
+	 Here the dtv needs to be updated to new_gen generation count.
+
+	 This code may be called during TLS access when GL(dl_load_lock)
+	 is not held.  In that case the user code has to synchrnize with
+	 dlopen and dlclose calls of relevant modules.  A module m is
+	 relevant if the generation of m <= new_gen and dlclose of m is
+	 synchronized: a memory access here happens after the dlopen and
+	 before the dlclose of relevant modules.  The dtv entries for
+	 relevant modules need to be updated, other entries can be
+	 arbitrary.
+
+	 This e.g. means that the first part of the slotinfo list can be
+	 accessed race free, but the tail may be concurrently extended.
+	 Similarly relevant slotinfo entries can be read race free, but
+	 other entries are racy.  However updating a non-relevant dtv
+	 entry does not affect correctness.  For a relevant module m,
+	 max_modid >= modid of m.  */
       size_t new_gen = listp->slotinfo[idx].gen;
       size_t total = 0;
+      size_t max_modid  = atomic_load_relaxed (&GL(dl_tls_max_dtv_idx));
+      assert (max_modid >= req_modid);
 
       /* We have to look through the entire dtv slotinfo list.  */
       listp =  GL(dl_tls_dtv_slotinfo_list);
@@ -745,12 +763,14 @@ _dl_update_slotinfo (unsigned long int req_modid)
 	    {
 	      size_t modid = total + cnt;
 
+	      /* Later entries are not relevant.  */
+	      if (modid > max_modid)
+		break;
+
 	      size_t gen = listp->slotinfo[cnt].gen;
 
 	      if (gen > new_gen)
-		/* This is a slot for a generation younger than the
-		   one we are handling now.  It might be incompletely
-		   set up so ignore it.  */
+		/* Not relevant.  */
 		continue;
 
 	      /* If the entry is older than the current dtv layout we
@@ -767,7 +787,7 @@ _dl_update_slotinfo (unsigned long int req_modid)
 		    continue;
 
 		  /* Resize the dtv.  */
-		  dtv = _dl_resize_dtv (dtv);
+		  dtv = _dl_resize_dtv (dtv, max_modid);
 
 		  assert (modid <= dtv[-1].counter);
 
@@ -789,8 +809,17 @@ _dl_update_slotinfo (unsigned long int req_modid)
 	    }
 
 	  total += listp->len;
+	  if (total > max_modid)
+	    break;
+
+	  /* Synchronize with _dl_add_to_slotinfo. Ideally this would
+	     be consume MO since we only need to order the accesses to
+	     the next node after the read of the address and on most
+	     hardware (other than alpha) a normal load would do that
+	     because of the address dependency.  */
+	  listp = atomic_load_acquire (&listp->next);
 	}
-      while ((listp = listp->next) != NULL);
+      while (listp != NULL);
 
       /* This will be the new maximum generation counter.  */
       dtv[0].counter = new_gen;
@@ -982,7 +1011,7 @@ _dl_add_to_slotinfo (struct link_map *l, bool do_add)
 	 the first slot.  */
       assert (idx == 0);
 
-      listp = prevp->next = (struct dtv_slotinfo_list *)
+      listp = (struct dtv_slotinfo_list *)
 	malloc (sizeof (struct dtv_slotinfo_list)
 		+ TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo));
       if (listp == NULL)
@@ -996,6 +1025,8 @@ cannot create TLS data structures"));
       listp->next = NULL;
       memset (listp->slotinfo, '\0',
 	      TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo));
+      /* Synchronize with _dl_update_slotinfo.  */
+      atomic_store_release (&prevp->next, listp);
     }
 
   /* Add the information into the slotinfo data structure.  */
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 09/15] elf: Use relaxed atomics for racy accesses [BZ #19329]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (7 preceding siblings ...)
  2021-02-15 12:01 ` [PATCH 08/15] elf: Fix data races in pthread_create and TLS access [BZ #19329] Szabolcs Nagy
@ 2021-02-15 12:01 ` Szabolcs Nagy
  2021-02-15 12:01 ` [PATCH 10/15] elf: Fix DTV gap reuse logic [BZ #27135] Szabolcs Nagy
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:01 UTC (permalink / raw)
  To: libc-alpha

This is a follow up patch to the fix for bug 19329.  This adds
relaxed MO atomics to accesses that are racy, but relaxed MO is
enough.
---
 elf/dl-close.c | 20 +++++++++++++-------
 elf/dl-open.c  |  5 ++++-
 elf/dl-tls.c   | 31 +++++++++++++++++++++++--------
 3 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/elf/dl-close.c b/elf/dl-close.c
index c51becd06b..3720e47dd1 100644
--- a/elf/dl-close.c
+++ b/elf/dl-close.c
@@ -79,9 +79,10 @@ remove_slotinfo (size_t idx, struct dtv_slotinfo_list *listp, size_t disp,
 	{
 	  assert (old_map->l_tls_modid == idx);
 
-	  /* Mark the entry as unused. */
-	  listp->slotinfo[idx - disp].gen = GL(dl_tls_generation) + 1;
-	  listp->slotinfo[idx - disp].map = NULL;
+	  /* Mark the entry as unused.  These can be read concurrently.  */
+	  atomic_store_relaxed (&listp->slotinfo[idx - disp].gen,
+				GL(dl_tls_generation) + 1);
+	  atomic_store_relaxed (&listp->slotinfo[idx - disp].map, NULL);
 	}
 
       /* If this is not the last currently used entry no need to look
@@ -96,8 +97,8 @@ remove_slotinfo (size_t idx, struct dtv_slotinfo_list *listp, size_t disp,
 
       if (listp->slotinfo[idx - disp].map != NULL)
 	{
-	  /* Found a new last used index.  */
-	  GL(dl_tls_max_dtv_idx) = idx;
+	  /* Found a new last used index.  This can be read concurrently.  */
+	  atomic_store_relaxed (&GL(dl_tls_max_dtv_idx), idx);
 	  return true;
 	}
     }
@@ -571,7 +572,9 @@ _dl_close_worker (struct link_map *map, bool force)
 					GL(dl_tls_dtv_slotinfo_list), 0,
 					imap->l_init_called))
 		/* All dynamically loaded modules with TLS are unloaded.  */
-		GL(dl_tls_max_dtv_idx) = GL(dl_tls_static_nelem);
+		/* Can be read concurrently.  */
+		atomic_store_relaxed (&GL(dl_tls_max_dtv_idx),
+				      GL(dl_tls_static_nelem));
 
 	      if (imap->l_tls_offset != NO_TLS_OFFSET
 		  && imap->l_tls_offset != FORCED_DYNAMIC_TLS_OFFSET)
@@ -769,8 +772,11 @@ _dl_close_worker (struct link_map *map, bool force)
   /* If we removed any object which uses TLS bump the generation counter.  */
   if (any_tls)
     {
-      if (__glibc_unlikely (++GL(dl_tls_generation) == 0))
+      size_t newgen = GL(dl_tls_generation) + 1;
+      if (__glibc_unlikely (newgen == 0))
 	_dl_fatal_printf ("TLS generation counter wrapped!  Please report as described in "REPORT_BUGS_TO".\n");
+      /* Can be read concurrently.  */
+      atomic_store_relaxed (&GL(dl_tls_generation), newgen);
 
       if (tls_free_end == GL(dl_tls_static_used))
 	GL(dl_tls_static_used) = tls_free_start;
diff --git a/elf/dl-open.c b/elf/dl-open.c
index ab7aaa345e..83b8e96a5c 100644
--- a/elf/dl-open.c
+++ b/elf/dl-open.c
@@ -395,9 +395,12 @@ update_tls_slotinfo (struct link_map *new)
 	}
     }
 
-  if (__builtin_expect (++GL(dl_tls_generation) == 0, 0))
+  size_t newgen = GL(dl_tls_generation) + 1;
+  if (__builtin_expect (newgen == 0, 0))
     _dl_fatal_printf (N_("\
 TLS generation counter wrapped!  Please report this."));
+  /* Can be read concurrently.  */
+  atomic_store_relaxed (&GL(dl_tls_generation), newgen);
 
   /* We need a second pass for static tls data, because
      _dl_update_slotinfo must not be run while calls to
diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index 33c06782b1..c4466bd9fc 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -175,7 +175,9 @@ _dl_next_tls_modid (void)
       /* No gaps, allocate a new entry.  */
     nogaps:
 
-      result = ++GL(dl_tls_max_dtv_idx);
+      result = GL(dl_tls_max_dtv_idx) + 1;
+      /* Can be read concurrently.  */
+      atomic_store_relaxed (&GL(dl_tls_max_dtv_idx), result);
     }
 
   return result;
@@ -359,10 +361,12 @@ allocate_dtv (void *result)
   dtv_t *dtv;
   size_t dtv_length;
 
+  /* Relaxed MO, because the dtv size is later rechecked, not relied on.  */
+  size_t max_modid = atomic_load_relaxed (&GL(dl_tls_max_dtv_idx));
   /* We allocate a few more elements in the dtv than are needed for the
      initial set of modules.  This should avoid in most cases expansions
      of the dtv.  */
-  dtv_length = GL(dl_tls_max_dtv_idx) + DTV_SURPLUS;
+  dtv_length = max_modid + DTV_SURPLUS;
   dtv = calloc (dtv_length + 2, sizeof (dtv_t));
   if (dtv != NULL)
     {
@@ -767,7 +771,7 @@ _dl_update_slotinfo (unsigned long int req_modid)
 	      if (modid > max_modid)
 		break;
 
-	      size_t gen = listp->slotinfo[cnt].gen;
+	      size_t gen = atomic_load_relaxed (&listp->slotinfo[cnt].gen);
 
 	      if (gen > new_gen)
 		/* Not relevant.  */
@@ -779,7 +783,8 @@ _dl_update_slotinfo (unsigned long int req_modid)
 		continue;
 
 	      /* If there is no map this means the entry is empty.  */
-	      struct link_map *map = listp->slotinfo[cnt].map;
+	      struct link_map *map
+		= atomic_load_relaxed (&listp->slotinfo[cnt].map);
 	      /* Check whether the current dtv array is large enough.  */
 	      if (dtv[-1].counter < modid)
 		{
@@ -923,7 +928,12 @@ __tls_get_addr (GET_ADDR_ARGS)
 {
   dtv_t *dtv = THREAD_DTV ();
 
-  if (__glibc_unlikely (dtv[0].counter != GL(dl_tls_generation)))
+  /* Update is needed if dtv[0].counter < the generation of the accessed
+     module.  The global generation counter is used here as it is easier
+     to check.  Synchronization for the relaxed MO access is guaranteed
+     by user code, see CONCURRENCY NOTES in _dl_update_slotinfo.  */
+  size_t gen = atomic_load_relaxed (&GL(dl_tls_generation));
+  if (__glibc_unlikely (dtv[0].counter != gen))
     return update_get_addr (GET_ADDR_PARAM);
 
   void *p = dtv[GET_ADDR_MODULE].pointer.val;
@@ -946,7 +956,10 @@ _dl_tls_get_addr_soft (struct link_map *l)
     return NULL;
 
   dtv_t *dtv = THREAD_DTV ();
-  if (__glibc_unlikely (dtv[0].counter != GL(dl_tls_generation)))
+  /* This may be called without holding the GL(dl_load_lock).  Reading
+     arbitrary gen value is fine since this is best effort code.  */
+  size_t gen = atomic_load_relaxed (&GL(dl_tls_generation));
+  if (__glibc_unlikely (dtv[0].counter != gen))
     {
       /* This thread's DTV is not completely current,
 	 but it might already cover this module.  */
@@ -1032,7 +1045,9 @@ cannot create TLS data structures"));
   /* Add the information into the slotinfo data structure.  */
   if (do_add)
     {
-      listp->slotinfo[idx].map = l;
-      listp->slotinfo[idx].gen = GL(dl_tls_generation) + 1;
+      /* Can be read concurrently.  See _dl_update_slotinfo.  */
+      atomic_store_relaxed (&listp->slotinfo[idx].map, l);
+      atomic_store_relaxed (&listp->slotinfo[idx].gen,
+			    GL(dl_tls_generation) + 1);
     }
 }
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 10/15] elf: Fix DTV gap reuse logic [BZ #27135]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (8 preceding siblings ...)
  2021-02-15 12:01 ` [PATCH 09/15] elf: Use relaxed atomics for racy accesses " Szabolcs Nagy
@ 2021-02-15 12:01 ` Szabolcs Nagy
  2021-02-15 12:02 ` [PATCH 11/15] x86_64: Avoid lazy relocation of tlsdesc [BZ #27137] Szabolcs Nagy
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:01 UTC (permalink / raw)
  To: libc-alpha

For some reason only dlopen failure caused dtv gaps to be reused.

It is possible that the intent was to never reuse modids for a
different module, but after dlopen failure all gaps are reused
not just the ones caused by the unfinished dlopened.

So the code has to handle reused modids already which seems to
work, however the data races at thread creation and tls access
(see bug 19329 and bug 27111) may be more severe if slots are
reused so this is scheduled after those fixes. I think fixing
the races are not simpler if reuse is disallowed and reuse has
other benefits, so set GL(dl_tls_dtv_gaps) whenever entries are
removed from the middle of the slotinfo list. The value does
not have to be correct: incorrect true value causes the next
modid query to do a slotinfo walk, incorrect false will leave
gaps and new entries are added at the end.

Fixes bug 27135.
---
 elf/dl-close.c |  6 +++++-
 elf/dl-open.c  | 10 ----------
 elf/dl-tls.c   |  5 +----
 3 files changed, 6 insertions(+), 15 deletions(-)

diff --git a/elf/dl-close.c b/elf/dl-close.c
index 3720e47dd1..9f31532f41 100644
--- a/elf/dl-close.c
+++ b/elf/dl-close.c
@@ -88,7 +88,11 @@ remove_slotinfo (size_t idx, struct dtv_slotinfo_list *listp, size_t disp,
       /* If this is not the last currently used entry no need to look
 	 further.  */
       if (idx != GL(dl_tls_max_dtv_idx))
-	return true;
+	{
+	  /* There is an unused dtv entry in the middle.  */
+	  GL(dl_tls_dtv_gaps) = true;
+	  return true;
+	}
     }
 
   while (idx - disp > (disp == 0 ? 1 + GL(dl_tls_static_nelem) : 0))
diff --git a/elf/dl-open.c b/elf/dl-open.c
index 83b8e96a5c..661f26977e 100644
--- a/elf/dl-open.c
+++ b/elf/dl-open.c
@@ -890,16 +890,6 @@ no more namespaces available for dlmopen()"));
 	 state if relocation failed, for example.  */
       if (args.map)
 	{
-	  /* Maybe some of the modules which were loaded use TLS.
-	     Since it will be removed in the following _dl_close call
-	     we have to mark the dtv array as having gaps to fill the
-	     holes.  This is a pessimistic assumption which won't hurt
-	     if not true.  There is no need to do this when we are
-	     loading the auditing DSOs since TLS has not yet been set
-	     up.  */
-	  if ((mode & __RTLD_AUDIT) == 0)
-	    GL(dl_tls_dtv_gaps) = true;
-
 	  _dl_close_worker (args.map, true);
 
 	  /* All l_nodelete_pending objects should have been deleted
diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index c4466bd9fc..b0257185e9 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -187,10 +187,7 @@ _dl_next_tls_modid (void)
 size_t
 _dl_count_modids (void)
 {
-  /* It is rare that we have gaps; see elf/dl-open.c (_dl_open) where
-     we fail to load a module and unload it leaving a gap.  If we don't
-     have gaps then the number of modids is the current maximum so
-     return that.  */
+  /* The count is the max unless dlclose or failed dlopen created gaps.  */
   if (__glibc_likely (!GL(dl_tls_dtv_gaps)))
     return GL(dl_tls_max_dtv_idx);
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 11/15] x86_64: Avoid lazy relocation of tlsdesc [BZ #27137]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (9 preceding siblings ...)
  2021-02-15 12:01 ` [PATCH 10/15] elf: Fix DTV gap reuse logic [BZ #27135] Szabolcs Nagy
@ 2021-02-15 12:02 ` Szabolcs Nagy
  2021-04-09  0:14   ` Ben Woodard
  2021-02-15 12:02 ` [PATCH 12/15] i386: " Szabolcs Nagy
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:02 UTC (permalink / raw)
  To: libc-alpha

Lazy tlsdesc relocation is racy because the static tls optimization and
tlsdesc management operations are done without holding the dlopen lock.

This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67
for aarch64, but it fixes a different race: bug 27137.
---
 sysdeps/x86_64/dl-machine.h | 19 ++++++++++++++-----
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
index 103eee6c3f..9a876a371e 100644
--- a/sysdeps/x86_64/dl-machine.h
+++ b/sysdeps/x86_64/dl-machine.h
@@ -570,12 +570,21 @@ elf_machine_lazy_rel (struct link_map *map,
     }
   else if (__glibc_likely (r_type == R_X86_64_TLSDESC))
     {
-      struct tlsdesc volatile * __attribute__((__unused__)) td =
-	(struct tlsdesc volatile *)reloc_addr;
+      const Elf_Symndx symndx = ELFW (R_SYM) (reloc->r_info);
+      const ElfW (Sym) *symtab = (const void *)D_PTR (map, l_info[DT_SYMTAB]);
+      const ElfW (Sym) *sym = &symtab[symndx];
+      const struct r_found_version *version = NULL;
 
-      td->arg = (void*)reloc;
-      td->entry = (void*)(D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_PLT)])
-			  + map->l_addr);
+      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+	{
+	  const ElfW (Half) *vernum =
+	    (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
+	  version = &map->l_versions[vernum[symndx] & 0x7fff];
+	}
+
+      /* Always initialize TLS descriptors completely at load time, in
+	 case static TLS is allocated for it that requires locking.  */
+      elf_machine_rela (map, reloc, sym, version, reloc_addr, skip_ifunc);
     }
   else if (__glibc_unlikely (r_type == R_X86_64_IRELATIVE))
     {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 12/15] i386: Avoid lazy relocation of tlsdesc [BZ #27137]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (10 preceding siblings ...)
  2021-02-15 12:02 ` [PATCH 11/15] x86_64: Avoid lazy relocation of tlsdesc [BZ #27137] Szabolcs Nagy
@ 2021-02-15 12:02 ` Szabolcs Nagy
  2021-02-15 12:02 ` [PATCH 13/15] x86_64: Remove lazy tlsdesc relocation related code Szabolcs Nagy
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:02 UTC (permalink / raw)
  To: libc-alpha

Lazy tlsdesc relocation is racy because the static tls optimization and
tlsdesc management operations are done without holding the dlopen lock.

This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67
for aarch64, but it fixes a different race: bug 27137.

On i386 the code is a bit more complicated than on x86_64 because both
rel and rela relocs are supported.
---
 sysdeps/i386/dl-machine.h | 76 ++++++++++++++++++---------------------
 1 file changed, 34 insertions(+), 42 deletions(-)

diff --git a/sysdeps/i386/dl-machine.h b/sysdeps/i386/dl-machine.h
index 23e9cc3bfb..590b41d8d7 100644
--- a/sysdeps/i386/dl-machine.h
+++ b/sysdeps/i386/dl-machine.h
@@ -688,50 +688,32 @@ elf_machine_lazy_rel (struct link_map *map,
     }
   else if (__glibc_likely (r_type == R_386_TLS_DESC))
     {
-      struct tlsdesc volatile * __attribute__((__unused__)) td =
-	(struct tlsdesc volatile *)reloc_addr;
-
-      /* Handle relocations that reference the local *ABS* in a simple
-	 way, so as to preserve a potential addend.  */
-      if (ELF32_R_SYM (reloc->r_info) == 0)
-	td->entry = _dl_tlsdesc_resolve_abs_plus_addend;
-      /* Given a known-zero addend, we can store a pointer to the
-	 reloc in the arg position.  */
-      else if (td->arg == 0)
-	{
-	  td->arg = (void*)reloc;
-	  td->entry = _dl_tlsdesc_resolve_rel;
-	}
-      else
-	{
-	  /* We could handle non-*ABS* relocations with non-zero addends
-	     by allocating dynamically an arg to hold a pointer to the
-	     reloc, but that sounds pointless.  */
-	  const Elf32_Rel *const r = reloc;
-	  /* The code below was borrowed from elf_dynamic_do_rel().  */
-	  const ElfW(Sym) *const symtab =
-	    (const void *) D_PTR (map, l_info[DT_SYMTAB]);
+      const Elf32_Rel *const r = reloc;
+      /* The code below was borrowed from elf_dynamic_do_rel().  */
+      const ElfW(Sym) *const symtab =
+	(const void *) D_PTR (map, l_info[DT_SYMTAB]);
 
+      /* Always initialize TLS descriptors completely at load time, in
+	 case static TLS is allocated for it that requires locking.  */
 # ifdef RTLD_BOOTSTRAP
-	  /* The dynamic linker always uses versioning.  */
-	  assert (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL);
+      /* The dynamic linker always uses versioning.  */
+      assert (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL);
 # else
-	  if (map->l_info[VERSYMIDX (DT_VERSYM)])
+      if (map->l_info[VERSYMIDX (DT_VERSYM)])
 # endif
-	    {
-	      const ElfW(Half) *const version =
-		(const void *) D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
-	      ElfW(Half) ndx = version[ELFW(R_SYM) (r->r_info)] & 0x7fff;
-	      elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)],
-			       &map->l_versions[ndx],
-			       (void *) (l_addr + r->r_offset), skip_ifunc);
-	    }
+	{
+	  const ElfW(Half) *const version =
+	    (const void *) D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
+	  ElfW(Half) ndx = version[ELFW(R_SYM) (r->r_info)] & 0x7fff;
+	  elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)],
+			   &map->l_versions[ndx],
+			   (void *) (l_addr + r->r_offset), skip_ifunc);
+	}
 # ifndef RTLD_BOOTSTRAP
-	  else
-	    elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)], NULL,
-			     (void *) (l_addr + r->r_offset), skip_ifunc);
+      else
+	elf_machine_rel (map, r, &symtab[ELFW(R_SYM) (r->r_info)], NULL,
+			 (void *) (l_addr + r->r_offset), skip_ifunc);
 # endif
-	}
     }
   else if (__glibc_unlikely (r_type == R_386_IRELATIVE))
     {
@@ -758,11 +740,21 @@ elf_machine_lazy_rela (struct link_map *map,
     ;
   else if (__glibc_likely (r_type == R_386_TLS_DESC))
     {
-      struct tlsdesc volatile * __attribute__((__unused__)) td =
-	(struct tlsdesc volatile *)reloc_addr;
+      const Elf_Symndx symndx = ELFW (R_SYM) (reloc->r_info);
+      const ElfW (Sym) *symtab = (const void *)D_PTR (map, l_info[DT_SYMTAB]);
+      const ElfW (Sym) *sym = &symtab[symndx];
+      const struct r_found_version *version = NULL;
+
+      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
+	{
+	  const ElfW (Half) *vernum =
+	    (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
+	  version = &map->l_versions[vernum[symndx] & 0x7fff];
+	}
 
-      td->arg = (void*)reloc;
-      td->entry = _dl_tlsdesc_resolve_rela;
+      /* Always initialize TLS descriptors completely at load time, in
+	 case static TLS is allocated for it that requires locking.  */
+      elf_machine_rela (map, reloc, sym, version, reloc_addr, skip_ifunc);
     }
   else if (__glibc_unlikely (r_type == R_386_IRELATIVE))
     {
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 13/15] x86_64: Remove lazy tlsdesc relocation related code
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (11 preceding siblings ...)
  2021-02-15 12:02 ` [PATCH 12/15] i386: " Szabolcs Nagy
@ 2021-02-15 12:02 ` Szabolcs Nagy
  2021-02-15 12:03 ` [PATCH 14/15] i386: " Szabolcs Nagy
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:02 UTC (permalink / raw)
  To: libc-alpha

_dl_tlsdesc_resolve_rela and _dl_tlsdesc_resolve_hold are only used for
lazy tlsdesc relocation processing which is no longer supported.
---
 sysdeps/x86_64/dl-tlsdesc.S | 104 ----------------------------------
 sysdeps/x86_64/dl-tlsdesc.h |   4 +-
 sysdeps/x86_64/tlsdesc.c    | 109 +-----------------------------------
 3 files changed, 2 insertions(+), 215 deletions(-)

diff --git a/sysdeps/x86_64/dl-tlsdesc.S b/sysdeps/x86_64/dl-tlsdesc.S
index 1d055adadd..ca9236bed8 100644
--- a/sysdeps/x86_64/dl-tlsdesc.S
+++ b/sysdeps/x86_64/dl-tlsdesc.S
@@ -144,107 +144,3 @@ _dl_tlsdesc_dynamic:
 	cfi_endproc
 	.size	_dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
 #endif /* SHARED */
-
-     /* This function is a wrapper for a lazy resolver for TLS_DESC
-	RELA relocations.  The incoming 0(%rsp) points to the caller's
-	link map, pushed by the dynamic object's internal lazy TLS
-	resolver front-end before tail-calling us.  We need to pop it
-	ourselves.  %rax points to a TLS descriptor, such that 0(%rax)
-	holds the address of the internal resolver front-end (unless
-	some other thread beat us to resolving it) and 8(%rax) holds a
-	pointer to the relocation.
-
-	When the actual resolver returns, it will have adjusted the
-	TLS descriptor such that we can tail-call it for it to return
-	the TP offset of the symbol.  */
-
-	.hidden _dl_tlsdesc_resolve_rela
-	.global	_dl_tlsdesc_resolve_rela
-	.type	_dl_tlsdesc_resolve_rela,@function
-	cfi_startproc
-	.align 16
-	/* The PLT entry will have pushed the link_map pointer.  */
-_dl_tlsdesc_resolve_rela:
-	_CET_ENDBR
-	cfi_adjust_cfa_offset (8)
-	/* Save all call-clobbered registers.  Add 8 bytes for push in
-	   the PLT entry to align the stack.  */
-	subq	$80, %rsp
-	cfi_adjust_cfa_offset (80)
-	movq	%rax, (%rsp)
-	movq	%rdi, 8(%rsp)
-	movq	%rax, %rdi	/* Pass tlsdesc* in %rdi.  */
-	movq	%rsi, 16(%rsp)
-	movq	80(%rsp), %rsi	/* Pass link_map* in %rsi.  */
-	movq	%r8, 24(%rsp)
-	movq	%r9, 32(%rsp)
-	movq	%r10, 40(%rsp)
-	movq	%r11, 48(%rsp)
-	movq	%rdx, 56(%rsp)
-	movq	%rcx, 64(%rsp)
-	call	_dl_tlsdesc_resolve_rela_fixup
-	movq	(%rsp), %rax
-	movq	8(%rsp), %rdi
-	movq	16(%rsp), %rsi
-	movq	24(%rsp), %r8
-	movq	32(%rsp), %r9
-	movq	40(%rsp), %r10
-	movq	48(%rsp), %r11
-	movq	56(%rsp), %rdx
-	movq	64(%rsp), %rcx
-	addq	$88, %rsp
-	cfi_adjust_cfa_offset (-88)
-	jmp	*(%rax)
-	cfi_endproc
-	.size	_dl_tlsdesc_resolve_rela, .-_dl_tlsdesc_resolve_rela
-
-     /* This function is a placeholder for lazy resolving of TLS
-	relocations.  Once some thread starts resolving a TLS
-	relocation, it sets up the TLS descriptor to use this
-	resolver, such that other threads that would attempt to
-	resolve it concurrently may skip the call to the original lazy
-	resolver and go straight to a condition wait.
-
-	When the actual resolver returns, it will have adjusted the
-	TLS descriptor such that we can tail-call it for it to return
-	the TP offset of the symbol.  */
-
-	.hidden _dl_tlsdesc_resolve_hold
-	.global	_dl_tlsdesc_resolve_hold
-	.type	_dl_tlsdesc_resolve_hold,@function
-	cfi_startproc
-	.align 16
-_dl_tlsdesc_resolve_hold:
-0:
-	_CET_ENDBR
-	/* Save all call-clobbered registers.  */
-	subq	$72, %rsp
-	cfi_adjust_cfa_offset (72)
-	movq	%rax, (%rsp)
-	movq	%rdi, 8(%rsp)
-	movq	%rax, %rdi	/* Pass tlsdesc* in %rdi.  */
-	movq	%rsi, 16(%rsp)
-	/* Pass _dl_tlsdesc_resolve_hold's address in %rsi.  */
-	leaq	. - _dl_tlsdesc_resolve_hold(%rip), %rsi
-	movq	%r8, 24(%rsp)
-	movq	%r9, 32(%rsp)
-	movq	%r10, 40(%rsp)
-	movq	%r11, 48(%rsp)
-	movq	%rdx, 56(%rsp)
-	movq	%rcx, 64(%rsp)
-	call	_dl_tlsdesc_resolve_hold_fixup
-1:
-	movq	(%rsp), %rax
-	movq	8(%rsp), %rdi
-	movq	16(%rsp), %rsi
-	movq	24(%rsp), %r8
-	movq	32(%rsp), %r9
-	movq	40(%rsp), %r10
-	movq	48(%rsp), %r11
-	movq	56(%rsp), %rdx
-	movq	64(%rsp), %rcx
-	addq	$72, %rsp
-	cfi_adjust_cfa_offset (-72)
-	jmp	*(%rax)
-	cfi_endproc
-	.size	_dl_tlsdesc_resolve_hold, .-_dl_tlsdesc_resolve_hold
diff --git a/sysdeps/x86_64/dl-tlsdesc.h b/sysdeps/x86_64/dl-tlsdesc.h
index d134b3f4db..03d5ac7a54 100644
--- a/sysdeps/x86_64/dl-tlsdesc.h
+++ b/sysdeps/x86_64/dl-tlsdesc.h
@@ -55,9 +55,7 @@ struct tlsdesc_dynamic_arg
 
 extern ptrdiff_t attribute_hidden
   _dl_tlsdesc_return(struct tlsdesc *on_rax),
-  _dl_tlsdesc_undefweak(struct tlsdesc *on_rax),
-  _dl_tlsdesc_resolve_rela(struct tlsdesc *on_rax),
-  _dl_tlsdesc_resolve_hold(struct tlsdesc *on_rax);
+  _dl_tlsdesc_undefweak(struct tlsdesc *on_rax);
 
 # ifdef SHARED
 extern void *_dl_make_tlsdesc_dynamic (struct link_map *map,
diff --git a/sysdeps/x86_64/tlsdesc.c b/sysdeps/x86_64/tlsdesc.c
index 4083849f22..ecf864d6ee 100644
--- a/sysdeps/x86_64/tlsdesc.c
+++ b/sysdeps/x86_64/tlsdesc.c
@@ -16,120 +16,13 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <link.h>
 #include <ldsodefs.h>
-#include <elf/dynamic-link.h>
 #include <tls.h>
 #include <dl-tlsdesc.h>
 #include <dl-unmap-segments.h>
+#define _dl_tlsdesc_resolve_hold 0
 #include <tlsdeschtab.h>
 
-/* The following 2 functions take a caller argument, that contains the
-   address expected to be in the TLS descriptor.  If it's changed, we
-   want to return immediately.  */
-
-/* This function is used to lazily resolve TLS_DESC RELA relocations.
-   The argument location is used to hold a pointer to the relocation.  */
-
-void
-attribute_hidden
-_dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
-				struct link_map *l)
-{
-  const ElfW(Rela) *reloc = td->arg;
-
-  if (_dl_tlsdesc_resolve_early_return_p
-      (td, (void*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_PLT)]) + l->l_addr)))
-    return;
-
-  /* The code below was borrowed from _dl_fixup().  */
-  const ElfW(Sym) *const symtab
-    = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
-  const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
-  const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
-  lookup_t result;
-
-   /* Look up the target symbol.  If the normal lookup rules are not
-      used don't look in the global scope.  */
-  if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
-      && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
-    {
-      const struct r_found_version *version = NULL;
-
-      if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
-	{
-	  const ElfW(Half) *vernum =
-	    (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
-	  ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
-	  version = &l->l_versions[ndx];
-	  if (version->hash == 0)
-	    version = NULL;
-	}
-
-      result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
-				    l->l_scope, version, ELF_RTYPE_CLASS_PLT,
-				    DL_LOOKUP_ADD_DEPENDENCY, NULL);
-    }
-  else
-    {
-      /* We already found the symbol.  The module (and therefore its load
-	 address) is also known.  */
-      result = l;
-    }
-
-  if (! sym)
-    {
-      td->arg = (void*)reloc->r_addend;
-      td->entry = _dl_tlsdesc_undefweak;
-    }
-  else
-    {
-#  ifndef SHARED
-      CHECK_STATIC_TLS (l, result);
-#  else
-      if (!TRY_STATIC_TLS (l, result))
-	{
-	  td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value
-					      + reloc->r_addend);
-	  td->entry = _dl_tlsdesc_dynamic;
-	}
-      else
-#  endif
-	{
-	  td->arg = (void*)(sym->st_value - result->l_tls_offset
-			    + reloc->r_addend);
-	  td->entry = _dl_tlsdesc_return;
-	}
-    }
-
-  _dl_tlsdesc_wake_up_held_fixups ();
-}
-
-/* This function is used to avoid busy waiting for other threads to
-   complete the lazy relocation.  Once another thread wins the race to
-   relocate a TLS descriptor, it sets the descriptor up such that this
-   function is called to wait until the resolver releases the
-   lock.  */
-
-void
-attribute_hidden
-_dl_tlsdesc_resolve_hold_fixup (struct tlsdesc volatile *td,
-				void *caller)
-{
-  /* Maybe we're lucky and can return early.  */
-  if (caller != td->entry)
-    return;
-
-  /* Locking here will stop execution until the running resolver runs
-     _dl_tlsdesc_wake_up_held_fixups(), releasing the lock.
-
-     FIXME: We'd be better off waiting on a condition variable, such
-     that we didn't have to hold the lock throughout the relocation
-     processing.  */
-  __rtld_lock_lock_recursive (GL(dl_load_lock));
-  __rtld_lock_unlock_recursive (GL(dl_load_lock));
-}
-
 /* Unmap the dynamic object, but also release its TLS descriptor table
    if there is one.  */
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 14/15] i386: Remove lazy tlsdesc relocation related code
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (12 preceding siblings ...)
  2021-02-15 12:02 ` [PATCH 13/15] x86_64: Remove lazy tlsdesc relocation related code Szabolcs Nagy
@ 2021-02-15 12:03 ` Szabolcs Nagy
  2021-02-15 12:03 ` [PATCH 15/15] elf: " Szabolcs Nagy
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:03 UTC (permalink / raw)
  To: libc-alpha

Like in commit e75711ebfa976d5468ec292282566a18b07e4d67 for x86_64,
remove unused lazy tlsdesc relocation processing code:

  _dl_tlsdesc_resolve_abs_plus_addend
  _dl_tlsdesc_resolve_rel
  _dl_tlsdesc_resolve_rela
  _dl_tlsdesc_resolve_hold
---
 sysdeps/i386/dl-tlsdesc.S | 156 -------------------------
 sysdeps/i386/dl-tlsdesc.h |   6 +-
 sysdeps/i386/tlsdesc.c    | 231 +-------------------------------------
 3 files changed, 2 insertions(+), 391 deletions(-)

diff --git a/sysdeps/i386/dl-tlsdesc.S b/sysdeps/i386/dl-tlsdesc.S
index e781d973b7..255fe88651 100644
--- a/sysdeps/i386/dl-tlsdesc.S
+++ b/sysdeps/i386/dl-tlsdesc.S
@@ -134,159 +134,3 @@ _dl_tlsdesc_dynamic:
 	cfi_endproc
 	.size	_dl_tlsdesc_dynamic, .-_dl_tlsdesc_dynamic
 #endif /* SHARED */
-
-     /* This function is a wrapper for a lazy resolver for TLS_DESC
-	REL relocations that reference the *ABS* segment in their own
-	link maps.  %ebx points to the caller's GOT.  %eax points to a
-	TLS descriptor, such that 0(%eax) holds the address of the
-	resolver wrapper itself (unless some other thread beat us to
-	it) and 4(%eax) holds the addend in the relocation.
-
-	When the actual resolver returns, it will have adjusted the
-	TLS descriptor such that we can tail-call it for it to return
-	the TP offset of the symbol.  */
-
-	.hidden _dl_tlsdesc_resolve_abs_plus_addend
-	.global	_dl_tlsdesc_resolve_abs_plus_addend
-	.type	_dl_tlsdesc_resolve_abs_plus_addend,@function
-	cfi_startproc
-	.align 16
-_dl_tlsdesc_resolve_abs_plus_addend:
-0:
-	_CET_ENDBR
-	pushl	%eax
-	cfi_adjust_cfa_offset (4)
-	pushl	%ecx
-	cfi_adjust_cfa_offset (4)
-	pushl	%edx
-	cfi_adjust_cfa_offset (4)
-	movl	$1f - 0b, %ecx
-	movl	4(%ebx), %edx
-	call	_dl_tlsdesc_resolve_abs_plus_addend_fixup
-1:
-	popl	%edx
-	cfi_adjust_cfa_offset (-4)
-	popl	%ecx
-	cfi_adjust_cfa_offset (-4)
-	popl	%eax
-	cfi_adjust_cfa_offset (-4)
-	jmp	*(%eax)
-	cfi_endproc
-	.size	_dl_tlsdesc_resolve_abs_plus_addend, .-_dl_tlsdesc_resolve_abs_plus_addend
-
-     /* This function is a wrapper for a lazy resolver for TLS_DESC
-	REL relocations that had zero addends.  %ebx points to the
-	caller's GOT.  %eax points to a TLS descriptor, such that
-	0(%eax) holds the address of the resolver wrapper itself
-	(unless some other thread beat us to it) and 4(%eax) holds a
-	pointer to the relocation.
-
-	When the actual resolver returns, it will have adjusted the
-	TLS descriptor such that we can tail-call it for it to return
-	the TP offset of the symbol.  */
-
-	.hidden _dl_tlsdesc_resolve_rel
-	.global	_dl_tlsdesc_resolve_rel
-	.type	_dl_tlsdesc_resolve_rel,@function
-	cfi_startproc
-	.align 16
-_dl_tlsdesc_resolve_rel:
-0:
-	_CET_ENDBR
-	pushl	%eax
-	cfi_adjust_cfa_offset (4)
-	pushl	%ecx
-	cfi_adjust_cfa_offset (4)
-	pushl	%edx
-	cfi_adjust_cfa_offset (4)
-	movl	$1f - 0b, %ecx
-	movl	4(%ebx), %edx
-	call	_dl_tlsdesc_resolve_rel_fixup
-1:
-	popl	%edx
-	cfi_adjust_cfa_offset (-4)
-	popl	%ecx
-	cfi_adjust_cfa_offset (-4)
-	popl	%eax
-	cfi_adjust_cfa_offset (-4)
-	jmp	*(%eax)
-	cfi_endproc
-	.size	_dl_tlsdesc_resolve_rel, .-_dl_tlsdesc_resolve_rel
-
-     /* This function is a wrapper for a lazy resolver for TLS_DESC
-	RELA relocations.  %ebx points to the caller's GOT.  %eax
-	points to a TLS descriptor, such that 0(%eax) holds the
-	address of the resolver wrapper itself (unless some other
-	thread beat us to it) and 4(%eax) holds a pointer to the
-	relocation.
-
-	When the actual resolver returns, it will have adjusted the
-	TLS descriptor such that we can tail-call it for it to return
-	the TP offset of the symbol.  */
-
-	.hidden _dl_tlsdesc_resolve_rela
-	.global	_dl_tlsdesc_resolve_rela
-	.type	_dl_tlsdesc_resolve_rela,@function
-	cfi_startproc
-	.align 16
-_dl_tlsdesc_resolve_rela:
-0:
-	_CET_ENDBR
-	pushl	%eax
-	cfi_adjust_cfa_offset (4)
-	pushl	%ecx
-	cfi_adjust_cfa_offset (4)
-	pushl	%edx
-	cfi_adjust_cfa_offset (4)
-	movl	$1f - 0b, %ecx
-	movl	4(%ebx), %edx
-	call	_dl_tlsdesc_resolve_rela_fixup
-1:
-	popl	%edx
-	cfi_adjust_cfa_offset (-4)
-	popl	%ecx
-	cfi_adjust_cfa_offset (-4)
-	popl	%eax
-	cfi_adjust_cfa_offset (-4)
-	jmp	*(%eax)
-	cfi_endproc
-	.size	_dl_tlsdesc_resolve_rela, .-_dl_tlsdesc_resolve_rela
-
-     /* This function is a placeholder for lazy resolving of TLS
-	relocations.  Once some thread starts resolving a TLS
-	relocation, it sets up the TLS descriptor to use this
-	resolver, such that other threads that would attempt to
-	resolve it concurrently may skip the call to the original lazy
-	resolver and go straight to a condition wait.
-
-	When the actual resolver returns, it will have adjusted the
-	TLS descriptor such that we can tail-call it for it to return
-	the TP offset of the symbol.  */
-
-	.hidden _dl_tlsdesc_resolve_hold
-	.global	_dl_tlsdesc_resolve_hold
-	.type	_dl_tlsdesc_resolve_hold,@function
-	cfi_startproc
-	.align 16
-_dl_tlsdesc_resolve_hold:
-0:
-	_CET_ENDBR
-	pushl	%eax
-	cfi_adjust_cfa_offset (4)
-	pushl	%ecx
-	cfi_adjust_cfa_offset (4)
-	pushl	%edx
-	cfi_adjust_cfa_offset (4)
-	movl	$1f - 0b, %ecx
-	movl	4(%ebx), %edx
-	call	_dl_tlsdesc_resolve_hold_fixup
-1:
-	popl	%edx
-	cfi_adjust_cfa_offset (-4)
-	popl	%ecx
-	cfi_adjust_cfa_offset (-4)
-	popl	%eax
-	cfi_adjust_cfa_offset (-4)
-	jmp	*(%eax)
-	cfi_endproc
-	.size	_dl_tlsdesc_resolve_hold, .-_dl_tlsdesc_resolve_hold
diff --git a/sysdeps/i386/dl-tlsdesc.h b/sysdeps/i386/dl-tlsdesc.h
index 753c03e79c..12e90da3a8 100644
--- a/sysdeps/i386/dl-tlsdesc.h
+++ b/sysdeps/i386/dl-tlsdesc.h
@@ -43,11 +43,7 @@ struct tlsdesc_dynamic_arg
 
 extern ptrdiff_t attribute_hidden __attribute__ ((regparm (1)))
   _dl_tlsdesc_return (struct tlsdesc *),
-  _dl_tlsdesc_undefweak (struct tlsdesc *),
-  _dl_tlsdesc_resolve_abs_plus_addend (struct tlsdesc *),
-  _dl_tlsdesc_resolve_rel (struct tlsdesc *),
-  _dl_tlsdesc_resolve_rela (struct tlsdesc *),
-  _dl_tlsdesc_resolve_hold (struct tlsdesc *);
+  _dl_tlsdesc_undefweak (struct tlsdesc *);
 
 # ifdef SHARED
 extern void *_dl_make_tlsdesc_dynamic (struct link_map *map,
diff --git a/sysdeps/i386/tlsdesc.c b/sysdeps/i386/tlsdesc.c
index 0bc646541f..436a21f66b 100644
--- a/sysdeps/i386/tlsdesc.c
+++ b/sysdeps/i386/tlsdesc.c
@@ -16,242 +16,13 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <link.h>
 #include <ldsodefs.h>
-#include <elf/dynamic-link.h>
 #include <tls.h>
 #include <dl-tlsdesc.h>
 #include <dl-unmap-segments.h>
+#define _dl_tlsdesc_resolve_hold 0
 #include <tlsdeschtab.h>
 
-/* The following 4 functions take an entry_check_offset argument.
-   It's computed by the caller as an offset between its entry point
-   and the call site, such that by adding the built-in return address
-   that is implicitly passed to the function with this offset, we can
-   easily obtain the caller's entry point to compare with the entry
-   point given in the TLS descriptor.  If it's changed, we want to
-   return immediately.  */
-
-/* This function is used to lazily resolve TLS_DESC REL relocations
-   that reference the *ABS* segment in their own link maps.  The
-   argument is the addend originally stored there.  */
-
-void
-__attribute__ ((regparm (3))) attribute_hidden
-_dl_tlsdesc_resolve_abs_plus_addend_fixup (struct tlsdesc volatile *td,
-					   struct link_map *l,
-					   ptrdiff_t entry_check_offset)
-{
-  ptrdiff_t addend = (ptrdiff_t) td->arg;
-
-  if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
-					  - entry_check_offset))
-    return;
-
-#ifndef SHARED
-  CHECK_STATIC_TLS (l, l);
-#else
-  if (!TRY_STATIC_TLS (l, l))
-    {
-      td->arg = _dl_make_tlsdesc_dynamic (l, addend);
-      td->entry = _dl_tlsdesc_dynamic;
-    }
-  else
-#endif
-    {
-      td->arg = (void*) (addend - l->l_tls_offset);
-      td->entry = _dl_tlsdesc_return;
-    }
-
-  _dl_tlsdesc_wake_up_held_fixups ();
-}
-
-/* This function is used to lazily resolve TLS_DESC REL relocations
-   that originally had zero addends.  The argument location, that
-   originally held the addend, is used to hold a pointer to the
-   relocation, but it has to be restored before we call the function
-   that applies relocations.  */
-
-void
-__attribute__ ((regparm (3))) attribute_hidden
-_dl_tlsdesc_resolve_rel_fixup (struct tlsdesc volatile *td,
-			       struct link_map *l,
-			       ptrdiff_t entry_check_offset)
-{
-  const ElfW(Rel) *reloc = td->arg;
-
-  if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
-					  - entry_check_offset))
-    return;
-
-  /* The code below was borrowed from _dl_fixup(),
-     except for checking for STB_LOCAL.  */
-  const ElfW(Sym) *const symtab
-    = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
-  const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
-  const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
-  lookup_t result;
-
-   /* Look up the target symbol.  If the normal lookup rules are not
-      used don't look in the global scope.  */
-  if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
-      && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
-    {
-      const struct r_found_version *version = NULL;
-
-      if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
-	{
-	  const ElfW(Half) *vernum =
-	    (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
-	  ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
-	  version = &l->l_versions[ndx];
-	  if (version->hash == 0)
-	    version = NULL;
-	}
-
-      result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
-				    l->l_scope, version, ELF_RTYPE_CLASS_PLT,
-				    DL_LOOKUP_ADD_DEPENDENCY, NULL);
-    }
-  else
-    {
-      /* We already found the symbol.  The module (and therefore its load
-	 address) is also known.  */
-      result = l;
-    }
-
-  if (!sym)
-    {
-      td->arg = 0;
-      td->entry = _dl_tlsdesc_undefweak;
-    }
-  else
-    {
-#  ifndef SHARED
-      CHECK_STATIC_TLS (l, result);
-#  else
-      if (!TRY_STATIC_TLS (l, result))
-	{
-	  td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value);
-	  td->entry = _dl_tlsdesc_dynamic;
-	}
-      else
-#  endif
-	{
-	  td->arg = (void*)(sym->st_value - result->l_tls_offset);
-	  td->entry = _dl_tlsdesc_return;
-	}
-    }
-
-  _dl_tlsdesc_wake_up_held_fixups ();
-}
-
-/* This function is used to lazily resolve TLS_DESC RELA relocations.
-   The argument location is used to hold a pointer to the relocation.  */
-
-void
-__attribute__ ((regparm (3))) attribute_hidden
-_dl_tlsdesc_resolve_rela_fixup (struct tlsdesc volatile *td,
-				struct link_map *l,
-				ptrdiff_t entry_check_offset)
-{
-  const ElfW(Rela) *reloc = td->arg;
-
-  if (_dl_tlsdesc_resolve_early_return_p (td, __builtin_return_address (0)
-					  - entry_check_offset))
-    return;
-
-  /* The code below was borrowed from _dl_fixup(),
-     except for checking for STB_LOCAL.  */
-  const ElfW(Sym) *const symtab
-    = (const void *) D_PTR (l, l_info[DT_SYMTAB]);
-  const char *strtab = (const void *) D_PTR (l, l_info[DT_STRTAB]);
-  const ElfW(Sym) *sym = &symtab[ELFW(R_SYM) (reloc->r_info)];
-  lookup_t result;
-
-   /* Look up the target symbol.  If the normal lookup rules are not
-      used don't look in the global scope.  */
-  if (ELFW(ST_BIND) (sym->st_info) != STB_LOCAL
-      && __builtin_expect (ELFW(ST_VISIBILITY) (sym->st_other), 0) == 0)
-    {
-      const struct r_found_version *version = NULL;
-
-      if (l->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
-	{
-	  const ElfW(Half) *vernum =
-	    (const void *) D_PTR (l, l_info[VERSYMIDX (DT_VERSYM)]);
-	  ElfW(Half) ndx = vernum[ELFW(R_SYM) (reloc->r_info)] & 0x7fff;
-	  version = &l->l_versions[ndx];
-	  if (version->hash == 0)
-	    version = NULL;
-	}
-
-      result = _dl_lookup_symbol_x (strtab + sym->st_name, l, &sym,
-				    l->l_scope, version, ELF_RTYPE_CLASS_PLT,
-				    DL_LOOKUP_ADD_DEPENDENCY, NULL);
-    }
-  else
-    {
-      /* We already found the symbol.  The module (and therefore its load
-	 address) is also known.  */
-      result = l;
-    }
-
-  if (!sym)
-    {
-      td->arg = (void*) reloc->r_addend;
-      td->entry = _dl_tlsdesc_undefweak;
-    }
-  else
-    {
-#  ifndef SHARED
-      CHECK_STATIC_TLS (l, result);
-#  else
-      if (!TRY_STATIC_TLS (l, result))
-	{
-	  td->arg = _dl_make_tlsdesc_dynamic (result, sym->st_value
-					      + reloc->r_addend);
-	  td->entry = _dl_tlsdesc_dynamic;
-	}
-      else
-#  endif
-	{
-	  td->arg = (void*) (sym->st_value - result->l_tls_offset
-			     + reloc->r_addend);
-	  td->entry = _dl_tlsdesc_return;
-	}
-    }
-
-  _dl_tlsdesc_wake_up_held_fixups ();
-}
-
-/* This function is used to avoid busy waiting for other threads to
-   complete the lazy relocation.  Once another thread wins the race to
-   relocate a TLS descriptor, it sets the descriptor up such that this
-   function is called to wait until the resolver releases the
-   lock.  */
-
-void
-__attribute__ ((regparm (3))) attribute_hidden
-_dl_tlsdesc_resolve_hold_fixup (struct tlsdesc volatile *td,
-				struct link_map *l __attribute__((__unused__)),
-				ptrdiff_t entry_check_offset)
-{
-  /* Maybe we're lucky and can return early.  */
-  if (__builtin_return_address (0) - entry_check_offset != td->entry)
-    return;
-
-  /* Locking here will stop execution until the running resolver runs
-     _dl_tlsdesc_wake_up_held_fixups(), releasing the lock.
-
-     FIXME: We'd be better off waiting on a condition variable, such
-     that we didn't have to hold the lock throughout the relocation
-     processing.  */
-  __rtld_lock_lock_recursive (GL(dl_load_lock));
-  __rtld_lock_unlock_recursive (GL(dl_load_lock));
-}
-
-
 /* Unmap the dynamic object, but also release its TLS descriptor table
    if there is one.  */
 
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 15/15] elf: Remove lazy tlsdesc relocation related code
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (13 preceding siblings ...)
  2021-02-15 12:03 ` [PATCH 14/15] i386: " Szabolcs Nagy
@ 2021-02-15 12:03 ` Szabolcs Nagy
  2021-02-15 12:08 ` [PATCH 03/15] Add test case for [BZ #19329] Szabolcs Nagy
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:03 UTC (permalink / raw)
  To: libc-alpha

Remove generic tlsdesc code related to lazy tlsdesc processing since
lazy tlsdesc relocation is no longer supported.  This includes removing
GL(dl_load_lock) from _dl_make_tlsdesc_dynamic which is only called at
load time when that lock is already held.

Added a documentation comment too.
---
 elf/tlsdeschtab.h         | 53 +++++----------------------------------
 sysdeps/aarch64/tlsdesc.c |  1 -
 sysdeps/arm/tlsdesc.c     |  1 -
 sysdeps/i386/tlsdesc.c    |  1 -
 sysdeps/x86_64/tlsdesc.c  |  1 -
 5 files changed, 6 insertions(+), 51 deletions(-)

diff --git a/elf/tlsdeschtab.h b/elf/tlsdeschtab.h
index 21695fd1e9..85bd0415e1 100644
--- a/elf/tlsdeschtab.h
+++ b/elf/tlsdeschtab.h
@@ -78,6 +78,10 @@ map_generation (struct link_map *map)
   return GL(dl_tls_generation) + 1;
 }
 
+/* Returns the data pointer for a given map and tls offset that is used
+   to fill in one of the GOT entries referenced by a TLSDESC relocation
+   when using dynamic TLS.  This requires allocation, returns NULL on
+   allocation failure.  */
 void *
 _dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
 {
@@ -85,18 +89,12 @@ _dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
   void **entry;
   struct tlsdesc_dynamic_arg *td, test;
 
-  /* FIXME: We could use a per-map lock here, but is it worth it?  */
-  __rtld_lock_lock_recursive (GL(dl_load_lock));
-
   ht = map->l_mach.tlsdesc_table;
   if (! ht)
     {
       ht = htab_create ();
       if (! ht)
-	{
-	  __rtld_lock_unlock_recursive (GL(dl_load_lock));
-	  return 0;
-	}
+	return 0;
       map->l_mach.tlsdesc_table = ht;
     }
 
@@ -104,15 +102,11 @@ _dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
   test.tlsinfo.ti_offset = ti_offset;
   entry = htab_find_slot (ht, &test, 1, hash_tlsdesc, eq_tlsdesc);
   if (! entry)
-    {
-      __rtld_lock_unlock_recursive (GL(dl_load_lock));
-      return 0;
-    }
+    return 0;
 
   if (*entry)
     {
       td = *entry;
-      __rtld_lock_unlock_recursive (GL(dl_load_lock));
       return td;
     }
 
@@ -122,44 +116,9 @@ _dl_make_tlsdesc_dynamic (struct link_map *map, size_t ti_offset)
      thread.  */
   td->gen_count = map_generation (map);
   td->tlsinfo = test.tlsinfo;
-
-  __rtld_lock_unlock_recursive (GL(dl_load_lock));
   return td;
 }
 
 # endif /* SHARED */
 
-/* The idea of the following two functions is to stop multiple threads
-   from attempting to resolve the same TLS descriptor without busy
-   waiting.  Ideally, we should be able to release the lock right
-   after changing td->entry, and then using say a condition variable
-   or a futex wake to wake up any waiting threads, but let's try to
-   avoid introducing such dependencies.  */
-
-static int
-__attribute__ ((unused))
-_dl_tlsdesc_resolve_early_return_p (struct tlsdesc volatile *td, void *caller)
-{
-  if (caller != atomic_load_relaxed (&td->entry))
-    return 1;
-
-  __rtld_lock_lock_recursive (GL(dl_load_lock));
-  if (caller != atomic_load_relaxed (&td->entry))
-    {
-      __rtld_lock_unlock_recursive (GL(dl_load_lock));
-      return 1;
-    }
-
-  atomic_store_relaxed (&td->entry, _dl_tlsdesc_resolve_hold);
-
-  return 0;
-}
-
-static void
-__attribute__ ((unused))
-_dl_tlsdesc_wake_up_held_fixups (void)
-{
-  __rtld_lock_unlock_recursive (GL(dl_load_lock));
-}
-
 #endif
diff --git a/sysdeps/aarch64/tlsdesc.c b/sysdeps/aarch64/tlsdesc.c
index 9b6a8b0ec4..3c1bcd8fe6 100644
--- a/sysdeps/aarch64/tlsdesc.c
+++ b/sysdeps/aarch64/tlsdesc.c
@@ -22,7 +22,6 @@
 #include <tls.h>
 #include <dl-tlsdesc.h>
 #include <dl-unmap-segments.h>
-#define _dl_tlsdesc_resolve_hold 0
 #include <tlsdeschtab.h>
 
 /* Unmap the dynamic object, but also release its TLS descriptor table
diff --git a/sysdeps/arm/tlsdesc.c b/sysdeps/arm/tlsdesc.c
index 2f70adef5c..b3d1735af4 100644
--- a/sysdeps/arm/tlsdesc.c
+++ b/sysdeps/arm/tlsdesc.c
@@ -20,7 +20,6 @@
 #include <tls.h>
 #include <dl-tlsdesc.h>
 #include <dl-unmap-segments.h>
-#define _dl_tlsdesc_resolve_hold 0
 #include <tlsdeschtab.h>
 
 /* Unmap the dynamic object, but also release its TLS descriptor table
diff --git a/sysdeps/i386/tlsdesc.c b/sysdeps/i386/tlsdesc.c
index 436a21f66b..364c7769ce 100644
--- a/sysdeps/i386/tlsdesc.c
+++ b/sysdeps/i386/tlsdesc.c
@@ -20,7 +20,6 @@
 #include <tls.h>
 #include <dl-tlsdesc.h>
 #include <dl-unmap-segments.h>
-#define _dl_tlsdesc_resolve_hold 0
 #include <tlsdeschtab.h>
 
 /* Unmap the dynamic object, but also release its TLS descriptor table
diff --git a/sysdeps/x86_64/tlsdesc.c b/sysdeps/x86_64/tlsdesc.c
index ecf864d6ee..3bb80517a3 100644
--- a/sysdeps/x86_64/tlsdesc.c
+++ b/sysdeps/x86_64/tlsdesc.c
@@ -20,7 +20,6 @@
 #include <tls.h>
 #include <dl-tlsdesc.h>
 #include <dl-unmap-segments.h>
-#define _dl_tlsdesc_resolve_hold 0
 #include <tlsdeschtab.h>
 
 /* Unmap the dynamic object, but also release its TLS descriptor table
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 03/15] Add test case for [BZ #19329]
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (14 preceding siblings ...)
  2021-02-15 12:03 ` [PATCH 15/15] elf: " Szabolcs Nagy
@ 2021-02-15 12:08 ` Szabolcs Nagy
  2021-02-15 12:08 ` [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo Szabolcs Nagy
       [not found] ` <CGME20210215115731epcas5p45614957debad2f679230d0bd1efbd57f@epcms5p7>
  17 siblings, 0 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:08 UTC (permalink / raw)
  To: libc-alpha

Test concurrent dlopen and pthread_create when the loaded
modules have TLS.  This triggers dl-tls assertion failures
more reliably than the tst-stack4 test.

The dlopened module has 100 DT_NEEDED dependencies, the
number of concurrent threads created during dlopen depends
on filesystem speed and hardware. At most 3 threads exist
at a time to limit resource usage.

Doing the test in a fork loop can make it more reliable.

---
v4:
- rebased, updated copyright year.
- moved to tests-internal because of <atomic.h>

v3:
- use the new test support code.
- better Makefile usage so modules are cleaned properly.

v2:
- undef NDEBUG.
- join nop threads so at most 3 threads exist at a time.
- remove stack size setting (resource usage is no longer a concern).
- stop creating threads after dlopen observably finished.
- print the number of threads created for debugging.

2016-12-13  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	* nptl/Makefile (tests): Add tst-tls7.
	(modules-names): Add tst-tls7mod, tst-tls7mod-dep.
	* nptl/tst-tls7.c: New file.
	* nptl/tst-tls7mod-dep.c: New file.
	* nptl/tst-tls7mod.c: New file.
---
 nptl/Makefile          | 17 +++++++++--
 nptl/tst-tls7.c        | 69 ++++++++++++++++++++++++++++++++++++++++++
 nptl/tst-tls7mod-dep.c |  1 +
 nptl/tst-tls7mod.c     |  1 +
 4 files changed, 85 insertions(+), 3 deletions(-)
 create mode 100644 nptl/tst-tls7.c
 create mode 100644 nptl/tst-tls7mod-dep.c
 create mode 100644 nptl/tst-tls7mod.c

diff --git a/nptl/Makefile b/nptl/Makefile
index 8fb7fee6db..208629876d 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -355,20 +355,25 @@ tests += tst-cancelx7 tst-cancelx17 tst-cleanupx4
 
 ifeq ($(build-shared),yes)
 tests += tst-compat-forwarder tst-audit-threads
-tests-internal += tst-tls3 tst-tls3-malloc tst-tls5 tst-stackguard1
+tests-internal += tst-tls3 tst-tls3-malloc tst-tls5 tst-stackguard1 tst-tls7
 ifeq ($(have-z-execstack),yes)
 tests += tst-execstack
 endif
 endif
 
+one-hundred = $(foreach x,0 1 2 3 4 5 6 7 8 9, \
+  0$x 1$x 2$x 3$x 4$x 5$x 6$x 7$x 8$x 9$x)
+tst-tls7mod-deps = $(one-hundred:%=tst-tls7mod-dep-%.so)
+
 modules-names = tst-tls3mod \
 		tst-tls5mod tst-tls5moda tst-tls5modb tst-tls5modc \
 		tst-tls5modd tst-tls5mode tst-tls5modf tst-stack4mod \
 		tst-execstack-mod \
 		tst-compat-forwarder-mod tst-audit-threads-mod1 \
-		tst-audit-threads-mod2
+		tst-audit-threads-mod2 \
+		tst-tls7mod tst-tls7mod-dep
 extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) \
-		   tst-cleanup4aux.o tst-cleanupx4aux.o
+		   tst-cleanup4aux.o tst-cleanupx4aux.o $(tst-tls7mod-deps)
 test-extras += tst-cleanup4aux tst-cleanupx4aux
 
 tst-tls3mod.so-no-z-defs = yes
@@ -517,6 +522,12 @@ $(objpfx)tst-tls6.out: tst-tls6.sh $(objpfx)tst-tls5 \
 	$(evaluate-test)
 endif
 
+$(objpfx)tst-tls7: $(libdl) $(shared-thread-library)
+$(objpfx)tst-tls7.out: $(objpfx)tst-tls7mod.so
+$(objpfx)tst-tls7mod.so: $(tst-tls7mod-deps:%=$(objpfx)%)
+$(tst-tls7mod-deps:%=$(objpfx)%): $(objpfx)tst-tls7mod-dep.so
+	cp -f $< $@
+
 $(objpfx)tst-dlsym1: $(libdl) $(shared-thread-library)
 
 ifeq (yes,$(build-shared))
diff --git a/nptl/tst-tls7.c b/nptl/tst-tls7.c
new file mode 100644
index 0000000000..4b3f9005fe
--- /dev/null
+++ b/nptl/tst-tls7.c
@@ -0,0 +1,69 @@
+/* Test concurrent dlopen and pthread_create: BZ 19329.
+   Copyright (C) 2021 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <dlfcn.h>
+#include <pthread.h>
+#include <stdio.h>
+#include <atomic.h>
+
+#include <support/xthread.h>
+
+#define THREADS 10000
+
+static volatile int done;
+
+static void *
+start (void *a)
+{
+  if (dlopen ("tst-tls7mod.so", RTLD_LAZY) == NULL)
+    {
+      printf ("dlopen failed: %s\n", dlerror ());
+      exit (1);
+    }
+  atomic_store_relaxed (&done, 1);
+  return 0;
+}
+
+static void *
+nop (void *a)
+{
+  return 0;
+}
+
+static int
+do_test (void)
+{
+  pthread_t t1, t2;
+  int i;
+
+  /* Load a module with lots of dependencies and TLS.  */
+  t1 = xpthread_create (0, start, 0);
+
+  /* Concurrently create lots of threads until dlopen is observably done.  */
+  for (i = 0; i < THREADS && !atomic_load_relaxed (&done); i++)
+    {
+      t2 = xpthread_create (0, nop, 0);
+      xpthread_join (t2);
+    }
+
+  xpthread_join (t1);
+  printf ("threads created during dlopen: %d\n", i);
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/nptl/tst-tls7mod-dep.c b/nptl/tst-tls7mod-dep.c
new file mode 100644
index 0000000000..206ece4fb3
--- /dev/null
+++ b/nptl/tst-tls7mod-dep.c
@@ -0,0 +1 @@
+int __thread x;
diff --git a/nptl/tst-tls7mod.c b/nptl/tst-tls7mod.c
new file mode 100644
index 0000000000..206ece4fb3
--- /dev/null
+++ b/nptl/tst-tls7mod.c
@@ -0,0 +1 @@
+int __thread x;
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo
  2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
                   ` (15 preceding siblings ...)
  2021-02-15 12:08 ` [PATCH 03/15] Add test case for [BZ #19329] Szabolcs Nagy
@ 2021-02-15 12:08 ` Szabolcs Nagy
       [not found] ` <CGME20210215115731epcas5p45614957debad2f679230d0bd1efbd57f@epcms5p7>
  17 siblings, 0 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-02-15 12:08 UTC (permalink / raw)
  To: libc-alpha

Since

  commit a509eb117fac1d764b15eba64993f4bdb63d7f3c
  Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]

the generation counter update is not needed in the failure path.
---
 elf/dl-tls.c | 11 +----------
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/elf/dl-tls.c b/elf/dl-tls.c
index 79b93ad91b..24d00c14ef 100644
--- a/elf/dl-tls.c
+++ b/elf/dl-tls.c
@@ -998,16 +998,7 @@ _dl_add_to_slotinfo (struct link_map *l, bool do_add)
 		+ TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo));
       if (listp == NULL)
 	{
-	  /* We ran out of memory.  We will simply fail this
-	     call but don't undo anything we did so far.  The
-	     application will crash or be terminated anyway very
-	     soon.  */
-
-	  /* We have to do this since some entries in the dtv
-	     slotinfo array might already point to this
-	     generation.  */
-	  ++GL(dl_tls_generation);
-
+	  /* We ran out of memory while resizing the dtv slotinfo list.  */
 	  _dl_signal_error (ENOMEM, "dlopen", NULL, N_("\
 cannot create TLS data structures"));
 	}
-- 
2.17.1


^ permalink raw reply	[flat|nested] 40+ messages in thread

* RE: [PATCH 02/15] elf: Fix data race in _dl_name_match_p [BZ #21349]
       [not found] ` <CGME20210215115731epcas5p45614957debad2f679230d0bd1efbd57f@epcms5p7>
@ 2021-02-15 12:11   ` Maninder Singh
  0 siblings, 0 replies; 40+ messages in thread
From: Maninder Singh @ 2021-02-15 12:11 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: libc-alpha, Vaneet Narang, AMIT SAHRAWAT

Hi Szabolcs,

Thanks for trying this patch again.

Please add  Vaneet Name in co authered also.

>From: Maninder Singh <maninder1.s@samsung.com>
> 
>dlopen updates libname_list by writing to lastp->next, but concurrent
>reads in _dl_name_match_p were not synchronized when it was called
>without holding GL(dl_load_lock), which can happen during lazy symbol
>resolution.
> 
>This patch fixes the race between _dl_name_match_p reading lastp->next
>and add_name_to_object writing to it. This could cause segfault on
>targets with weak memory order when lastp->next->name is read, which
>was observed on an arm system. Fixes bug 21349.
> 
>(Code is from Maninder Singh, comments and description is from Szabolcs
>Nagy.)
> 
>Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
Co-authored-by: Vaneet Narang <v.narang@samsung.com>

because issue was analysed by vaneet firstly.

Thanks Again,

Maninder Singh

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/15] aarch64: free tlsdesc data on dlclose [BZ #27403]
  2021-02-15 11:56 ` [PATCH 01/15] aarch64: free tlsdesc data on dlclose [BZ #27403] Szabolcs Nagy
@ 2021-04-01 12:57   ` Adhemerval Zanella
  2021-04-06 13:43     ` Szabolcs Nagy
  0 siblings, 1 reply; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-01 12:57 UTC (permalink / raw)
  To: libc-alpha, Szabolcs Nagy



On 15/02/2021 08:56, Szabolcs Nagy via Libc-alpha wrote:
> DL_UNMAP_IS_SPECIAL and DL_UNMAP were not defined. The definitions are
> now copied from arm, since the same is needed on aarch64. The cleanup
> of tlsdesc data is handled by the custom _dl_unmap.
> 
> Fixes bug 27403.
> ---
>  sysdeps/aarch64/dl-lookupcfg.h | 27 +++++++++++++++++++++++++++
>  1 file changed, 27 insertions(+)
>  create mode 100644 sysdeps/aarch64/dl-lookupcfg.h
> 
> diff --git a/sysdeps/aarch64/dl-lookupcfg.h b/sysdeps/aarch64/dl-lookupcfg.h
> new file mode 100644
> index 0000000000..c1ae676ae1
> --- /dev/null
> +++ b/sysdeps/aarch64/dl-lookupcfg.h
> @@ -0,0 +1,27 @@
> +/* Configuration of lookup functions.
> +   Copyright (C) 2006-2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library.  If not, see
> +   <https://www.gnu.org/licenses/>.  */
> +
> +#define DL_UNMAP_IS_SPECIAL
> +
> +#include_next <dl-lookupcfg.h>
> +
> +struct link_map;
> +
> +extern void _dl_unmap (struct link_map *map);
> +
> +#define DL_UNMAP(map) _dl_unmap (map)
> 

The fix looks ok to me (aarch64 supports tlsdesc, but it not
calling htab_delete) but _dl_unmap is replicated over aarch64, 
arm, i386, and x86_64 (the architectures that supports tlsdesc).
I think it would be simpler to add a define on linkmap.h to 
indicate the ABI supports tsldesc and consolidate the _dl_unmap 
code that call htab_delete on _dl_unmap_segments.

It would allow to remove all the tlsdesc.c implementations (and
dl-lookupcfg.h completely on some architectures) and simplify
the required code if any other architectures decides to support
tlsdesc. 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 02/15] elf: Fix data race in _dl_name_match_p [BZ #21349]
  2021-02-15 11:56 ` [PATCH 02/15] elf: Fix data race in _dl_name_match_p [BZ #21349] Szabolcs Nagy
@ 2021-04-01 14:01   ` Adhemerval Zanella
  2021-04-06 16:41     ` Szabolcs Nagy
  0 siblings, 1 reply; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-01 14:01 UTC (permalink / raw)
  To: Szabolcs Nagy, libc-alpha; +Cc: Maninder Singh



On 15/02/2021 08:56, Szabolcs Nagy via Libc-alpha wrote:
> From: Maninder Singh <maninder1.s@samsung.com>
> 
> dlopen updates libname_list by writing to lastp->next, but concurrent
> reads in _dl_name_match_p were not synchronized when it was called
> without holding GL(dl_load_lock), which can happen during lazy symbol
> resolution.
> 
> This patch fixes the race between _dl_name_match_p reading lastp->next
> and add_name_to_object writing to it. This could cause segfault on
> targets with weak memory order when lastp->next->name is read, which
> was observed on an arm system. Fixes bug 21349.
> 
> (Code is from Maninder Singh, comments and description is from Szabolcs
> Nagy.)
> 
> Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>

I couldn't reproduce with the example provided in the bugzilla (on both
aarch64 and arm machines), but the explanation and the fix sounds logical.
I guess a testcase will be hard to create an exercise the issue.

LGTM, thanks.

Reviewed-by; Adhemerval Zanella  <adhemerval.zanella@linaro.org>

> ---
>  elf/dl-load.c | 18 +++++++++++++++++-
>  elf/dl-misc.c |  4 +++-
>  2 files changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/elf/dl-load.c b/elf/dl-load.c
> index 9e2089cfaa..be54bafad5 100644
> --- a/elf/dl-load.c
> +++ b/elf/dl-load.c
> @@ -438,7 +438,23 @@ add_name_to_object (struct link_map *l, const char *name)
>    newname->name = memcpy (newname + 1, name, name_len);
>    newname->next = NULL;
>    newname->dont_free = 0;
> -  lastp->next = newname;
> +  /* CONCURRENCY NOTES:
> +
> +     Make sure the initialization of newname happens before its address is
> +     read from the lastp->next store below.
> +
> +     GL(dl_load_lock) is held here (and by other writers, e.g. dlclose), so
> +     readers of libname_list->next (e.g. _dl_check_caller or the reads above)
> +     can use that for synchronization, however the read in _dl_name_match_p
> +     may be executed without holding the lock during _dl_runtime_resolve
> +     (i.e. lazy symbol resolution when a function of library l is called).
> +
> +     The release MO store below synchronizes with the acquire MO load in
> +     _dl_name_match_p.  Other writes need to synchronize with that load too,
> +     however those happen either early when the process is single threaded
> +     (dl_main) or when the library is unloaded (dlclose) and the user has to
> +     synchronize library calls with unloading.  */
> +  atomic_store_release (&lastp->next, newname);
>  }
>  
>  /* Standard search directories.  */
> diff --git a/elf/dl-misc.c b/elf/dl-misc.c
> index 082f75f459..d4803bba4e 100644
> --- a/elf/dl-misc.c
> +++ b/elf/dl-misc.c
> @@ -347,7 +347,9 @@ _dl_name_match_p (const char *name, const struct link_map *map)
>      if (strcmp (name, runp->name) == 0)
>        return 1;
>      else
> -      runp = runp->next;
> +      /* Synchronize with the release MO store in add_name_to_object.
> +	 See CONCURRENCY NOTES in add_name_to_object in dl-load.c.  */
> +      runp = atomic_load_acquire (&runp->next);
>  
>    return 0;
>  }
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 03/15] Add test case for [BZ #19329]
  2021-02-15 11:57 ` [PATCH 03/15] Add test case for [BZ #19329] Szabolcs Nagy
@ 2021-04-02 19:10   ` Adhemerval Zanella
  0 siblings, 0 replies; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-02 19:10 UTC (permalink / raw)
  To: libc-alpha, Szabolcs Nagy



On 15/02/2021 08:57, Szabolcs Nagy via Libc-alpha wrote:
> From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
> 
> Test concurrent dlopen and pthread_create when the loaded
> modules have TLS.  This triggers dl-tls assertion failures
> more reliably than the tst-stack4 test.
> 
> The dlopened module has 100 DT_NEEDED dependencies, the
> number of concurrent threads created during dlopen depends
> on filesystem speed and hardware. At most 3 threads exist
> at a time to limit resource usage.
> 
> Doing the test in a fork loop can make it more reliable.

I can easily reproduce it on x86_64 (8 cpus), powerpc64 (128
cpus), and aarch64 (160 cpus); however this tests does pass
on a more limited armhf box (2 cpus).  In any case, this
should be ok.

Also, this patch should be move after the BZ #19329 fix in
this set (otherwise it will generate make check failures).

> 
> ---
> v4:
> - rebased, updated copyright year.
> - moved to tests-internal because of <atomic.h>

The glibc build with -std=gnu11, so I think it should be ok
to use stdatomic and move out tests-internal.

> 
> v3:
> - use the new test support code.
> - better Makefile usage so modules are cleaned properly.
> 
> v2:
> - undef NDEBUG.
> - join nop threads so at most 3 threads exist at a time.
> - remove stack size setting (resource usage is no longer a concern).
> - stop creating threads after dlopen observably finished.
> - print the number of threads created for debugging.
> 
> 2016-12-13  Szabolcs Nagy  <szabolcs.nagy@arm.com>
> 
> 	* nptl/Makefile (tests): Add tst-tls7.
> 	(modules-names): Add tst-tls7mod, tst-tls7mod-dep.
> 	* nptl/tst-tls7.c: New file.
> 	* nptl/tst-tls7mod-dep.c: New file.
> 	* nptl/tst-tls7mod.c: New file.

There is no need to add ChangeLog entries anymore.

> ---
>  nptl/Makefile          | 17 +++++++++--
>  nptl/tst-tls7.c        | 69 ++++++++++++++++++++++++++++++++++++++++++
>  nptl/tst-tls7mod-dep.c |  1 +
>  nptl/tst-tls7mod.c     |  1 +
>  4 files changed, 85 insertions(+), 3 deletions(-)
>  create mode 100644 nptl/tst-tls7.c
>  create mode 100644 nptl/tst-tls7mod-dep.c
>  create mode 100644 nptl/tst-tls7mod.c
> 
> diff --git a/nptl/Makefile b/nptl/Makefile
> index 8fb7fee6db..208629876d 100644
> --- a/nptl/Makefile
> +++ b/nptl/Makefile
> @@ -355,20 +355,25 @@ tests += tst-cancelx7 tst-cancelx17 tst-cleanupx4
>  
>  ifeq ($(build-shared),yes)
>  tests += tst-compat-forwarder tst-audit-threads
> -tests-internal += tst-tls3 tst-tls3-malloc tst-tls5 tst-stackguard1
> +tests-internal += tst-tls3 tst-tls3-malloc tst-tls5 tst-stackguard1 tst-tls7
>  ifeq ($(have-z-execstack),yes)
>  tests += tst-execstack
>  endif
>  endif

Ok, although I think we can make is non internal test.

>  
> +one-hundred = $(foreach x,0 1 2 3 4 5 6 7 8 9, \
> +  0$x 1$x 2$x 3$x 4$x 5$x 6$x 7$x 8$x 9$x)
> +tst-tls7mod-deps = $(one-hundred:%=tst-tls7mod-dep-%.so)
> +
>  modules-names = tst-tls3mod \
>  		tst-tls5mod tst-tls5moda tst-tls5modb tst-tls5modc \
>  		tst-tls5modd tst-tls5mode tst-tls5modf tst-stack4mod \
>  		tst-execstack-mod \
>  		tst-compat-forwarder-mod tst-audit-threads-mod1 \
> -		tst-audit-threads-mod2
> +		tst-audit-threads-mod2 \
> +		tst-tls7mod tst-tls7mod-dep
>  extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) \
> -		   tst-cleanup4aux.o tst-cleanupx4aux.o
> +		   tst-cleanup4aux.o tst-cleanupx4aux.o $(tst-tls7mod-deps)
>  test-extras += tst-cleanup4aux tst-cleanupx4aux
>  
>  tst-tls3mod.so-no-z-defs = yes

Ok.

> @@ -517,6 +522,12 @@ $(objpfx)tst-tls6.out: tst-tls6.sh $(objpfx)tst-tls5 \
>  	$(evaluate-test)
>  endif
>  
> +$(objpfx)tst-tls7: $(libdl) $(shared-thread-library)
> +$(objpfx)tst-tls7.out: $(objpfx)tst-tls7mod.so
> +$(objpfx)tst-tls7mod.so: $(tst-tls7mod-deps:%=$(objpfx)%)
> +$(tst-tls7mod-deps:%=$(objpfx)%): $(objpfx)tst-tls7mod-dep.so
> +	cp -f $< $@
> +
>  $(objpfx)tst-dlsym1: $(libdl) $(shared-thread-library)
>  
>  ifeq (yes,$(build-shared))

Ok.

> diff --git a/nptl/tst-tls7.c b/nptl/tst-tls7.c
> new file mode 100644
> index 0000000000..4b3f9005fe
> --- /dev/null
> +++ b/nptl/tst-tls7.c
> @@ -0,0 +1,69 @@
> +/* Test concurrent dlopen and pthread_create: BZ 19329.
> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <dlfcn.h>
> +#include <pthread.h>
> +#include <stdio.h>
> +#include <atomic.h>
> +
> +#include <support/xthread.h>
> +
> +#define THREADS 10000
> +
> +static volatile int done;

I think there is no need to define it volatile once it always
accessed through atomic functions.

> +
> +static void *
> +start (void *a)
> +{
> +  if (dlopen ("tst-tls7mod.so", RTLD_LAZY) == NULL)

Use xdlopen.

> +    {
> +      printf ("dlopen failed: %s\n", dlerror ());
> +      exit (1);
> +    }
> +  atomic_store_relaxed (&done, 1);
> +  return 0;
> +}
> +
> +static void *
> +nop (void *a)
> +{
> +  return 0;
> +}
> +
> +static int
> +do_test (void)
> +{
> +  pthread_t t1, t2;
> +  int i;
> +
> +  /* Load a module with lots of dependencies and TLS.  */
> +  t1 = xpthread_create (0, start, 0);
> +
> +  /* Concurrently create lots of threads until dlopen is observably done.  */
> +  for (i = 0; i < THREADS && !atomic_load_relaxed (&done); i++)
> +    {
> +      t2 = xpthread_create (0, nop, 0);
> +      xpthread_join (t2);
> +    }
> +
> +  xpthread_join (t1);
> +  printf ("threads created during dlopen: %d\n", i);
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>

Ok.

> diff --git a/nptl/tst-tls7mod-dep.c b/nptl/tst-tls7mod-dep.c
> new file mode 100644
> index 0000000000..206ece4fb3
> --- /dev/null
> +++ b/nptl/tst-tls7mod-dep.c
> @@ -0,0 +1 @@
> +int __thread x;
> diff --git a/nptl/tst-tls7mod.c b/nptl/tst-tls7mod.c
> new file mode 100644
> index 0000000000..206ece4fb3
> --- /dev/null
> +++ b/nptl/tst-tls7mod.c
> @@ -0,0 +1 @@
> +int __thread x;
> 

Ok.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 04/15] Add a DTV setup test [BZ #27136]
  2021-02-15 11:59 ` [PATCH 04/15] Add a DTV setup test [BZ #27136] Szabolcs Nagy
@ 2021-04-02 19:35   ` Adhemerval Zanella
  0 siblings, 0 replies; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-02 19:35 UTC (permalink / raw)
  To: Szabolcs Nagy, libc-alpha



On 15/02/2021 08:59, Szabolcs Nagy via Libc-alpha wrote:
> The test relies on reusing slotinfo entries after dlclose which
> can result in non-monotonic increasing generation counters in
> the slotinfo list. It can trigger bug 27136.
> 
> The test requires large number of modules with TLS so the
> modules of tst-tls7 are used instead of duplicating them.

This patch need to be moved after the one that actually fixes
BZ #27136.

I am getting consistently the error:

| mod[57]: &x = 0x7f1a340013f0
| mod[58]: &x = 0x7f1a340012b0
| mod[59]: &x = 0x7f1a340011f0
| mod[60]: &x = 0x7f1a340010b0
| mod[61]: &x = 0x7f1a340010f0
| mod[62]: &x = (nil)
| error: tst-tls8.c:62: dlsym failed: (null)
 
On multiple platforms (x86_64, arm, aarch64, and powerpc64le) and I 
think this is the expected result on a platform without BZ#27136 
fixed.

Patch looks ok in general, some comments below.

> ---
>  nptl/Makefile          | 10 ++++-
>  nptl/tst-tls8.c        | 96 ++++++++++++++++++++++++++++++++++++++++++
>  nptl/tst-tls8mod-bad.c |  2 +
>  3 files changed, 106 insertions(+), 2 deletions(-)
>  create mode 100644 nptl/tst-tls8.c
>  create mode 100644 nptl/tst-tls8mod-bad.c
> 
> diff --git a/nptl/Makefile b/nptl/Makefile
> index 208629876d..a11b598efd 100644
> --- a/nptl/Makefile
> +++ b/nptl/Makefile
> @@ -354,7 +354,7 @@ LDFLAGS-pthread.so = -Wl,--enable-new-dtags,-z,nodelete,-z,initfirst
>  tests += tst-cancelx7 tst-cancelx17 tst-cleanupx4
>  
>  ifeq ($(build-shared),yes)
> -tests += tst-compat-forwarder tst-audit-threads
> +tests += tst-compat-forwarder tst-audit-threads tst-tls8
>  tests-internal += tst-tls3 tst-tls3-malloc tst-tls5 tst-stackguard1 tst-tls7
>  ifeq ($(have-z-execstack),yes)
>  tests += tst-execstack

Ok.

> @@ -371,7 +371,7 @@ modules-names = tst-tls3mod \
>  		tst-execstack-mod \
>  		tst-compat-forwarder-mod tst-audit-threads-mod1 \
>  		tst-audit-threads-mod2 \
> -		tst-tls7mod tst-tls7mod-dep
> +		tst-tls7mod tst-tls7mod-dep tst-tls8mod-bad
>  extra-test-objs += $(addsuffix .os,$(strip $(modules-names))) \
>  		   tst-cleanup4aux.o tst-cleanupx4aux.o $(tst-tls7mod-deps)
>  test-extras += tst-cleanup4aux tst-cleanupx4aux

Ok.

> @@ -528,6 +528,12 @@ $(objpfx)tst-tls7mod.so: $(tst-tls7mod-deps:%=$(objpfx)%)
>  $(tst-tls7mod-deps:%=$(objpfx)%): $(objpfx)tst-tls7mod-dep.so
>  	cp -f $< $@
>  
> +# Reuse tst-tls7mod-dep*.so.
> +tst-tls8mod-bad.so-no-z-defs = yes
> +$(objpfx)tst-tls8: $(libdl) $(shared-thread-library)
> +$(objpfx)tst-tls8.out: $(objpfx)tst-tls8mod-bad.so \
> +		       $(tst-tls7mod-deps:%=$(objpfx)%)
> +
>  $(objpfx)tst-dlsym1: $(libdl) $(shared-thread-library)
>  
>  ifeq (yes,$(build-shared))
> diff --git a/nptl/tst-tls8.c b/nptl/tst-tls8.c
> new file mode 100644
> index 0000000000..be7b64c9be
> --- /dev/null
> +++ b/nptl/tst-tls8.c
> @@ -0,0 +1,96 @@
> +/* Test dtv setup if entries don't have monoton increasing generation.

s/monoton/monotonic

> +   Copyright (C) 2021 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include <dlfcn.h>
> +#include <pthread.h>
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <support/check.h>
> +#include <support/xdlfcn.h>
> +#include <support/xthread.h>
> +
> +#define NMOD 100
> +static void *mod[NMOD];
> +
> +static void
> +load_fail (void)
> +{
> +  /* Expected to fail because of a missing symbol.  */
> +  void *m = dlopen ("tst-tls8mod-bad.so", RTLD_NOW);
> +  if (m != NULL)
> +    FAIL_EXIT1 ("dlopen of tst-tls8mod-bad.so succeeded\n");
> +}

Ok.

> +
> +static void
> +load_mod (int i)
> +{
> +  char buf[100];
> +  sprintf (buf, "tst-tls7mod-dep-%02d.so", i);

Maybe use snprintf or xasprintf here?

> +  mod[i] = xdlopen (buf, RTLD_LAZY);
> +}
> +
> +static void
> +unload_mod (int i)
> +{
> +  if (mod[i])

No implicit check.

> +    xdlclose (mod[i]);
> +  mod[i] = 0;

Use NULL here.

> +}
> +
> +static void
> +access (int i)
> +{
> +  dlerror ();
> +  int *p = dlsym (mod[i], "x");
> +  printf ("mod[%d]: &x = %p\n", i, p);
> +  if (p == NULL)
> +    FAIL_EXIT1 ("dlsym failed: %s\n", dlerror ());
> +  ++*p;
> +}
> +

Ok.

> +static void *
> +start (void *a)
> +{
> +  for (int i = 0; i < NMOD; i++)
> +    if (mod[i])

No implicit check.

> +      access (i);
> +  return 0;
> +}
> +
> +static int
> +do_test (void)
> +{
> +  int i;
> +
> +  for (i = 0; i < NMOD; i++)
> +    {
> +      load_mod (i);
> +      /* Bump the generation of mod[0] without using new dtv slot.  */
> +      unload_mod (0);
> +      load_fail (); /* Ensure GL(dl_tls_dtv_gaps) is true: see bug 27135.  */
> +      load_mod (0);
> +      /* Access TLS in all loaded modules.  */
> +      pthread_t t = xpthread_create (0, start, 0);
> +      xpthread_join (t);
> +    }
> +  for (i = 0; i < NMOD; i++)
> +    unload_mod (i);
> +  return 0;
> +}
> +
> +#include <support/test-driver.c>
Ok.

> diff --git a/nptl/tst-tls8mod-bad.c b/nptl/tst-tls8mod-bad.c
> new file mode 100644
> index 0000000000..c1aed8ea7d
> --- /dev/null
> +++ b/nptl/tst-tls8mod-bad.c
> @@ -0,0 +1,2 @@
> +void missing_symbol (void);
> +void f (void) {missing_symbol ();}

Use the usual format:

 void f (void)
 {
   missing_symbol ();
 }

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 05/15] elf: Fix a DTV setup issue [BZ #27136]
  2021-02-15 11:59 ` [PATCH 05/15] elf: Fix a DTV setup issue " Szabolcs Nagy
@ 2021-04-02 19:46   ` Adhemerval Zanella
  0 siblings, 0 replies; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-02 19:46 UTC (permalink / raw)
  To: libc-alpha, Szabolcs Nagy



On 15/02/2021 08:59, Szabolcs Nagy via Libc-alpha wrote:
> The max modid is a valid index in the dtv, it should not be skipped.
> 
> The bug is observable if the last module has modid == 64 and its
> generation is same or less than the max generation of the previous
> modules.  Then dtv[0].counter implies dtv[64] is initialized but
> it isn't. Fixes bug 27136.

LGTM, thank. 

I think it would be better to either squash the testcase into this patch
(which would require to rework the testcase make rules, since it uses
the objects from B#19329) or move the test after this patch.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

> ---
>  elf/dl-tls.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> index dd76829e74..79b93ad91b 100644
> --- a/elf/dl-tls.c
> +++ b/elf/dl-tls.c
> @@ -590,7 +590,7 @@ _dl_allocate_tls_init (void *result)
>  	}
>  
>        total += cnt;
> -      if (total >= GL(dl_tls_max_dtv_idx))
> +      if (total > GL(dl_tls_max_dtv_idx))
>  	break;
>  
>        listp = listp->next;
> 

Ok, it align on how dl_tls_max_dtv_idx is used on this file
(such as _dl_count_modids).

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo
  2021-02-15 11:59 ` [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo Szabolcs Nagy
@ 2021-04-02 20:50   ` Adhemerval Zanella
  2021-04-06 15:48     ` Szabolcs Nagy
  0 siblings, 1 reply; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-02 20:50 UTC (permalink / raw)
  To: Szabolcs Nagy, libc-alpha



On 15/02/2021 08:59, Szabolcs Nagy via Libc-alpha wrote:
> From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
> 
> Since
> 
>   commit a509eb117fac1d764b15eba64993f4bdb63d7f3c
>   Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]
> 
> the generation counter update is not needed in the failure path.

It is not clear to me from just the commit reference why it would
be safe to remove the GL(dl_tls_generation) update on 
_dl_add_to_slotinfo.

The dl_open_worker calls update_tls_slotinfo which in turn call
might call _dl_add_to_slotinfo *after* the demarcation point.  Will
it terminate the process?

> ---
>  elf/dl-tls.c | 11 +----------
>  1 file changed, 1 insertion(+), 10 deletions(-)
> 
> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> index 79b93ad91b..24d00c14ef 100644
> --- a/elf/dl-tls.c
> +++ b/elf/dl-tls.c
> @@ -998,16 +998,7 @@ _dl_add_to_slotinfo (struct link_map *l, bool do_add)
>  		+ TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo));
>        if (listp == NULL)
>  	{
> -	  /* We ran out of memory.  We will simply fail this
> -	     call but don't undo anything we did so far.  The
> -	     application will crash or be terminated anyway very
> -	     soon.  */
> -
> -	  /* We have to do this since some entries in the dtv
> -	     slotinfo array might already point to this
> -	     generation.  */
> -	  ++GL(dl_tls_generation);
> -
> +	  /* We ran out of memory while resizing the dtv slotinfo list.  */
>  	  _dl_signal_error (ENOMEM, "dlopen", NULL, N_("\
>  cannot create TLS data structures"));
>  	}
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/15] aarch64: free tlsdesc data on dlclose [BZ #27403]
  2021-04-01 12:57   ` Adhemerval Zanella
@ 2021-04-06 13:43     ` Szabolcs Nagy
  2021-04-06 16:52       ` Adhemerval Zanella
  0 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-04-06 13:43 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha

The 04/01/2021 09:57, Adhemerval Zanella wrote:
> On 15/02/2021 08:56, Szabolcs Nagy via Libc-alpha wrote:
> > DL_UNMAP_IS_SPECIAL and DL_UNMAP were not defined. The definitions are
> > now copied from arm, since the same is needed on aarch64. The cleanup
> > of tlsdesc data is handled by the custom _dl_unmap.
> > 
> > Fixes bug 27403.
> > ---
> >  sysdeps/aarch64/dl-lookupcfg.h | 27 +++++++++++++++++++++++++++
> >  1 file changed, 27 insertions(+)
> >  create mode 100644 sysdeps/aarch64/dl-lookupcfg.h
> > 
> > diff --git a/sysdeps/aarch64/dl-lookupcfg.h b/sysdeps/aarch64/dl-lookupcfg.h
> > new file mode 100644
> > index 0000000000..c1ae676ae1
> > --- /dev/null
> > +++ b/sysdeps/aarch64/dl-lookupcfg.h
> > @@ -0,0 +1,27 @@
> > +/* Configuration of lookup functions.
> > +   Copyright (C) 2006-2021 Free Software Foundation, Inc.
> > +   This file is part of the GNU C Library.
> > +
> > +   The GNU C Library is free software; you can redistribute it and/or
> > +   modify it under the terms of the GNU Lesser General Public
> > +   License as published by the Free Software Foundation; either
> > +   version 2.1 of the License, or (at your option) any later version.
> > +
> > +   The GNU C Library is distributed in the hope that it will be useful,
> > +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> > +   Lesser General Public License for more details.
> > +
> > +   You should have received a copy of the GNU Lesser General Public
> > +   License along with the GNU C Library.  If not, see
> > +   <https://www.gnu.org/licenses/>.  */
> > +
> > +#define DL_UNMAP_IS_SPECIAL
> > +
> > +#include_next <dl-lookupcfg.h>
> > +
> > +struct link_map;
> > +
> > +extern void _dl_unmap (struct link_map *map);
> > +
> > +#define DL_UNMAP(map) _dl_unmap (map)
> > 
> 
> The fix looks ok to me (aarch64 supports tlsdesc, but it not
> calling htab_delete) but _dl_unmap is replicated over aarch64, 
> arm, i386, and x86_64 (the architectures that supports tlsdesc).
> I think it would be simpler to add a define on linkmap.h to 
> indicate the ABI supports tsldesc and consolidate the _dl_unmap 
> code that call htab_delete on _dl_unmap_segments.
> 
> It would allow to remove all the tlsdesc.c implementations (and
> dl-lookupcfg.h completely on some architectures) and simplify
> the required code if any other architectures decides to support
> tlsdesc. 

i agree.

the last few patches allow even more refactoring
(by removing lazy tlsdesc code paths from x86)

i think consolidation should be separate from the bug
fixes, so i plan to commit this patch as is.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo
  2021-04-02 20:50   ` Adhemerval Zanella
@ 2021-04-06 15:48     ` Szabolcs Nagy
  2021-04-06 17:47       ` Adhemerval Zanella
  0 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-04-06 15:48 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha

The 04/02/2021 17:50, Adhemerval Zanella via Libc-alpha wrote:
> On 15/02/2021 08:59, Szabolcs Nagy via Libc-alpha wrote:
> > From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
> > 
> > Since
> > 
> >   commit a509eb117fac1d764b15eba64993f4bdb63d7f3c
> >   Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]
> > 
> > the generation counter update is not needed in the failure path.
> 
> It is not clear to me from just the commit reference why it would
> be safe to remove the GL(dl_tls_generation) update on 
> _dl_add_to_slotinfo.
> 
> The dl_open_worker calls update_tls_slotinfo which in turn call
> might call _dl_add_to_slotinfo *after* the demarcation point.  Will
> it terminate the process?

in that commit the logic got changed such that allocations
happen before the demarcation point in resize_tls_slotinfo.

this is the reason for the do_add bool argument in
_dl_add_to_slotinfo: it's called twice and the first call
with do_add==false is only there to ensure everything is
allocated before the demarcation point (so module loading
can be reverted, no need to bump the generation count).

i guess this is not visible by just looking at the
_dl_add_to_slotinfo code.

note that adding some asserts to ensure there is no allocation
when do_add==true does not work: rtld uses the same api, but
without the do_add==false preallocation step since at startup
time allocation failure is fatal anyway.

> > ---
> >  elf/dl-tls.c | 11 +----------
> >  1 file changed, 1 insertion(+), 10 deletions(-)
> > 
> > diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> > index 79b93ad91b..24d00c14ef 100644
> > --- a/elf/dl-tls.c
> > +++ b/elf/dl-tls.c
> > @@ -998,16 +998,7 @@ _dl_add_to_slotinfo (struct link_map *l, bool do_add)
> >  		+ TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo));
> >        if (listp == NULL)
> >  	{
> > -	  /* We ran out of memory.  We will simply fail this
> > -	     call but don't undo anything we did so far.  The
> > -	     application will crash or be terminated anyway very
> > -	     soon.  */
> > -
> > -	  /* We have to do this since some entries in the dtv
> > -	     slotinfo array might already point to this
> > -	     generation.  */
> > -	  ++GL(dl_tls_generation);
> > -
> > +	  /* We ran out of memory while resizing the dtv slotinfo list.  */
> >  	  _dl_signal_error (ENOMEM, "dlopen", NULL, N_("\
> >  cannot create TLS data structures"));
> >  	}

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 02/15] elf: Fix data race in _dl_name_match_p [BZ #21349]
  2021-04-01 14:01   ` Adhemerval Zanella
@ 2021-04-06 16:41     ` Szabolcs Nagy
  0 siblings, 0 replies; 40+ messages in thread
From: Szabolcs Nagy @ 2021-04-06 16:41 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha, Maninder Singh

The 04/01/2021 11:01, Adhemerval Zanella wrote:
> On 15/02/2021 08:56, Szabolcs Nagy via Libc-alpha wrote:
> > From: Maninder Singh <maninder1.s@samsung.com>
> > 
> > dlopen updates libname_list by writing to lastp->next, but concurrent
> > reads in _dl_name_match_p were not synchronized when it was called
> > without holding GL(dl_load_lock), which can happen during lazy symbol
> > resolution.
> > 
> > This patch fixes the race between _dl_name_match_p reading lastp->next
> > and add_name_to_object writing to it. This could cause segfault on
> > targets with weak memory order when lastp->next->name is read, which
> > was observed on an arm system. Fixes bug 21349.
> > 
> > (Code is from Maninder Singh, comments and description is from Szabolcs
> > Nagy.)
> > 
> > Co-authored-by: Szabolcs Nagy <szabolcs.nagy@arm.com>
> 
> I couldn't reproduce with the example provided in the bugzilla (on both
> aarch64 and arm machines), but the explanation and the fix sounds logical.
> I guess a testcase will be hard to create an exercise the issue.
> 
> LGTM, thanks.
> 
> Reviewed-by; Adhemerval Zanella  <adhemerval.zanella@linaro.org>

thanks, committed at 395be7c2184645320c955b0ba214af9fa1ea9675

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 01/15] aarch64: free tlsdesc data on dlclose [BZ #27403]
  2021-04-06 13:43     ` Szabolcs Nagy
@ 2021-04-06 16:52       ` Adhemerval Zanella
  0 siblings, 0 replies; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-06 16:52 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: libc-alpha



On 06/04/2021 10:43, Szabolcs Nagy wrote:
> The 04/01/2021 09:57, Adhemerval Zanella wrote:
>> On 15/02/2021 08:56, Szabolcs Nagy via Libc-alpha wrote:
>>> DL_UNMAP_IS_SPECIAL and DL_UNMAP were not defined. The definitions are
>>> now copied from arm, since the same is needed on aarch64. The cleanup
>>> of tlsdesc data is handled by the custom _dl_unmap.
>>>
>>> Fixes bug 27403.
>>> ---
>>>  sysdeps/aarch64/dl-lookupcfg.h | 27 +++++++++++++++++++++++++++
>>>  1 file changed, 27 insertions(+)
>>>  create mode 100644 sysdeps/aarch64/dl-lookupcfg.h
>>>
>>> diff --git a/sysdeps/aarch64/dl-lookupcfg.h b/sysdeps/aarch64/dl-lookupcfg.h
>>> new file mode 100644
>>> index 0000000000..c1ae676ae1
>>> --- /dev/null
>>> +++ b/sysdeps/aarch64/dl-lookupcfg.h
>>> @@ -0,0 +1,27 @@
>>> +/* Configuration of lookup functions.
>>> +   Copyright (C) 2006-2021 Free Software Foundation, Inc.
>>> +   This file is part of the GNU C Library.
>>> +
>>> +   The GNU C Library is free software; you can redistribute it and/or
>>> +   modify it under the terms of the GNU Lesser General Public
>>> +   License as published by the Free Software Foundation; either
>>> +   version 2.1 of the License, or (at your option) any later version.
>>> +
>>> +   The GNU C Library is distributed in the hope that it will be useful,
>>> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
>>> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
>>> +   Lesser General Public License for more details.
>>> +
>>> +   You should have received a copy of the GNU Lesser General Public
>>> +   License along with the GNU C Library.  If not, see
>>> +   <https://www.gnu.org/licenses/>.  */
>>> +
>>> +#define DL_UNMAP_IS_SPECIAL
>>> +
>>> +#include_next <dl-lookupcfg.h>
>>> +
>>> +struct link_map;
>>> +
>>> +extern void _dl_unmap (struct link_map *map);
>>> +
>>> +#define DL_UNMAP(map) _dl_unmap (map)
>>>
>>
>> The fix looks ok to me (aarch64 supports tlsdesc, but it not
>> calling htab_delete) but _dl_unmap is replicated over aarch64, 
>> arm, i386, and x86_64 (the architectures that supports tlsdesc).
>> I think it would be simpler to add a define on linkmap.h to 
>> indicate the ABI supports tsldesc and consolidate the _dl_unmap 
>> code that call htab_delete on _dl_unmap_segments.
>>
>> It would allow to remove all the tlsdesc.c implementations (and
>> dl-lookupcfg.h completely on some architectures) and simplify
>> the required code if any other architectures decides to support
>> tlsdesc. 
> 
> i agree.
> 
> the last few patches allow even more refactoring
> (by removing lazy tlsdesc code paths from x86)
> 
> i think consolidation should be separate from the bug
> fixes, so i plan to commit this patch as is.

Right, but would work on the possible refactor? This should be that
complicate and would prevent other architectures to incur in the
same aarch64 mistake.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo
  2021-04-06 15:48     ` Szabolcs Nagy
@ 2021-04-06 17:47       ` Adhemerval Zanella
  2021-04-07  7:57         ` Szabolcs Nagy
  0 siblings, 1 reply; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-06 17:47 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: libc-alpha



On 06/04/2021 12:48, Szabolcs Nagy wrote:
> The 04/02/2021 17:50, Adhemerval Zanella via Libc-alpha wrote:
>> On 15/02/2021 08:59, Szabolcs Nagy via Libc-alpha wrote:
>>> From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
>>>
>>> Since
>>>
>>>   commit a509eb117fac1d764b15eba64993f4bdb63d7f3c
>>>   Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]
>>>
>>> the generation counter update is not needed in the failure path.
>>
>> It is not clear to me from just the commit reference why it would
>> be safe to remove the GL(dl_tls_generation) update on 
>> _dl_add_to_slotinfo.
>>
>> The dl_open_worker calls update_tls_slotinfo which in turn call
>> might call _dl_add_to_slotinfo *after* the demarcation point.  Will
>> it terminate the process?
> 
> in that commit the logic got changed such that allocations
> happen before the demarcation point in resize_tls_slotinfo.
> 
> this is the reason for the do_add bool argument in
> _dl_add_to_slotinfo: it's called twice and the first call
> with do_add==false is only there to ensure everything is
> allocated before the demarcation point (so module loading
> can be reverted, no need to bump the generation count).
> 
> i guess this is not visible by just looking at the
> _dl_add_to_slotinfo code.

Right, so if I reading correctly once _dl_add_to_slotinfo (..., true)
is called by update_tls_slotinfo, the malloc that create a new
dtv_slotinfo_list won't be called (since it was already allocated
previously) since the entry was already pre-allocated and thus the
search part at line 978-987 will find.  Is that correct?

> 
> note that adding some asserts to ensure there is no allocation
> when do_add==true does not work: rtld uses the same api, but
> without the do_add==false preallocation step since at startup
> time allocation failure is fatal anyway.

Right, the _dl_signal_error will trigger a fatal_error since
lcatch won't be override yet.

Thanks for the explanation.

> 
>>> ---
>>>  elf/dl-tls.c | 11 +----------
>>>  1 file changed, 1 insertion(+), 10 deletions(-)
>>>
>>> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
>>> index 79b93ad91b..24d00c14ef 100644
>>> --- a/elf/dl-tls.c
>>> +++ b/elf/dl-tls.c
>>> @@ -998,16 +998,7 @@ _dl_add_to_slotinfo (struct link_map *l, bool do_add)
>>>  		+ TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo));
>>>        if (listp == NULL)
>>>  	{
>>> -	  /* We ran out of memory.  We will simply fail this
>>> -	     call but don't undo anything we did so far.  The
>>> -	     application will crash or be terminated anyway very
>>> -	     soon.  */
>>> -
>>> -	  /* We have to do this since some entries in the dtv
>>> -	     slotinfo array might already point to this
>>> -	     generation.  */
>>> -	  ++GL(dl_tls_generation);
>>> -
>>> +	  /* We ran out of memory while resizing the dtv slotinfo list.  */
>>>  	  _dl_signal_error (ENOMEM, "dlopen", NULL, N_("\
>>>  cannot create TLS data structures"));
>>>  	}

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free
  2021-02-15 12:00 ` [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free Szabolcs Nagy
@ 2021-04-06 19:40   ` Adhemerval Zanella
  2021-04-07  8:01     ` Szabolcs Nagy
  2021-04-07 17:05   ` Adhemerval Zanella
  1 sibling, 1 reply; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-06 19:40 UTC (permalink / raw)
  To: Szabolcs Nagy, libc-alpha



On 15/02/2021 09:00, Szabolcs Nagy via Libc-alpha wrote:
> map is not valid to access here because it can be freed by a
> concurrent dlclose, so don't check the modid.

Won't it be protected by the recursive GL(dl_load_lock) in such case?
I think the concurrency issue is between dlopen and _dl_deallocate_tls
called by pthread stack handling (nptl/allocatestack.c).  Am I missing
something here?

> 
> The map == 0 and map != 0 code paths can be shared (avoiding
> the dtv resize in case of map == 0 is just an optimization:
> larger dtv than necessary would be fine too).
> ---
>  elf/dl-tls.c | 21 +++++----------------
>  1 file changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> index 24d00c14ef..f8b32b3ecb 100644
> --- a/elf/dl-tls.c
> +++ b/elf/dl-tls.c
> @@ -743,6 +743,8 @@ _dl_update_slotinfo (unsigned long int req_modid)
>  	{
>  	  for (size_t cnt = total == 0 ? 1 : 0; cnt < listp->len; ++cnt)
>  	    {
> +	      size_t modid = total + cnt;
> +
>  	      size_t gen = listp->slotinfo[cnt].gen;
>  
>  	      if (gen > new_gen)
> @@ -758,25 +760,12 @@ _dl_update_slotinfo (unsigned long int req_modid)
>  
>  	      /* If there is no map this means the entry is empty.  */
>  	      struct link_map *map = listp->slotinfo[cnt].map;
> -	      if (map == NULL)
> -		{
> -		  if (dtv[-1].counter >= total + cnt)
> -		    {
> -		      /* If this modid was used at some point the memory
> -			 might still be allocated.  */
> -		      free (dtv[total + cnt].pointer.to_free);
> -		      dtv[total + cnt].pointer.val = TLS_DTV_UNALLOCATED;
> -		      dtv[total + cnt].pointer.to_free = NULL;
> -		    }
> -
> -		  continue;
> -		}
> -
>  	      /* Check whether the current dtv array is large enough.  */
> -	      size_t modid = map->l_tls_modid;
> -	      assert (total + cnt == modid);
>  	      if (dtv[-1].counter < modid)
>  		{
> +		  if (map == NULL)
> +		    continue;
> +
>  		  /* Resize the dtv.  */
>  		  dtv = _dl_resize_dtv (dtv);
>  
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo
  2021-04-06 17:47       ` Adhemerval Zanella
@ 2021-04-07  7:57         ` Szabolcs Nagy
  2021-04-07 14:20           ` Adhemerval Zanella
  0 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-04-07  7:57 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha

The 04/06/2021 14:47, Adhemerval Zanella wrote:
> On 06/04/2021 12:48, Szabolcs Nagy wrote:
> > The 04/02/2021 17:50, Adhemerval Zanella via Libc-alpha wrote:
> >> On 15/02/2021 08:59, Szabolcs Nagy via Libc-alpha wrote:
> >>> From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
> >>>
> >>> Since
> >>>
> >>>   commit a509eb117fac1d764b15eba64993f4bdb63d7f3c
> >>>   Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]
> >>>
> >>> the generation counter update is not needed in the failure path.
> >>
> >> It is not clear to me from just the commit reference why it would
> >> be safe to remove the GL(dl_tls_generation) update on 
> >> _dl_add_to_slotinfo.
> >>
> >> The dl_open_worker calls update_tls_slotinfo which in turn call
> >> might call _dl_add_to_slotinfo *after* the demarcation point.  Will
> >> it terminate the process?
> > 
> > in that commit the logic got changed such that allocations
> > happen before the demarcation point in resize_tls_slotinfo.
> > 
> > this is the reason for the do_add bool argument in
> > _dl_add_to_slotinfo: it's called twice and the first call
> > with do_add==false is only there to ensure everything is
> > allocated before the demarcation point (so module loading
> > can be reverted, no need to bump the generation count).
> > 
> > i guess this is not visible by just looking at the
> > _dl_add_to_slotinfo code.
> 
> Right, so if I reading correctly once _dl_add_to_slotinfo (..., true)
> is called by update_tls_slotinfo, the malloc that create a new
> dtv_slotinfo_list won't be called (since it was already allocated
> previously) since the entry was already pre-allocated and thus the
> search part at line 978-987 will find.  Is that correct?

yes

if you have an idea how to make things clearer let me know.

> > 
> > note that adding some asserts to ensure there is no allocation
> > when do_add==true does not work: rtld uses the same api, but
> > without the do_add==false preallocation step since at startup
> > time allocation failure is fatal anyway.
> 
> Right, the _dl_signal_error will trigger a fatal_error since
> lcatch won't be override yet.
> 
> Thanks for the explanation.
> 
> > 
> >>> ---
> >>>  elf/dl-tls.c | 11 +----------
> >>>  1 file changed, 1 insertion(+), 10 deletions(-)
> >>>
> >>> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> >>> index 79b93ad91b..24d00c14ef 100644
> >>> --- a/elf/dl-tls.c
> >>> +++ b/elf/dl-tls.c
> >>> @@ -998,16 +998,7 @@ _dl_add_to_slotinfo (struct link_map *l, bool do_add)
> >>>  		+ TLS_SLOTINFO_SURPLUS * sizeof (struct dtv_slotinfo));
> >>>        if (listp == NULL)
> >>>  	{
> >>> -	  /* We ran out of memory.  We will simply fail this
> >>> -	     call but don't undo anything we did so far.  The
> >>> -	     application will crash or be terminated anyway very
> >>> -	     soon.  */
> >>> -
> >>> -	  /* We have to do this since some entries in the dtv
> >>> -	     slotinfo array might already point to this
> >>> -	     generation.  */
> >>> -	  ++GL(dl_tls_generation);
> >>> -
> >>> +	  /* We ran out of memory while resizing the dtv slotinfo list.  */
> >>>  	  _dl_signal_error (ENOMEM, "dlopen", NULL, N_("\
> >>>  cannot create TLS data structures"));
> >>>  	}

-- 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free
  2021-04-06 19:40   ` Adhemerval Zanella
@ 2021-04-07  8:01     ` Szabolcs Nagy
  2021-04-07 14:28       ` Adhemerval Zanella
  0 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-04-07  8:01 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha

The 04/06/2021 16:40, Adhemerval Zanella wrote:
> On 15/02/2021 09:00, Szabolcs Nagy via Libc-alpha wrote:
> > map is not valid to access here because it can be freed by a
> > concurrent dlclose, so don't check the modid.
> 
> Won't it be protected by the recursive GL(dl_load_lock) in such case?
> I think the concurrency issue is between dlopen and _dl_deallocate_tls
> called by pthread stack handling (nptl/allocatestack.c).  Am I missing
> something here?


_dl_update_slotinfo is called both with and without
the dlopen lock held: during dynamic tls access
the lock is not held (see the __tls_get_addr path)

we cannot add a lock there because that would cause
new deadlocks, dealing with this is the tricky part
of the patchset.

> > 
> > The map == 0 and map != 0 code paths can be shared (avoiding
> > the dtv resize in case of map == 0 is just an optimization:
> > larger dtv than necessary would be fine too).
> > ---
> >  elf/dl-tls.c | 21 +++++----------------
> >  1 file changed, 5 insertions(+), 16 deletions(-)
> > 
> > diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> > index 24d00c14ef..f8b32b3ecb 100644
> > --- a/elf/dl-tls.c
> > +++ b/elf/dl-tls.c
> > @@ -743,6 +743,8 @@ _dl_update_slotinfo (unsigned long int req_modid)
> >  	{
> >  	  for (size_t cnt = total == 0 ? 1 : 0; cnt < listp->len; ++cnt)
> >  	    {
> > +	      size_t modid = total + cnt;
> > +
> >  	      size_t gen = listp->slotinfo[cnt].gen;
> >  
> >  	      if (gen > new_gen)
> > @@ -758,25 +760,12 @@ _dl_update_slotinfo (unsigned long int req_modid)
> >  
> >  	      /* If there is no map this means the entry is empty.  */
> >  	      struct link_map *map = listp->slotinfo[cnt].map;
> > -	      if (map == NULL)
> > -		{
> > -		  if (dtv[-1].counter >= total + cnt)
> > -		    {
> > -		      /* If this modid was used at some point the memory
> > -			 might still be allocated.  */
> > -		      free (dtv[total + cnt].pointer.to_free);
> > -		      dtv[total + cnt].pointer.val = TLS_DTV_UNALLOCATED;
> > -		      dtv[total + cnt].pointer.to_free = NULL;
> > -		    }
> > -
> > -		  continue;
> > -		}
> > -
> >  	      /* Check whether the current dtv array is large enough.  */
> > -	      size_t modid = map->l_tls_modid;
> > -	      assert (total + cnt == modid);
> >  	      if (dtv[-1].counter < modid)
> >  		{
> > +		  if (map == NULL)
> > +		    continue;
> > +
> >  		  /* Resize the dtv.  */
> >  		  dtv = _dl_resize_dtv (dtv);
> >  
> > 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo
  2021-04-07  7:57         ` Szabolcs Nagy
@ 2021-04-07 14:20           ` Adhemerval Zanella
  0 siblings, 0 replies; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-07 14:20 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: libc-alpha



On 07/04/2021 04:57, Szabolcs Nagy wrote:
> The 04/06/2021 14:47, Adhemerval Zanella wrote:
>> On 06/04/2021 12:48, Szabolcs Nagy wrote:
>>> The 04/02/2021 17:50, Adhemerval Zanella via Libc-alpha wrote:
>>>> On 15/02/2021 08:59, Szabolcs Nagy via Libc-alpha wrote:
>>>>> From: Szabolcs Nagy <szabolcs dot nagy at arm dot com>
>>>>>
>>>>> Since
>>>>>
>>>>>   commit a509eb117fac1d764b15eba64993f4bdb63d7f3c
>>>>>   Avoid late dlopen failure due to scope, TLS slotinfo updates [BZ #25112]
>>>>>
>>>>> the generation counter update is not needed in the failure path.
>>>>
>>>> It is not clear to me from just the commit reference why it would
>>>> be safe to remove the GL(dl_tls_generation) update on 
>>>> _dl_add_to_slotinfo.
>>>>
>>>> The dl_open_worker calls update_tls_slotinfo which in turn call
>>>> might call _dl_add_to_slotinfo *after* the demarcation point.  Will
>>>> it terminate the process?
>>>
>>> in that commit the logic got changed such that allocations
>>> happen before the demarcation point in resize_tls_slotinfo.
>>>
>>> this is the reason for the do_add bool argument in
>>> _dl_add_to_slotinfo: it's called twice and the first call
>>> with do_add==false is only there to ensure everything is
>>> allocated before the demarcation point (so module loading
>>> can be reverted, no need to bump the generation count).
>>>
>>> i guess this is not visible by just looking at the
>>> _dl_add_to_slotinfo code.
>>
>> Right, so if I reading correctly once _dl_add_to_slotinfo (..., true)
>> is called by update_tls_slotinfo, the malloc that create a new
>> dtv_slotinfo_list won't be called (since it was already allocated
>> previously) since the entry was already pre-allocated and thus the
>> search part at line 978-987 will find.  Is that correct?
> 
> yes
> 
> if you have an idea how to make things clearer let me know.
> 

Maybe add it on the commit list.  The patch looks ok to me
thanks.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free
  2021-04-07  8:01     ` Szabolcs Nagy
@ 2021-04-07 14:28       ` Adhemerval Zanella
  2021-04-07 14:36         ` Adhemerval Zanella
  0 siblings, 1 reply; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-07 14:28 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: libc-alpha



On 07/04/2021 05:01, Szabolcs Nagy wrote:
> The 04/06/2021 16:40, Adhemerval Zanella wrote:
>> On 15/02/2021 09:00, Szabolcs Nagy via Libc-alpha wrote:
>>> map is not valid to access here because it can be freed by a
>>> concurrent dlclose, so don't check the modid.
>>
>> Won't it be protected by the recursive GL(dl_load_lock) in such case?
>> I think the concurrency issue is between dlopen and _dl_deallocate_tls
>> called by pthread stack handling (nptl/allocatestack.c).  Am I missing
>> something here?
> 
> 
> _dl_update_slotinfo is called both with and without
> the dlopen lock held: during dynamic tls access
> the lock is not held (see the __tls_get_addr path)


Right, revising the patch I mapped the calls (not sure if it is 
fully complete):

| _dl_open
|   __rtld_lock_lock_recursive (GL(dl_load_lock));
|   dl_open_worker
|     update_tls_slotinfo
|       _dl_update_slotinfo
|   __rtld_lock_unlock_recursive (GL(dl_load_lock));
  
| __tls_get_addr   
|   update_get_addr
|     _dl_update_slotinfo 

| rtld
|   _dl_resolve_conflicts
|     elf_machine_rela
|       TRY_STATIC_TLS
|         _dl_try_allocate_static_tls
|          _dl_update_slotinfo
|    
|    elf_machine_rela 
|      CHECK_STATIC_TLS   
|        _dl_allocate_static_tls
|          _dl_try_allocate_static_tls
|            _dl_update_slotinfo

The rtld part should not matter, since it is done before thread
is supported. 

> 
> we cannot add a lock there because that would cause
> new deadlocks, dealing with this is the tricky part
> of the patchset.

I understand this patch from previous discussion about it.  The
part is confusing me is "because it can be freed by a concurrent
dlclose".  My understanding is '_dl_deallocate_tls' might be called
in thread exit / deallocation without the GL(dl_load_lock) (which
is a potential issue); what I can't see is how concurrent dlclose 
might trigger this issue (since it should be synchronized with dlopen
through the lock).


> 
>>>
>>> The map == 0 and map != 0 code paths can be shared (avoiding
>>> the dtv resize in case of map == 0 is just an optimization:
>>> larger dtv than necessary would be fine too).
>>> ---
>>>  elf/dl-tls.c | 21 +++++----------------
>>>  1 file changed, 5 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
>>> index 24d00c14ef..f8b32b3ecb 100644
>>> --- a/elf/dl-tls.c
>>> +++ b/elf/dl-tls.c
>>> @@ -743,6 +743,8 @@ _dl_update_slotinfo (unsigned long int req_modid)
>>>  	{
>>>  	  for (size_t cnt = total == 0 ? 1 : 0; cnt < listp->len; ++cnt)
>>>  	    {
>>> +	      size_t modid = total + cnt;
>>> +
>>>  	      size_t gen = listp->slotinfo[cnt].gen;
>>>  
>>>  	      if (gen > new_gen)
>>> @@ -758,25 +760,12 @@ _dl_update_slotinfo (unsigned long int req_modid)
>>>  
>>>  	      /* If there is no map this means the entry is empty.  */
>>>  	      struct link_map *map = listp->slotinfo[cnt].map;
>>> -	      if (map == NULL)
>>> -		{
>>> -		  if (dtv[-1].counter >= total + cnt)
>>> -		    {
>>> -		      /* If this modid was used at some point the memory
>>> -			 might still be allocated.  */
>>> -		      free (dtv[total + cnt].pointer.to_free);
>>> -		      dtv[total + cnt].pointer.val = TLS_DTV_UNALLOCATED;
>>> -		      dtv[total + cnt].pointer.to_free = NULL;
>>> -		    }
>>> -
>>> -		  continue;
>>> -		}
>>> -
>>>  	      /* Check whether the current dtv array is large enough.  */
>>> -	      size_t modid = map->l_tls_modid;
>>> -	      assert (total + cnt == modid);
>>>  	      if (dtv[-1].counter < modid)
>>>  		{
>>> +		  if (map == NULL)
>>> +		    continue;
>>> +
>>>  		  /* Resize the dtv.  */
>>>  		  dtv = _dl_resize_dtv (dtv);
>>>  
>>>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free
  2021-04-07 14:28       ` Adhemerval Zanella
@ 2021-04-07 14:36         ` Adhemerval Zanella
  0 siblings, 0 replies; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-07 14:36 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: libc-alpha



On 07/04/2021 11:28, Adhemerval Zanella wrote:
> 
> 
> On 07/04/2021 05:01, Szabolcs Nagy wrote:
>> The 04/06/2021 16:40, Adhemerval Zanella wrote:
>>> On 15/02/2021 09:00, Szabolcs Nagy via Libc-alpha wrote:
>>>> map is not valid to access here because it can be freed by a
>>>> concurrent dlclose, so don't check the modid.
>>>
>>> Won't it be protected by the recursive GL(dl_load_lock) in such case?
>>> I think the concurrency issue is between dlopen and _dl_deallocate_tls
>>> called by pthread stack handling (nptl/allocatestack.c).  Am I missing
>>> something here?
>>
>>
>> _dl_update_slotinfo is called both with and without
>> the dlopen lock held: during dynamic tls access
>> the lock is not held (see the __tls_get_addr path)
> 
> 
> Right, revising the patch I mapped the calls (not sure if it is 
> fully complete):
> 
> | _dl_open
> |   __rtld_lock_lock_recursive (GL(dl_load_lock));
> |   dl_open_worker
> |     update_tls_slotinfo
> |       _dl_update_slotinfo
> |   __rtld_lock_unlock_recursive (GL(dl_load_lock));
>   
> | __tls_get_addr   
> |   update_get_addr
> |     _dl_update_slotinfo 
> 
> | rtld
> |   _dl_resolve_conflicts
> |     elf_machine_rela
> |       TRY_STATIC_TLS
> |         _dl_try_allocate_static_tls
> |          _dl_update_slotinfo
> |    
> |    elf_machine_rela 
> |      CHECK_STATIC_TLS   
> |        _dl_allocate_static_tls
> |          _dl_try_allocate_static_tls
> |            _dl_update_slotinfo
> 
> The rtld part should not matter, since it is done before thread
> is supported. 
> 
>>
>> we cannot add a lock there because that would cause
>> new deadlocks, dealing with this is the tricky part
>> of the patchset.
> 
> I understand this patch from previous discussion about it.  The
> part is confusing me is "because it can be freed by a concurrent
> dlclose".  My understanding is '_dl_deallocate_tls' might be called
> in thread exit / deallocation without the GL(dl_load_lock) (which
> is a potential issue); what I can't see is how concurrent dlclose 
> might trigger this issue (since it should be synchronized with dlopen
> through the lock).

I think I got what you meant: the concurrency issues is not related to 
dlopen open, but rather to __tls_get_addr and dclose.  Maybe making this
explicit on the commit message.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free
  2021-02-15 12:00 ` [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free Szabolcs Nagy
  2021-04-06 19:40   ` Adhemerval Zanella
@ 2021-04-07 17:05   ` Adhemerval Zanella
  1 sibling, 0 replies; 40+ messages in thread
From: Adhemerval Zanella @ 2021-04-07 17:05 UTC (permalink / raw)
  To: libc-alpha



On 15/02/2021 09:00, Szabolcs Nagy via Libc-alpha wrote:
> map is not valid to access here because it can be freed by a
> concurrent dlclose, so don't check the modid.
> 
> The map == 0 and map != 0 code paths can be shared (avoiding
> the dtv resize in case of map == 0 is just an optimization:
> larger dtv than necessary would be fine too).

Please extend a bit the patch description and add that __tls_get_addr
is the public interface that show concurrency issues with dlclose.

The patch looks ok, thanks.

Reviewed-by: Adhemerval Zanella  <adhemerval.zanella@linaro.org>

> ---
>  elf/dl-tls.c | 21 +++++----------------
>  1 file changed, 5 insertions(+), 16 deletions(-)
> 
> diff --git a/elf/dl-tls.c b/elf/dl-tls.c
> index 24d00c14ef..f8b32b3ecb 100644
> --- a/elf/dl-tls.c
> +++ b/elf/dl-tls.c
> @@ -743,6 +743,8 @@ _dl_update_slotinfo (unsigned long int req_modid)
>  	{
>  	  for (size_t cnt = total == 0 ? 1 : 0; cnt < listp->len; ++cnt)
>  	    {
> +	      size_t modid = total + cnt;
> +
>  	      size_t gen = listp->slotinfo[cnt].gen;
>  
>  	      if (gen > new_gen)
> @@ -758,25 +760,12 @@ _dl_update_slotinfo (unsigned long int req_modid)
>  
>  	      /* If there is no map this means the entry is empty.  */
>  	      struct link_map *map = listp->slotinfo[cnt].map;
> -	      if (map == NULL)
> -		{
> -		  if (dtv[-1].counter >= total + cnt)
> -		    {
> -		      /* If this modid was used at some point the memory
> -			 might still be allocated.  */
> -		      free (dtv[total + cnt].pointer.to_free);
> -		      dtv[total + cnt].pointer.val = TLS_DTV_UNALLOCATED;
> -		      dtv[total + cnt].pointer.to_free = NULL;
> -		    }
> -
> -		  continue;
> -		}
> -
>  	      /* Check whether the current dtv array is large enough.  */
> -	      size_t modid = map->l_tls_modid;
> -	      assert (total + cnt == modid);
>  	      if (dtv[-1].counter < modid)
>  		{
> +		  if (map == NULL)
> +		    continue;
> +
>  		  /* Resize the dtv.  */
>  		  dtv = _dl_resize_dtv (dtv);
>  
> 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 11/15] x86_64: Avoid lazy relocation of tlsdesc [BZ #27137]
  2021-02-15 12:02 ` [PATCH 11/15] x86_64: Avoid lazy relocation of tlsdesc [BZ #27137] Szabolcs Nagy
@ 2021-04-09  0:14   ` Ben Woodard
  2021-04-09 13:38     ` Szabolcs Nagy
  0 siblings, 1 reply; 40+ messages in thread
From: Ben Woodard @ 2021-04-09  0:14 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: libc-alpha

Don’t you also need to modify elf_machine_runtime_setup It also has a reference to _dl_tlsdesc_resolve_rela that becomes undefined when you try to compile with your patchset including patch 13 where you remove the code.

To make a test build I just commented it out but I think that this patch should remove that if statement as well.

diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
index 9a876a371e..2b1b36a739 100644
--- a/sysdeps/x86_64/dl-machine.h
+++ b/sysdeps/x86_64/dl-machine.h
@@ -127,9 +127,11 @@ elf_machine_runtime_setup (struct link_map *l, int lazy, int profile)
        }
     }

-  if (l->l_info[ADDRIDX (DT_TLSDESC_GOT)] && lazy)
-    *(ElfW(Addr)*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_GOT)]) + l->l_addr)
-      = (ElfW(Addr)) &_dl_tlsdesc_resolve_rela;
+  /* Lazy binding of TLSDESC relocations is no longer done so this logic
+     won't apply */
+  /* if (l->l_info[ADDRIDX (DT_TLSDESC_GOT)] && lazy) */
+  /*   *(ElfW(Addr)*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_GOT)]) + l->l_addr) */
+  /*     = (ElfW(Addr)) &_dl_tlsdesc_resolve_rela; */

   return lazy;
 }


> On Feb 15, 2021, at 4:02 AM, Szabolcs Nagy via Libc-alpha <libc-alpha@sourceware.org> wrote:
> 
> Lazy tlsdesc relocation is racy because the static tls optimization and
> tlsdesc management operations are done without holding the dlopen lock.
> 
> This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67
> for aarch64, but it fixes a different race: bug 27137.
> ---
> sysdeps/x86_64/dl-machine.h | 19 ++++++++++++++-----
> 1 file changed, 14 insertions(+), 5 deletions(-)
> 
> diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
> index 103eee6c3f..9a876a371e 100644
> --- a/sysdeps/x86_64/dl-machine.h
> +++ b/sysdeps/x86_64/dl-machine.h
> @@ -570,12 +570,21 @@ elf_machine_lazy_rel (struct link_map *map,
>     }
>   else if (__glibc_likely (r_type == R_X86_64_TLSDESC))
>     {
> -      struct tlsdesc volatile * __attribute__((__unused__)) td =
> -	(struct tlsdesc volatile *)reloc_addr;
> +      const Elf_Symndx symndx = ELFW (R_SYM) (reloc->r_info);
> +      const ElfW (Sym) *symtab = (const void *)D_PTR (map, l_info[DT_SYMTAB]);
> +      const ElfW (Sym) *sym = &symtab[symndx];
> +      const struct r_found_version *version = NULL;
> 
> -      td->arg = (void*)reloc;
> -      td->entry = (void*)(D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_PLT)])
> -			  + map->l_addr);
> +      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
> +	{
> +	  const ElfW (Half) *vernum =
> +	    (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
> +	  version = &map->l_versions[vernum[symndx] & 0x7fff];
> +	}
> +
> +      /* Always initialize TLS descriptors completely at load time, in
> +	 case static TLS is allocated for it that requires locking.  */
> +      elf_machine_rela (map, reloc, sym, version, reloc_addr, skip_ifunc);
>     }
>   else if (__glibc_unlikely (r_type == R_X86_64_IRELATIVE))
>     {
> -- 
> 2.17.1
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 11/15] x86_64: Avoid lazy relocation of tlsdesc [BZ #27137]
  2021-04-09  0:14   ` Ben Woodard
@ 2021-04-09 13:38     ` Szabolcs Nagy
  2021-04-09 14:55       ` Ben Woodard
  0 siblings, 1 reply; 40+ messages in thread
From: Szabolcs Nagy @ 2021-04-09 13:38 UTC (permalink / raw)
  To: Ben Woodard; +Cc: libc-alpha

The 04/08/2021 17:14, Ben Woodard wrote:
> Don’t you also need to modify elf_machine_runtime_setup It also has a reference to _dl_tlsdesc_resolve_rela that becomes undefined when you try to compile with your patchset including patch 13 where you remove the code.
> 
> To make a test build I just commented it out but I think that this patch should remove that if statement as well.

thanks,
indeed this was wrong, i tested the wrong branch on x86_64.

i will fix this and post a v2 set with the other feedback.

> 
> diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
> index 9a876a371e..2b1b36a739 100644
> --- a/sysdeps/x86_64/dl-machine.h
> +++ b/sysdeps/x86_64/dl-machine.h
> @@ -127,9 +127,11 @@ elf_machine_runtime_setup (struct link_map *l, int lazy, int profile)
>         }
>      }
> 
> -  if (l->l_info[ADDRIDX (DT_TLSDESC_GOT)] && lazy)
> -    *(ElfW(Addr)*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_GOT)]) + l->l_addr)
> -      = (ElfW(Addr)) &_dl_tlsdesc_resolve_rela;
> +  /* Lazy binding of TLSDESC relocations is no longer done so this logic
> +     won't apply */
> +  /* if (l->l_info[ADDRIDX (DT_TLSDESC_GOT)] && lazy) */
> +  /*   *(ElfW(Addr)*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_GOT)]) + l->l_addr) */
> +  /*     = (ElfW(Addr)) &_dl_tlsdesc_resolve_rela; */
> 
>    return lazy;
>  }
> 
> 
> > On Feb 15, 2021, at 4:02 AM, Szabolcs Nagy via Libc-alpha <libc-alpha@sourceware.org> wrote:
> > 
> > Lazy tlsdesc relocation is racy because the static tls optimization and
> > tlsdesc management operations are done without holding the dlopen lock.
> > 
> > This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67
> > for aarch64, but it fixes a different race: bug 27137.
> > ---
> > sysdeps/x86_64/dl-machine.h | 19 ++++++++++++++-----
> > 1 file changed, 14 insertions(+), 5 deletions(-)
> > 
> > diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
> > index 103eee6c3f..9a876a371e 100644
> > --- a/sysdeps/x86_64/dl-machine.h
> > +++ b/sysdeps/x86_64/dl-machine.h
> > @@ -570,12 +570,21 @@ elf_machine_lazy_rel (struct link_map *map,
> >     }
> >   else if (__glibc_likely (r_type == R_X86_64_TLSDESC))
> >     {
> > -      struct tlsdesc volatile * __attribute__((__unused__)) td =
> > -	(struct tlsdesc volatile *)reloc_addr;
> > +      const Elf_Symndx symndx = ELFW (R_SYM) (reloc->r_info);
> > +      const ElfW (Sym) *symtab = (const void *)D_PTR (map, l_info[DT_SYMTAB]);
> > +      const ElfW (Sym) *sym = &symtab[symndx];
> > +      const struct r_found_version *version = NULL;
> > 
> > -      td->arg = (void*)reloc;
> > -      td->entry = (void*)(D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_PLT)])
> > -			  + map->l_addr);
> > +      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
> > +	{
> > +	  const ElfW (Half) *vernum =
> > +	    (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
> > +	  version = &map->l_versions[vernum[symndx] & 0x7fff];
> > +	}
> > +
> > +      /* Always initialize TLS descriptors completely at load time, in
> > +	 case static TLS is allocated for it that requires locking.  */
> > +      elf_machine_rela (map, reloc, sym, version, reloc_addr, skip_ifunc);
> >     }
> >   else if (__glibc_unlikely (r_type == R_X86_64_IRELATIVE))
> >     {
> > -- 
> > 2.17.1
> > 
> 

-- 

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [PATCH 11/15] x86_64: Avoid lazy relocation of tlsdesc [BZ #27137]
  2021-04-09 13:38     ` Szabolcs Nagy
@ 2021-04-09 14:55       ` Ben Woodard
  0 siblings, 0 replies; 40+ messages in thread
From: Ben Woodard @ 2021-04-09 14:55 UTC (permalink / raw)
  To: Szabolcs Nagy; +Cc: libc-alpha



> On Apr 9, 2021, at 6:38 AM, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:
> 
> The 04/08/2021 17:14, Ben Woodard wrote:
>> Don’t you also need to modify elf_machine_runtime_setup It also has a reference to _dl_tlsdesc_resolve_rela that becomes undefined when you try to compile with your patchset including patch 13 where you remove the code.
>> 
>> To make a test build I just commented it out but I think that this patch should remove that if statement as well.
> 
> thanks,
> indeed this was wrong, i tested the wrong branch on x86_64.
> 
> i will fix this and post a v2 set with the other feedback.

On the positive side, I’ve been tracking down a problem where a library compiled with the gnu2 variant of TLS in a way that I haven’t been able to reproduce yet is crashing the dynamic loader when used with a performance tool that uses LD_AUDIT. 

This patch alone (with my tiny modification below) addresses the problem. I say “addresses” because it doesn’t exactly fix the problem; it makes it so that the code with the bug in it isn’t run. Patch 13 in your patch set removes the code with the bug in it. 

I see that patches 1 and 2 of your patch set have already been committed. I would encourage you to consider committing V2 of patch 11 and 13 (or maybe 11-14) even before the rest of the patch set since it addresses a bug that we are seeing in the wild.

-ben

> 
>> 
>> diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
>> index 9a876a371e..2b1b36a739 100644
>> --- a/sysdeps/x86_64/dl-machine.h
>> +++ b/sysdeps/x86_64/dl-machine.h
>> @@ -127,9 +127,11 @@ elf_machine_runtime_setup (struct link_map *l, int lazy, int profile)
>>        }
>>     }
>> 
>> -  if (l->l_info[ADDRIDX (DT_TLSDESC_GOT)] && lazy)
>> -    *(ElfW(Addr)*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_GOT)]) + l->l_addr)
>> -      = (ElfW(Addr)) &_dl_tlsdesc_resolve_rela;
>> +  /* Lazy binding of TLSDESC relocations is no longer done so this logic
>> +     won't apply */
>> +  /* if (l->l_info[ADDRIDX (DT_TLSDESC_GOT)] && lazy) */
>> +  /*   *(ElfW(Addr)*)(D_PTR (l, l_info[ADDRIDX (DT_TLSDESC_GOT)]) + l->l_addr) */
>> +  /*     = (ElfW(Addr)) &_dl_tlsdesc_resolve_rela; */
>> 
>>   return lazy;
>> }
>> 
>> 
>>> On Feb 15, 2021, at 4:02 AM, Szabolcs Nagy via Libc-alpha <libc-alpha@sourceware.org> wrote:
>>> 
>>> Lazy tlsdesc relocation is racy because the static tls optimization and
>>> tlsdesc management operations are done without holding the dlopen lock.
>>> 
>>> This similar to the commit b7cf203b5c17dd6d9878537d41e0c7cc3d270a67
>>> for aarch64, but it fixes a different race: bug 27137.
>>> ---
>>> sysdeps/x86_64/dl-machine.h | 19 ++++++++++++++-----
>>> 1 file changed, 14 insertions(+), 5 deletions(-)
>>> 
>>> diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h
>>> index 103eee6c3f..9a876a371e 100644
>>> --- a/sysdeps/x86_64/dl-machine.h
>>> +++ b/sysdeps/x86_64/dl-machine.h
>>> @@ -570,12 +570,21 @@ elf_machine_lazy_rel (struct link_map *map,
>>>    }
>>>  else if (__glibc_likely (r_type == R_X86_64_TLSDESC))
>>>    {
>>> -      struct tlsdesc volatile * __attribute__((__unused__)) td =
>>> -	(struct tlsdesc volatile *)reloc_addr;
>>> +      const Elf_Symndx symndx = ELFW (R_SYM) (reloc->r_info);
>>> +      const ElfW (Sym) *symtab = (const void *)D_PTR (map, l_info[DT_SYMTAB]);
>>> +      const ElfW (Sym) *sym = &symtab[symndx];
>>> +      const struct r_found_version *version = NULL;
>>> 
>>> -      td->arg = (void*)reloc;
>>> -      td->entry = (void*)(D_PTR (map, l_info[ADDRIDX (DT_TLSDESC_PLT)])
>>> -			  + map->l_addr);
>>> +      if (map->l_info[VERSYMIDX (DT_VERSYM)] != NULL)
>>> +	{
>>> +	  const ElfW (Half) *vernum =
>>> +	    (const void *)D_PTR (map, l_info[VERSYMIDX (DT_VERSYM)]);
>>> +	  version = &map->l_versions[vernum[symndx] & 0x7fff];
>>> +	}
>>> +
>>> +      /* Always initialize TLS descriptors completely at load time, in
>>> +	 case static TLS is allocated for it that requires locking.  */
>>> +      elf_machine_rela (map, reloc, sym, version, reloc_addr, skip_ifunc);
>>>    }
>>>  else if (__glibc_unlikely (r_type == R_X86_64_IRELATIVE))
>>>    {
>>> -- 
>>> 2.17.1
>>> 
>> 
> 
> -- 


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2021-04-09 14:55 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-02-15 11:56 [PATCH 00/15] Dynamic TLS related data race fixes Szabolcs Nagy
2021-02-15 11:56 ` [PATCH 01/15] aarch64: free tlsdesc data on dlclose [BZ #27403] Szabolcs Nagy
2021-04-01 12:57   ` Adhemerval Zanella
2021-04-06 13:43     ` Szabolcs Nagy
2021-04-06 16:52       ` Adhemerval Zanella
2021-02-15 11:56 ` [PATCH 02/15] elf: Fix data race in _dl_name_match_p [BZ #21349] Szabolcs Nagy
2021-04-01 14:01   ` Adhemerval Zanella
2021-04-06 16:41     ` Szabolcs Nagy
2021-02-15 11:57 ` [PATCH 03/15] Add test case for [BZ #19329] Szabolcs Nagy
2021-04-02 19:10   ` Adhemerval Zanella
2021-02-15 11:59 ` [PATCH 04/15] Add a DTV setup test [BZ #27136] Szabolcs Nagy
2021-04-02 19:35   ` Adhemerval Zanella
2021-02-15 11:59 ` [PATCH 05/15] elf: Fix a DTV setup issue " Szabolcs Nagy
2021-04-02 19:46   ` Adhemerval Zanella
2021-02-15 11:59 ` [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo Szabolcs Nagy
2021-04-02 20:50   ` Adhemerval Zanella
2021-04-06 15:48     ` Szabolcs Nagy
2021-04-06 17:47       ` Adhemerval Zanella
2021-04-07  7:57         ` Szabolcs Nagy
2021-04-07 14:20           ` Adhemerval Zanella
2021-02-15 12:00 ` [PATCH 07/15] elf: Refactor _dl_update_slotinfo to avoid use after free Szabolcs Nagy
2021-04-06 19:40   ` Adhemerval Zanella
2021-04-07  8:01     ` Szabolcs Nagy
2021-04-07 14:28       ` Adhemerval Zanella
2021-04-07 14:36         ` Adhemerval Zanella
2021-04-07 17:05   ` Adhemerval Zanella
2021-02-15 12:01 ` [PATCH 08/15] elf: Fix data races in pthread_create and TLS access [BZ #19329] Szabolcs Nagy
2021-02-15 12:01 ` [PATCH 09/15] elf: Use relaxed atomics for racy accesses " Szabolcs Nagy
2021-02-15 12:01 ` [PATCH 10/15] elf: Fix DTV gap reuse logic [BZ #27135] Szabolcs Nagy
2021-02-15 12:02 ` [PATCH 11/15] x86_64: Avoid lazy relocation of tlsdesc [BZ #27137] Szabolcs Nagy
2021-04-09  0:14   ` Ben Woodard
2021-04-09 13:38     ` Szabolcs Nagy
2021-04-09 14:55       ` Ben Woodard
2021-02-15 12:02 ` [PATCH 12/15] i386: " Szabolcs Nagy
2021-02-15 12:02 ` [PATCH 13/15] x86_64: Remove lazy tlsdesc relocation related code Szabolcs Nagy
2021-02-15 12:03 ` [PATCH 14/15] i386: " Szabolcs Nagy
2021-02-15 12:03 ` [PATCH 15/15] elf: " Szabolcs Nagy
2021-02-15 12:08 ` [PATCH 03/15] Add test case for [BZ #19329] Szabolcs Nagy
2021-02-15 12:08 ` [PATCH 06/15] elf: Fix comments and logic in _dl_add_to_slotinfo Szabolcs Nagy
     [not found] ` <CGME20210215115731epcas5p45614957debad2f679230d0bd1efbd57f@epcms5p7>
2021-02-15 12:11   ` [PATCH 02/15] elf: Fix data race in _dl_name_match_p [BZ #21349] Maninder Singh

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).