* [PATCH v2 00/19] Fix various NPTL synchronization issues
@ 2021-08-23 19:50 Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 01/19] nptl: Fix tst-cancel7 and tst-cancelx7 race condition (BZ #14232) Adhemerval Zanella
` (19 more replies)
0 siblings, 20 replies; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
This is an update of my previous set to fix some NPTL issues [1].
The main changes are:
- Rebased against master and adjusted the __clone_internal usage.
- Adapted Florian's ESRCH fixes [2]
- Add fixes for various function that access the 'tid'.
Patch 01 to 03 are general nptl fixes and they are independent of the
other fixes.
Patch 04 is the main change of this patchset, it uses a different
field instead of the pthread 'tid' to synchrnonize the internal
thread state (BZ#19951).
It allows to both move the thread setxid internal state out of
'cancelhandling' (used on setuid() call in multi-thread information),
and remove the EXITING_BIT and TERMINATED_BIT (since 'joinstate' now
track such it). This is done on patch 05 and 06.
Patches 08 and 09 fixes two long standing issues regarding
pthread_kill() and thread exit (BZ#12889 and BZ#19193). Now that]
'tid' is setting explicitly by pthread_create(), a simple lock can be
used instead of more complex futex operation.
Patches 10 to 18 extend the same 'tid' access fix to other pthread
functions that uses the member.
[1] https://patchwork.sourceware.org/project/glibc/list/?series=2304
[2] https://patchwork.sourceware.org/project/glibc/list/?series=2696
Adhemerval Zanella (16):
nptl: Fix tst-cancel7 and tst-cancelx7 race condition (BZ #14232)
nptl: Set cancellation type and state on pthread_exit
nptl: Handle robust PI mutexes for !__ASSUME_SET_ROBUST_LIST
nptl: Do not use pthread set_tid_address as state synchronization (BZ
#19951)
nptl: Move setxid flag out of cancelhandling
nptl: Replace struct thread cancelhandling field
nptl: Use tidlock when accessing TID on pthread_getaffinity_np
nptl: Use tidlock when accessing TID on pthread_setaffinity
nptl: Use tidlock when accessing TID on pthread_getcpuclockid
nptl: Use tidlock when accessing TID on pthread_getschedparam
nptl: Use tidlock when accessing TID on pthread_setschedparam
nptl: Use tidlock when accessing TID on pthread_getname_np
nptl: Use tidlock when accessing TID on pthread_setname_np
nptl: Use tidlock when accessing TID on pthread_sigqueue
nptl: Use tidlock when accessing TID on pthread_setschedprio
nptl: Remove INVALID_TD_P
Florian Weimer (3):
support: Add support_wait_for_thread_exit
nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193)
nptl: Fix race between pthread_kill and thread exit (bug 12889)
nptl/Makefile | 3 +-
nptl/Versions | 1 +
nptl/allocatestack.c | 6 +-
nptl/cancellation.c | 17 +-
nptl/descr.h | 48 +++---
nptl/nptl-stack.h | 2 +-
nptl/nptl_free_tcb.c | 22 +--
nptl/nptl_setxid.c | 49 ++----
nptl/pthread_cancel.c | 20 +--
nptl/pthread_clockjoin.c | 2 +-
nptl/pthread_create.c | 124 ++++++++------
nptl/pthread_detach.c | 36 ++--
nptl/pthread_exit.c | 4 +-
nptl/pthread_getaffinity.c | 22 ++-
nptl/pthread_getattr_np.c | 2 +-
nptl/pthread_getcpuclockid.c | 27 +--
nptl/pthread_getname.c | 45 +++--
nptl/pthread_getschedparam.c | 16 +-
nptl/pthread_join.c | 2 +-
nptl/pthread_join_common.c | 126 +++++---------
nptl/pthread_kill.c | 21 ++-
nptl/pthread_setaffinity.c | 22 ++-
nptl/pthread_setname.c | 34 ++--
nptl/pthread_setschedparam.c | 16 +-
nptl/pthread_setschedprio.c | 17 +-
nptl/pthread_sigqueue.c | 52 +++---
nptl/pthread_testcancel.c | 11 +-
nptl/pthread_timedjoin.c | 2 +-
nptl/pthread_tryjoin.c | 18 +-
nptl/tst-cancel7.c | 114 ++++++-------
nptl/tst-cleanup5.c | 157 ++++++++++++++++++
nptl_db/structs.def | 2 +-
nptl_db/td_thr_get_info.c | 16 +-
nptl_db/td_thr_getfpregs.c | 9 +-
nptl_db/td_thr_getgregs.c | 9 +-
nptl_db/td_thr_setfpregs.c | 9 +-
nptl_db/td_thr_setgregs.c | 9 +-
support/Makefile | 3 +-
support/support.h | 4 +
support/support_wait_for_thread_exit.c | 72 ++++++++
sysdeps/hppa/nptl/tcb-offsets.sym | 1 -
sysdeps/i386/nptl/tcb-offsets.sym | 1 -
sysdeps/nptl/dl-tls_init_tp.c | 4 +-
sysdeps/nptl/libc_start_call_main.h | 7 +
sysdeps/nptl/pthreadP.h | 27 +--
sysdeps/pthread/Makefile | 7 +-
sysdeps/pthread/tst-kill4.c | 90 ----------
sysdeps/pthread/tst-pthread_cancel-exited.c | 45 +++++
.../pthread/tst-pthread_cancel-select-loop.c | 87 ++++++++++
sysdeps/pthread/tst-pthread_kill-exited.c | 46 +++++
sysdeps/pthread/tst-pthread_kill-exiting.c | 110 ++++++++++++
sysdeps/pthread/tst-thrd-detach.c | 16 +-
sysdeps/sh/nptl/tcb-offsets.sym | 1 -
sysdeps/x86_64/nptl/tcb-offsets.sym | 4 -
54 files changed, 1037 insertions(+), 580 deletions(-)
create mode 100644 nptl/tst-cleanup5.c
create mode 100644 support/support_wait_for_thread_exit.c
delete mode 100644 sysdeps/pthread/tst-kill4.c
create mode 100644 sysdeps/pthread/tst-pthread_cancel-exited.c
create mode 100644 sysdeps/pthread/tst-pthread_cancel-select-loop.c
create mode 100644 sysdeps/pthread/tst-pthread_kill-exited.c
create mode 100644 sysdeps/pthread/tst-pthread_kill-exiting.c
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 01/19] nptl: Fix tst-cancel7 and tst-cancelx7 race condition (BZ #14232)
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 9:33 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 02/19] nptl: Set cancellation type and state on pthread_exit Adhemerval Zanella
` (18 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
A mapped temporary file and a semaphore is used to synchronize the
pid information on the created file, the semaphore is updated once
the file contents is flushed.
Checked on x86_64-linux-gnu.
---
nptl/tst-cancel7.c | 114 ++++++++++++++++++++++-----------------------
1 file changed, 57 insertions(+), 57 deletions(-)
diff --git a/nptl/tst-cancel7.c b/nptl/tst-cancel7.c
index 7a1870ac74..82ac9b9aac 100644
--- a/nptl/tst-cancel7.c
+++ b/nptl/tst-cancel7.c
@@ -18,44 +18,48 @@
#include <errno.h>
#include <fcntl.h>
-#include <pthread.h>
+#include <getopt.h>
#include <signal.h>
-#include <stdio.h>
#include <stdlib.h>
-#include <string.h>
-#include <unistd.h>
-#include <getopt.h>
-
+#include <semaphore.h>
+#include <sys/mman.h>
+
+#include <support/check.h>
+#include <support/support.h>
+#include <support/temp_file.h>
+#include <support/xstdio.h>
+#include <support/xunistd.h>
#include <support/xthread.h>
-const char *command;
-const char *pidfile;
-char pidfilename[] = "/tmp/tst-cancel7-XXXXXX";
+static const char *command;
+static const char *pidfile;
+static const char *semfile;
+static char *pidfilename;
+static char *semfilename;
+
+static sem_t *sem;
static void *
tf (void *arg)
{
- const char *args = " --direct --pidfile ";
- char *cmd = alloca (strlen (command) + strlen (args)
- + strlen (pidfilename) + 1);
-
- strcpy (stpcpy (stpcpy (cmd, command), args), pidfilename);
+ char *cmd = xasprintf ("%s --direct --sem %s --pidfile %s",
+ command, semfilename, pidfilename);
system (cmd);
/* This call should never return. */
return NULL;
}
-
static void
sl (void)
{
- FILE *f = fopen (pidfile, "w");
- if (f == NULL)
- exit (1);
+ FILE *f = xfopen (pidfile, "w");
fprintf (f, "%lld\n", (long long) getpid ());
fflush (f);
+ if (sem_post (sem) != 0)
+ FAIL_EXIT1 ("sem_post: %m");
+
struct flock fl =
{
.l_type = F_WRLCK,
@@ -64,7 +68,7 @@ sl (void)
.l_len = 1
};
if (fcntl (fileno (f), F_SETLK, &fl) != 0)
- exit (1);
+ FAIL_EXIT1 ("fcntl (F_SETFL): %m");
sigset_t ss;
sigfillset (&ss);
@@ -76,57 +80,57 @@ sl (void)
static void
do_prepare (int argc, char *argv[])
{
+ int semfd;
+ if (semfile == NULL)
+ semfd = create_temp_file ("tst-cancel7.", &semfilename);
+ else
+ semfd = open (semfile, O_RDWR);
+ TEST_VERIFY_EXIT (semfd != -1);
+
+ sem = xmmap (NULL, sizeof (sem_t), PROT_READ | PROT_WRITE, MAP_SHARED,
+ semfd);
+ TEST_VERIFY_EXIT (sem != SEM_FAILED);
+ if (semfile == NULL)
+ {
+ xftruncate (semfd, sizeof (sem_t));
+ TEST_VERIFY_EXIT (sem_init (sem, 1, 0) != -1);
+ }
+
if (command == NULL)
command = argv[0];
if (pidfile)
sl ();
- int fd = mkstemp (pidfilename);
+ int fd = create_temp_file ("tst-cancel7-pid-", &pidfilename);
if (fd == -1)
- {
- puts ("mkstemp failed");
- exit (1);
- }
+ FAIL_EXIT1 ("create_temp_file failed: %m");
- write (fd, " ", 1);
- close (fd);
+ xwrite (fd, " ", 1);
+ xclose (fd);
}
static int
do_test (void)
{
- pthread_t th;
- if (pthread_create (&th, NULL, tf, NULL) != 0)
- {
- puts ("pthread_create failed");
- return 1;
- }
+ pthread_t th = xpthread_create (NULL, tf, NULL);
do
- sleep (1);
+ nanosleep (&(struct timespec) { .tv_sec = 0, .tv_nsec = 100000000 }, NULL);
while (access (pidfilename, R_OK) != 0);
xpthread_cancel (th);
void *r = xpthread_join (th);
- sleep (1);
+ if (sem_wait (sem) != 0)
+ FAIL_EXIT1 ("sem_wait: %m");
- FILE *f = fopen (pidfilename, "r+");
- if (f == NULL)
- {
- puts ("no pidfile");
- return 1;
- }
+ FILE *f = xfopen (pidfilename, "r+");
long long ll;
if (fscanf (f, "%lld\n", &ll) != 1)
- {
- puts ("could not read pid");
- unlink (pidfilename);
- return 1;
- }
+ FAIL_EXIT1 ("fscanf: %m");
struct flock fl =
{
@@ -136,11 +140,7 @@ do_test (void)
.l_len = 1
};
if (fcntl (fileno (f), F_GETLK, &fl) != 0)
- {
- puts ("F_GETLK failed");
- unlink (pidfilename);
- return 1;
- }
+ FAIL_EXIT1 ("fcntl: %m");
if (fl.l_type != F_UNLCK)
{
@@ -148,13 +148,10 @@ do_test (void)
if (fl.l_pid == ll)
kill (fl.l_pid, SIGKILL);
- unlink (pidfilename);
return 1;
}
- fclose (f);
-
- unlink (pidfilename);
+ xfclose (f);
return r != PTHREAD_CANCELED;
}
@@ -180,15 +177,15 @@ do_cleanup (void)
fclose (f);
}
-
- unlink (pidfilename);
}
#define OPT_COMMAND 10000
#define OPT_PIDFILE 10001
+#define OPT_SEMFILE 10002
#define CMDLINE_OPTIONS \
{ "command", required_argument, NULL, OPT_COMMAND }, \
- { "pidfile", required_argument, NULL, OPT_PIDFILE },
+ { "pidfile", required_argument, NULL, OPT_PIDFILE }, \
+ { "sem", required_argument, NULL, OPT_SEMFILE },
static void
cmdline_process (int c)
{
@@ -200,6 +197,9 @@ cmdline_process (int c)
case OPT_PIDFILE:
pidfile = optarg;
break;
+ case OPT_SEMFILE:
+ semfile = optarg;
+ break;
}
}
#define CMDLINE_PROCESS cmdline_process
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 02/19] nptl: Set cancellation type and state on pthread_exit
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 01/19] nptl: Fix tst-cancel7 and tst-cancelx7 race condition (BZ #14232) Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 9:38 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 03/19] nptl: Handle robust PI mutexes for !__ASSUME_SET_ROBUST_LIST Adhemerval Zanella
` (17 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
It is required by POSIX XSH 2.9.5 Thread Cancellation under the heading
Thread Cancellation Cleanup Handlers.
Checked x86_64-linux-gnu.
---
nptl/Makefile | 3 +-
nptl/cancellation.c | 7 +-
nptl/pthread_cancel.c | 1 -
nptl/pthread_exit.c | 4 +-
nptl/pthread_testcancel.c | 1 -
nptl/tst-cleanup5.c | 157 ++++++++++++++++++++++++++++++++++++++
sysdeps/nptl/pthreadP.h | 18 ++++-
7 files changed, 180 insertions(+), 11 deletions(-)
create mode 100644 nptl/tst-cleanup5.c
diff --git a/nptl/Makefile b/nptl/Makefile
index ff4d590f11..e1e195cc15 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -306,7 +306,8 @@ tests = tst-attr2 tst-attr3 tst-default-attr \
tst-pthread-gdb-attach tst-pthread-gdb-attach-static \
tst-pthread_exit-nothreads \
tst-pthread_exit-nothreads-static \
- tst-thread-setspecific
+ tst-thread-setspecific \
+ tst-cleanup5
tests-nolibpthread = \
tst-pthread_exit-nothreads \
diff --git a/nptl/cancellation.c b/nptl/cancellation.c
index 05962784d5..6478c029de 100644
--- a/nptl/cancellation.c
+++ b/nptl/cancellation.c
@@ -42,7 +42,6 @@ __pthread_enable_asynccancel (void)
&& !(ch & EXITING_BITMASK)
&& !(ch & TERMINATED_BITMASK))
{
- THREAD_SETMEM (self, result, PTHREAD_CANCELED);
__do_cancel ();
}
@@ -64,3 +63,9 @@ __pthread_disable_asynccancel (int oldtype)
self->canceltype = PTHREAD_CANCEL_DEFERRED;
}
libc_hidden_def (__pthread_disable_asynccancel)
+
+void
+__do_cancel (void)
+{
+ __exit_thread (PTHREAD_CANCELED);
+}
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index cc25ff21f3..38f637c5cf 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -107,7 +107,6 @@ __pthread_cancel (pthread_t th)
__libc_multiple_threads = 1;
#endif
- THREAD_SETMEM (pd, result, PTHREAD_CANCELED);
if (pd->cancelstate == PTHREAD_CANCEL_ENABLE
&& pd->canceltype == PTHREAD_CANCEL_ASYNCHRONOUS)
__do_cancel ();
diff --git a/nptl/pthread_exit.c b/nptl/pthread_exit.c
index 6abf66463e..978f4042ef 100644
--- a/nptl/pthread_exit.c
+++ b/nptl/pthread_exit.c
@@ -32,9 +32,7 @@ __pthread_exit (void *value)
" must be installed for pthread_exit to work\n");
}
- THREAD_SETMEM (THREAD_SELF, result, value);
-
- __do_cancel ();
+ __exit_thread (value);
}
libc_hidden_def (__pthread_exit)
weak_alias (__pthread_exit, pthread_exit)
diff --git a/nptl/pthread_testcancel.c b/nptl/pthread_testcancel.c
index 31185d89f2..f728a35524 100644
--- a/nptl/pthread_testcancel.c
+++ b/nptl/pthread_testcancel.c
@@ -30,7 +30,6 @@ ___pthread_testcancel (void)
&& !(cancelhandling & EXITING_BITMASK)
&& !(cancelhandling & TERMINATED_BITMASK))
{
- THREAD_SETMEM (self, result, PTHREAD_CANCELED);
__do_cancel ();
}
}
diff --git a/nptl/tst-cleanup5.c b/nptl/tst-cleanup5.c
new file mode 100644
index 0000000000..6cc1c141e9
--- /dev/null
+++ b/nptl/tst-cleanup5.c
@@ -0,0 +1,157 @@
+/* Check if cancellation state and type is correctly set on thread exit.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <support/check.h>
+#include <support/xthread.h>
+#include <support/xunistd.h>
+
+static int pipefds[2];
+static pthread_barrier_t b;
+
+static void
+clh (void *arg)
+{
+ /* Although POSIX state setting either the cancellation state or type is
+ undefined during cleanup handler execution, both calls should be safe
+ since none has any side-effect (they should not change current state
+ neither trigger a pending cancellation). */
+
+ int state;
+ TEST_VERIFY (pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, &state) == 0);
+ TEST_COMPARE (state, PTHREAD_CANCEL_DISABLE);
+
+ int type;
+ TEST_VERIFY (pthread_setcanceltype (PTHREAD_CANCEL_DEFERRED, &type) == 0);
+ TEST_COMPARE (type, PTHREAD_CANCEL_DEFERRED);
+}
+
+/* Check if a thread with PTHREAD_CANCEL_DEFERRED cancellation on
+ pthread_cleanup_pop sets the correct state and type as pthread_exit. */
+static void *
+tf_cancel_deferred (void *arg)
+{
+ xpthread_barrier_wait (&b);
+
+ pthread_cleanup_push (clh, NULL);
+
+ char c;
+ read (pipefds[0], &c, 1);
+
+ pthread_cleanup_pop (1);
+
+ return NULL;
+}
+
+/* Check if a thread with PTHREAD_CANCEL_ASYNCHRONOUS cancellation on
+ blocked read() sets the correct state and type as pthread_exit. */
+static void *
+tf_cancel_async (void *arg)
+{
+ TEST_VERIFY (pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, NULL)
+ == 0);
+
+ xpthread_barrier_wait (&b);
+
+ pthread_cleanup_push (clh, NULL);
+
+ char c;
+ read (pipefds[0], &c, 1);
+
+ pthread_cleanup_pop (1);
+
+ return NULL;
+}
+
+/* Check if a thread with PTHREAD_CANCEL_DEFERRED cancellation on
+ blocked read() sets the correct state and type as pthread_exit. */
+static void *
+tf_testcancel (void *arg)
+{
+ xpthread_barrier_wait (&b);
+
+ pthread_cleanup_push (clh, NULL);
+
+ char c;
+ read (pipefds[0], &c, 1);
+
+ pthread_testcancel ();
+
+ pthread_cleanup_pop (1);
+
+ return NULL;
+}
+
+#define EXIT_EXPECTED_VALUE ((void *) 42)
+
+/* Check if a thread with PTHREAD_CANCEL_DEFERRED cancellation on
+ pthread_exit() sets the correct state and type. */
+static void *
+tf_exit (void *arg)
+{
+ xpthread_barrier_wait (&b);
+
+ pthread_cleanup_push (clh, NULL);
+
+ pthread_exit (EXIT_EXPECTED_VALUE);
+
+ pthread_cleanup_pop (1);
+
+ return NULL;
+}
+
+static int
+do_test (void)
+{
+ xpipe (pipefds);
+
+ xpthread_barrier_init (&b, NULL, 2);
+ {
+ pthread_t th = xpthread_create (NULL, tf_cancel_deferred, NULL);
+ xpthread_barrier_wait (&b);
+ xpthread_cancel (th);
+ void *r = xpthread_join (th);
+ TEST_VERIFY (r == PTHREAD_CANCELED);
+ }
+
+ {
+ pthread_t th = xpthread_create (NULL, tf_cancel_async, NULL);
+ xpthread_barrier_wait (&b);
+ xpthread_cancel (th);
+ void *r = xpthread_join (th);
+ TEST_VERIFY (r == PTHREAD_CANCELED);
+ }
+
+ {
+ pthread_t th = xpthread_create (NULL, tf_testcancel, NULL);
+ xpthread_barrier_wait (&b);
+ xpthread_cancel (th);
+ void *r = xpthread_join (th);
+ TEST_VERIFY (r == PTHREAD_CANCELED);
+ }
+
+ {
+ pthread_t th = xpthread_create (NULL, tf_exit, NULL);
+ xpthread_barrier_wait (&b);
+ void *r = xpthread_join (th);
+ TEST_VERIFY (r == EXIT_EXPECTED_VALUE);
+ }
+
+ return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
index 374657a2fd..d968a91cfd 100644
--- a/sysdeps/nptl/pthreadP.h
+++ b/sysdeps/nptl/pthreadP.h
@@ -268,20 +268,30 @@ extern void __pthread_unregister_cancel (__pthread_unwind_buf_t *__buf)
__cleanup_fct_attribute;
libc_hidden_proto (__pthread_unregister_cancel)
-/* Called when a thread reacts on a cancellation request. */
-static inline void
-__attribute ((noreturn, always_inline))
-__do_cancel (void)
+static _Noreturn inline void
+__exit_thread (void *value)
{
struct pthread *self = THREAD_SELF;
/* Make sure we get no more cancellations. */
THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT);
+ THREAD_SETMEM (self, result, value);
+
+ /* It is required by POSIX XSH 2.9.5 Thread Cancellation under the heading
+ Thread Cancellation Cleanup Handlers and also avoid further cancellation
+ wrapper to act on cancellation. */
+ THREAD_SETMEM (self, cancelstate, PTHREAD_CANCEL_DISABLE);
+ THREAD_SETMEM (self, canceltype, PTHREAD_CANCEL_DEFERRED);
+
__pthread_unwind ((__pthread_unwind_buf_t *)
THREAD_GETMEM (self, cleanup_jmp_buf));
}
+/* It is a wrapper over __exit_thread (PTHREAD_CANCELED). It is has its own
+ implementation because it might be called by arch-specific asm code. */
+_Noreturn void __do_cancel (void) attribute_hidden;
+
/* Internal prototypes. */
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 03/19] nptl: Handle robust PI mutexes for !__ASSUME_SET_ROBUST_LIST
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 01/19] nptl: Fix tst-cancel7 and tst-cancelx7 race condition (BZ #14232) Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 02/19] nptl: Set cancellation type and state on pthread_exit Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 9:42 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 04/19] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951) Adhemerval Zanella
` (16 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
The robust PI mutexes are signaled by setting the LSB bit to 1, so
the code requires to take this consideration before access the
__pthread_mutex_s.
The code is also simplified: the initialization code is not really
required, PD->robust_head.list and PD->robust_list.__next are
essentially the same regardless of __PTHREAD_MUTEX_HAVE_PREV, the futex
wake is optimized to be issued only when required, and the futex shared
bit is set only when required.
Checked on a build for m68k-linux-gnu. I also checked on
x86_64-linux-gnu by removing the check for !__ASSUME_SET_ROBUST_LIST.
---
nptl/pthread_create.c | 53 ++++++++++++++++++++++---------------------
1 file changed, 27 insertions(+), 26 deletions(-)
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index d8ec299cb1..08e5189ad6 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -486,35 +486,36 @@ start_thread (void *arg)
exit (0);
#ifndef __ASSUME_SET_ROBUST_LIST
- /* If this thread has any robust mutexes locked, handle them now. */
-# if __PTHREAD_MUTEX_HAVE_PREV
- void *robust = pd->robust_head.list;
-# else
- __pthread_slist_t *robust = pd->robust_list.__next;
-# endif
- /* We let the kernel do the notification if it is able to do so.
- If we have to do it here there for sure are no PI mutexes involved
- since the kernel support for them is even more recent. */
- if (!__nptl_set_robust_list_avail
- && __builtin_expect (robust != (void *) &pd->robust_head, 0))
+ /* We let the kernel do the notification if it is able to do so on the exit
+ syscall. Otherwise we need to handle before the thread terminates. */
+ void **robust;
+ while ((robust = pd->robust_head.list)
+ && robust != (void *) &pd->robust_head)
{
- do
+ /* Note: robust PI futexes are signaled by setting bit 0. */
+ void **robustp = (void **) ((uintptr_t) robust & ~1UL);
+
+ struct __pthread_mutex_s *mtx = (struct __pthread_mutex_s *)
+ ((char *) robustp - offsetof (struct __pthread_mutex_s,
+ __list.__next));
+ unsigned int nusers = mtx->__nusers;
+ int shared = mtx->__kind & 128;
+
+ pd->robust_head.list_op_pending = robust;
+ pd->robust_head.list = *robustp;
+ /* Although the list will not be changed at this point, it follows the
+ expected kernel ABI. */
+ __asm ("" ::: "memory");
+
+ int lock = atomic_exchange_relaxed (&mtx->__lock, FUTEX_OWNER_DIED);
+ /* Wake any users if mutex is acquired with potential users. */
+ if (lock > 1 || nusers != 0)
{
- struct __pthread_mutex_s *this = (struct __pthread_mutex_s *)
- ((char *) robust - offsetof (struct __pthread_mutex_s,
- __list.__next));
- robust = *((void **) robust);
-
-# if __PTHREAD_MUTEX_HAVE_PREV
- this->__list.__prev = NULL;
-# endif
- this->__list.__next = NULL;
-
- atomic_or (&this->__lock, FUTEX_OWNER_DIED);
- futex_wake ((unsigned int *) &this->__lock, 1,
- /* XYZ */ FUTEX_SHARED);
+ if ((uintptr_t) robust & 1)
+ futex_unlock_pi ((unsigned int *) &mtx->__lock, shared);
+ else
+ futex_wake ((unsigned int *) &mtx->__lock, 1, shared);
}
- while (robust != (void *) &pd->robust_head);
}
#endif
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 04/19] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (2 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 03/19] nptl: Handle robust PI mutexes for !__ASSUME_SET_ROBUST_LIST Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 10:41 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 05/19] nptl: Move setxid flag out of cancelhandling Adhemerval Zanella
` (15 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
The use after free described in BZ#19951 is due the use of two different
PD fields, 'joinid' and 'cancelhandling', to describe the thread state
to synchronize the calls of pthread_join(), pthread_detach(),
pthread_exit(), and normal thread exit.
And any state change potentially requires to check for both fields
atomically to handle partial state (such as pthread_join() with a
cancellation handler to issue a 'joinstate' field rollback).
This patch uses a different PD member with 4 possible states (JOINABLE,
DETACHED, EXITING, and EXITED) instead of pthread 'tid' field:
1. On pthread_create() the inital state is set either to JOINABLE or
DETACHED depending of the pthread attribute used.
2. On pthread_detach(), a CAS is issued on the state. If the CAS
fails it means that thread is already detached (DETACHED) or is
being terminated (EXITING). For former an EINVAL is returned,
while for latter pthread_detach() should be reponsible to join the
thread (and deallocate any internal resources).
3. On pthread_create() exit phase (reached either if the thread
function has returned, pthread_exit() has being called, or
cancellation handled has been acted upon) we issue a CAS on state
to set to EXITING mode. If the thread is previously on DETACHED
mode it will be the thread itself responsible to deallocate any
resource, otherwise the threads needs to be joined.
4. The clear_tid_field on 'clone' call is changed to set the new
'state' field on thread exit (EXITED). This state ins only
reached at thread termination.
4. The pthread_join() implementation is now simpler: the futex wait
is done directly on thread state and there is no need to reset it
in case of timeout (since the state is now set either by
pthread_detach() or by the kernel on process termination).
The race condition on pthread_detach is avoided with only one atomic
operation on PD state: once the mode is set to THREAD_STATE_DETACHED
it is up to thread itself to deallocate its memory (done on the exit
phase at pthread_create()).
Also, the INVALID_NOT_TERMINATED_TD_P is removed since a a negative
tid is not possible and the macro is not used anywhere.
This change trigger an invalid C11 thread tests: it crates a thread,
which detaches itself, and after a timeout the creating thread checks
if the join fails. The issue is once thrd_join() is called the thread
lifetime has already expired. The test is changed so the sleep is done
by the thread itself, so the creating thread will try to join a valid
thread.
Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
arm-linux-gnueabihf, and powerpc64-linux-gnu.
---
nptl/descr.h | 26 +++---
nptl/nptl-stack.h | 2 +-
nptl/pthread_cancel.c | 3 +-
nptl/pthread_clockjoin.c | 2 +-
nptl/pthread_create.c | 54 ++++++++----
nptl/pthread_detach.c | 36 +++-----
nptl/pthread_getattr_np.c | 2 +-
nptl/pthread_join.c | 2 +-
nptl/pthread_join_common.c | 130 ++++++++++------------------
nptl/pthread_timedjoin.c | 2 +-
nptl/pthread_tryjoin.c | 18 ++--
sysdeps/nptl/dl-tls_init_tp.c | 4 +-
sysdeps/nptl/libc_start_call_main.h | 7 ++
sysdeps/nptl/pthreadP.h | 3 +-
sysdeps/pthread/tst-thrd-detach.c | 16 ++--
15 files changed, 144 insertions(+), 163 deletions(-)
diff --git a/nptl/descr.h b/nptl/descr.h
index c85778d449..4b2db6edab 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -124,6 +124,18 @@ struct priority_protection_data
};
+/* Define a possible thread state on 'joinstate' field. The value will be
+ cleared by the kernel when the thread terminates (CLONE_CHILD_CLEARTID),
+ so THREAD_STATE_EXITED must be 0. */
+enum
+ {
+ THREAD_STATE_EXITED = 0,
+ THREAD_STATE_EXITING,
+ THREAD_STATE_JOINABLE,
+ THREAD_STATE_DETACHED,
+ };
+
+
/* Thread descriptor data structure. */
struct pthread
{
@@ -166,8 +178,7 @@ struct pthread
GL (dl_stack_user) list. */
list_t list;
- /* Thread ID - which is also a 'is this thread descriptor (and
- therefore stack) used' flag. */
+ /* Thread ID set by the kernel with CLONE_PARENT_SETTID. */
pid_t tid;
/* Ununsed. */
@@ -335,15 +346,8 @@ struct pthread
hp_timing_t cpuclock_offset_ununsed;
#endif
- /* If the thread waits to join another one the ID of the latter is
- stored here.
-
- In case a thread is detached this field contains a pointer of the
- TCB if the thread itself. This is something which cannot happen
- in normal operation. */
- struct pthread *joinid;
- /* Check whether a thread is detached. */
-#define IS_DETACHED(pd) ((pd)->joinid == (pd))
+ /* The current thread state defined by the THREAD_STATE_* enumeration. */
+ unsigned int joinstate;
/* The result of the thread function. */
void *result;
diff --git a/nptl/nptl-stack.h b/nptl/nptl-stack.h
index 8bcfde7ec5..c6b2b3078f 100644
--- a/nptl/nptl-stack.h
+++ b/nptl/nptl-stack.h
@@ -32,7 +32,7 @@ extern size_t __nptl_stack_cache_maxsize attribute_hidden;
static inline bool
__nptl_stack_in_use (struct pthread *pd)
{
- return pd->tid <= 0;
+ return atomic_load_relaxed (&pd->joinstate) == THREAD_STATE_EXITED;
}
/* Remove the stack ELEM from its list. */
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index 38f637c5cf..67e00ef007 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -63,7 +63,8 @@ __pthread_cancel (pthread_t th)
volatile struct pthread *pd = (volatile struct pthread *) th;
/* Make sure the descriptor is valid. */
- if (INVALID_TD_P (pd))
+ int state = atomic_load_acquire (&pd->joinstate);
+ if (state == THREAD_STATE_EXITED || state == THREAD_STATE_EXITING)
/* Not a valid thread handle. */
return ESRCH;
diff --git a/nptl/pthread_clockjoin.c b/nptl/pthread_clockjoin.c
index 2d01ba03a2..76598ff0fa 100644
--- a/nptl/pthread_clockjoin.c
+++ b/nptl/pthread_clockjoin.c
@@ -30,7 +30,7 @@ ___pthread_clockjoin_np64 (pthread_t threadid, void **thread_return,
return EINVAL;
return __pthread_clockjoin_ex (threadid, thread_return,
- clockid, abstime, true);
+ clockid, abstime);
}
#if __TIMESIZE == 64
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 08e5189ad6..763e32bc3e 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -286,7 +286,7 @@ static int create_thread (struct pthread *pd, const struct pthread_attr *attr,
.flags = clone_flags,
.pidfd = (uintptr_t) &pd->tid,
.parent_tid = (uintptr_t) &pd->tid,
- .child_tid = (uintptr_t) &pd->tid,
+ .child_tid = (uintptr_t) &pd->joinstate,
.stack = (uintptr_t) stackaddr,
.stack_size = stacksize,
.tls = (uintptr_t) tp,
@@ -351,13 +351,16 @@ start_thread (void *arg)
and free any resource prior return to the pthread_create caller. */
setup_failed = pd->setup_failed == 1;
if (setup_failed)
- pd->joinid = NULL;
+ pd->joinstate = THREAD_STATE_JOINABLE;
/* And give it up right away. */
lll_unlock (pd->lock, LLL_PRIVATE);
if (setup_failed)
- goto out;
+ {
+ pd->tid = 0;
+ goto out;
+ }
}
/* Initialize resolver state pointer. */
@@ -481,9 +484,20 @@ start_thread (void *arg)
the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE. */
atomic_bit_set (&pd->cancelhandling, EXITING_BIT);
- if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
- /* This was the last thread. */
- exit (0);
+
+ /* CONCURRENCY NOTES:
+
+ Concurrent pthread_detach() will either set state to
+ THREAD_STATE_DETACHED or wait the thread to terminate. The exiting state
+ set here is set so a pthread_join() wait until all the required cleanup
+ steps are done.
+
+ The 'joinstate' field will be used to determine who is responsible to
+ call __nptl_free_tcb below. */
+
+ unsigned int joinstate = THREAD_STATE_JOINABLE;
+ atomic_compare_exchange_weak_acquire (&pd->joinstate, &joinstate,
+ THREAD_STATE_EXITING);
#ifndef __ASSUME_SET_ROBUST_LIST
/* We let the kernel do the notification if it is able to do so on the exit
@@ -519,6 +533,10 @@ start_thread (void *arg)
}
#endif
+ if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
+ /* This was the last thread. */
+ exit (0);
+
if (!pd->user_stack)
advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
pd->guardsize);
@@ -539,17 +557,17 @@ start_thread (void *arg)
pd->setxid_futex = 0;
}
- /* If the thread is detached free the TCB. */
- if (IS_DETACHED (pd))
- /* Free the TCB. */
+ if (joinstate == THREAD_STATE_DETACHED)
__nptl_free_tcb (pd);
+ pd->tid = 0;
+
out:
/* We cannot call '_exit' here. '_exit' will terminate the process.
The 'exit' implementation in the kernel will signal when the
process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID
- flag. The 'tid' field in the TCB will be set to zero.
+ flag. The 'joinstate' field in the TCB will be set to zero.
The exit code is zero since in case all threads exit by calling
'pthread_exit' the exit status must be 0 (zero). */
@@ -644,10 +662,9 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
pd->flags = ((iattr->flags & ~(ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET))
| (self->flags & (ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET)));
- /* Initialize the field for the ID of the thread which is waiting
- for us. This is a self-reference in case the thread is created
- detached. */
- pd->joinid = iattr->flags & ATTR_FLAG_DETACHSTATE ? pd : NULL;
+ pd->joinstate = iattr->flags & ATTR_FLAG_DETACHSTATE
+ ? THREAD_STATE_DETACHED
+ : THREAD_STATE_JOINABLE;
/* The debug events are inherited from the parent. */
pd->eventbuf = self->eventbuf;
@@ -806,10 +823,11 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
/* Similar to pthread_join, but since thread creation has failed at
startup there is no need to handle all the steps. */
- pid_t tid;
- while ((tid = atomic_load_acquire (&pd->tid)) != 0)
- __futex_abstimed_wait_cancelable64 ((unsigned int *) &pd->tid,
- tid, 0, NULL, LLL_SHARED);
+ unsigned int state;
+ while ((state = atomic_load_acquire (&pd->joinstate))
+ != THREAD_STATE_EXITED)
+ __futex_abstimed_wait_cancelable64 (&pd->joinstate, state, 0,
+ NULL, LLL_SHARED);
}
/* State (c) or (d) and we have ownership of PD (see CONCURRENCY
diff --git a/nptl/pthread_detach.c b/nptl/pthread_detach.c
index ac50db9b0e..97d292cab8 100644
--- a/nptl/pthread_detach.c
+++ b/nptl/pthread_detach.c
@@ -26,32 +26,24 @@ ___pthread_detach (pthread_t th)
{
struct pthread *pd = (struct pthread *) th;
- /* Make sure the descriptor is valid. */
- if (INVALID_NOT_TERMINATED_TD_P (pd))
- /* Not a valid thread handle. */
- return ESRCH;
+ /* CONCURRENCY NOTES:
- int result = 0;
+ Concurrent pthread_detach() will return EINVAL for the case the thread
+ is already detached (THREAD_STATE_DETACHED). POSIX states it is
+ undefined to call pthread_detach if TH refers to a non joinable thread.
- /* Mark the thread as detached. */
- if (atomic_compare_and_exchange_bool_acq (&pd->joinid, pd, NULL))
+ For the case the thread is being terminated (THREAD_STATE_EXITING),
+ pthread_detach() will responsible to clean up the stack. */
+
+ int curstate = THREAD_STATE_JOINABLE;
+ if (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &curstate,
+ THREAD_STATE_DETACHED))
{
- /* There are two possibilities here. First, the thread might
- already be detached. In this case we return EINVAL.
- Otherwise there might already be a waiter. The standard does
- not mention what happens in this case. */
- if (IS_DETACHED (pd))
- result = EINVAL;
+ if (curstate == THREAD_STATE_DETACHED)
+ return EINVAL;
+ return __pthread_join (th, 0);
}
- else
- /* Check whether the thread terminated meanwhile. In this case we
- will just free the TCB. */
- if ((pd->cancelhandling & EXITING_BITMASK) != 0)
- /* Note that the code in __free_tcb makes sure each thread
- control block is freed only once. */
- __nptl_free_tcb (pd);
-
- return result;
+ return 0;
}
versioned_symbol (libc, ___pthread_detach, pthread_detach, GLIBC_2_34);
libc_hidden_ver (___pthread_detach, __pthread_detach)
diff --git a/nptl/pthread_getattr_np.c b/nptl/pthread_getattr_np.c
index 25807cb529..560ba9ab98 100644
--- a/nptl/pthread_getattr_np.c
+++ b/nptl/pthread_getattr_np.c
@@ -53,7 +53,7 @@ __pthread_getattr_np (pthread_t thread_id, pthread_attr_t *attr)
iattr->flags = thread->flags;
/* The thread might be detached by now. */
- if (IS_DETACHED (thread))
+ if (atomic_load_acquire (&thread->joinstate) == THREAD_STATE_DETACHED)
iattr->flags |= ATTR_FLAG_DETACHSTATE;
/* This is the guardsize after adjusting it. */
diff --git a/nptl/pthread_join.c b/nptl/pthread_join.c
index d2b33de73d..195a537029 100644
--- a/nptl/pthread_join.c
+++ b/nptl/pthread_join.c
@@ -23,7 +23,7 @@ int
___pthread_join (pthread_t threadid, void **thread_return)
{
return __pthread_clockjoin_ex (threadid, thread_return, 0 /* Ignored */,
- NULL, true);
+ NULL);
}
versioned_symbol (libc, ___pthread_join, pthread_join, GLIBC_2_34);
libc_hidden_ver (___pthread_join, __pthread_join)
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
index 7303069316..428ed35101 100644
--- a/nptl/pthread_join_common.c
+++ b/nptl/pthread_join_common.c
@@ -22,113 +22,73 @@
#include <time.h>
#include <futex-internal.h>
-static void
-cleanup (void *arg)
+/* Check for a possible deadlock situation where the threads are waiting for
+ each other to finish. Note that this is a "may" error. To be 100% sure we
+ catch this error we would have to lock the data structures but it is not
+ necessary. In the unlikely case that two threads are really caught in this
+ situation they will deadlock. It is the programmer's problem to figure
+ this out. */
+static inline bool
+check_for_deadlock (int state, struct pthread *pd)
{
- /* If we already changed the waiter ID, reset it. The call cannot
- fail for any reason but the thread not having done that yet so
- there is no reason for a loop. */
struct pthread *self = THREAD_SELF;
- atomic_compare_exchange_weak_acquire (&arg, &self, NULL);
+ return ((pd == self
+ || (atomic_load_acquire (&self->joinstate) == THREAD_STATE_DETACHED
+ && (pd->cancelhandling
+ & (CANCELED_BITMASK | EXITING_BITMASK
+ | TERMINATED_BITMASK)) == 0))
+ && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
+ && (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
+ | TERMINATED_BITMASK))
+ == CANCELED_BITMASK));
}
int
__pthread_clockjoin_ex (pthread_t threadid, void **thread_return,
clockid_t clockid,
- const struct __timespec64 *abstime, bool block)
+ const struct __timespec64 *abstime)
{
struct pthread *pd = (struct pthread *) threadid;
- /* Make sure the descriptor is valid. */
- if (INVALID_NOT_TERMINATED_TD_P (pd))
- /* Not a valid thread handle. */
- return ESRCH;
-
- /* Is the thread joinable?. */
- if (IS_DETACHED (pd))
- /* We cannot wait for the thread. */
- return EINVAL;
-
- struct pthread *self = THREAD_SELF;
- int result = 0;
-
LIBC_PROBE (pthread_join, 1, threadid);
- if ((pd == self
- || (self->joinid == pd
- && (pd->cancelhandling
- & (CANCELED_BITMASK | EXITING_BITMASK
- | TERMINATED_BITMASK)) == 0))
- && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
- && (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
- | TERMINATED_BITMASK))
- == CANCELED_BITMASK))
- /* This is a deadlock situation. The threads are waiting for each
- other to finish. Note that this is a "may" error. To be 100%
- sure we catch this error we would have to lock the data
- structures but it is not necessary. In the unlikely case that
- two threads are really caught in this situation they will
- deadlock. It is the programmer's problem to figure this
- out. */
- return EDEADLK;
-
- /* Wait for the thread to finish. If it is already locked something
- is wrong. There can only be one waiter. */
- else if (__glibc_unlikely (atomic_compare_exchange_weak_acquire (&pd->joinid,
- &self,
- NULL)))
- /* There is already somebody waiting for the thread. */
- return EINVAL;
-
- /* BLOCK waits either indefinitely or based on an absolute time. POSIX also
- states a cancellation point shall occur for pthread_join, and we use the
- same rationale for posix_timedjoin_np. Both clockwait_tid and the futex
- call use the cancellable variant. */
- if (block)
+ int result = 0;
+ unsigned int state;
+ while ((state = atomic_load_acquire (&pd->joinstate))
+ != THREAD_STATE_EXITED)
{
- /* During the wait we change to asynchronous cancellation. If we
- are cancelled the thread we are waiting for must be marked as
- un-wait-ed for again. */
- pthread_cleanup_push (cleanup, &pd->joinid);
-
- /* We need acquire MO here so that we synchronize with the
- kernel's store to 0 when the clone terminates. (see above) */
- pid_t tid;
- while ((tid = atomic_load_acquire (&pd->tid)) != 0)
- {
- /* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via
- futex wake-up when the clone terminates. The memory location
- contains the thread ID while the clone is running and is reset to
- zero by the kernel afterwards. The kernel up to version 3.16.3
- does not use the private futex operations for futex wake-up when
- the clone terminates. */
- int ret = __futex_abstimed_wait_cancelable64 (
- (unsigned int *) &pd->tid, tid, clockid, abstime, LLL_SHARED);
- if (ret == ETIMEDOUT || ret == EOVERFLOW)
- {
- result = ret;
- break;
- }
+ if (check_for_deadlock (state, pd))
+ return EDEADLK;
+
+ /* POSIX states calling pthread_join() on a non joinable thread is
+ undefined. However, if PD is still in the cache we can still warn
+ the caller. */
+ if (state == THREAD_STATE_DETACHED)
+ return EINVAL;
+
+ /* pthread_join() is a cancellation entrypoint and we use the same
+ rationale for pthread_timedjoin_np().
+
+ The kernel notifies a process which uses CLONE_CHILD_CLEARTID via
+ a memory zerouing and futex wake up when the process terminates.
+ The futex operation is not private. */
+ int ret = __futex_abstimed_wait_cancelable64 (&pd->joinstate, state,
+ clockid, abstime,
+ LLL_SHARED);
+ if (ret == ETIMEDOUT || ret == EOVERFLOW)
+ {
+ result = ret;
+ break;
}
-
- pthread_cleanup_pop (0);
}
void *pd_result = pd->result;
- if (__glibc_likely (result == 0))
+ if (result == 0)
{
- /* We mark the thread as terminated and as joined. */
- pd->tid = -1;
-
- /* Store the return value if the caller is interested. */
if (thread_return != NULL)
*thread_return = pd_result;
-
- /* Free the TCB. */
__nptl_free_tcb (pd);
}
- else
- pd->joinid = NULL;
LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd_result);
diff --git a/nptl/pthread_timedjoin.c b/nptl/pthread_timedjoin.c
index 0b4026612f..5d561fb4bb 100644
--- a/nptl/pthread_timedjoin.c
+++ b/nptl/pthread_timedjoin.c
@@ -25,7 +25,7 @@ ___pthread_timedjoin_np64 (pthread_t threadid, void **thread_return,
const struct __timespec64 *abstime)
{
return __pthread_clockjoin_ex (threadid, thread_return,
- CLOCK_REALTIME, abstime, true);
+ CLOCK_REALTIME, abstime);
}
#if __TIMESIZE == 64
diff --git a/nptl/pthread_tryjoin.c b/nptl/pthread_tryjoin.c
index fd938e8780..726f2abc68 100644
--- a/nptl/pthread_tryjoin.c
+++ b/nptl/pthread_tryjoin.c
@@ -22,15 +22,17 @@
int
__pthread_tryjoin_np (pthread_t threadid, void **thread_return)
{
- /* Return right away if the thread hasn't terminated yet. */
- struct pthread *pd = (struct pthread *) threadid;
- if (pd->tid != 0)
- return EBUSY;
+ /* The joinable state (THREAD_STATE_JOINABLE) is straigthforward since the
+ thread hasn't finished yet and trying to join might block.
+ Both detached (THREAD_STATE_DETACHED) and exiting (THREAD_STATE_EXITING)
+ might also result in a possible blocking call: a detached thread might
+ change its state to exiting and a exiting thread my take some time to
+ exit (and thus let the kernel set the state to THREAD_STATE_EXITED). */
- /* If pd->tid == 0 then lll_wait_tid will not block on futex
- operation. */
- return __pthread_clockjoin_ex (threadid, thread_return, 0 /* Ignored */,
- NULL, false);
+ struct pthread *pd = (struct pthread *) threadid;
+ return atomic_load_acquire (&pd->joinstate) != THREAD_STATE_EXITED
+ ? EBUSY
+ : __pthread_clockjoin_ex (threadid, thread_return, 0, NULL);
}
versioned_symbol (libc, __pthread_tryjoin_np, pthread_tryjoin_np, GLIBC_2_34);
diff --git a/sysdeps/nptl/dl-tls_init_tp.c b/sysdeps/nptl/dl-tls_init_tp.c
index ca494dd3a5..35d54abd92 100644
--- a/sysdeps/nptl/dl-tls_init_tp.c
+++ b/sysdeps/nptl/dl-tls_init_tp.c
@@ -62,7 +62,7 @@ __tls_init_tp (void)
/* Early initialization of the TCB. */
struct pthread *pd = THREAD_SELF;
- pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->tid);
+ pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->joinstate);
THREAD_SETMEM (pd, specific[0], &pd->specific_1stblock[0]);
THREAD_SETMEM (pd, user_stack, true);
@@ -97,4 +97,6 @@ __tls_init_tp (void)
THREAD_SETMEM (pd, cancelstate, PTHREAD_CANCEL_ENABLE);
THREAD_SETMEM (pd, canceltype, PTHREAD_CANCEL_DEFERRED);
+
+ THREAD_SETMEM (pd, joinstate, THREAD_STATE_JOINABLE);
}
diff --git a/sysdeps/nptl/libc_start_call_main.h b/sysdeps/nptl/libc_start_call_main.h
index 06d72c1e38..e7a7f7b4c4 100644
--- a/sysdeps/nptl/libc_start_call_main.h
+++ b/sysdeps/nptl/libc_start_call_main.h
@@ -18,6 +18,7 @@
#include <atomic.h>
#include <pthreadP.h>
+#include <futex-internal.h>
_Noreturn static void
__libc_start_call_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
@@ -65,6 +66,12 @@ __libc_start_call_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
/* One less thread. Decrement the counter. If it is zero we
terminate the entire process. */
result = 0;
+
+ /* For the case a thread is waiting for the main thread to finish. */
+ struct pthread *self = THREAD_SELF;
+ atomic_store_release (&self->joinstate, THREAD_STATE_EXITED);
+ futex_wake (&self->joinstate, 1, FUTEX_SHARED);
+
if (! atomic_decrement_and_test (&__nptl_nthreads))
/* Not much left to do but to exit the thread, not the process. */
while (1)
diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
index d968a91cfd..e89c1b92ae 100644
--- a/sysdeps/nptl/pthreadP.h
+++ b/sysdeps/nptl/pthreadP.h
@@ -243,7 +243,6 @@ libc_hidden_proto (__pthread_current_priority)
nothing. And if the test triggers the thread descriptor is
guaranteed to be invalid. */
#define INVALID_TD_P(pd) __builtin_expect ((pd)->tid <= 0, 0)
-#define INVALID_NOT_TERMINATED_TD_P(pd) __builtin_expect ((pd)->tid < 0, 0)
extern void __pthread_unwind (__pthread_unwind_buf_t *__buf)
__cleanup_fct_attribute __attribute ((__noreturn__))
@@ -537,7 +536,7 @@ libc_hidden_proto (__pthread_setcanceltype)
extern void __pthread_testcancel (void);
libc_hidden_proto (__pthread_testcancel)
extern int __pthread_clockjoin_ex (pthread_t, void **, clockid_t,
- const struct __timespec64 *, bool)
+ const struct __timespec64 *)
attribute_hidden;
extern int __pthread_sigmask (int, const sigset_t *, sigset_t *);
libc_hidden_proto (__pthread_sigmask);
diff --git a/sysdeps/pthread/tst-thrd-detach.c b/sysdeps/pthread/tst-thrd-detach.c
index c844767748..e1906a0e10 100644
--- a/sysdeps/pthread/tst-thrd-detach.c
+++ b/sysdeps/pthread/tst-thrd-detach.c
@@ -20,14 +20,14 @@
#include <time.h>
#include <stdio.h>
#include <unistd.h>
-
+#include <limits.h>
#include <support/check.h>
static int
detach_thrd (void *arg)
{
- if (thrd_detach (thrd_current ()) != thrd_success)
- FAIL_EXIT1 ("thrd_detach failed");
+ thrd_sleep (&(struct timespec) { .tv_sec = INT_MAX }, NULL);
+
thrd_exit (thrd_success);
}
@@ -36,15 +36,11 @@ do_test (void)
{
thrd_t id;
- /* Create new thread. */
- if (thrd_create (&id, detach_thrd, NULL) != thrd_success)
- FAIL_EXIT1 ("thrd_create failed");
+ TEST_COMPARE (thrd_create (&id, detach_thrd, NULL), thrd_success);
- /* Give some time so the thread can finish. */
- thrd_sleep (&(struct timespec) {.tv_sec = 2}, NULL);
+ TEST_COMPARE (thrd_detach (id), thrd_success);
- if (thrd_join (id, NULL) == thrd_success)
- FAIL_EXIT1 ("thrd_join succeed where it should fail");
+ TEST_COMPARE (thrd_join (id, NULL), thrd_error);
return 0;
}
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 05/19] nptl: Move setxid flag out of cancelhandling
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (3 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 04/19] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951) Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 11:34 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 06/19] nptl: Replace struct thread cancelhandling field Adhemerval Zanella
` (14 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Now that the thread state is tracked by 'joinstate' field, there is no
need to keep the setxid flag within the 'cancelhandling'. It simplifies
the atomic code to set and reset it (since there is no need to handle
the thread state concurrent update).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
---
nptl/allocatestack.c | 3 +++
nptl/descr.h | 5 ++---
nptl/nptl_setxid.c | 49 ++++++++++---------------------------------
nptl/pthread_create.c | 4 ++--
4 files changed, 18 insertions(+), 43 deletions(-)
diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index cfe37a3443..ccb1aa21c1 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -99,6 +99,7 @@ get_cached_stack (size_t *sizep, void **memp)
}
/* Don't allow setxid until cloned. */
+ result->setxid_flag = 0;
result->setxid_futex = -1;
/* Dequeue the entry. */
@@ -301,6 +302,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
#endif
/* Don't allow setxid until cloned. */
+ pd->setxid_flag = 0;
pd->setxid_futex = -1;
/* Allocate the DTV for this thread. */
@@ -422,6 +424,7 @@ allocate_stack (const struct pthread_attr *attr, struct pthread **pdp,
#endif
/* Don't allow setxid until cloned. */
+ pd->setxid_flag = 0;
pd->setxid_futex = -1;
/* Allocate the DTV for this thread. */
diff --git a/nptl/descr.h b/nptl/descr.h
index 4b2db6edab..563b152bff 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -297,9 +297,6 @@ struct pthread
/* Bit set if thread terminated and TCB is freed. */
#define TERMINATED_BIT 5
#define TERMINATED_BITMASK (0x01 << TERMINATED_BIT)
- /* Bit set if thread is supposed to change XID. */
-#define SETXID_BIT 6
-#define SETXID_BITMASK (0x01 << SETXID_BIT)
/* Flags. Including those copied from the thread attribute. */
int flags;
@@ -339,6 +336,8 @@ struct pthread
/* Lock to synchronize access to the descriptor. */
int lock;
+ /* Indicate whether thread is supposed to change XID. */
+ int setxid_flag;
/* Lock for synchronizing setxid calls. */
unsigned int setxid_futex;
diff --git a/nptl/nptl_setxid.c b/nptl/nptl_setxid.c
index 2f35772411..c337fa8577 100644
--- a/nptl/nptl_setxid.c
+++ b/nptl/nptl_setxid.c
@@ -76,14 +76,7 @@ __nptl_setxid_sighandler (int sig, siginfo_t *si, void *ctx)
/* Reset the SETXID flag. */
struct pthread *self = THREAD_SELF;
- int flags, newval;
- do
- {
- flags = THREAD_GETMEM (self, cancelhandling);
- newval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling,
- flags & ~SETXID_BITMASK, flags);
- }
- while (flags != newval);
+ atomic_store_release (&self->setxid_flag, 0);
/* And release the futex. */
self->setxid_futex = 1;
@@ -97,8 +90,6 @@ libc_hidden_def (__nptl_setxid_sighandler)
static void
setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
{
- int ch;
-
/* Wait until this thread is cloned. */
if (t->setxid_futex == -1
&& ! atomic_compare_and_exchange_bool_acq (&t->setxid_futex, -2, -1))
@@ -109,41 +100,23 @@ setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
/* Don't let the thread exit before the setxid handler runs. */
t->setxid_futex = 0;
- do
- {
- ch = t->cancelhandling;
+ /* If thread is exiting right now, ignore it. */
+ if (atomic_load_acquire (&t->joinstate) == THREAD_STATE_EXITING)
+ return;
- /* If the thread is exiting right now, ignore it. */
- if ((ch & EXITING_BITMASK) != 0)
- {
- /* Release the futex if there is no other setxid in
- progress. */
- if ((ch & SETXID_BITMASK) == 0)
- {
- t->setxid_futex = 1;
- futex_wake (&t->setxid_futex, 1, FUTEX_PRIVATE);
- }
- return;
- }
+ /* Release the futex if there is no other setxid in progress. */
+ if (atomic_exchange_acquire (&t->setxid_flag, 1) == 0)
+ {
+ t->setxid_futex = 1;
+ futex_wake (&t->setxid_futex, 1, FUTEX_PRIVATE);
}
- while (atomic_compare_and_exchange_bool_acq (&t->cancelhandling,
- ch | SETXID_BITMASK, ch));
}
static void
setxid_unmark_thread (struct xid_command *cmdp, struct pthread *t)
{
- int ch;
-
- do
- {
- ch = t->cancelhandling;
- if ((ch & SETXID_BITMASK) == 0)
- return;
- }
- while (atomic_compare_and_exchange_bool_acq (&t->cancelhandling,
- ch & ~SETXID_BITMASK, ch));
+ atomic_exchange_release (&t->setxid_flag, 0);
/* Release the futex just in case. */
t->setxid_futex = 1;
@@ -154,7 +127,7 @@ setxid_unmark_thread (struct xid_command *cmdp, struct pthread *t)
static int
setxid_signal_thread (struct xid_command *cmdp, struct pthread *t)
{
- if ((t->cancelhandling & SETXID_BITMASK) == 0)
+ if (atomic_load_relaxed (&t->setxid_flag) == 0)
return 0;
int val;
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 763e32bc3e..c00d6ae00c 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -541,7 +541,7 @@ start_thread (void *arg)
advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
pd->guardsize);
- if (__glibc_unlikely (pd->cancelhandling & SETXID_BITMASK))
+ if (__glibc_unlikely (atomic_load_relaxed (&pd->setxid_flag) == 1))
{
/* Some other thread might call any of the setXid functions and expect
us to reply. In this case wait until we did that. */
@@ -551,7 +551,7 @@ start_thread (void *arg)
condition used in the surrounding loop (cancelhandling). We need
to check and document why this is correct. */
futex_wait_simple (&pd->setxid_futex, 0, FUTEX_PRIVATE);
- while (pd->cancelhandling & SETXID_BITMASK);
+ while (atomic_load_relaxed (&pd->setxid_flag) == 1);
/* Reset the value so that the stack can be reused. */
pd->setxid_futex = 0;
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 06/19] nptl: Replace struct thread cancelhandling field
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (4 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 05/19] nptl: Move setxid flag out of cancelhandling Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 14:34 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 07/19] support: Add support_wait_for_thread_exit Adhemerval Zanella
` (13 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Now that both thread state and setxid is being tracked by two different
fields ('joinstate' and 'setxid_flag'), there is no need to keep track
of exiting (EXITING_BIT) and terminated (TERMINATED_BIT) state in
'cancelhandling'.
The 'cancelhandling' field is renamed to 'cancelstatus' and it only
signals whether the thread has been cancelled (set by pthread_cancel()).
It allows simplify the atomic operation required to avoid CAS
operations.
There is no need to check the thread state on
__pthread_enable_asynccancel() nor on pthread_testcancel () anymore,
since cancellation is explicit disabled when thread start the exit code
on __do_cancel().
On __nptl_free_tcp(), the 'joinstate' now defines whether it is the
creating thread or the created thread that calls it. So there is
no concurrent call within the function and thus no need to set the
TERMINATED_BIT.
For SIGCANCEL handler, sigcancel_handler(), 'joinstate' is used instead
(pthread_cancel() might still be called concurrenty in assynchronous
mode).
Checked on x86_64-linux-gnu and aarch64-linux-gnu.
---
nptl/Versions | 1 +
nptl/allocatestack.c | 2 +-
nptl/cancellation.c | 10 ++--------
nptl/descr.h | 13 ++-----------
nptl/nptl_free_tcb.c | 22 ++++++----------------
nptl/pthread_cancel.c | 10 ++++------
nptl/pthread_create.c | 10 ++--------
nptl/pthread_join_common.c | 10 +++-------
nptl/pthread_testcancel.c | 10 ++--------
nptl_db/structs.def | 2 +-
nptl_db/td_thr_get_info.c | 16 +++++++---------
nptl_db/td_thr_getfpregs.c | 9 +++++----
nptl_db/td_thr_getgregs.c | 9 +++++----
nptl_db/td_thr_setfpregs.c | 9 +++++----
nptl_db/td_thr_setgregs.c | 9 +++++----
sysdeps/hppa/nptl/tcb-offsets.sym | 1 -
sysdeps/i386/nptl/tcb-offsets.sym | 1 -
sysdeps/nptl/pthreadP.h | 3 ---
sysdeps/sh/nptl/tcb-offsets.sym | 1 -
sysdeps/x86_64/nptl/tcb-offsets.sym | 4 ----
20 files changed, 51 insertions(+), 101 deletions(-)
diff --git a/nptl/Versions b/nptl/Versions
index 3221de89d1..3b23c6e248 100644
--- a/nptl/Versions
+++ b/nptl/Versions
@@ -413,6 +413,7 @@ libc {
_thread_db_pthread_eventbuf;
_thread_db_pthread_eventbuf_eventmask;
_thread_db_pthread_eventbuf_eventmask_event_bits;
+ _thread_db_pthread_joinstate;
_thread_db_pthread_key_data_data;
_thread_db_pthread_key_data_level2_data;
_thread_db_pthread_key_data_seq;
diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index ccb1aa21c1..a2a1fa7b81 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -119,7 +119,7 @@ get_cached_stack (size_t *sizep, void **memp)
*memp = result->stackblock;
/* Cancellation handling is back to the default. */
- result->cancelhandling = 0;
+ result->cancelstatus = 0;
result->cancelstate = PTHREAD_CANCEL_ENABLE;
result->canceltype = PTHREAD_CANCEL_DEFERRED;
result->cleanup = NULL;
diff --git a/nptl/cancellation.c b/nptl/cancellation.c
index 6478c029de..a244b3eeea 100644
--- a/nptl/cancellation.c
+++ b/nptl/cancellation.c
@@ -35,15 +35,9 @@ __pthread_enable_asynccancel (void)
int oldval = THREAD_GETMEM (self, canceltype);
THREAD_SETMEM (self, canceltype, PTHREAD_CANCEL_ASYNCHRONOUS);
- int ch = THREAD_GETMEM (self, cancelhandling);
-
if (self->cancelstate == PTHREAD_CANCEL_ENABLE
- && (ch & CANCELED_BITMASK)
- && !(ch & EXITING_BITMASK)
- && !(ch & TERMINATED_BITMASK))
- {
- __do_cancel ();
- }
+ && self->cancelstatus == 1)
+ __do_cancel ();
return oldval;
}
diff --git a/nptl/descr.h b/nptl/descr.h
index 563b152bff..1bfa2b9b52 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -286,17 +286,8 @@ struct pthread
struct pthread_unwind_buf *cleanup_jmp_buf;
#define HAVE_CLEANUP_JMP_BUF
- /* Flags determining processing of cancellation. */
- int cancelhandling;
- /* Bit set if canceled. */
-#define CANCELED_BIT 3
-#define CANCELED_BITMASK (0x01 << CANCELED_BIT)
- /* Bit set if thread is exiting. */
-#define EXITING_BIT 4
-#define EXITING_BITMASK (0x01 << EXITING_BIT)
- /* Bit set if thread terminated and TCB is freed. */
-#define TERMINATED_BIT 5
-#define TERMINATED_BITMASK (0x01 << TERMINATED_BIT)
+ /* Flag to determine whether the thread is signaled to be cancelled. */
+ int cancelstatus;
/* Flags. Including those copied from the thread attribute. */
int flags;
diff --git a/nptl/nptl_free_tcb.c b/nptl/nptl_free_tcb.c
index cbf3580f59..15e1a18562 100644
--- a/nptl/nptl_free_tcb.c
+++ b/nptl/nptl_free_tcb.c
@@ -24,22 +24,12 @@
void
__nptl_free_tcb (struct pthread *pd)
{
- /* The thread is exiting now. */
- if (atomic_bit_test_set (&pd->cancelhandling, TERMINATED_BIT) == 0)
- {
- /* Free TPP data. */
- if (pd->tpp != NULL)
- {
- struct priority_protection_data *tpp = pd->tpp;
+ free (pd->tpp);
+ pd->tpp = NULL;
- pd->tpp = NULL;
- free (tpp);
- }
-
- /* Queue the stack memory block for reuse and exit the process. The
- kernel will signal via writing to the address returned by
- QUEUE-STACK when the stack is available. */
- __nptl_deallocate_stack (pd);
- }
+ /* Queue the stack memory block for reuse and exit the process. The kernel
+ will signal via writing to the address returned by QUEUE-STACK when the
+ stack is available. */
+ __nptl_deallocate_stack (pd);
}
libc_hidden_def (__nptl_free_tcb)
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index 67e00ef007..aed6c1ea47 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -43,11 +43,8 @@ sigcancel_handler (int sig, siginfo_t *si, void *ctx)
struct pthread *self = THREAD_SELF;
- int ch = atomic_load_relaxed (&self->cancelhandling);
- /* Cancelation not enabled, not cancelled, or already exitting. */
if (self->cancelstate == PTHREAD_CANCEL_DISABLE
- || (ch & CANCELED_BITMASK) == 0
- || (ch & EXITING_BITMASK) != 0)
+ || atomic_load_relaxed (&self->joinstate) == THREAD_STATE_EXITING)
return;
/* Set the return value. */
@@ -93,8 +90,9 @@ __pthread_cancel (pthread_t th)
}
#endif
- int oldch = atomic_fetch_or_acquire (&pd->cancelhandling, CANCELED_BITMASK);
- if ((oldch & CANCELED_BITMASK) != 0)
+ /* If already cancelled just return (cancellation will be acted upon in next
+ cancellation entrypoint). */
+ if (atomic_exchange_acquire (&pd->cancelstatus, 1) == 1)
return 0;
if (pd == THREAD_SELF)
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index c00d6ae00c..b354f62995 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -479,12 +479,6 @@ start_thread (void *arg)
}
}
- /* The thread is exiting now. Don't set this bit until after we've hit
- the event-reporting breakpoint, so that td_thr_get_info on us while at
- the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE. */
- atomic_bit_set (&pd->cancelhandling, EXITING_BIT);
-
-
/* CONCURRENCY NOTES:
Concurrent pthread_detach() will either set state to
@@ -548,8 +542,8 @@ start_thread (void *arg)
do
/* XXX This differs from the typical futex_wait_simple pattern in that
the futex_wait condition (setxid_futex) is different from the
- condition used in the surrounding loop (cancelhandling). We need
- to check and document why this is correct. */
+ condition used in the surrounding loop. We need to check and
+ document why this is correct. */
futex_wait_simple (&pd->setxid_futex, 0, FUTEX_PRIVATE);
while (atomic_load_relaxed (&pd->setxid_flag) == 1);
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
index 428ed35101..132a40a56e 100644
--- a/nptl/pthread_join_common.c
+++ b/nptl/pthread_join_common.c
@@ -33,14 +33,10 @@ check_for_deadlock (int state, struct pthread *pd)
{
struct pthread *self = THREAD_SELF;
return ((pd == self
- || (atomic_load_acquire (&self->joinstate) == THREAD_STATE_DETACHED
- && (pd->cancelhandling
- & (CANCELED_BITMASK | EXITING_BITMASK
- | TERMINATED_BITMASK)) == 0))
+ || (atomic_load_acquire (&self->joinstate)
+ == THREAD_STATE_DETACHED))
&& !(self->cancelstate == PTHREAD_CANCEL_ENABLE
- && (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
- | TERMINATED_BITMASK))
- == CANCELED_BITMASK));
+ && atomic_load_relaxed (&self->cancelstatus) == 1));
}
int
diff --git a/nptl/pthread_testcancel.c b/nptl/pthread_testcancel.c
index f728a35524..4623f9e7ae 100644
--- a/nptl/pthread_testcancel.c
+++ b/nptl/pthread_testcancel.c
@@ -24,14 +24,8 @@ void
___pthread_testcancel (void)
{
struct pthread *self = THREAD_SELF;
- int cancelhandling = THREAD_GETMEM (self, cancelhandling);
- if (self->cancelstate == PTHREAD_CANCEL_ENABLE
- && (cancelhandling & CANCELED_BITMASK)
- && !(cancelhandling & EXITING_BITMASK)
- && !(cancelhandling & TERMINATED_BITMASK))
- {
- __do_cancel ();
- }
+ if (self->cancelstate == PTHREAD_CANCEL_ENABLE && self->cancelstatus == 1)
+ __do_cancel ();
}
versioned_symbol (libc, ___pthread_testcancel, pthread_testcancel, GLIBC_2_34);
libc_hidden_ver (___pthread_testcancel, __pthread_testcancel)
diff --git a/nptl_db/structs.def b/nptl_db/structs.def
index 248ecf4335..65a068e9d1 100644
--- a/nptl_db/structs.def
+++ b/nptl_db/structs.def
@@ -53,7 +53,7 @@ DB_STRUCT_FIELD (pthread, list)
DB_STRUCT_FIELD (pthread, report_events)
DB_STRUCT_FIELD (pthread, tid)
DB_STRUCT_FIELD (pthread, start_routine)
-DB_STRUCT_FIELD (pthread, cancelhandling)
+DB_STRUCT_FIELD (pthread, joinstate)
DB_STRUCT_FIELD (pthread, schedpolicy)
DB_STRUCT_FIELD (pthread, schedparam_sched_priority)
DB_STRUCT_FIELD (pthread, specific)
diff --git a/nptl_db/td_thr_get_info.c b/nptl_db/td_thr_get_info.c
index 01af021d2a..82f1f2667c 100644
--- a/nptl_db/td_thr_get_info.c
+++ b/nptl_db/td_thr_get_info.c
@@ -28,7 +28,7 @@ td_thr_get_info (const td_thrhandle_t *th, td_thrinfo_t *infop)
{
td_err_e err;
void *copy;
- psaddr_t tls, schedpolicy, schedprio, cancelhandling, tid, report_events;
+ psaddr_t tls, schedpolicy, schedprio, joinstate, tid, report_events;
LOG ("td_thr_get_info");
@@ -37,7 +37,7 @@ td_thr_get_info (const td_thrhandle_t *th, td_thrinfo_t *infop)
/* Special case for the main thread before initialization. */
copy = NULL;
tls = 0;
- cancelhandling = 0;
+ joinstate = 0;
schedpolicy = SCHED_OTHER;
schedprio = 0;
tid = 0;
@@ -76,8 +76,8 @@ td_thr_get_info (const td_thrhandle_t *th, td_thrinfo_t *infop)
err = DB_GET_FIELD_LOCAL (tid, th->th_ta_p, copy, pthread, tid, 0);
if (err != TD_OK)
return err;
- err = DB_GET_FIELD_LOCAL (cancelhandling, th->th_ta_p, copy, pthread,
- cancelhandling, 0);
+ err = DB_GET_FIELD_LOCAL (joinstate, th->th_ta_p, copy, pthread,
+ joinstate, 0);
if (err != TD_OK)
return err;
err = DB_GET_FIELD_LOCAL (report_events, th->th_ta_p, copy, pthread,
@@ -96,13 +96,11 @@ td_thr_get_info (const td_thrhandle_t *th, td_thrinfo_t *infop)
? 0 : (uintptr_t) schedprio);
infop->ti_type = TD_THR_USER;
- if ((((int) (uintptr_t) cancelhandling) & EXITING_BITMASK) == 0)
- /* XXX For now there is no way to get more information. */
+ int js = (int) (uintptr_t) joinstate;
+ if (js == THREAD_STATE_JOINABLE || js == THREAD_STATE_DETACHED)
infop->ti_state = TD_THR_ACTIVE;
- else if ((((int) (uintptr_t) cancelhandling) & TERMINATED_BITMASK) == 0)
- infop->ti_state = TD_THR_ZOMBIE;
else
- infop->ti_state = TD_THR_UNKNOWN;
+ infop->ti_state = TD_THR_ZOMBIE;
/* Initialization which are the same in both cases. */
infop->ti_ta_p = th->th_ta_p;
diff --git a/nptl_db/td_thr_getfpregs.c b/nptl_db/td_thr_getfpregs.c
index 3d08aa3f60..23a8a215c2 100644
--- a/nptl_db/td_thr_getfpregs.c
+++ b/nptl_db/td_thr_getfpregs.c
@@ -23,7 +23,7 @@
td_err_e
td_thr_getfpregs (const td_thrhandle_t *th, prfpregset_t *regset)
{
- psaddr_t cancelhandling, tid;
+ psaddr_t joinstate, tid;
td_err_e err;
LOG ("td_thr_getfpregs");
@@ -34,13 +34,14 @@ td_thr_getfpregs (const td_thrhandle_t *th, prfpregset_t *regset)
regset) != PS_OK ? TD_ERR : TD_OK;
/* We have to get the state and the PID for this thread. */
- err = DB_GET_FIELD (cancelhandling, th->th_ta_p, th->th_unique, pthread,
- cancelhandling, 0);
+ err = DB_GET_FIELD (joinstate, th->th_ta_p, th->th_unique, pthread,
+ joinstate, 0);
if (err != TD_OK)
return err;
/* If the thread already terminated we return all zeroes. */
- if (((int) (uintptr_t) cancelhandling) & TERMINATED_BITMASK)
+ int js = (int) (uintptr_t) joinstate;
+ if (js == THREAD_STATE_EXITING || js == THREAD_STATE_EXITED)
memset (regset, '\0', sizeof (*regset));
/* Otherwise get the register content through the callback. */
else
diff --git a/nptl_db/td_thr_getgregs.c b/nptl_db/td_thr_getgregs.c
index 8f9fab096f..b92402670f 100644
--- a/nptl_db/td_thr_getgregs.c
+++ b/nptl_db/td_thr_getgregs.c
@@ -23,7 +23,7 @@
td_err_e
td_thr_getgregs (const td_thrhandle_t *th, prgregset_t regset)
{
- psaddr_t cancelhandling, tid;
+ psaddr_t joinstate, tid;
td_err_e err;
LOG ("td_thr_getgregs");
@@ -34,13 +34,14 @@ td_thr_getgregs (const td_thrhandle_t *th, prgregset_t regset)
regset) != PS_OK ? TD_ERR : TD_OK;
/* We have to get the state and the PID for this thread. */
- err = DB_GET_FIELD (cancelhandling, th->th_ta_p, th->th_unique, pthread,
- cancelhandling, 0);
+ err = DB_GET_FIELD (joinstate, th->th_ta_p, th->th_unique, pthread,
+ joinstate, 0);
if (err != TD_OK)
return err;
/* If the thread already terminated we return all zeroes. */
- if (((int) (uintptr_t) cancelhandling) & TERMINATED_BITMASK)
+ int js = (int) (uintptr_t) joinstate;
+ if (js == THREAD_STATE_EXITING || js == THREAD_STATE_EXITED)
memset (regset, '\0', sizeof (*regset));
/* Otherwise get the register content through the callback. */
else
diff --git a/nptl_db/td_thr_setfpregs.c b/nptl_db/td_thr_setfpregs.c
index bddb0359a8..73ab675ce6 100644
--- a/nptl_db/td_thr_setfpregs.c
+++ b/nptl_db/td_thr_setfpregs.c
@@ -23,7 +23,7 @@
td_err_e
td_thr_setfpregs (const td_thrhandle_t *th, const prfpregset_t *fpregs)
{
- psaddr_t cancelhandling, tid;
+ psaddr_t joinstate, tid;
td_err_e err;
LOG ("td_thr_setfpregs");
@@ -34,13 +34,14 @@ td_thr_setfpregs (const td_thrhandle_t *th, const prfpregset_t *fpregs)
fpregs) != PS_OK ? TD_ERR : TD_OK;
/* We have to get the state and the PID for this thread. */
- err = DB_GET_FIELD (cancelhandling, th->th_ta_p, th->th_unique, pthread,
- cancelhandling, 0);
+ err = DB_GET_FIELD (joinstate, th->th_ta_p, th->th_unique, pthread,
+ joinstate, 0);
if (err != TD_OK)
return err;
/* Only set the registers if the thread hasn't yet terminated. */
- if ((((int) (uintptr_t) cancelhandling) & TERMINATED_BITMASK) == 0)
+ int js = (int) (uintptr_t) joinstate;
+ if (js != THREAD_STATE_EXITING || js != THREAD_STATE_EXITED)
{
err = DB_GET_FIELD (tid, th->th_ta_p, th->th_unique, pthread, tid, 0);
if (err != TD_OK)
diff --git a/nptl_db/td_thr_setgregs.c b/nptl_db/td_thr_setgregs.c
index 2a76a10754..186df94cbc 100644
--- a/nptl_db/td_thr_setgregs.c
+++ b/nptl_db/td_thr_setgregs.c
@@ -23,7 +23,7 @@
td_err_e
td_thr_setgregs (const td_thrhandle_t *th, prgregset_t gregs)
{
- psaddr_t cancelhandling, tid;
+ psaddr_t joinstate, tid;
td_err_e err;
LOG ("td_thr_setgregs");
@@ -34,13 +34,14 @@ td_thr_setgregs (const td_thrhandle_t *th, prgregset_t gregs)
gregs) != PS_OK ? TD_ERR : TD_OK;
/* We have to get the state and the PID for this thread. */
- err = DB_GET_FIELD (cancelhandling, th->th_ta_p, th->th_unique, pthread,
- cancelhandling, 0);
+ err = DB_GET_FIELD (joinstate, th->th_ta_p, th->th_unique, pthread,
+ joinstate, 0);
if (err != TD_OK)
return err;
/* Only set the registers if the thread hasn't yet terminated. */
- if ((((int) (uintptr_t) cancelhandling) & TERMINATED_BITMASK) == 0)
+ int js = (int) (uintptr_t) joinstate;
+ if (js != THREAD_STATE_EXITING || js != THREAD_STATE_EXITED)
{
err = DB_GET_FIELD (tid, th->th_ta_p, th->th_unique, pthread, tid, 0);
if (err != TD_OK)
diff --git a/sysdeps/hppa/nptl/tcb-offsets.sym b/sysdeps/hppa/nptl/tcb-offsets.sym
index 6e852f35b1..8937f1ec21 100644
--- a/sysdeps/hppa/nptl/tcb-offsets.sym
+++ b/sysdeps/hppa/nptl/tcb-offsets.sym
@@ -3,7 +3,6 @@
RESULT offsetof (struct pthread, result)
TID offsetof (struct pthread, tid)
-CANCELHANDLING offsetof (struct pthread, cancelhandling)
CLEANUP_JMP_BUF offsetof (struct pthread, cleanup_jmp_buf)
MULTIPLE_THREADS_OFFSET offsetof (struct pthread, header.multiple_threads)
TLS_PRE_TCB_SIZE sizeof (struct pthread)
diff --git a/sysdeps/i386/nptl/tcb-offsets.sym b/sysdeps/i386/nptl/tcb-offsets.sym
index 2ec9e787c1..85a27dc29f 100644
--- a/sysdeps/i386/nptl/tcb-offsets.sym
+++ b/sysdeps/i386/nptl/tcb-offsets.sym
@@ -4,7 +4,6 @@
RESULT offsetof (struct pthread, result)
TID offsetof (struct pthread, tid)
-CANCELHANDLING offsetof (struct pthread, cancelhandling)
CLEANUP_JMP_BUF offsetof (struct pthread, cleanup_jmp_buf)
MULTIPLE_THREADS_OFFSET offsetof (tcbhead_t, multiple_threads)
SYSINFO_OFFSET offsetof (tcbhead_t, sysinfo)
diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
index e89c1b92ae..d366d691cd 100644
--- a/sysdeps/nptl/pthreadP.h
+++ b/sysdeps/nptl/pthreadP.h
@@ -272,9 +272,6 @@ __exit_thread (void *value)
{
struct pthread *self = THREAD_SELF;
- /* Make sure we get no more cancellations. */
- THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT);
-
THREAD_SETMEM (self, result, value);
/* It is required by POSIX XSH 2.9.5 Thread Cancellation under the heading
diff --git a/sysdeps/sh/nptl/tcb-offsets.sym b/sysdeps/sh/nptl/tcb-offsets.sym
index 234207779d..60c9e40b72 100644
--- a/sysdeps/sh/nptl/tcb-offsets.sym
+++ b/sysdeps/sh/nptl/tcb-offsets.sym
@@ -4,7 +4,6 @@
RESULT offsetof (struct pthread, result)
TID offsetof (struct pthread, tid)
-CANCELHANDLING offsetof (struct pthread, cancelhandling)
CLEANUP_JMP_BUF offsetof (struct pthread, cleanup_jmp_buf)
MULTIPLE_THREADS_OFFSET offsetof (struct pthread, header.multiple_threads)
TLS_PRE_TCB_SIZE sizeof (struct pthread)
diff --git a/sysdeps/x86_64/nptl/tcb-offsets.sym b/sysdeps/x86_64/nptl/tcb-offsets.sym
index 2bbd563a6c..6cc845f7ed 100644
--- a/sysdeps/x86_64/nptl/tcb-offsets.sym
+++ b/sysdeps/x86_64/nptl/tcb-offsets.sym
@@ -4,7 +4,6 @@
RESULT offsetof (struct pthread, result)
TID offsetof (struct pthread, tid)
-CANCELHANDLING offsetof (struct pthread, cancelhandling)
CLEANUP_JMP_BUF offsetof (struct pthread, cleanup_jmp_buf)
CLEANUP offsetof (struct pthread, cleanup)
CLEANUP_PREV offsetof (struct _pthread_cleanup_buffer, __prev)
@@ -13,6 +12,3 @@ MULTIPLE_THREADS_OFFSET offsetof (tcbhead_t, multiple_threads)
POINTER_GUARD offsetof (tcbhead_t, pointer_guard)
FEATURE_1_OFFSET offsetof (tcbhead_t, feature_1)
SSP_BASE_OFFSET offsetof (tcbhead_t, ssp_base)
-
--- Not strictly offsets, but these values are also used in the TCB.
-TCB_CANCELED_BITMASK CANCELED_BITMASK
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 07/19] support: Add support_wait_for_thread_exit
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (5 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 06/19] nptl: Replace struct thread cancelhandling field Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 9:31 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 08/19] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) Adhemerval Zanella
` (12 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
From: Florian Weimer <fweimer@redhat.com>
Wait until all threads except the current thread has exited.
---
support/Makefile | 3 +-
support/support.h | 4 ++
support/support_wait_for_thread_exit.c | 72 ++++++++++++++++++++++++++
3 files changed, 78 insertions(+), 1 deletion(-)
create mode 100644 support/support_wait_for_thread_exit.c
diff --git a/support/Makefile b/support/Makefile
index a462781718..ef2b1a980a 100644
--- a/support/Makefile
+++ b/support/Makefile
@@ -82,9 +82,10 @@ libsupport-routines = \
support_test_compare_blob \
support_test_compare_failure \
support_test_compare_string \
- support_write_file_string \
support_test_main \
support_test_verify_impl \
+ support_wait_for_thread_exit \
+ support_write_file_string \
temp_file \
timespec \
timespec-time64 \
diff --git a/support/support.h b/support/support.h
index 834dba9097..a5978b939a 100644
--- a/support/support.h
+++ b/support/support.h
@@ -174,6 +174,10 @@ timer_t support_create_timer (uint64_t sec, long int nsec, bool repeat,
/* Disable the timer TIMER. */
void support_delete_timer (timer_t timer);
+/* Wait until all threads except the current thread have exited (as
+ far as the kernel is concerned). */
+void support_wait_for_thread_exit (void);
+
struct support_stack
{
void *stack;
diff --git a/support/support_wait_for_thread_exit.c b/support/support_wait_for_thread_exit.c
new file mode 100644
index 0000000000..658a813810
--- /dev/null
+++ b/support/support_wait_for_thread_exit.c
@@ -0,0 +1,72 @@
+/* Wait until all threads except the current thread has exited.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <dirent.h>
+#include <errno.h>
+#include <string.h>
+#include <support/check.h>
+#include <support/support.h>
+#include <unistd.h>
+
+void
+support_wait_for_thread_exit (void)
+{
+#ifdef __linux__
+ DIR *proc_self_task = opendir ("/proc/self/task");
+ TEST_VERIFY_EXIT (proc_self_task != NULL);
+
+ while (true)
+ {
+ errno = 0;
+ struct dirent *e = readdir (proc_self_task);
+ if (e == NULL && errno != 0)
+ FAIL_EXIT1 ("readdir: %m");
+ if (e == NULL)
+ {
+ /* Only the main thread remains. Testing may continue. */
+ closedir (proc_self_task);
+ return;
+ }
+
+ if (strcmp (e->d_name, ".") == 0 || strcmp (e->d_name, "..") == 0)
+ continue;
+
+ int task_tid = atoi (e->d_name);
+ if (task_tid <= 0)
+ FAIL_EXIT1 ("Invalid /proc/self/task entry: %s", e->d_name);
+
+ if (task_tid == gettid ())
+ /* The current thread. Keep scanning for other
+ threads. */
+ continue;
+
+ /* task_tid does not refer to this thread here, i.e., there is
+ another running thread. */
+
+ /* Small timeout to give the thread a chance to exit. */
+ usleep (50 * 1000);
+
+ /* Start scanning the directory from the start. */
+ rewinddir (proc_self_task);
+ }
+#else
+ /* Use a large timeout because we cannot verify that the thread has
+ exited. */
+ usleep (5 * 1000 * 1000);
+#endif
+}
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 08/19] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193)
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (6 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 07/19] support: Add support_wait_for_thread_exit Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 10:03 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 09/19] nptl: Fix race between pthread_kill and thread exit (bug 12889) Adhemerval Zanella
` (11 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
From: Florian Weimer <fweimer@redhat.com>
This closes one remaining race condition related to bug 12889: if
the thread already exited on the kernel side, returning ESRCH
is not correct because that error is reserved for the thread IDs
(pthread_t values) whose lifetime has ended. In case of a
kernel-side exit and a valid thread ID, no signal needs to be sent
and cancellation does not have an effect, so just return 0.
sysdeps/pthread/tst-kill4.c triggers undefined behavior and is
removed with this commit.
Checked on x86_64-linux-gnu and i686-linux-gnu.
---
nptl/pthread_cancel.c | 6 ++-
nptl/pthread_kill.c | 7 +++-
sysdeps/pthread/Makefile | 5 ++-
sysdeps/pthread/tst-pthread_cancel-exited.c | 45 ++++++++++++++++++++
sysdeps/pthread/tst-pthread_kill-exited.c | 46 +++++++++++++++++++++
5 files changed, 105 insertions(+), 4 deletions(-)
create mode 100644 sysdeps/pthread/tst-pthread_cancel-exited.c
create mode 100644 sysdeps/pthread/tst-pthread_kill-exited.c
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index aed6c1ea47..6f6c989dc0 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -62,8 +62,10 @@ __pthread_cancel (pthread_t th)
/* Make sure the descriptor is valid. */
int state = atomic_load_acquire (&pd->joinstate);
if (state == THREAD_STATE_EXITED || state == THREAD_STATE_EXITING)
- /* Not a valid thread handle. */
- return ESRCH;
+ /* The thread has already either exited on the kernel side or started the
+ exit phase. In eitehr case its outcome (regular exit, other
+ cancelation) has already been determined. */
+ return 0;
static int init_sigcancel = 0;
if (atomic_load_relaxed (&init_sigcancel) == 0)
diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
index f79a2b26fc..5d4c86f920 100644
--- a/nptl/pthread_kill.c
+++ b/nptl/pthread_kill.c
@@ -46,7 +46,12 @@ __pthread_kill_internal (pthread_t threadid, int signo)
? INTERNAL_SYSCALL_ERRNO (val) : 0);
}
else
- val = ESRCH;
+ /* The kernel reports that the thread has exited. POSIX specifies
+ the ESRCH error only for the case when the lifetime of a thread
+ ID has ended, but calling pthread_kill on such a thread ID is
+ undefined in glibc. Therefore, do not treat kernel thread exit
+ as an error. */
+ val = 0;
return val;
}
diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile
index 42f9fc5072..dedfa0d290 100644
--- a/sysdeps/pthread/Makefile
+++ b/sysdeps/pthread/Makefile
@@ -89,7 +89,7 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \
tst-join8 tst-join9 tst-join10 tst-join11 tst-join12 tst-join13 \
tst-join14 tst-join15 \
tst-key1 tst-key2 tst-key3 tst-key4 \
- tst-kill1 tst-kill2 tst-kill3 tst-kill4 tst-kill5 tst-kill6 \
+ tst-kill1 tst-kill2 tst-kill3 tst-kill5 tst-kill6 \
tst-locale1 tst-locale2 \
tst-memstream \
tst-mutex-errorcheck tst-mutex1 tst-mutex2 tst-mutex3 tst-mutex4 \
@@ -118,6 +118,9 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \
tst-unload \
tst-unwind-thread \
tst-pt-vfork1 tst-pt-vfork2 tst-vfork1x tst-vfork2x \
+ tst-pthread_cancel-exited \
+ tst-pthread_kill-exited \
+ # tests
tests-time64 := \
tst-abstime-time64 \
diff --git a/sysdeps/pthread/tst-pthread_cancel-exited.c b/sysdeps/pthread/tst-pthread_cancel-exited.c
new file mode 100644
index 0000000000..811c9bee07
--- /dev/null
+++ b/sysdeps/pthread/tst-pthread_cancel-exited.c
@@ -0,0 +1,45 @@
+/* Test that pthread_kill succeeds for an exited thread.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* This test verifies that pthread_kill returns 0 (and not ESRCH) for
+ a thread that has exited on the kernel side. */
+
+#include <stddef.h>
+#include <support/support.h>
+#include <support/xthread.h>
+
+static void *
+noop_thread (void *closure)
+{
+ return NULL;
+}
+
+static int
+do_test (void)
+{
+ pthread_t thr = xpthread_create (NULL, noop_thread, NULL);
+
+ support_wait_for_thread_exit ();
+
+ xpthread_cancel (thr);
+ xpthread_join (thr);
+
+ return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/pthread/tst-pthread_kill-exited.c b/sysdeps/pthread/tst-pthread_kill-exited.c
new file mode 100644
index 0000000000..7575fb6d58
--- /dev/null
+++ b/sysdeps/pthread/tst-pthread_kill-exited.c
@@ -0,0 +1,46 @@
+/* Test that pthread_kill succeeds for an exited thread.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* This test verifies that pthread_kill returns 0 (and not ESRCH) for
+ a thread that has exited on the kernel side. */
+
+#include <signal.h>
+#include <stddef.h>
+#include <support/support.h>
+#include <support/xthread.h>
+
+static void *
+noop_thread (void *closure)
+{
+ return NULL;
+}
+
+static int
+do_test (void)
+{
+ pthread_t thr = xpthread_create (NULL, noop_thread, NULL);
+
+ support_wait_for_thread_exit ();
+
+ xpthread_kill (thr, SIGUSR1);
+ xpthread_join (thr);
+
+ return 0;
+}
+
+#include <support/test-driver.c>
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 09/19] nptl: Fix race between pthread_kill and thread exit (bug 12889)
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (7 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 08/19] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 14:23 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 10/19] nptl: Use tidlock when accessing TID on pthread_getaffinity_np Adhemerval Zanella
` (10 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
From: Florian Weimer <fweimer@redhat.com>
Now that the 'tid' pthread member is not used to advertise the thread
lifetime through set_tid_address syscall, we can use a simple lock to
guarantee it is valid on pthread_kill().
This patch adds a new lock, tidlock, which is used to synchronize the
pthread 'tid' field access. Also, there is no need to access the
'tid' member using atomic operations.
Checked on x86_64-linux-gnu.
Co-authored-by: Adhemerval Zanella <adhemerval.zanella@linaro.org>
---
nptl/allocatestack.c | 1 +
nptl/descr.h | 4 +
nptl/pthread_create.c | 7 ++
nptl/pthread_kill.c | 14 ++-
sysdeps/pthread/Makefile | 2 +
sysdeps/pthread/tst-kill4.c | 90 --------------
.../pthread/tst-pthread_cancel-select-loop.c | 87 ++++++++++++++
sysdeps/pthread/tst-pthread_kill-exiting.c | 110 ++++++++++++++++++
8 files changed, 221 insertions(+), 94 deletions(-)
delete mode 100644 sysdeps/pthread/tst-kill4.c
create mode 100644 sysdeps/pthread/tst-pthread_cancel-select-loop.c
create mode 100644 sysdeps/pthread/tst-pthread_kill-exiting.c
diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index a2a1fa7b81..63e6fe9d60 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -124,6 +124,7 @@ get_cached_stack (size_t *sizep, void **memp)
result->canceltype = PTHREAD_CANCEL_DEFERRED;
result->cleanup = NULL;
result->setup_failed = 0;
+ result->tidlock = LLL_LOCK_INITIALIZER;
/* No pending event. */
result->nextevent = NULL;
diff --git a/nptl/descr.h b/nptl/descr.h
index 1bfa2b9b52..f9bb58d7cc 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -181,6 +181,10 @@ struct pthread
/* Thread ID set by the kernel with CLONE_PARENT_SETTID. */
pid_t tid;
+ /* Lock to synchronize access to TID value (used on pthread_kill and thread
+ exit). */
+ int tidlock;
+
/* Ununsed. */
pid_t pid_ununsed;
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index b354f62995..a0f3ddf5fa 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -531,6 +531,12 @@ start_thread (void *arg)
/* This was the last thread. */
exit (0);
+ /* This prevents sending a signal from this thread to itself during its
+ final stages. This must come after the exit call above because
+ atexit handlers must not run with signals blocked. */
+ __libc_signal_block_all (NULL);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
+
if (!pd->user_stack)
advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
pd->guardsize);
@@ -555,6 +561,7 @@ start_thread (void *arg)
__nptl_free_tcb (pd);
pd->tid = 0;
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
out:
/* We cannot call '_exit' here. '_exit' will terminate the process.
diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
index 5d4c86f920..63fe8c44ca 100644
--- a/nptl/pthread_kill.c
+++ b/nptl/pthread_kill.c
@@ -26,15 +26,18 @@ __pthread_kill_internal (pthread_t threadid, int signo)
pid_t tid;
struct pthread *pd = (struct pthread *) threadid;
+ /* Block all signal, since the lock is recursive and used on pthread_cancel
+ (which should be async-signal-safe). */
+ sigset_t oldmask;
+ __libc_signal_block_all (&oldmask);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
+
if (pd == THREAD_SELF)
/* It is a special case to handle raise() implementation after a vfork
call (which does not update the PD tid field). */
tid = INLINE_SYSCALL_CALL (gettid);
else
- /* Force load of pd->tid into local variable or register. Otherwise
- if a thread exits between ESRCH test and tgkill, we might return
- EINVAL, because pd->tid would be cleared by the kernel. */
- tid = atomic_forced_read (pd->tid);
+ tid = pd->tid;
int val;
if (__glibc_likely (tid > 0))
@@ -53,6 +56,9 @@ __pthread_kill_internal (pthread_t threadid, int signo)
as an error. */
val = 0;
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
+ __libc_signal_restore_set (&oldmask);
+
return val;
}
diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile
index dedfa0d290..48dba717a1 100644
--- a/sysdeps/pthread/Makefile
+++ b/sysdeps/pthread/Makefile
@@ -119,7 +119,9 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \
tst-unwind-thread \
tst-pt-vfork1 tst-pt-vfork2 tst-vfork1x tst-vfork2x \
tst-pthread_cancel-exited \
+ tst-pthread_cancel-select-loop \
tst-pthread_kill-exited \
+ tst-pthread_kill-exiting \
# tests
tests-time64 := \
diff --git a/sysdeps/pthread/tst-kill4.c b/sysdeps/pthread/tst-kill4.c
deleted file mode 100644
index 9563939792..0000000000
--- a/sysdeps/pthread/tst-kill4.c
+++ /dev/null
@@ -1,90 +0,0 @@
-/* Copyright (C) 2003-2021 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
- Contributed by Ulrich Drepper <drepper@redhat.com>, 2003.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <errno.h>
-#include <pthread.h>
-#include <signal.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <unistd.h>
-
-
-static void *
-tf (void *a)
-{
- return NULL;
-}
-
-
-int
-do_test (void)
-{
- pthread_attr_t at;
- if (pthread_attr_init (&at) != 0)
- {
- puts ("attr_create failed");
- exit (1);
- }
-
- /* Limit thread stack size, because if it is too large, pthread_join
- will free it immediately rather than put it into stack cache. */
- if (pthread_attr_setstacksize (&at, 2 * 1024 * 1024) != 0)
- {
- puts ("setstacksize failed");
- exit (1);
- }
-
- pthread_t th;
- if (pthread_create (&th, &at, tf, NULL) != 0)
- {
- puts ("create failed");
- exit (1);
- }
-
- pthread_attr_destroy (&at);
-
- if (pthread_join (th, NULL) != 0)
- {
- puts ("join failed");
- exit (1);
- }
-
- /* The following only works because we assume here something about
- the implementation. Namely, that the memory allocated for the
- thread descriptor is not going away, that the TID field is
- cleared and therefore the signal is sent to process 0, and that
- we can savely assume there is no other process with this ID at
- that time. */
- int e = pthread_kill (th, 0);
- if (e == 0)
- {
- puts ("pthread_kill succeeded");
- exit (1);
- }
- if (e != ESRCH)
- {
- puts ("pthread_kill didn't return ESRCH");
- exit (1);
- }
-
- return 0;
-}
-
-
-#define TEST_FUNCTION do_test ()
-#include "../test-skeleton.c"
diff --git a/sysdeps/pthread/tst-pthread_cancel-select-loop.c b/sysdeps/pthread/tst-pthread_cancel-select-loop.c
new file mode 100644
index 0000000000..a62087589c
--- /dev/null
+++ b/sysdeps/pthread/tst-pthread_cancel-select-loop.c
@@ -0,0 +1,87 @@
+/* Test that pthread_cancel succeeds during thread exit.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* This test tries to trigger an internal race condition in
+ pthread_cancel, where the cancellation signal is sent after the
+ thread has begun the cancellation process. This can result in a
+ spurious ESRCH error. For the original bug 12889, the window is
+ quite small, so the bug was not reproduced in every run. */
+
+#include <stdbool.h>
+#include <stddef.h>
+#include <support/check.h>
+#include <support/xthread.h>
+#include <support/xunistd.h>
+#include <sys/select.h>
+#include <unistd.h>
+
+/* Set to true by timeout_thread_function when the test should
+ terminate. */
+static bool timeout;
+
+static void *
+timeout_thread_function (void *unused)
+{
+ usleep (5 * 1000 * 1000);
+ __atomic_store_n (&timeout, true, __ATOMIC_RELAXED);
+ return NULL;
+}
+
+/* Used for blocking the select function below. */
+static int pipe_fds[2];
+
+static void *
+canceled_thread_function (void *unused)
+{
+ while (true)
+ {
+ fd_set rfs;
+ fd_set wfs;
+ fd_set efs;
+ FD_ZERO (&rfs);
+ FD_ZERO (&wfs);
+ FD_ZERO (&efs);
+ FD_SET (pipe_fds[0], &rfs);
+
+ /* If the cancellation request is recognized early, the thread
+ begins exiting while the cancellation signal arrives. */
+ select (FD_SETSIZE, &rfs, &wfs, &efs, NULL);
+ }
+ return NULL;
+}
+
+static int
+do_test (void)
+{
+ xpipe (pipe_fds);
+ pthread_t thr_timeout = xpthread_create (NULL, timeout_thread_function, NULL);
+
+ while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED))
+ {
+ pthread_t thr = xpthread_create (NULL, canceled_thread_function, NULL);
+ xpthread_cancel (thr);
+ TEST_VERIFY (xpthread_join (thr) == PTHREAD_CANCELED);
+ }
+
+ xpthread_join (thr_timeout);
+ xclose (pipe_fds[0]);
+ xclose (pipe_fds[1]);
+ return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/pthread/tst-pthread_kill-exiting.c b/sysdeps/pthread/tst-pthread_kill-exiting.c
new file mode 100644
index 0000000000..15550d2310
--- /dev/null
+++ b/sysdeps/pthread/tst-pthread_kill-exiting.c
@@ -0,0 +1,110 @@
+/* Test that pthread_kill succeeds during thread exit.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* This test verifies that pthread_kill for a thread that is exiting
+ succeeds (with or without actually delivering the signal). */
+
+#include <array_length.h>
+#include <stdbool.h>
+#include <stddef.h>
+#include <support/xsignal.h>
+#include <support/xthread.h>
+#include <unistd.h>
+
+/* Set to true by timeout_thread_function when the test should
+ terminate. */
+static bool timeout;
+
+static void *
+timeout_thread_function (void *unused)
+{
+ usleep (1000 * 1000);
+ __atomic_store_n (&timeout, true, __ATOMIC_RELAXED);
+ return NULL;
+}
+
+/* Used to synchronize the sending threads with the target thread and
+ main thread. */
+static pthread_barrier_t barrier_1;
+static pthread_barrier_t barrier_2;
+
+/* The target thread to which signals are to be sent. */
+static pthread_t target_thread;
+
+static void *
+sender_thread_function (void *unused)
+{
+ while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED))
+ {
+ /* Wait until target_thread has been initialized. The target
+ thread and main thread participate in this barrier. */
+ xpthread_barrier_wait (&barrier_1);
+
+ xpthread_kill (target_thread, SIGUSR1);
+
+ /* Communicate that the signal has been sent. The main thread
+ participates in this barrier. */
+ xpthread_barrier_wait (&barrier_2);
+ }
+ return NULL;
+}
+
+static void *
+target_thread_function (void *unused)
+{
+ target_thread = pthread_self ();
+ xpthread_barrier_wait (&barrier_1);
+ return NULL;
+}
+
+static int
+do_test (void)
+{
+ xsignal (SIGUSR1, SIG_IGN);
+
+ pthread_t thr_timeout = xpthread_create (NULL, timeout_thread_function, NULL);
+
+ pthread_t threads[4];
+ xpthread_barrier_init (&barrier_1, NULL, array_length (threads) + 2);
+ xpthread_barrier_init (&barrier_2, NULL, array_length (threads) + 1);
+
+ for (int i = 0; i < array_length (threads); ++i)
+ threads[i] = xpthread_create (NULL, sender_thread_function, NULL);
+
+ while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED))
+ {
+ xpthread_create (NULL, target_thread_function, NULL);
+
+ /* Wait for the target thread to be set up and signal sending to
+ start. */
+ xpthread_barrier_wait (&barrier_1);
+
+ /* Wait for signal sending to complete. */
+ xpthread_barrier_wait (&barrier_2);
+
+ xpthread_join (target_thread);
+ }
+
+ for (int i = 0; i < array_length (threads); ++i)
+ xpthread_join (threads[i]);
+ xpthread_join (thr_timeout);
+
+ return 0;
+}
+
+#include <support/test-driver.c>
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 10/19] nptl: Use tidlock when accessing TID on pthread_getaffinity_np
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (8 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 09/19] nptl: Fix race between pthread_kill and thread exit (bug 12889) Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 14:24 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 11/19] nptl: Use tidlock when accessing TID on pthread_setaffinity Adhemerval Zanella
` (9 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Checked on x86_64-linux-gnu.
---
nptl/pthread_getaffinity.c | 22 +++++++++++++++++-----
1 file changed, 17 insertions(+), 5 deletions(-)
diff --git a/nptl/pthread_getaffinity.c b/nptl/pthread_getaffinity.c
index 18261ddae0..5268d86e6e 100644
--- a/nptl/pthread_getaffinity.c
+++ b/nptl/pthread_getaffinity.c
@@ -29,12 +29,24 @@
int
__pthread_getaffinity_np (pthread_t th, size_t cpusetsize, cpu_set_t *cpuset)
{
- const struct pthread *pd = (const struct pthread *) th;
+ struct pthread *pd = (struct pthread *) th;
- int res = INTERNAL_SYSCALL_CALL (sched_getaffinity, pd->tid,
- MIN (INT_MAX, cpusetsize), cpuset);
- if (INTERNAL_SYSCALL_ERROR_P (res))
- return INTERNAL_SYSCALL_ERRNO (res);
+ /* Block all signal, since the lock is recursive and used on pthread_cancel
+ (which should be async-signal-safe). */
+ sigset_t oldmask;
+ __libc_signal_block_all (&oldmask);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
+
+ int res = pd->tid == 0
+ ? -ESRCH
+ : INTERNAL_SYSCALL_CALL (sched_getaffinity, pd->tid,
+ MIN (INT_MAX, cpusetsize), cpuset);
+
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
+ __libc_signal_restore_set (&oldmask);
+
+ if (res < 0)
+ return -res;
/* Clean the rest of the memory the kernel didn't do. */
memset ((char *) cpuset + res, '\0', cpusetsize - res);
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 11/19] nptl: Use tidlock when accessing TID on pthread_setaffinity
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (9 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 10/19] nptl: Use tidlock when accessing TID on pthread_getaffinity_np Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 14:25 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 12/19] nptl: Use tidlock when accessing TID on pthread_getcpuclockid Adhemerval Zanella
` (8 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Checked on x86_64-linux-gnu.
---
nptl/pthread_setaffinity.c | 22 +++++++++++++++-------
1 file changed, 15 insertions(+), 7 deletions(-)
diff --git a/nptl/pthread_setaffinity.c b/nptl/pthread_setaffinity.c
index 3bfdc63e19..beb61836a6 100644
--- a/nptl/pthread_setaffinity.c
+++ b/nptl/pthread_setaffinity.c
@@ -27,15 +27,23 @@ int
__pthread_setaffinity_new (pthread_t th, size_t cpusetsize,
const cpu_set_t *cpuset)
{
- const struct pthread *pd = (const struct pthread *) th;
- int res;
+ struct pthread *pd = (struct pthread *) th;
- res = INTERNAL_SYSCALL_CALL (sched_setaffinity, pd->tid, cpusetsize,
- cpuset);
+ /* Block all signal, since the lock is recursive and used on pthread_cancel
+ (which should be async-signal-safe). */
+ sigset_t oldmask;
+ __libc_signal_block_all (&oldmask);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
- return (INTERNAL_SYSCALL_ERROR_P (res)
- ? INTERNAL_SYSCALL_ERRNO (res)
- : 0);
+ int res = pd->tid == 0
+ ? ESRCH
+ : INTERNAL_SYSCALL_CALL (sched_setaffinity, pd->tid, cpusetsize,
+ cpuset);
+
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
+ __libc_signal_restore_set (&oldmask);
+
+ return res;
}
versioned_symbol (libc, __pthread_setaffinity_new,
pthread_setaffinity_np, GLIBC_2_34);
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 12/19] nptl: Use tidlock when accessing TID on pthread_getcpuclockid
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (10 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 11/19] nptl: Use tidlock when accessing TID on pthread_setaffinity Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 14:27 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 13/19] nptl: Use tidlock when accessing TID on pthread_getschedparam Adhemerval Zanella
` (7 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Checked on x86_64-linux-gnu.
---
nptl/pthread_getcpuclockid.c | 27 +++++++++++++++++----------
1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/nptl/pthread_getcpuclockid.c b/nptl/pthread_getcpuclockid.c
index 0a6656ea4c..8c0db0b9ba 100644
--- a/nptl/pthread_getcpuclockid.c
+++ b/nptl/pthread_getcpuclockid.c
@@ -29,16 +29,23 @@ __pthread_getcpuclockid (pthread_t threadid, clockid_t *clockid)
struct pthread *pd = (struct pthread *) threadid;
/* Make sure the descriptor is valid. */
- if (INVALID_TD_P (pd))
- /* Not a valid thread handle. */
- return ESRCH;
-
- /* The clockid_t value is a simple computation from the TID. */
-
- const clockid_t tidclock = MAKE_THREAD_CPUCLOCK (pd->tid, CPUCLOCK_SCHED);
-
- *clockid = tidclock;
- return 0;
+ sigset_t oldmask;
+ __libc_signal_block_all (&oldmask);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
+
+ int res;
+ if (pd->tid != 0)
+ {
+ *clockid = MAKE_THREAD_CPUCLOCK (pd->tid, CPUCLOCK_SCHED);
+ res = 0;
+ }
+ else
+ res = -ESRCH;
+
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
+ __libc_signal_restore_set (&oldmask);
+
+ return res;
}
versioned_symbol (libc, __pthread_getcpuclockid, pthread_getcpuclockid,
GLIBC_2_34);
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 13/19] nptl: Use tidlock when accessing TID on pthread_getschedparam
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (11 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 12/19] nptl: Use tidlock when accessing TID on pthread_getcpuclockid Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 15:00 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 14/19] nptl: Use tidlock when accessing TID on pthread_setschedparam Adhemerval Zanella
` (6 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Checked on x86_64-linux-gnu.
---
nptl/pthread_getschedparam.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/nptl/pthread_getschedparam.c b/nptl/pthread_getschedparam.c
index 94316cf897..c41d2a8341 100644
--- a/nptl/pthread_getschedparam.c
+++ b/nptl/pthread_getschedparam.c
@@ -29,12 +29,18 @@ __pthread_getschedparam (pthread_t threadid, int *policy,
struct pthread *pd = (struct pthread *) threadid;
/* Make sure the descriptor is valid. */
- if (INVALID_TD_P (pd))
- /* Not a valid thread handle. */
- return ESRCH;
+ sigset_t oldmask;
+ __libc_signal_block_all (&oldmask);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
int result = 0;
+ if (pd->tid == 0)
+ {
+ result = ESRCH;
+ goto out;
+ }
+
/* See CREATE THREAD NOTES in nptl/pthread_create.c. */
lll_lock (pd->lock, LLL_PRIVATE);
@@ -68,6 +74,10 @@ __pthread_getschedparam (pthread_t threadid, int *policy,
lll_unlock (pd->lock, LLL_PRIVATE);
+out:
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
+ __libc_signal_restore_set (&oldmask);
+
return result;
}
libc_hidden_def (__pthread_getschedparam)
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 14/19] nptl: Use tidlock when accessing TID on pthread_setschedparam
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (12 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 13/19] nptl: Use tidlock when accessing TID on pthread_getschedparam Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 14:35 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 15/19] nptl: Use tidlock when accessing TID on pthread_getname_np Adhemerval Zanella
` (5 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Checked on x86_64-linux-gnu.
---
nptl/pthread_setschedparam.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/nptl/pthread_setschedparam.c b/nptl/pthread_setschedparam.c
index 70fa8378b8..75c796d9b8 100644
--- a/nptl/pthread_setschedparam.c
+++ b/nptl/pthread_setschedparam.c
@@ -30,12 +30,18 @@ __pthread_setschedparam (pthread_t threadid, int policy,
struct pthread *pd = (struct pthread *) threadid;
/* Make sure the descriptor is valid. */
- if (INVALID_TD_P (pd))
- /* Not a valid thread handle. */
- return ESRCH;
+ sigset_t oldmask;
+ __libc_signal_block_all (&oldmask);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
int result = 0;
+ if (pd->tid == 0)
+ {
+ result = ESRCH;
+ goto out;
+ }
+
/* See CREATE THREAD NOTES in nptl/pthread_create.c. */
lll_lock (pd->lock, LLL_PRIVATE);
@@ -67,6 +73,10 @@ __pthread_setschedparam (pthread_t threadid, int policy,
lll_unlock (pd->lock, LLL_PRIVATE);
+out:
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
+ __libc_signal_restore_set (&oldmask);
+
return result;
}
strong_alias (__pthread_setschedparam, pthread_setschedparam)
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 15/19] nptl: Use tidlock when accessing TID on pthread_getname_np
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (13 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 14/19] nptl: Use tidlock when accessing TID on pthread_setschedparam Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 14:38 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 16/19] nptl: Use tidlock when accessing TID on pthread_setname_np Adhemerval Zanella
` (4 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Checked on x86_64-linux-gnu.
---
nptl/pthread_getname.c | 45 ++++++++++++++++++++++++++----------------
1 file changed, 28 insertions(+), 17 deletions(-)
diff --git a/nptl/pthread_getname.c b/nptl/pthread_getname.c
index 8ac814366a..37dd6a360e 100644
--- a/nptl/pthread_getname.c
+++ b/nptl/pthread_getname.c
@@ -22,6 +22,7 @@
#include <stdio.h>
#include <string.h>
#include <unistd.h>
+#include <intprops.h>
#include <sys/prctl.h>
#include <not-cancel.h>
#include <shlib-compat.h>
@@ -29,7 +30,7 @@
int
__pthread_getname_np (pthread_t th, char *buf, size_t len)
{
- const struct pthread *pd = (const struct pthread *) th;
+ struct pthread *pd = (struct pthread *) th;
/* Unfortunately the kernel headers do not export the TASK_COMM_LEN
macro. So we have to define it here. */
@@ -40,29 +41,39 @@ __pthread_getname_np (pthread_t th, char *buf, size_t len)
if (pd == THREAD_SELF)
return __prctl (PR_GET_NAME, buf) ? errno : 0;
-#define FMT "/proc/self/task/%u/comm"
- char fname[sizeof (FMT) + 8];
- sprintf (fname, FMT, (unsigned int) pd->tid);
+ /* Block all signal, since the lock is recursive and used on pthread_cancel
+ (which should be async-signal-safe). */
+ sigset_t oldmask;
+ __libc_signal_block_all (&oldmask);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
- int fd = __open64_nocancel (fname, O_RDONLY);
- if (fd == -1)
- return errno;
+ char fname[sizeof ("/proc/self/task//comm" ) + INT_BUFSIZE_BOUND (pid_t)];
+ __snprintf (fname, sizeof (fname), "/proc/self/task/%d/comm", pd->tid);
int res = 0;
- ssize_t n = TEMP_FAILURE_RETRY (__read_nocancel (fd, buf, len));
- if (n < 0)
- res = errno;
- else
+ int fd = __open64_nocancel (fname, O_RDONLY);
+ if (fd != -1)
{
- if (buf[n - 1] == '\n')
- buf[n - 1] = '\0';
- else if (n == len)
- res = ERANGE;
+ ssize_t n = TEMP_FAILURE_RETRY (__read_nocancel (fd, buf, len));
+ if (n > 0)
+ {
+ if (buf[n - 1] == '\n')
+ buf[n - 1] = '\0';
+ else if (n == len)
+ res = ERANGE;
+ else
+ buf[n] = '\0';
+ }
else
- buf[n] = '\0';
+ res = errno;
+
+ __close_nocancel_nostatus (fd);
}
+ else
+ res = errno;
- __close_nocancel_nostatus (fd);
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
+ __libc_signal_restore_set (&oldmask);
return res;
}
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 16/19] nptl: Use tidlock when accessing TID on pthread_setname_np
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (14 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 15/19] nptl: Use tidlock when accessing TID on pthread_getname_np Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 17/19] nptl: Use tidlock when accessing TID on pthread_sigqueue Adhemerval Zanella
` (3 subsequent siblings)
19 siblings, 0 replies; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Checked on x86_64-linux-gnu.
---
nptl/pthread_setname.c | 34 ++++++++++++++++++++++------------
1 file changed, 22 insertions(+), 12 deletions(-)
diff --git a/nptl/pthread_setname.c b/nptl/pthread_setname.c
index 6d2d8a1723..0f541b4534 100644
--- a/nptl/pthread_setname.c
+++ b/nptl/pthread_setname.c
@@ -23,14 +23,14 @@
#include <string.h>
#include <unistd.h>
#include <sys/prctl.h>
-
+#include <intprops.h>
#include <not-cancel.h>
int
__pthread_setname_np (pthread_t th, const char *name)
{
- const struct pthread *pd = (const struct pthread *) th;
+ struct pthread *pd = (struct pthread *) th;
/* Unfortunately the kernel headers do not export the TASK_COMM_LEN
macro. So we have to define it here. */
@@ -42,23 +42,33 @@ __pthread_setname_np (pthread_t th, const char *name)
if (pd == THREAD_SELF)
return __prctl (PR_SET_NAME, name) ? errno : 0;
-#define FMT "/proc/self/task/%u/comm"
- char fname[sizeof (FMT) + 8];
- sprintf (fname, FMT, (unsigned int) pd->tid);
+ /* Block all signal, since the lock is recursive and used on pthread_cancel
+ (which should be async-signal-safe). */
+ sigset_t oldmask;
+ __libc_signal_block_all (&oldmask);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
- int fd = __open64_nocancel (fname, O_RDWR);
- if (fd == -1)
- return errno;
+ char fname[sizeof ("/proc/self/task//comm" ) + INT_BUFSIZE_BOUND (pid_t)];
+ __snprintf (fname, sizeof (fname), "/proc/self/task/%d/comm", pd->tid);
int res = 0;
- ssize_t n = TEMP_FAILURE_RETRY (__write_nocancel (fd, name, name_len));
- if (n < 0)
+ int fd = __open64_nocancel (fname, O_RDWR);
+ if (fd == -1)
+ {
+ ssize_t n = TEMP_FAILURE_RETRY (__write_nocancel (fd, name, name_len));
+ if (n < 0)
+ res = errno;
+ else if (n != name_len)
+ res = EIO;
+ }
+ else
res = errno;
- else if (n != name_len)
- res = EIO;
__close_nocancel_nostatus (fd);
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
+ __libc_signal_restore_set (&oldmask);
+
return res;
}
versioned_symbol (libc, __pthread_setname_np, pthread_setname_np,
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 17/19] nptl: Use tidlock when accessing TID on pthread_sigqueue
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (15 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 16/19] nptl: Use tidlock when accessing TID on pthread_setname_np Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 14:43 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 18/19] nptl: Use tidlock when accessing TID on pthread_setschedprio Adhemerval Zanella
` (2 subsequent siblings)
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Checked on x86_64-linux-gnu.
---
nptl/pthread_sigqueue.c | 52 +++++++++++++++++++++--------------------
1 file changed, 27 insertions(+), 25 deletions(-)
diff --git a/nptl/pthread_sigqueue.c b/nptl/pthread_sigqueue.c
index 2a0467ad7a..731e83884d 100644
--- a/nptl/pthread_sigqueue.c
+++ b/nptl/pthread_sigqueue.c
@@ -28,41 +28,43 @@
int
__pthread_sigqueue (pthread_t threadid, int signo, const union sigval value)
{
-#ifdef __NR_rt_tgsigqueueinfo
struct pthread *pd = (struct pthread *) threadid;
- /* Force load of pd->tid into local variable or register. Otherwise
- if a thread exits between ESRCH test and tgkill, we might return
- EINVAL, because pd->tid would be cleared by the kernel. */
- pid_t tid = atomic_forced_read (pd->tid);
- if (__glibc_unlikely (tid <= 0))
- /* Not a valid thread handle. */
- return ESRCH;
-
/* Disallow sending the signal we use for cancellation, timers,
for the setxid implementation. */
if (signo == SIGCANCEL || signo == SIGTIMER || signo == SIGSETXID)
return EINVAL;
- pid_t pid = getpid ();
+ /* Block all signal, since the lock is recursive and used on pthread_cancel
+ (which should be async-signal-safe). */
+ sigset_t oldmask;
+ __libc_signal_block_all (&oldmask);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
+
+ int res;
+ if (pd->tid == 0)
+ {
+ pid_t pid = getpid ();
- /* Set up the siginfo_t structure. */
- siginfo_t info;
- memset (&info, '\0', sizeof (siginfo_t));
- info.si_signo = signo;
- info.si_code = SI_QUEUE;
- info.si_pid = pid;
- info.si_uid = __getuid ();
- info.si_value = value;
+ /* Set up the siginfo_t structure. */
+ siginfo_t info;
+ memset (&info, '\0', sizeof (siginfo_t));
+ info.si_signo = signo;
+ info.si_code = SI_QUEUE;
+ info.si_pid = pid;
+ info.si_uid = __getuid ();
+ info.si_value = value;
- /* We have a special syscall to do the work. */
- int val = INTERNAL_SYSCALL_CALL (rt_tgsigqueueinfo, pid, tid, signo,
+ res = INTERNAL_SYSCALL_CALL (rt_tgsigqueueinfo, pid, pd->tid, signo,
&info);
- return (INTERNAL_SYSCALL_ERROR_P (val)
- ? INTERNAL_SYSCALL_ERRNO (val) : 0);
-#else
- return ENOSYS;
-#endif
+ }
+ else
+ res = -ESRCH;
+
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
+ __libc_signal_restore_set (&oldmask);
+
+ return -res;
}
versioned_symbol (libc, __pthread_sigqueue, pthread_sigqueue, GLIBC_2_34);
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 18/19] nptl: Use tidlock when accessing TID on pthread_setschedprio
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (16 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 17/19] nptl: Use tidlock when accessing TID on pthread_sigqueue Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 19/19] nptl: Remove INVALID_TD_P Adhemerval Zanella
2021-08-26 14:47 ` [PATCH v2 00/19] Fix various NPTL synchronization issues Florian Weimer
19 siblings, 0 replies; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
Checked on x86_64-linux-gnu.
---
nptl/pthread_setschedprio.c | 17 ++++++++++++++---
1 file changed, 14 insertions(+), 3 deletions(-)
diff --git a/nptl/pthread_setschedprio.c b/nptl/pthread_setschedprio.c
index 7bb68d3231..f6efdcad8e 100644
--- a/nptl/pthread_setschedprio.c
+++ b/nptl/pthread_setschedprio.c
@@ -30,11 +30,18 @@ __pthread_setschedprio (pthread_t threadid, int prio)
struct pthread *pd = (struct pthread *) threadid;
/* Make sure the descriptor is valid. */
- if (INVALID_TD_P (pd))
- /* Not a valid thread handle. */
- return ESRCH;
+ sigset_t oldmask;
+ __libc_signal_block_all (&oldmask);
+ lll_lock (pd->tidlock, LLL_PRIVATE);
int result = 0;
+
+ if (pd->tid == 0)
+ {
+ result = ESRCH;
+ goto out;
+ }
+
struct sched_param param;
param.sched_priority = prio;
@@ -60,6 +67,10 @@ __pthread_setschedprio (pthread_t threadid, int prio)
lll_unlock (pd->lock, LLL_PRIVATE);
+out:
+ lll_unlock (pd->tidlock, LLL_PRIVATE);
+ __libc_signal_restore_set (&oldmask);
+
return result;
}
versioned_symbol (libc, __pthread_setschedprio, pthread_setschedprio,
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* [PATCH v2 19/19] nptl: Remove INVALID_TD_P
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (17 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 18/19] nptl: Use tidlock when accessing TID on pthread_setschedprio Adhemerval Zanella
@ 2021-08-23 19:50 ` Adhemerval Zanella
2021-08-26 9:30 ` Florian Weimer
2021-08-26 14:47 ` [PATCH v2 00/19] Fix various NPTL synchronization issues Florian Weimer
19 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-23 19:50 UTC (permalink / raw)
To: libc-alpha; +Cc: Florian Weimer
It is not used anymore.
---
sysdeps/nptl/pthreadP.h | 5 -----
1 file changed, 5 deletions(-)
diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
index d366d691cd..67a9887da3 100644
--- a/sysdeps/nptl/pthreadP.h
+++ b/sysdeps/nptl/pthreadP.h
@@ -239,11 +239,6 @@ libc_hidden_proto (__pthread_tpp_change_priority)
extern int __pthread_current_priority (void);
libc_hidden_proto (__pthread_current_priority)
-/* This will not catch all invalid descriptors but is better than
- nothing. And if the test triggers the thread descriptor is
- guaranteed to be invalid. */
-#define INVALID_TD_P(pd) __builtin_expect ((pd)->tid <= 0, 0)
-
extern void __pthread_unwind (__pthread_unwind_buf_t *__buf)
__cleanup_fct_attribute __attribute ((__noreturn__))
#if !defined SHARED && !IS_IN (libpthread)
--
2.30.2
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 19/19] nptl: Remove INVALID_TD_P
2021-08-23 19:50 ` [PATCH v2 19/19] nptl: Remove INVALID_TD_P Adhemerval Zanella
@ 2021-08-26 9:30 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 9:30 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> It is not used anymore.
> ---
> sysdeps/nptl/pthreadP.h | 5 -----
> 1 file changed, 5 deletions(-)
>
> diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
> index d366d691cd..67a9887da3 100644
> --- a/sysdeps/nptl/pthreadP.h
> +++ b/sysdeps/nptl/pthreadP.h
> @@ -239,11 +239,6 @@ libc_hidden_proto (__pthread_tpp_change_priority)
> extern int __pthread_current_priority (void);
> libc_hidden_proto (__pthread_current_priority)
>
> -/* This will not catch all invalid descriptors but is better than
> - nothing. And if the test triggers the thread descriptor is
> - guaranteed to be invalid. */
> -#define INVALID_TD_P(pd) __builtin_expect ((pd)->tid <= 0, 0)
> -
> extern void __pthread_unwind (__pthread_unwind_buf_t *__buf)
> __cleanup_fct_attribute __attribute ((__noreturn__))
> #if !defined SHARED && !IS_IN (libpthread)
Looks okay to me.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 07/19] support: Add support_wait_for_thread_exit
2021-08-23 19:50 ` [PATCH v2 07/19] support: Add support_wait_for_thread_exit Adhemerval Zanella
@ 2021-08-26 9:31 ` Florian Weimer
2021-08-26 16:49 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 9:31 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> From: Florian Weimer <fweimer@redhat.com>
>
> Wait until all threads except the current thread has exited.
Should I commit this separately?
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 01/19] nptl: Fix tst-cancel7 and tst-cancelx7 race condition (BZ #14232)
2021-08-23 19:50 ` [PATCH v2 01/19] nptl: Fix tst-cancel7 and tst-cancelx7 race condition (BZ #14232) Adhemerval Zanella
@ 2021-08-26 9:33 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 9:33 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> A mapped temporary file and a semaphore is used to synchronize the
> pid information on the created file, the semaphore is updated once
> the file contents is flushed.
>
> Checked on x86_64-linux-gnu.
Patch looks good to me.
Reviewed-by: Florian Weimer <fweimer@redhat.com>
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 02/19] nptl: Set cancellation type and state on pthread_exit
2021-08-23 19:50 ` [PATCH v2 02/19] nptl: Set cancellation type and state on pthread_exit Adhemerval Zanella
@ 2021-08-26 9:38 ` Florian Weimer
2021-08-26 9:42 ` Florian Weimer
2021-08-26 11:52 ` Adhemerval Zanella
0 siblings, 2 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 9:38 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> +/* Check if a thread with PTHREAD_CANCEL_ASYNCHRONOUS cancellation on
> + blocked read() sets the correct state and type as pthread_exit. */
> +static void *
> +tf_cancel_async (void *arg)
> +{
> + TEST_VERIFY (pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, NULL)
> + == 0);
> +
> + xpthread_barrier_wait (&b);
> +
> + pthread_cleanup_push (clh, NULL);
This assumes that pthread_cleanup_push is async-cancel-safe. According
to this thread:
Async cacellation and pthread_cleanup_push
<https://sourceware.org/pipermail/libc-alpha/2021-August/129816.html>
this is not always true.
Should we build this test with -fno-exceptions?
Rest of the patch looks okay to me.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 03/19] nptl: Handle robust PI mutexes for !__ASSUME_SET_ROBUST_LIST
2021-08-23 19:50 ` [PATCH v2 03/19] nptl: Handle robust PI mutexes for !__ASSUME_SET_ROBUST_LIST Adhemerval Zanella
@ 2021-08-26 9:42 ` Florian Weimer
2021-08-26 12:14 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 9:42 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> The robust PI mutexes are signaled by setting the LSB bit to 1, so
> the code requires to take this consideration before access the
> __pthread_mutex_s.
>
> The code is also simplified: the initialization code is not really
> required, PD->robust_head.list and PD->robust_list.__next are
> essentially the same regardless of __PTHREAD_MUTEX_HAVE_PREV, the futex
> wake is optimized to be issued only when required, and the futex shared
> bit is set only when required.
Is this a user-visible bug? Should it have a bug reference?
> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
> index d8ec299cb1..08e5189ad6 100644
> --- a/nptl/pthread_create.c
> +++ b/nptl/pthread_create.c
> @@ -486,35 +486,36 @@ start_thread (void *arg)
> exit (0);
>
> #ifndef __ASSUME_SET_ROBUST_LIST
> - /* If this thread has any robust mutexes locked, handle them now. */
> -# if __PTHREAD_MUTEX_HAVE_PREV
> - void *robust = pd->robust_head.list;
> -# else
> - __pthread_slist_t *robust = pd->robust_list.__next;
> -# endif
> - /* We let the kernel do the notification if it is able to do so.
> - If we have to do it here there for sure are no PI mutexes involved
> - since the kernel support for them is even more recent. */
> - if (!__nptl_set_robust_list_avail
> - && __builtin_expect (robust != (void *) &pd->robust_head, 0))
> + /* We let the kernel do the notification if it is able to do so on the exit
> + syscall. Otherwise we need to handle before the thread terminates. */
> + void **robust;
> + while ((robust = pd->robust_head.list)
> + && robust != (void *) &pd->robust_head)
> {
> - do
> + /* Note: robust PI futexes are signaled by setting bit 0. */
> + void **robustp = (void **) ((uintptr_t) robust & ~1UL);
> +
> + struct __pthread_mutex_s *mtx = (struct __pthread_mutex_s *)
> + ((char *) robustp - offsetof (struct __pthread_mutex_s,
> + __list.__next));
> + unsigned int nusers = mtx->__nusers;
> + int shared = mtx->__kind & 128;
> +
> + pd->robust_head.list_op_pending = robust;
> + pd->robust_head.list = *robustp;
> + /* Although the list will not be changed at this point, it follows the
> + expected kernel ABI. */
> + __asm ("" ::: "memory");
> +
> + int lock = atomic_exchange_relaxed (&mtx->__lock, FUTEX_OWNER_DIED);
> + /* Wake any users if mutex is acquired with potential users. */
> + if (lock > 1 || nusers != 0)
Why the check for nusers? Isn't that racy?
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 02/19] nptl: Set cancellation type and state on pthread_exit
2021-08-26 9:38 ` Florian Weimer
@ 2021-08-26 9:42 ` Florian Weimer
2021-08-26 11:56 ` Adhemerval Zanella
2021-08-26 11:52 ` Adhemerval Zanella
1 sibling, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 9:42 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Florian Weimer:
> * Adhemerval Zanella:
>
>> +/* Check if a thread with PTHREAD_CANCEL_ASYNCHRONOUS cancellation on
>> + blocked read() sets the correct state and type as pthread_exit. */
>> +static void *
>> +tf_cancel_async (void *arg)
>> +{
>> + TEST_VERIFY (pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, NULL)
>> + == 0);
>> +
>> + xpthread_barrier_wait (&b);
>> +
>> + pthread_cleanup_push (clh, NULL);
>
> This assumes that pthread_cleanup_push is async-cancel-safe. According
> to this thread:
>
> Async cacellation and pthread_cleanup_push
> <https://sourceware.org/pipermail/libc-alpha/2021-August/129816.html>
>
> this is not always true.
>
> Should we build this test with -fno-exceptions?
>
> Rest of the patch looks okay to me.
And this should probably reference a Bugzilla bug.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 08/19] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193)
2021-08-23 19:50 ` [PATCH v2 08/19] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) Adhemerval Zanella
@ 2021-08-26 10:03 ` Florian Weimer
2021-08-26 16:49 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 10:03 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> From: Florian Weimer <fweimer@redhat.com>
>
> This closes one remaining race condition related to bug 12889: if
> the thread already exited on the kernel side, returning ESRCH
> is not correct because that error is reserved for the thread IDs
> (pthread_t values) whose lifetime has ended. In case of a
> kernel-side exit and a valid thread ID, no signal needs to be sent
> and cancellation does not have an effect, so just return 0.
>
> sysdeps/pthread/tst-kill4.c triggers undefined behavior and is
> removed with this commit.
Wrong commit subject: “should [not] fail”
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 04/19] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
2021-08-23 19:50 ` [PATCH v2 04/19] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951) Adhemerval Zanella
@ 2021-08-26 10:41 ` Florian Weimer
2021-08-26 14:58 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 10:41 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> The use after free described in BZ#19951 is due the use of two different
> PD fields, 'joinid' and 'cancelhandling', to describe the thread state
> to synchronize the calls of pthread_join(), pthread_detach(),
> pthread_exit(), and normal thread exit.
>
> And any state change potentially requires to check for both fields
> atomically to handle partial state (such as pthread_join() with a
> cancellation handler to issue a 'joinstate' field rollback).
>
> This patch uses a different PD member with 4 possible states (JOINABLE,
> DETACHED, EXITING, and EXITED) instead of pthread 'tid' field:
>
> 1. On pthread_create() the inital state is set either to JOINABLE or
> DETACHED depending of the pthread attribute used.
>
> 2. On pthread_detach(), a CAS is issued on the state. If the CAS
> fails it means that thread is already detached (DETACHED) or is
> being terminated (EXITING). For former an EINVAL is returned,
> while for latter pthread_detach() should be reponsible to join the
> thread (and deallocate any internal resources).
>
> 3. On pthread_create() exit phase (reached either if the thread
Suggest: “In the exit phase of the wrapper function for the thread start
routine (reached …”
> function has returned, pthread_exit() has being called, or
> cancellation handled has been acted upon) we issue a CAS on state
> to set to EXITING mode. If the thread is previously on DETACHED
> mode it will be the thread itself responsible to deallocate any
Suggest: “the thread itself is responsible for arranging the
deallocation of any resource”
(detached threads cannot immediately deallocate themselves)
> resource, otherwise the threads needs to be joined.
“the thread[] needs to be joined”
> 4. The clear_tid_field on 'clone' call is changed to set the new
> 'state' field on thread exit (EXITED). This state ins only
> reached at thread termination.
>
> 4. The pthread_join() implementation is now simpler: the futex wait
> is done directly on thread state and there is no need to reset it
> in case of timeout (since the state is now set either by
> pthread_detach() or by the kernel on process termination).
Duplicate 4.
> The race condition on pthread_detach is avoided with only one atomic
> operation on PD state: once the mode is set to THREAD_STATE_DETACHED
> it is up to thread itself to deallocate its memory (done on the exit
> phase at pthread_create()).
See above regarding thread self-deallocation.
The design as described above looks sound to me, those are just nits.
> This change trigger an invalid C11 thread tests: it crates a thread,
> which detaches itself, and after a timeout the creating thread checks
> if the join fails. The issue is once thrd_join() is called the thread
> lifetime has already expired. The test is changed so the sleep is done
> by the thread itself, so the creating thread will try to join a valid
> thread.
This could be a separate commit.
> diff --git a/nptl/nptl-stack.h b/nptl/nptl-stack.h
> index 8bcfde7ec5..c6b2b3078f 100644
> --- a/nptl/nptl-stack.h
> +++ b/nptl/nptl-stack.h
> @@ -32,7 +32,7 @@ extern size_t __nptl_stack_cache_maxsize attribute_hidden;
> static inline bool
> __nptl_stack_in_use (struct pthread *pd)
> {
> - return pd->tid <= 0;
> + return atomic_load_relaxed (&pd->joinstate) == THREAD_STATE_EXITED;
> }
Maybe this should be an acquire load, to synchronize with the thread
exit as communicated by the kernel?
> diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
> index 38f637c5cf..67e00ef007 100644
> --- a/nptl/pthread_cancel.c
> +++ b/nptl/pthread_cancel.c
> @@ -63,7 +63,8 @@ __pthread_cancel (pthread_t th)
> volatile struct pthread *pd = (volatile struct pthread *) th;
>
> /* Make sure the descriptor is valid. */
> - if (INVALID_TD_P (pd))
> + int state = atomic_load_acquire (&pd->joinstate);
> + if (state == THREAD_STATE_EXITED || state == THREAD_STATE_EXITING)
> /* Not a valid thread handle. */
> return ESRCH;
This is fixed in patch 08/19, so it's okay.
> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
> index 08e5189ad6..763e32bc3e 100644
> --- a/nptl/pthread_create.c
> +++ b/nptl/pthread_create.c
> @@ -286,7 +286,7 @@ static int create_thread (struct pthread *pd,
> const struct
> @@ -351,13 +351,16 @@ start_thread (void *arg)
> and free any resource prior return to the pthread_create caller. */
> setup_failed = pd->setup_failed == 1;
> if (setup_failed)
> - pd->joinid = NULL;
> + pd->joinstate = THREAD_STATE_JOINABLE;
>
> /* And give it up right away. */
> lll_unlock (pd->lock, LLL_PRIVATE);
>
> if (setup_failed)
> - goto out;
> + {
> + pd->tid = 0;
> + goto out;
> + }
> }
What's the advantage of setting pd->tid here and below in start_thread?
It destroys information that could be useful for debugging purposes,
answering the question what the TID was before the thread exited.
> @@ -481,9 +484,20 @@ start_thread (void *arg)
> the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE. */
> atomic_bit_set (&pd->cancelhandling, EXITING_BIT);
>
> - if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
> - /* This was the last thread. */
> - exit (0);
> +
> + /* CONCURRENCY NOTES:
> +
> + Concurrent pthread_detach() will either set state to
> + THREAD_STATE_DETACHED or wait the thread to terminate. The exiting state
“wait [for] the thread to terminate”
> + set here is set so a pthread_join() wait until all the required cleanup
> + steps are done.
> +
> + The 'joinstate' field will be used to determine who is responsible to
> + call __nptl_free_tcb below. */
> +
> + unsigned int joinstate = THREAD_STATE_JOINABLE;
> + atomic_compare_exchange_weak_acquire (&pd->joinstate, &joinstate,
> + THREAD_STATE_EXITING);
I think you need a strong CAS here. We don't have, so you'll have to
add a loop.
> diff --git a/nptl/pthread_detach.c b/nptl/pthread_detach.c
> index ac50db9b0e..97d292cab8 100644
> --- a/nptl/pthread_detach.c
> +++ b/nptl/pthread_detach.c
> @@ -26,32 +26,24 @@ ___pthread_detach (pthread_t th)
> + For the case the thread is being terminated (THREAD_STATE_EXITING),
> + pthread_detach() will responsible to clean up the stack. */
> +
> + int curstate = THREAD_STATE_JOINABLE;
> + if (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &curstate,
> + THREAD_STATE_DETACHED))
This should be a strong CAS, so this needs a loop, I think.
> diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
> index 7303069316..428ed35101 100644
> --- a/nptl/pthread_join_common.c
> +++ b/nptl/pthread_join_common.c
> @@ -22,113 +22,73 @@
> #include <time.h>
> #include <futex-internal.h>
>
> -static void
> -cleanup (void *arg)
> +/* Check for a possible deadlock situation where the threads are waiting for
> + each other to finish. Note that this is a "may" error. To be 100% sure we
> + catch this error we would have to lock the data structures but it is not
> + necessary. In the unlikely case that two threads are really caught in this
> + situation they will deadlock. It is the programmer's problem to figure
> + this out. */
> +static inline bool
> +check_for_deadlock (int state, struct pthread *pd)
> {
> - /* If we already changed the waiter ID, reset it. The call cannot
> - fail for any reason but the thread not having done that yet so
> - there is no reason for a loop. */
> struct pthread *self = THREAD_SELF;
> - atomic_compare_exchange_weak_acquire (&arg, &self, NULL);
> + return ((pd == self
> + || (atomic_load_acquire (&self->joinstate) == THREAD_STATE_DETACHED
> + && (pd->cancelhandling
> + & (CANCELED_BITMASK | EXITING_BITMASK
> + | TERMINATED_BITMASK)) == 0))
> + && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
> + && (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
> + | TERMINATED_BITMASK))
> + == CANCELED_BITMASK));
> }
The state argument to check_for_deadlock appears to be unused.
> + /* pthread_join() is a cancellation entrypoint and we use the same
> + rationale for pthread_timedjoin_np().
> +
> + The kernel notifies a process which uses CLONE_CHILD_CLEARTID via
> + a memory zerouing and futex wake up when the process terminates.
“memory zero[]ing and futex wake[-]up”
> #if __TIMESIZE == 64
> diff --git a/nptl/pthread_tryjoin.c b/nptl/pthread_tryjoin.c
> index fd938e8780..726f2abc68 100644
> --- a/nptl/pthread_tryjoin.c
> +++ b/nptl/pthread_tryjoin.c
> @@ -22,15 +22,17 @@
> int
> __pthread_tryjoin_np (pthread_t threadid, void **thread_return)
> {
> - /* Return right away if the thread hasn't terminated yet. */
> - struct pthread *pd = (struct pthread *) threadid;
> - if (pd->tid != 0)
> - return EBUSY;
> + /* The joinable state (THREAD_STATE_JOINABLE) is straigthforward since the
> + thread hasn't finished yet and trying to join might block.
> + Both detached (THREAD_STATE_DETACHED) and exiting (THREAD_STATE_EXITING)
> + might also result in a possible blocking call: a detached thread might
> + change its state to exiting and a exiting thread my take some time to
> + exit (and thus let the kernel set the state to THREAD_STATE_EXITED). */
pthread_tryjoin_np on a thread which is THREAD_STATE_DETACHED is
invalid, so that case doesn't matter, I think.
> + struct pthread *pd = (struct pthread *) threadid;
> + return atomic_load_acquire (&pd->joinstate) != THREAD_STATE_EXITED
> + ? EBUSY
> + : __pthread_clockjoin_ex (threadid, thread_return, 0, NULL);
> }
But the code is correct, I believe.
> diff --git a/sysdeps/pthread/tst-thrd-detach.c b/sysdeps/pthread/tst-thrd-detach.c
> index c844767748..e1906a0e10 100644
> --- a/sysdeps/pthread/tst-thrd-detach.c
> +++ b/sysdeps/pthread/tst-thrd-detach.c
> - if (thrd_join (id, NULL) == thrd_success)
> - FAIL_EXIT1 ("thrd_join succeed where it should fail");
> + TEST_COMPARE (thrd_join (id, NULL), thrd_error);
This is still a user-after-free bug, right?
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 05/19] nptl: Move setxid flag out of cancelhandling
2021-08-23 19:50 ` [PATCH v2 05/19] nptl: Move setxid flag out of cancelhandling Adhemerval Zanella
@ 2021-08-26 11:34 ` Florian Weimer
2021-08-26 15:11 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 11:34 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> Now that the thread state is tracked by 'joinstate' field, there is no
> need to keep the setxid flag within the 'cancelhandling'. It simplifies
> the atomic code to set and reset it (since there is no need to handle
> the thread state concurrent update).
Some of this functionality reimplements the tidlock.
There is some functionality around thread creation, but I do not know if
the synchronization is entirely correct. This makes it rather difficult
to review changes.
It's also not clear to me if we need to do anything to prevent early
setxid calls:
> /* Don't allow setxid until cloned. */
I suspect this should just look at the TID instead. We create the
thread with all signals blocked, so once there is a TID (and the thread
has not exited), it is safe to send the signal.
> diff --git a/nptl/descr.h b/nptl/descr.h
> index 4b2db6edab..563b152bff 100644
> --- a/nptl/descr.h
> +++ b/nptl/descr.h
> + /* Indicate whether thread is supposed to change XID. */
> + int setxid_flag;
This should document the possible values.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 02/19] nptl: Set cancellation type and state on pthread_exit
2021-08-26 9:38 ` Florian Weimer
2021-08-26 9:42 ` Florian Weimer
@ 2021-08-26 11:52 ` Adhemerval Zanella
2021-08-26 12:08 ` Florian Weimer
1 sibling, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 11:52 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 06:38, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> +/* Check if a thread with PTHREAD_CANCEL_ASYNCHRONOUS cancellation on
>> + blocked read() sets the correct state and type as pthread_exit. */
>> +static void *
>> +tf_cancel_async (void *arg)
>> +{
>> + TEST_VERIFY (pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, NULL)
>> + == 0);
>> +
>> + xpthread_barrier_wait (&b);
>> +
>> + pthread_cleanup_push (clh, NULL);
>
> This assumes that pthread_cleanup_push is async-cancel-safe. According
> to this thread:
>
> Async cacellation and pthread_cleanup_push
> <https://sourceware.org/pipermail/libc-alpha/2021-August/129816.html>
>
> this is not always true.
>
> Should we build this test with -fno-exceptions?
Or I can't remove the aync test part and only test the deferred part,
I don't have a strong opinion.
>
> Rest of the patch looks okay to me.
>
> Thanks,
> Florian
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 02/19] nptl: Set cancellation type and state on pthread_exit
2021-08-26 9:42 ` Florian Weimer
@ 2021-08-26 11:56 ` Adhemerval Zanella
0 siblings, 0 replies; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 11:56 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 06:42, Florian Weimer wrote:
> * Florian Weimer:
>
>> * Adhemerval Zanella:
>>
>>> +/* Check if a thread with PTHREAD_CANCEL_ASYNCHRONOUS cancellation on
>>> + blocked read() sets the correct state and type as pthread_exit. */
>>> +static void *
>>> +tf_cancel_async (void *arg)
>>> +{
>>> + TEST_VERIFY (pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, NULL)
>>> + == 0);
>>> +
>>> + xpthread_barrier_wait (&b);
>>> +
>>> + pthread_cleanup_push (clh, NULL);
>>
>> This assumes that pthread_cleanup_push is async-cancel-safe. According
>> to this thread:
>>
>> Async cacellation and pthread_cleanup_push
>> <https://sourceware.org/pipermail/libc-alpha/2021-August/129816.html>
>>
>> this is not always true.
>>
>> Should we build this test with -fno-exceptions?
>>
>> Rest of the patch looks okay to me.
>
> And this should probably reference a Bugzilla bug.
I have opened https://sourceware.org/bugzilla/show_bug.cgi?id=28267
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 02/19] nptl: Set cancellation type and state on pthread_exit
2021-08-26 11:52 ` Adhemerval Zanella
@ 2021-08-26 12:08 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 12:08 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> On 26/08/2021 06:38, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>> +/* Check if a thread with PTHREAD_CANCEL_ASYNCHRONOUS cancellation on
>>> + blocked read() sets the correct state and type as pthread_exit. */
>>> +static void *
>>> +tf_cancel_async (void *arg)
>>> +{
>>> + TEST_VERIFY (pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, NULL)
>>> + == 0);
>>> +
>>> + xpthread_barrier_wait (&b);
>>> +
>>> + pthread_cleanup_push (clh, NULL);
>>
>> This assumes that pthread_cleanup_push is async-cancel-safe. According
>> to this thread:
>>
>> Async cacellation and pthread_cleanup_push
>> <https://sourceware.org/pipermail/libc-alpha/2021-August/129816.html>
>>
>> this is not always true.
>>
>> Should we build this test with -fno-exceptions?
>
> Or I can't remove the aync test part and only test the deferred part,
> I don't have a strong opinion.
Testing deferred cancellation only is fine with me.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 03/19] nptl: Handle robust PI mutexes for !__ASSUME_SET_ROBUST_LIST
2021-08-26 9:42 ` Florian Weimer
@ 2021-08-26 12:14 ` Adhemerval Zanella
0 siblings, 0 replies; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 12:14 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 06:42, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> The robust PI mutexes are signaled by setting the LSB bit to 1, so
>> the code requires to take this consideration before access the
>> __pthread_mutex_s.
>>
>> The code is also simplified: the initialization code is not really
>> required, PD->robust_head.list and PD->robust_list.__next are
>> essentially the same regardless of __PTHREAD_MUTEX_HAVE_PREV, the futex
>> wake is optimized to be issued only when required, and the futex shared
>> bit is set only when required.
>
> Is this a user-visible bug? Should it have a bug reference?
This a bug on only visible on a handful of architectures where
__NR_set_robust_list is not support due the missing CONFIG_HAVE_FUTEX_CMPXCHG.
I have opened https://sourceware.org/bugzilla/show_bug.cgi?id=28268
>
>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>> index d8ec299cb1..08e5189ad6 100644
>> --- a/nptl/pthread_create.c
>> +++ b/nptl/pthread_create.c
>> @@ -486,35 +486,36 @@ start_thread (void *arg)
>> exit (0);
>>
>> #ifndef __ASSUME_SET_ROBUST_LIST
>> - /* If this thread has any robust mutexes locked, handle them now. */
>> -# if __PTHREAD_MUTEX_HAVE_PREV
>> - void *robust = pd->robust_head.list;
>> -# else
>> - __pthread_slist_t *robust = pd->robust_list.__next;
>> -# endif
>> - /* We let the kernel do the notification if it is able to do so.
>> - If we have to do it here there for sure are no PI mutexes involved
>> - since the kernel support for them is even more recent. */
>> - if (!__nptl_set_robust_list_avail
>> - && __builtin_expect (robust != (void *) &pd->robust_head, 0))
>> + /* We let the kernel do the notification if it is able to do so on the exit
>> + syscall. Otherwise we need to handle before the thread terminates. */
>> + void **robust;
>> + while ((robust = pd->robust_head.list)
>> + && robust != (void *) &pd->robust_head)
>> {
>> - do
>> + /* Note: robust PI futexes are signaled by setting bit 0. */
>> + void **robustp = (void **) ((uintptr_t) robust & ~1UL);
>> +
>> + struct __pthread_mutex_s *mtx = (struct __pthread_mutex_s *)
>> + ((char *) robustp - offsetof (struct __pthread_mutex_s,
>> + __list.__next));
>> + unsigned int nusers = mtx->__nusers;
>> + int shared = mtx->__kind & 128;
>> +
>> + pd->robust_head.list_op_pending = robust;
>> + pd->robust_head.list = *robustp;
>> + /* Although the list will not be changed at this point, it follows the
>> + expected kernel ABI. */
>> + __asm ("" ::: "memory");
>> +
>> + int lock = atomic_exchange_relaxed (&mtx->__lock, FUTEX_OWNER_DIED);
>> + /* Wake any users if mutex is acquired with potential users. */
>> + if (lock > 1 || nusers != 0)
>
> Why the check for nusers? Isn't that racy?
I was small optimization to avoid calling the futex, but I think you are
correct that is racy. I think it would be better to let the kernel handle
it.
>
> Thanks,
> Florian
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 09/19] nptl: Fix race between pthread_kill and thread exit (bug 12889)
2021-08-23 19:50 ` [PATCH v2 09/19] nptl: Fix race between pthread_kill and thread exit (bug 12889) Adhemerval Zanella
@ 2021-08-26 14:23 ` Florian Weimer
2021-08-26 17:06 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 14:23 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
> index 5d4c86f920..63fe8c44ca 100644
> --- a/nptl/pthread_kill.c
> +++ b/nptl/pthread_kill.c
> @@ -26,15 +26,18 @@ __pthread_kill_internal (pthread_t threadid, int signo)
> pid_t tid;
> struct pthread *pd = (struct pthread *) threadid;
>
> + /* Block all signal, since the lock is recursive and used on pthread_cancel
> + (which should be async-signal-safe). */
> + sigset_t oldmask;
> + __libc_signal_block_all (&oldmask);
> + lll_lock (pd->tidlock, LLL_PRIVATE);
> +
> if (pd == THREAD_SELF)
> /* It is a special case to handle raise() implementation after a vfork
> call (which does not update the PD tid field). */
> tid = INLINE_SYSCALL_CALL (gettid);
> else
> - /* Force load of pd->tid into local variable or register. Otherwise
> - if a thread exits between ESRCH test and tgkill, we might return
> - EINVAL, because pd->tid would be cleared by the kernel. */
> - tid = atomic_forced_read (pd->tid);
> + tid = pd->tid;
>
> int val;
> if (__glibc_likely (tid > 0))
> @@ -53,6 +56,9 @@ __pthread_kill_internal (pthread_t threadid, int signo)
> as an error. */
> val = 0;
>
> + lll_unlock (pd->tidlock, LLL_PRIVATE);
> + __libc_signal_restore_set (&oldmask);
> +
> return val;
> }
This needs a comment explaining that *all* pending signals are delivered
at the point of the __libc_signal_restore_set call. I hope that this is
actually what happens in Linux; POSIX only guarantees that for one
pending signal that is unblocked.
The problem here is that pthread_kill (pthread_self (), …) must generate
synchronous signal, and due to the signal-blocking, it is not
immediately obvious if that's the case.
If we aren't sure that all signals are flushed, we need to check for the
send-signal-to-self case and avoid blocking the signal there. We don't
need the tidlock for that because the running thread won't go away.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 10/19] nptl: Use tidlock when accessing TID on pthread_getaffinity_np
2021-08-23 19:50 ` [PATCH v2 10/19] nptl: Use tidlock when accessing TID on pthread_getaffinity_np Adhemerval Zanella
@ 2021-08-26 14:24 ` Florian Weimer
2021-08-26 17:29 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 14:24 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> Checked on x86_64-linux-gnu.
> ---
> nptl/pthread_getaffinity.c | 22 +++++++++++++++++-----
> 1 file changed, 17 insertions(+), 5 deletions(-)
>
> diff --git a/nptl/pthread_getaffinity.c b/nptl/pthread_getaffinity.c
> index 18261ddae0..5268d86e6e 100644
> --- a/nptl/pthread_getaffinity.c
> +++ b/nptl/pthread_getaffinity.c
> @@ -29,12 +29,24 @@
> int
> __pthread_getaffinity_np (pthread_t th, size_t cpusetsize, cpu_set_t *cpuset)
> {
> - const struct pthread *pd = (const struct pthread *) th;
> + struct pthread *pd = (struct pthread *) th;
>
> - int res = INTERNAL_SYSCALL_CALL (sched_getaffinity, pd->tid,
> - MIN (INT_MAX, cpusetsize), cpuset);
> - if (INTERNAL_SYSCALL_ERROR_P (res))
> - return INTERNAL_SYSCALL_ERRNO (res);
> + /* Block all signal, since the lock is recursive and used on pthread_cancel
> + (which should be async-signal-safe). */
The pthread_cancel reference looks like a cut-and-paste-bug.
> + sigset_t oldmask;
> + __libc_signal_block_all (&oldmask);
> + lll_lock (pd->tidlock, LLL_PRIVATE);
> +
> + int res = pd->tid == 0
> + ? -ESRCH
> + : INTERNAL_SYSCALL_CALL (sched_getaffinity, pd->tid,
> + MIN (INT_MAX, cpusetsize), cpuset);
> +
> + lll_unlock (pd->tidlock, LLL_PRIVATE);
> + __libc_signal_restore_set (&oldmask);
> +
> + if (res < 0)
> + return -res;
ESRCH doesn't look like the right error code here. Should we return an
affinity mask without any bits set?
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 11/19] nptl: Use tidlock when accessing TID on pthread_setaffinity
2021-08-23 19:50 ` [PATCH v2 11/19] nptl: Use tidlock when accessing TID on pthread_setaffinity Adhemerval Zanella
@ 2021-08-26 14:25 ` Florian Weimer
2021-08-26 17:31 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 14:25 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> Checked on x86_64-linux-gnu.
> ---
> nptl/pthread_setaffinity.c | 22 +++++++++++++++-------
> 1 file changed, 15 insertions(+), 7 deletions(-)
>
> diff --git a/nptl/pthread_setaffinity.c b/nptl/pthread_setaffinity.c
> index 3bfdc63e19..beb61836a6 100644
> --- a/nptl/pthread_setaffinity.c
> +++ b/nptl/pthread_setaffinity.c
> @@ -27,15 +27,23 @@ int
> __pthread_setaffinity_new (pthread_t th, size_t cpusetsize,
> const cpu_set_t *cpuset)
> {
> - const struct pthread *pd = (const struct pthread *) th;
> - int res;
> + struct pthread *pd = (struct pthread *) th;
>
> - res = INTERNAL_SYSCALL_CALL (sched_setaffinity, pd->tid, cpusetsize,
> - cpuset);
> + /* Block all signal, since the lock is recursive and used on pthread_cancel
> + (which should be async-signal-safe). */
> + sigset_t oldmask;
> + __libc_signal_block_all (&oldmask);
> + lll_lock (pd->tidlock, LLL_PRIVATE);
>
> - return (INTERNAL_SYSCALL_ERROR_P (res)
> - ? INTERNAL_SYSCALL_ERRNO (res)
> - : 0);
> + int res = pd->tid == 0
> + ? ESRCH
> + : INTERNAL_SYSCALL_CALL (sched_setaffinity, pd->tid, cpusetsize,
> + cpuset);
> +
> + lll_unlock (pd->tidlock, LLL_PRIVATE);
> + __libc_signal_restore_set (&oldmask);
> +
> + return res;
Same issue regarding ESRCH and pthread_cancel. Here we can just return
0 in case the thread is gone, I think.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 12/19] nptl: Use tidlock when accessing TID on pthread_getcpuclockid
2021-08-23 19:50 ` [PATCH v2 12/19] nptl: Use tidlock when accessing TID on pthread_getcpuclockid Adhemerval Zanella
@ 2021-08-26 14:27 ` Florian Weimer
2021-08-26 17:41 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 14:27 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> Checked on x86_64-linux-gnu.
> ---
> nptl/pthread_getcpuclockid.c | 27 +++++++++++++++++----------
> 1 file changed, 17 insertions(+), 10 deletions(-)
>
> diff --git a/nptl/pthread_getcpuclockid.c b/nptl/pthread_getcpuclockid.c
> index 0a6656ea4c..8c0db0b9ba 100644
> --- a/nptl/pthread_getcpuclockid.c
> +++ b/nptl/pthread_getcpuclockid.c
> @@ -29,16 +29,23 @@ __pthread_getcpuclockid (pthread_t threadid, clockid_t *clockid)
> struct pthread *pd = (struct pthread *) threadid;
>
> /* Make sure the descriptor is valid. */
> - if (INVALID_TD_P (pd))
> - /* Not a valid thread handle. */
> - return ESRCH;
> -
> - /* The clockid_t value is a simple computation from the TID. */
> -
> - const clockid_t tidclock = MAKE_THREAD_CPUCLOCK (pd->tid, CPUCLOCK_SCHED);
> -
> - *clockid = tidclock;
> - return 0;
> + sigset_t oldmask;
> + __libc_signal_block_all (&oldmask);
> + lll_lock (pd->tidlock, LLL_PRIVATE);
> +
> + int res;
> + if (pd->tid != 0)
> + {
> + *clockid = MAKE_THREAD_CPUCLOCK (pd->tid, CPUCLOCK_SCHED);
> + res = 0;
> + }
> + else
> + res = -ESRCH;
> +
> + lll_unlock (pd->tidlock, LLL_PRIVATE);
> + __libc_signal_restore_set (&oldmask);
> +
> + return res;
This doesn't really solve the race, does it? The caller cannot use the
clock ID safely.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 06/19] nptl: Replace struct thread cancelhandling field
2021-08-23 19:50 ` [PATCH v2 06/19] nptl: Replace struct thread cancelhandling field Adhemerval Zanella
@ 2021-08-26 14:34 ` Florian Weimer
2021-08-26 16:48 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 14:34 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> Now that both thread state and setxid is being tracked by two different
> fields ('joinstate' and 'setxid_flag'), there is no need to keep track
> of exiting (EXITING_BIT) and terminated (TERMINATED_BIT) state in
> 'cancelhandling'.
>
> The 'cancelhandling' field is renamed to 'cancelstatus' and it only
> signals whether the thread has been cancelled (set by pthread_cancel()).
> It allows simplify the atomic operation required to avoid CAS
> operations.
“cancelstatus” is a bad name because “cancel state” is POSIX terminology
with different meaning. Please choose a different name.
I think pending_cancel or cancel_requested would work.
> There is no need to check the thread state on
> __pthread_enable_asynccancel() nor on pthread_testcancel () anymore,
> since cancellation is explicit disabled when thread start the exit code
> on __do_cancel().
“explict[ly]”
>
> On __nptl_free_tcp(), the 'joinstate' now defines whether it is the
“__nptl_free_tcb”.
> creating thread or the created thread that calls it. So there is
> no concurrent call within the function and thus no need to set the
> TERMINATED_BIT.
>
> For SIGCANCEL handler, sigcancel_handler(), 'joinstate' is used instead
> (pthread_cancel() might still be called concurrenty in assynchronous
> mode).
“as[]ynchronous”
> diff --git a/nptl/nptl_free_tcb.c b/nptl/nptl_free_tcb.c
> index cbf3580f59..15e1a18562 100644
> --- a/nptl/nptl_free_tcb.c
> +++ b/nptl/nptl_free_tcb.c
> @@ -24,22 +24,12 @@
> void
> __nptl_free_tcb (struct pthread *pd)
> {
> - /* The thread is exiting now. */
> - if (atomic_bit_test_set (&pd->cancelhandling, TERMINATED_BIT) == 0)
> - {
> - /* Free TPP data. */
> - if (pd->tpp != NULL)
> - {
> - struct priority_protection_data *tpp = pd->tpp;
> + free (pd->tpp);
> + pd->tpp = NULL;
For detached threads, this is called on the thread after signals have
been blocked. I don't think this is entirely correct. And things
become rather interesting if malloc needs pd->tpp.
It's not clear to me whether this could happen before.
> + /* Queue the stack memory block for reuse and exit the process. The kernel
> + will signal via writing to the address returned by QUEUE-STACK when the
> + stack is available. */
> + __nptl_deallocate_stack (pd);
> }
That old comment seems to be out-of-date.
> libc_hidden_def (__nptl_free_tcb)
> diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
> index 67e00ef007..aed6c1ea47 100644
> --- a/nptl/pthread_cancel.c
> +++ b/nptl/pthread_cancel.c
> @@ -93,8 +90,9 @@ __pthread_cancel (pthread_t th)
> }
> #endif
>
> - int oldch = atomic_fetch_or_acquire (&pd->cancelhandling, CANCELED_BITMASK);
> - if ((oldch & CANCELED_BITMASK) != 0)
> + /* If already cancelled just return (cancellation will be acted upon in next
> + cancellation entrypoint). */
> + if (atomic_exchange_acquire (&pd->cancelstatus, 1) == 1)
> return 0;
This should probably use relaxed MO.
The libthreaddb changes look okay to me.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 14/19] nptl: Use tidlock when accessing TID on pthread_setschedparam
2021-08-23 19:50 ` [PATCH v2 14/19] nptl: Use tidlock when accessing TID on pthread_setschedparam Adhemerval Zanella
@ 2021-08-26 14:35 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 14:35 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> Checked on x86_64-linux-gnu.
> ---
> nptl/pthread_setschedparam.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/nptl/pthread_setschedparam.c b/nptl/pthread_setschedparam.c
> index 70fa8378b8..75c796d9b8 100644
> --- a/nptl/pthread_setschedparam.c
> +++ b/nptl/pthread_setschedparam.c
> @@ -30,12 +30,18 @@ __pthread_setschedparam (pthread_t threadid, int policy,
> struct pthread *pd = (struct pthread *) threadid;
>
> /* Make sure the descriptor is valid. */
> - if (INVALID_TD_P (pd))
> - /* Not a valid thread handle. */
> - return ESRCH;
> + sigset_t oldmask;
> + __libc_signal_block_all (&oldmask);
> + lll_lock (pd->tidlock, LLL_PRIVATE);
>
> int result = 0;
>
> + if (pd->tid == 0)
> + {
> + result = ESRCH;
> + goto out;
> + }
This case should probably just return 0.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 15/19] nptl: Use tidlock when accessing TID on pthread_getname_np
2021-08-23 19:50 ` [PATCH v2 15/19] nptl: Use tidlock when accessing TID on pthread_getname_np Adhemerval Zanella
@ 2021-08-26 14:38 ` Florian Weimer
2021-08-26 17:45 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 14:38 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> + /* Block all signal, since the lock is recursive and used on pthread_cancel
> + (which should be async-signal-safe). */
> + sigset_t oldmask;
> + __libc_signal_block_all (&oldmask);
> + lll_lock (pd->tidlock, LLL_PRIVATE);
>
> - int fd = __open64_nocancel (fname, O_RDONLY);
> - if (fd == -1)
> - return errno;
> + char fname[sizeof ("/proc/self/task//comm" ) + INT_BUFSIZE_BOUND (pid_t)];
> + __snprintf (fname, sizeof (fname), "/proc/self/task/%d/comm", pd->tid);
>
> int res = 0;
> - ssize_t n = TEMP_FAILURE_RETRY (__read_nocancel (fd, buf, len));
> - if (n < 0)
> - res = errno;
> - else
> + int fd = __open64_nocancel (fname, O_RDONLY);
> + if (fd != -1)
> {
> - if (buf[n - 1] == '\n')
> - buf[n - 1] = '\0';
> - else if (n == len)
> - res = ERANGE;
> + ssize_t n = TEMP_FAILURE_RETRY (__read_nocancel (fd, buf, len));
> + if (n > 0)
> + {
> + if (buf[n - 1] == '\n')
> + buf[n - 1] = '\0';
> + else if (n == len)
> + res = ERANGE;
> + else
> + buf[n] = '\0';
> + }
> else
> - buf[n] = '\0';
> + res = errno;
> +
> + __close_nocancel_nostatus (fd);
> }
> + else
> + res = errno;
ENOENT for exited threads is inconsistent with ESRCH (or 0) in other
cases.
I wonder if we should cache the thread name in the TCB. That would be
nice for debugging coredumps, too.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 17/19] nptl: Use tidlock when accessing TID on pthread_sigqueue
2021-08-23 19:50 ` [PATCH v2 17/19] nptl: Use tidlock when accessing TID on pthread_sigqueue Adhemerval Zanella
@ 2021-08-26 14:43 ` Florian Weimer
2021-08-26 17:49 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 14:43 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> + /* Block all signal, since the lock is recursive and used on pthread_cancel
> + (which should be async-signal-safe). */
> + sigset_t oldmask;
> + __libc_signal_block_all (&oldmask);
> + lll_lock (pd->tidlock, LLL_PRIVATE);
> +
> + int res;
> + if (pd->tid == 0)
> + {
> + pid_t pid = getpid ();
Huh, that can't be right, should be pd->tid != 0.
Don't we have test coverage for this?
> + else
> + res = -ESRCH;
We can return 0 in this case, I think.
It's possibly that the same issue regarding synchronous signal delivery
for pthread_jill applies here.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 00/19] Fix various NPTL synchronization issues
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
` (18 preceding siblings ...)
2021-08-23 19:50 ` [PATCH v2 19/19] nptl: Remove INVALID_TD_P Adhemerval Zanella
@ 2021-08-26 14:47 ` Florian Weimer
2021-08-26 18:19 ` Adhemerval Zanella
19 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 14:47 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> This is an update of my previous set to fix some NPTL issues [1].
> The main changes are:
>
> - Rebased against master and adjusted the __clone_internal usage.
> - Adapted Florian's ESRCH fixes [2]
> - Add fixes for various function that access the 'tid'.
>
> Patch 01 to 03 are general nptl fixes and they are independent of the
> other fixes.
>
> Patch 04 is the main change of this patchset, it uses a different
> field instead of the pthread 'tid' to synchrnonize the internal
> thread state (BZ#19951).
>
> It allows to both move the thread setxid internal state out of
> 'cancelhandling' (used on setuid() call in multi-thread information),
> and remove the EXITING_BIT and TERMINATED_BIT (since 'joinstate' now
> track such it). This is done on patch 05 and 06.
>
> Patches 08 and 09 fixes two long standing issues regarding
> pthread_kill() and thread exit (BZ#12889 and BZ#19193). Now that]
> 'tid' is setting explicitly by pthread_create(), a simple lock can be
> used instead of more complex futex operation.
>
> Patches 10 to 18 extend the same 'tid' access fix to other pthread
> functions that uses the member.
I don't think this series of patches is suitable for backport to glibc
2.34 once completed. The libpthreaddb changes look particularly
cumbersome because you'll need two versions of the library depending
which coredumps you are investigating. However, I expect that we need
to fix the pthread_cancel race in glibc 2.34.
I can send my previous attempt with a straightforward lock (and perhaps
with the callback-based function removed).
However, I'd like to know what people think about relying on signal
unblocking delivering the signal that was sent to the thread itself.
Do we need to special-case the pthread_self case or not?
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 04/19] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
2021-08-26 10:41 ` Florian Weimer
@ 2021-08-26 14:58 ` Adhemerval Zanella
2021-08-26 15:06 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 14:58 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 07:41, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> The use after free described in BZ#19951 is due the use of two different
>> PD fields, 'joinid' and 'cancelhandling', to describe the thread state
>> to synchronize the calls of pthread_join(), pthread_detach(),
>> pthread_exit(), and normal thread exit.
>>
>> And any state change potentially requires to check for both fields
>> atomically to handle partial state (such as pthread_join() with a
>> cancellation handler to issue a 'joinstate' field rollback).
>>
>> This patch uses a different PD member with 4 possible states (JOINABLE,
>> DETACHED, EXITING, and EXITED) instead of pthread 'tid' field:
>>
>> 1. On pthread_create() the inital state is set either to JOINABLE or
>> DETACHED depending of the pthread attribute used.
>>
>> 2. On pthread_detach(), a CAS is issued on the state. If the CAS
>> fails it means that thread is already detached (DETACHED) or is
>> being terminated (EXITING). For former an EINVAL is returned,
>> while for latter pthread_detach() should be reponsible to join the
>> thread (and deallocate any internal resources).
>>
>> 3. On pthread_create() exit phase (reached either if the thread
>
> Suggest: “In the exit phase of the wrapper function for the thread start
> routine (reached …”
Ack.
>
>> function has returned, pthread_exit() has being called, or
>> cancellation handled has been acted upon) we issue a CAS on state
>> to set to EXITING mode. If the thread is previously on DETACHED
>> mode it will be the thread itself responsible to deallocate any
>
> Suggest: “the thread itself is responsible for arranging the
> deallocation of any resource”
>
> (detached threads cannot immediately deallocate themselves)
Ack.
>
>> resource, otherwise the threads needs to be joined.
>
> “the thread[] needs to be joined”
Ack.
>
>
>> 4. The clear_tid_field on 'clone' call is changed to set the new
>> 'state' field on thread exit (EXITED). This state ins only
>> reached at thread termination.
>>
>> 4. The pthread_join() implementation is now simpler: the futex wait
>> is done directly on thread state and there is no need to reset it
>> in case of timeout (since the state is now set either by
>> pthread_detach() or by the kernel on process termination).
>
> Duplicate 4.
Ack.
>
>> The race condition on pthread_detach is avoided with only one atomic
>> operation on PD state: once the mode is set to THREAD_STATE_DETACHED
>> it is up to thread itself to deallocate its memory (done on the exit
>> phase at pthread_create()).
>
> See above regarding thread self-deallocation.
>
> The design as described above looks sound to me, those are just nits.
Right, should I change this paragraph as well (it is not clear the
suggestion here).
>
>> This change trigger an invalid C11 thread tests: it crates a thread,
>> which detaches itself, and after a timeout the creating thread checks
>> if the join fails. The issue is once thrd_join() is called the thread
>> lifetime has already expired. The test is changed so the sleep is done
>> by the thread itself, so the creating thread will try to join a valid
>> thread.
>
> This could be a separate commit.
Ack.
>
>> diff --git a/nptl/nptl-stack.h b/nptl/nptl-stack.h
>> index 8bcfde7ec5..c6b2b3078f 100644
>> --- a/nptl/nptl-stack.h
>> +++ b/nptl/nptl-stack.h
>> @@ -32,7 +32,7 @@ extern size_t __nptl_stack_cache_maxsize attribute_hidden;
>> static inline bool
>> __nptl_stack_in_use (struct pthread *pd)
>> {
>> - return pd->tid <= 0;
>> + return atomic_load_relaxed (&pd->joinstate) == THREAD_STATE_EXITED;
>> }
>
> Maybe this should be an acquire load, to synchronize with the thread
> exit as communicated by the kernel?
It seems ok, I also added the comment:
/* It synchronize with the thread exit as communicated by the kernel with
set_tid_address set at clone(). */
>
>> diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
>> index 38f637c5cf..67e00ef007 100644
>> --- a/nptl/pthread_cancel.c
>> +++ b/nptl/pthread_cancel.c
>> @@ -63,7 +63,8 @@ __pthread_cancel (pthread_t th)
>> volatile struct pthread *pd = (volatile struct pthread *) th;
>>
>> /* Make sure the descriptor is valid. */
>> - if (INVALID_TD_P (pd))
>> + int state = atomic_load_acquire (&pd->joinstate);
>> + if (state == THREAD_STATE_EXITED || state == THREAD_STATE_EXITING)
>> /* Not a valid thread handle. */
>> return ESRCH;
>
> This is fixed in patch 08/19, so it's okay.
>
>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>> index 08e5189ad6..763e32bc3e 100644
>> --- a/nptl/pthread_create.c
>> +++ b/nptl/pthread_create.c
>> @@ -286,7 +286,7 @@ static int create_thread (struct pthread *pd,
>> const struct
>> @@ -351,13 +351,16 @@ start_thread (void *arg)
>> and free any resource prior return to the pthread_create caller. */
>> setup_failed = pd->setup_failed == 1;
>> if (setup_failed)
>> - pd->joinid = NULL;
>> + pd->joinstate = THREAD_STATE_JOINABLE;
>>
>> /* And give it up right away. */
>> lll_unlock (pd->lock, LLL_PRIVATE);
>>
>> if (setup_failed)
>> - goto out;
>> + {
>> + pd->tid = 0;
>> + goto out;
>> + }
>> }
>
> What's the advantage of setting pd->tid here and below in start_thread?
We don't really need to clear the tid on setup_failed case in fact, since
in this case no pthread_t will be returned to caller. I remove it.
>
> It destroys information that could be useful for debugging purposes,
> answering the question what the TID was before the thread exited.
>
>> @@ -481,9 +484,20 @@ start_thread (void *arg)
>> the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE. */
>> atomic_bit_set (&pd->cancelhandling, EXITING_BIT);
>>
>> - if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
>> - /* This was the last thread. */
>> - exit (0);
>> +
>> + /* CONCURRENCY NOTES:
>> +
>> + Concurrent pthread_detach() will either set state to
>> + THREAD_STATE_DETACHED or wait the thread to terminate. The exiting state
>
> “wait [for] the thread to terminate”
Ack.
>
>> + set here is set so a pthread_join() wait until all the required cleanup
>> + steps are done.
>> +
>> + The 'joinstate' field will be used to determine who is responsible to
>> + call __nptl_free_tcb below. */
>> +
>> + unsigned int joinstate = THREAD_STATE_JOINABLE;
>> + atomic_compare_exchange_weak_acquire (&pd->joinstate, &joinstate,
>> + THREAD_STATE_EXITING);
>
> I think you need a strong CAS here. We don't have, so you'll have to
> add a loop.
Yeah, it seems right. I changed to:
unsigned int prevstate;
while (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &prevstate,
THREAD_STATE_EXITING))
prevstate = atomic_load_relaxed (&pd->joinstate);
>
>> diff --git a/nptl/pthread_detach.c b/nptl/pthread_detach.c
>> index ac50db9b0e..97d292cab8 100644
>> --- a/nptl/pthread_detach.c
>> +++ b/nptl/pthread_detach.c
>> @@ -26,32 +26,24 @@ ___pthread_detach (pthread_t th)
>
>> + For the case the thread is being terminated (THREAD_STATE_EXITING),
>> + pthread_detach() will responsible to clean up the stack. */
>> +
>> + int curstate = THREAD_STATE_JOINABLE;
>> + if (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &curstate,
>> + THREAD_STATE_DETACHED))
>
> This should be a strong CAS, so this needs a loop, I think.
I changed to:
unsigned int prevstate = atomic_load_relaxed (&pd->joinstate);
do
{
if (prevstate != THREAD_STATE_JOINABLE)
return EINVAL;
return __pthread_join (th, 0);
}
while (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &prevstate,
THREAD_STATE_DETACHED));
>
>> diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
>> index 7303069316..428ed35101 100644
>> --- a/nptl/pthread_join_common.c
>> +++ b/nptl/pthread_join_common.c
>> @@ -22,113 +22,73 @@
>> #include <time.h>
>> #include <futex-internal.h>
>>
>> -static void
>> -cleanup (void *arg)
>> +/* Check for a possible deadlock situation where the threads are waiting for
>> + each other to finish. Note that this is a "may" error. To be 100% sure we
>> + catch this error we would have to lock the data structures but it is not
>> + necessary. In the unlikely case that two threads are really caught in this
>> + situation they will deadlock. It is the programmer's problem to figure
>> + this out. */
>> +static inline bool
>> +check_for_deadlock (int state, struct pthread *pd)
>> {
>> - /* If we already changed the waiter ID, reset it. The call cannot
>> - fail for any reason but the thread not having done that yet so
>> - there is no reason for a loop. */
>> struct pthread *self = THREAD_SELF;
>> - atomic_compare_exchange_weak_acquire (&arg, &self, NULL);
>> + return ((pd == self
>> + || (atomic_load_acquire (&self->joinstate) == THREAD_STATE_DETACHED
>> + && (pd->cancelhandling
>> + & (CANCELED_BITMASK | EXITING_BITMASK
>> + | TERMINATED_BITMASK)) == 0))
>> + && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
>> + && (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
>> + | TERMINATED_BITMASK))
>> + == CANCELED_BITMASK));
>> }
>
> The state argument to check_for_deadlock appears to be unused.
I removed it.
>
>> + /* pthread_join() is a cancellation entrypoint and we use the same
>> + rationale for pthread_timedjoin_np().
>> +
>> + The kernel notifies a process which uses CLONE_CHILD_CLEARTID via
>> + a memory zerouing and futex wake up when the process terminates.
>
> “memory zero[]ing and futex wake[-]up”
Ack.
>
>> #if __TIMESIZE == 64
>> diff --git a/nptl/pthread_tryjoin.c b/nptl/pthread_tryjoin.c
>> index fd938e8780..726f2abc68 100644
>> --- a/nptl/pthread_tryjoin.c
>> +++ b/nptl/pthread_tryjoin.c
>> @@ -22,15 +22,17 @@
>> int
>> __pthread_tryjoin_np (pthread_t threadid, void **thread_return)
>> {
>> - /* Return right away if the thread hasn't terminated yet. */
>> - struct pthread *pd = (struct pthread *) threadid;
>> - if (pd->tid != 0)
>> - return EBUSY;
>> + /* The joinable state (THREAD_STATE_JOINABLE) is straigthforward since the
>> + thread hasn't finished yet and trying to join might block.
>> + Both detached (THREAD_STATE_DETACHED) and exiting (THREAD_STATE_EXITING)
>> + might also result in a possible blocking call: a detached thread might
>> + change its state to exiting and a exiting thread my take some time to
>> + exit (and thus let the kernel set the state to THREAD_STATE_EXITED). */
>
> pthread_tryjoin_np on a thread which is THREAD_STATE_DETACHED is
> invalid, so that case doesn't matter, I think.
I changed the comment to:
/* The joinable state (THREAD_STATE_JOINABLE) is straigthforward since the
thread hasn't finished yet and trying to join might block.
The exiting thread (THREAD_STATE_EXITING) also mgith result in ablocking
call: a detached thread might change its state to exiting and a exiting
thread my take some time to exit (and thus let the kernel set the state
to THREAD_STATE_EXITED). */
>
>> + struct pthread *pd = (struct pthread *) threadid;
>> + return atomic_load_acquire (&pd->joinstate) != THREAD_STATE_EXITED
>> + ? EBUSY
>> + : __pthread_clockjoin_ex (threadid, thread_return, 0, NULL);
>> }
>
> But the code is correct, I believe.
>
>> diff --git a/sysdeps/pthread/tst-thrd-detach.c b/sysdeps/pthread/tst-thrd-detach.c
>> index c844767748..e1906a0e10 100644
>> --- a/sysdeps/pthread/tst-thrd-detach.c
>> +++ b/sysdeps/pthread/tst-thrd-detach.c
>
>> - if (thrd_join (id, NULL) == thrd_success)
>> - FAIL_EXIT1 ("thrd_join succeed where it should fail");
>> + TEST_COMPARE (thrd_join (id, NULL), thrd_error);
>
> This is still a user-after-free bug, right?
Indeed, I think it would be better to just remove this test.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 13/19] nptl: Use tidlock when accessing TID on pthread_getschedparam
2021-08-23 19:50 ` [PATCH v2 13/19] nptl: Use tidlock when accessing TID on pthread_getschedparam Adhemerval Zanella
@ 2021-08-26 15:00 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 15:00 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> Checked on x86_64-linux-gnu.
> ---
> nptl/pthread_getschedparam.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/nptl/pthread_getschedparam.c b/nptl/pthread_getschedparam.c
> index 94316cf897..c41d2a8341 100644
> --- a/nptl/pthread_getschedparam.c
> +++ b/nptl/pthread_getschedparam.c
> @@ -29,12 +29,18 @@ __pthread_getschedparam (pthread_t threadid, int *policy,
> struct pthread *pd = (struct pthread *) threadid;
>
> /* Make sure the descriptor is valid. */
> - if (INVALID_TD_P (pd))
> - /* Not a valid thread handle. */
> - return ESRCH;
> + sigset_t oldmask;
> + __libc_signal_block_all (&oldmask);
> + lll_lock (pd->tidlock, LLL_PRIVATE);
>
> int result = 0;
>
> + if (pd->tid == 0)
> + {
> + result = ESRCH;
> + goto out;
> + }
We can avoid returning ESRCH by storing the parameters unconditionally
in the TCB. That should also simplify the implementation, potentially
at the cost of a few extra system calls.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 04/19] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
2021-08-26 14:58 ` Adhemerval Zanella
@ 2021-08-26 15:06 ` Florian Weimer
2021-08-26 16:16 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 15:06 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
>>> The race condition on pthread_detach is avoided with only one atomic
>>> operation on PD state: once the mode is set to THREAD_STATE_DETACHED
>>> it is up to thread itself to deallocate its memory (done on the exit
>>> phase at pthread_create()).
>>
>> See above regarding thread self-deallocation.
>>
>> The design as described above looks sound to me, those are just nits.
>
> Right, should I change this paragraph as well (it is not clear the
> suggestion here).
Maybe “up to [the] thread itself to [trigger deallocation of] its memory”?
>>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>>> index 08e5189ad6..763e32bc3e 100644
>>> --- a/nptl/pthread_create.c
>>> +++ b/nptl/pthread_create.c
>>> @@ -286,7 +286,7 @@ static int create_thread (struct pthread *pd,
>>> const struct
>>> @@ -351,13 +351,16 @@ start_thread (void *arg)
>>> and free any resource prior return to the pthread_create caller. */
>>> setup_failed = pd->setup_failed == 1;
>>> if (setup_failed)
>>> - pd->joinid = NULL;
>>> + pd->joinstate = THREAD_STATE_JOINABLE;
>>>
>>> /* And give it up right away. */
>>> lll_unlock (pd->lock, LLL_PRIVATE);
>>>
>>> if (setup_failed)
>>> - goto out;
>>> + {
>>> + pd->tid = 0;
>>> + goto out;
>>> + }
>>> }
>>
>> What's the advantage of setting pd->tid here and below in start_thread?
>
> We don't really need to clear the tid on setup_failed case in fact, since
> in this case no pthread_t will be returned to caller. I remove it.
What about the change in start_thread?
The subsequent changes look at the tid member, but they could equally
well look at joinstate, I think.
>> I think you need a strong CAS here. We don't have, so you'll have to
>> add a loop.
>
> Yeah, it seems right. I changed to:
>
> unsigned int prevstate;
> while (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &prevstate,
> THREAD_STATE_EXITING))
> prevstate = atomic_load_relaxed (&pd->joinstate);
Isn't prevstate uninitialized? Why no do-while loop?
>> pthread_tryjoin_np on a thread which is THREAD_STATE_DETACHED is
>> invalid, so that case doesn't matter, I think.
>
> I changed the comment to:
>
> /* The joinable state (THREAD_STATE_JOINABLE) is straigthforward since the
> thread hasn't finished yet and trying to join might block.
> The exiting thread (THREAD_STATE_EXITING) also mgith result in ablocking
> call: a detached thread might change its state to exiting and a exiting
> thread my take some time to exit (and thus let the kernel set the state
> to THREAD_STATE_EXITED). */
Typo: mgith
Rest looks okay to me.
>>> diff --git a/sysdeps/pthread/tst-thrd-detach.c b/sysdeps/pthread/tst-thrd-detach.c
>>> index c844767748..e1906a0e10 100644
>>> --- a/sysdeps/pthread/tst-thrd-detach.c
>>> +++ b/sysdeps/pthread/tst-thrd-detach.c
>>
>>> - if (thrd_join (id, NULL) == thrd_success)
>>> - FAIL_EXIT1 ("thrd_join succeed where it should fail");
>>> + TEST_COMPARE (thrd_join (id, NULL), thrd_error);
>>
>> This is still a user-after-free bug, right?
>
> Indeed, I think it would be better to just remove this test.
Agreed.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 05/19] nptl: Move setxid flag out of cancelhandling
2021-08-26 11:34 ` Florian Weimer
@ 2021-08-26 15:11 ` Adhemerval Zanella
2021-08-26 15:21 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 15:11 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 08:34, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> Now that the thread state is tracked by 'joinstate' field, there is no
>> need to keep the setxid flag within the 'cancelhandling'. It simplifies
>> the atomic code to set and reset it (since there is no need to handle
>> the thread state concurrent update).
>
> Some of this functionality reimplements the tidlock.
>
> There is some functionality around thread creation, but I do not know if
> the synchronization is entirely correct. This makes it rather difficult
> to review changes.
>
> It's also not clear to me if we need to do anything to prevent early
> setxid calls:
>
>> /* Don't allow setxid until cloned. */
>
> I suspect this should just look at the TID instead. We create the
> thread with all signals blocked, so once there is a TID (and the thread
> has not exited), it is safe to send the signal.
I agree that we might improve the setxid code, however what I intended
with this changes is to keep its semantic as-is and just decouple it
from cancelhandling. I might revise it on a later patch to follow your
suggestions.
>
>> diff --git a/nptl/descr.h b/nptl/descr.h
>> index 4b2db6edab..563b152bff 100644
>> --- a/nptl/descr.h
>> +++ b/nptl/descr.h
>
>> + /* Indicate whether thread is supposed to change XID. */
>> + int setxid_flag;
>
> This should document the possible values.
I changed to:
/* Indicate whether thread is supposed to change XID, it acts a boolean
variable (0 means no operation pending). */
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 05/19] nptl: Move setxid flag out of cancelhandling
2021-08-26 15:11 ` Adhemerval Zanella
@ 2021-08-26 15:21 ` Florian Weimer
2021-08-26 16:39 ` Adhemerval Zanella
0 siblings, 1 reply; 68+ messages in thread
From: Florian Weimer @ 2021-08-26 15:21 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> On 26/08/2021 08:34, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>> Now that the thread state is tracked by 'joinstate' field, there is no
>>> need to keep the setxid flag within the 'cancelhandling'. It simplifies
>>> the atomic code to set and reset it (since there is no need to handle
>>> the thread state concurrent update).
>>
>> Some of this functionality reimplements the tidlock.
>>
>> There is some functionality around thread creation, but I do not know if
>> the synchronization is entirely correct. This makes it rather difficult
>> to review changes.
>>
>> It's also not clear to me if we need to do anything to prevent early
>> setxid calls:
>>
>>> /* Don't allow setxid until cloned. */
>>
>> I suspect this should just look at the TID instead. We create the
>> thread with all signals blocked, so once there is a TID (and the thread
>> has not exited), it is safe to send the signal.
>
> I agree that we might improve the setxid code, however what I intended
> with this changes is to keep its semantic as-is and just decouple it
> from cancelhandling. I might revise it on a later patch to follow your
> suggestions.
I wonder if there is a new race here:
@@ -109,41 +100,23 @@ setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
/* Don't let the thread exit before the setxid handler runs. */
t->setxid_futex = 0;
+ /* If thread is exiting right now, ignore it. */
+ if (atomic_load_acquire (&t->joinstate) == THREAD_STATE_EXITING)
+ return;
+ /* Release the futex if there is no other setxid in progress. */
+ if (atomic_exchange_acquire (&t->setxid_flag, 1) == 0)
+ {
+ t->setxid_futex = 1;
+ futex_wake (&t->setxid_futex, 1, FUTEX_PRIVATE);
}
That is, if the setxid-sending thread can now wait for a futex wake that
never comes.
There's also a use of atomic_exchange_release whose result is not used:
+ atomic_exchange_release (&t->setxid_flag, 0);
That should probably be store.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 04/19] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
2021-08-26 15:06 ` Florian Weimer
@ 2021-08-26 16:16 ` Adhemerval Zanella
2021-08-30 10:42 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 16:16 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 12:06, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>>>> The race condition on pthread_detach is avoided with only one atomic
>>>> operation on PD state: once the mode is set to THREAD_STATE_DETACHED
>>>> it is up to thread itself to deallocate its memory (done on the exit
>>>> phase at pthread_create()).
>>>
>>> See above regarding thread self-deallocation.
>>>
>>> The design as described above looks sound to me, those are just nits.
>>
>> Right, should I change this paragraph as well (it is not clear the
>> suggestion here).
>
> Maybe “up to [the] thread itself to [trigger deallocation of] its memory”?
Ack.
>
>>>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>>>> index 08e5189ad6..763e32bc3e 100644
>>>> --- a/nptl/pthread_create.c
>>>> +++ b/nptl/pthread_create.c
>>>> @@ -286,7 +286,7 @@ static int create_thread (struct pthread *pd,
>>>> const struct
>>>> @@ -351,13 +351,16 @@ start_thread (void *arg)
>>>> and free any resource prior return to the pthread_create caller. */
>>>> setup_failed = pd->setup_failed == 1;
>>>> if (setup_failed)
>>>> - pd->joinid = NULL;
>>>> + pd->joinstate = THREAD_STATE_JOINABLE;
>>>>
>>>> /* And give it up right away. */
>>>> lll_unlock (pd->lock, LLL_PRIVATE);
>>>>
>>>> if (setup_failed)
>>>> - goto out;
>>>> + {
>>>> + pd->tid = 0;
>>>> + goto out;
>>>> + }
>>>> }
>>>
>>> What's the advantage of setting pd->tid here and below in start_thread?
>>
>> We don't really need to clear the tid on setup_failed case in fact, since
>> in this case no pthread_t will be returned to caller. I remove it.
>
> What about the change in start_thread?
>
> The subsequent changes look at the tid member, but they could equally
> well look at joinstate, I think.
My understanding it we still need to clear the tid member to avoid the
unintentional usage, since kernel will clear it anymore. For instance
on pthread_kill() (or any other usage), there is still an windows
where the joinstate test and the tid read might result in an invalid
value (either a tid reuse or an invalid value).
>
>
>>> I think you need a strong CAS here. We don't have, so you'll have to
>>> add a loop.
>>
>> Yeah, it seems right. I changed to:
>>
>> unsigned int prevstate;
>> while (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &prevstate,
>> THREAD_STATE_EXITING))
>> prevstate = atomic_load_relaxed (&pd->joinstate);
>
> Isn't prevstate uninitialized? Why no do-while loop?
It is, I have changed to:
unsigned int prevstate;
do
prevstate = atomic_load_relaxed (&pd->joinstate);
while (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &prevstate,
THREAD_STATE_EXITING));
>
>>> pthread_tryjoin_np on a thread which is THREAD_STATE_DETACHED is
>>> invalid, so that case doesn't matter, I think.
>>
>> I changed the comment to:
>>
>> /* The joinable state (THREAD_STATE_JOINABLE) is straigthforward since the
>> thread hasn't finished yet and trying to join might block.
>> The exiting thread (THREAD_STATE_EXITING) also mgith result in ablocking
>> call: a detached thread might change its state to exiting and a exiting
>> thread my take some time to exit (and thus let the kernel set the state
>> to THREAD_STATE_EXITED). */
>
> Typo: mgith
Ack.
>
> Rest looks okay to me.
>
>>>> diff --git a/sysdeps/pthread/tst-thrd-detach.c b/sysdeps/pthread/tst-thrd-detach.c
>>>> index c844767748..e1906a0e10 100644
>>>> --- a/sysdeps/pthread/tst-thrd-detach.c
>>>> +++ b/sysdeps/pthread/tst-thrd-detach.c
>>>
>>>> - if (thrd_join (id, NULL) == thrd_success)
>>>> - FAIL_EXIT1 ("thrd_join succeed where it should fail");
>>>> + TEST_COMPARE (thrd_join (id, NULL), thrd_error);
>>>
>>> This is still a user-after-free bug, right?
>>
>> Indeed, I think it would be better to just remove this test.
>
> Agreed.
>
> Thanks,
> Florian
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 05/19] nptl: Move setxid flag out of cancelhandling
2021-08-26 15:21 ` Florian Weimer
@ 2021-08-26 16:39 ` Adhemerval Zanella
2021-08-30 11:27 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 16:39 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 12:21, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> On 26/08/2021 08:34, Florian Weimer wrote:
>>> * Adhemerval Zanella:
>>>
>>>> Now that the thread state is tracked by 'joinstate' field, there is no
>>>> need to keep the setxid flag within the 'cancelhandling'. It simplifies
>>>> the atomic code to set and reset it (since there is no need to handle
>>>> the thread state concurrent update).
>>>
>>> Some of this functionality reimplements the tidlock.
>>>
>>> There is some functionality around thread creation, but I do not know if
>>> the synchronization is entirely correct. This makes it rather difficult
>>> to review changes.
>>>
>>> It's also not clear to me if we need to do anything to prevent early
>>> setxid calls:
>>>
>>>> /* Don't allow setxid until cloned. */
>>>
>>> I suspect this should just look at the TID instead. We create the
>>> thread with all signals blocked, so once there is a TID (and the thread
>>> has not exited), it is safe to send the signal.
>>
>> I agree that we might improve the setxid code, however what I intended
>> with this changes is to keep its semantic as-is and just decouple it
>> from cancelhandling. I might revise it on a later patch to follow your
>> suggestions.
>
> I wonder if there is a new race here:
>
> @@ -109,41 +100,23 @@ setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
> /* Don't let the thread exit before the setxid handler runs. */
> t->setxid_futex = 0;
>
> + /* If thread is exiting right now, ignore it. */
> + if (atomic_load_acquire (&t->joinstate) == THREAD_STATE_EXITING)
> + return;
>
> + /* Release the futex if there is no other setxid in progress. */
> + if (atomic_exchange_acquire (&t->setxid_flag, 1) == 0)
> + {
> + t->setxid_futex = 1;
> + futex_wake (&t->setxid_futex, 1, FUTEX_PRIVATE);
> }
>
> That is, if the setxid-sending thread can now wait for a futex wake that
> never comes.
Wouldn't be handled by setxid_unmark_thread, which wakes the futex
unconditionally?
>
> There's also a use of atomic_exchange_release whose result is not used:
>
> + atomic_exchange_release (&t->setxid_flag, 0);
>
> That should probably be store.
Ack.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 06/19] nptl: Replace struct thread cancelhandling field
2021-08-26 14:34 ` Florian Weimer
@ 2021-08-26 16:48 ` Adhemerval Zanella
2021-08-30 10:36 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 16:48 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 11:34, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> Now that both thread state and setxid is being tracked by two different
>> fields ('joinstate' and 'setxid_flag'), there is no need to keep track
>> of exiting (EXITING_BIT) and terminated (TERMINATED_BIT) state in
>> 'cancelhandling'.
>>
>> The 'cancelhandling' field is renamed to 'cancelstatus' and it only
>> signals whether the thread has been cancelled (set by pthread_cancel()).
>> It allows simplify the atomic operation required to avoid CAS
>> operations.
>
> “cancelstatus” is a bad name because “cancel state” is POSIX terminology
> with different meaning. Please choose a different name.
>
> I think pending_cancel or cancel_requested would work.
The cancel_requested sounds ok.
>
>> There is no need to check the thread state on
>> __pthread_enable_asynccancel() nor on pthread_testcancel () anymore,
>> since cancellation is explicit disabled when thread start the exit code
>> on __do_cancel().
>
> “explict[ly]”
Ack.
>
>>
>> On __nptl_free_tcp(), the 'joinstate' now defines whether it is the
>
> “__nptl_free_tcb”.
Ack.
>
>> creating thread or the created thread that calls it. So there is
>> no concurrent call within the function and thus no need to set the
>> TERMINATED_BIT.
>>
>> For SIGCANCEL handler, sigcancel_handler(), 'joinstate' is used instead
>> (pthread_cancel() might still be called concurrenty in assynchronous
>> mode).
>
> “as[]ynchronous”
>
Ack.
>> diff --git a/nptl/nptl_free_tcb.c b/nptl/nptl_free_tcb.c
>> index cbf3580f59..15e1a18562 100644
>> --- a/nptl/nptl_free_tcb.c
>> +++ b/nptl/nptl_free_tcb.c
>> @@ -24,22 +24,12 @@
>> void
>> __nptl_free_tcb (struct pthread *pd)
>> {
>> - /* The thread is exiting now. */
>> - if (atomic_bit_test_set (&pd->cancelhandling, TERMINATED_BIT) == 0)
>> - {
>> - /* Free TPP data. */
>> - if (pd->tpp != NULL)
>> - {
>> - struct priority_protection_data *tpp = pd->tpp;
>> + free (pd->tpp);
>> + pd->tpp = NULL;
>
> For detached threads, this is called on the thread after signals have
> been blocked. I don't think this is entirely correct. And things
> become rather interesting if malloc needs pd->tpp.
Why do you think this is not correct? This is accessed through
mutex code, so I don't think this would matter on the detached
exit phase.
And what do you mean by malloc needing?
>
> It's not clear to me whether this could happen before.
>
>> + /* Queue the stack memory block for reuse and exit the process. The kernel
>> + will signal via writing to the address returned by QUEUE-STACK when the
>> + stack is available. */
>> + __nptl_deallocate_stack (pd);
>> }
>
> That old comment seems to be out-of-date.
It is still correct I think, __nptl_stack_in_use() now checks the joinstate
which is still advertise by the kernel.
>
>> libc_hidden_def (__nptl_free_tcb)
>> diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
>> index 67e00ef007..aed6c1ea47 100644
>> --- a/nptl/pthread_cancel.c
>> +++ b/nptl/pthread_cancel.c
>
>> @@ -93,8 +90,9 @@ __pthread_cancel (pthread_t th)
>> }
>> #endif
>>
>> - int oldch = atomic_fetch_or_acquire (&pd->cancelhandling, CANCELED_BITMASK);
>> - if ((oldch & CANCELED_BITMASK) != 0)
>> + /* If already cancelled just return (cancellation will be acted upon in next
>> + cancellation entrypoint). */
>> + if (atomic_exchange_acquire (&pd->cancelstatus, 1) == 1)
>> return 0;
>
> This should probably use relaxed MO.
Ack.
>
> The libthreaddb changes look okay to me.
>
> Thanks,
> Florian
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 07/19] support: Add support_wait_for_thread_exit
2021-08-26 9:31 ` Florian Weimer
@ 2021-08-26 16:49 ` Adhemerval Zanella
2021-08-30 11:46 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 16:49 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 06:31, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> From: Florian Weimer <fweimer@redhat.com>
>>
>> Wait until all threads except the current thread has exited.
>
> Should I commit this separately?
Yes, I think it will be better.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 08/19] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193)
2021-08-26 10:03 ` Florian Weimer
@ 2021-08-26 16:49 ` Adhemerval Zanella
0 siblings, 0 replies; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 16:49 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 07:03, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> From: Florian Weimer <fweimer@redhat.com>
>>
>> This closes one remaining race condition related to bug 12889: if
>> the thread already exited on the kernel side, returning ESRCH
>> is not correct because that error is reserved for the thread IDs
>> (pthread_t values) whose lifetime has ended. In case of a
>> kernel-side exit and a valid thread ID, no signal needs to be sent
>> and cancellation does not have an effect, so just return 0.
>>
>> sysdeps/pthread/tst-kill4.c triggers undefined behavior and is
>> removed with this commit.
>
> Wrong commit subject: “should [not] fail”
Ack.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 09/19] nptl: Fix race between pthread_kill and thread exit (bug 12889)
2021-08-26 14:23 ` Florian Weimer
@ 2021-08-26 17:06 ` Adhemerval Zanella
2021-08-30 9:25 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 17:06 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 11:23, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
>> index 5d4c86f920..63fe8c44ca 100644
>> --- a/nptl/pthread_kill.c
>> +++ b/nptl/pthread_kill.c
>> @@ -26,15 +26,18 @@ __pthread_kill_internal (pthread_t threadid, int signo)
>> pid_t tid;
>> struct pthread *pd = (struct pthread *) threadid;
>>
>> + /* Block all signal, since the lock is recursive and used on pthread_cancel
>> + (which should be async-signal-safe). */
>> + sigset_t oldmask;
>> + __libc_signal_block_all (&oldmask);
>> + lll_lock (pd->tidlock, LLL_PRIVATE);
>> +
>> if (pd == THREAD_SELF)
>> /* It is a special case to handle raise() implementation after a vfork
>> call (which does not update the PD tid field). */
>> tid = INLINE_SYSCALL_CALL (gettid);
>> else
>> - /* Force load of pd->tid into local variable or register. Otherwise
>> - if a thread exits between ESRCH test and tgkill, we might return
>> - EINVAL, because pd->tid would be cleared by the kernel. */
>> - tid = atomic_forced_read (pd->tid);
>> + tid = pd->tid;
>>
>> int val;
>> if (__glibc_likely (tid > 0))
>> @@ -53,6 +56,9 @@ __pthread_kill_internal (pthread_t threadid, int signo)
>> as an error. */
>> val = 0;
>>
>> + lll_unlock (pd->tidlock, LLL_PRIVATE);
>> + __libc_signal_restore_set (&oldmask);
>> +
>> return val;
>> }
>
> This needs a comment explaining that *all* pending signals are delivered
> at the point of the __libc_signal_restore_set call. I hope that this is
> actually what happens in Linux; POSIX only guarantees that for one
> pending signal that is unblocked.
My understanding from kernel/signal.c all pending signals are delivered
with the signal mask restore, but only real-time one are queued (and
subjected to RLIMIT_SIGPENDING, which causes another issue [1]).
>
> The problem here is that pthread_kill (pthread_self (), …) must generate
> synchronous signal, and due to the signal-blocking, it is not
> immediately obvious if that's the case.
>
> If we aren't sure that all signals are flushed, we need to check for the
> send-signal-to-self case and avoid blocking the signal there. We don't
> need the tidlock for that because the running thread won't go away.
>
> Thanks,
> Florian
>
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=21108
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 10/19] nptl: Use tidlock when accessing TID on pthread_getaffinity_np
2021-08-26 14:24 ` Florian Weimer
@ 2021-08-26 17:29 ` Adhemerval Zanella
2021-08-30 9:30 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 17:29 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 11:24, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> Checked on x86_64-linux-gnu.
>> ---
>> nptl/pthread_getaffinity.c | 22 +++++++++++++++++-----
>> 1 file changed, 17 insertions(+), 5 deletions(-)
>>
>> diff --git a/nptl/pthread_getaffinity.c b/nptl/pthread_getaffinity.c
>> index 18261ddae0..5268d86e6e 100644
>> --- a/nptl/pthread_getaffinity.c
>> +++ b/nptl/pthread_getaffinity.c
>> @@ -29,12 +29,24 @@
>> int
>> __pthread_getaffinity_np (pthread_t th, size_t cpusetsize, cpu_set_t *cpuset)
>> {
>> - const struct pthread *pd = (const struct pthread *) th;
>> + struct pthread *pd = (struct pthread *) th;
>>
>> - int res = INTERNAL_SYSCALL_CALL (sched_getaffinity, pd->tid,
>> - MIN (INT_MAX, cpusetsize), cpuset);
>> - if (INTERNAL_SYSCALL_ERROR_P (res))
>> - return INTERNAL_SYSCALL_ERRNO (res);
>> + /* Block all signal, since the lock is recursive and used on pthread_cancel
>> + (which should be async-signal-safe). */
>
> The pthread_cancel reference looks like a cut-and-paste-bug.
It is, but I think it is applicable. Maybe:
/* Block all signal, since the lock is not recursive and used on
async-signal-safe functions. */
>
>> + sigset_t oldmask;
>> + __libc_signal_block_all (&oldmask);
>> + lll_lock (pd->tidlock, LLL_PRIVATE);
>> +
>> + int res = pd->tid == 0
>> + ? -ESRCH
>> + : INTERNAL_SYSCALL_CALL (sched_getaffinity, pd->tid,
>> + MIN (INT_MAX, cpusetsize), cpuset);
>> +
>> + lll_unlock (pd->tidlock, LLL_PRIVATE);
>> + __libc_signal_restore_set (&oldmask);
>> +
>> + if (res < 0)
>> + return -res;
>
> ESRCH doesn't look like the right error code here. Should we return an
> affinity mask without any bits set?
Why not? Returning anything but an error does not improve things here
since the information won't make much sense. Also it follows what other
symbols is already doing (such as pthread_cancel()).
>
> Thanks,
> Florian
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 11/19] nptl: Use tidlock when accessing TID on pthread_setaffinity
2021-08-26 14:25 ` Florian Weimer
@ 2021-08-26 17:31 ` Adhemerval Zanella
0 siblings, 0 replies; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 17:31 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 11:25, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> Checked on x86_64-linux-gnu.
>> ---
>> nptl/pthread_setaffinity.c | 22 +++++++++++++++-------
>> 1 file changed, 15 insertions(+), 7 deletions(-)
>>
>> diff --git a/nptl/pthread_setaffinity.c b/nptl/pthread_setaffinity.c
>> index 3bfdc63e19..beb61836a6 100644
>> --- a/nptl/pthread_setaffinity.c
>> +++ b/nptl/pthread_setaffinity.c
>> @@ -27,15 +27,23 @@ int
>> __pthread_setaffinity_new (pthread_t th, size_t cpusetsize,
>> const cpu_set_t *cpuset)
>> {
>> - const struct pthread *pd = (const struct pthread *) th;
>> - int res;
>> + struct pthread *pd = (struct pthread *) th;
>>
>> - res = INTERNAL_SYSCALL_CALL (sched_setaffinity, pd->tid, cpusetsize,
>> - cpuset);
>> + /* Block all signal, since the lock is recursive and used on pthread_cancel
>> + (which should be async-signal-safe). */
>> + sigset_t oldmask;
>> + __libc_signal_block_all (&oldmask);
>> + lll_lock (pd->tidlock, LLL_PRIVATE);
>>
>> - return (INTERNAL_SYSCALL_ERROR_P (res)
>> - ? INTERNAL_SYSCALL_ERRNO (res)
>> - : 0);
>> + int res = pd->tid == 0
>> + ? ESRCH
>> + : INTERNAL_SYSCALL_CALL (sched_setaffinity, pd->tid, cpusetsize,
>> + cpuset);
>> +
>> + lll_unlock (pd->tidlock, LLL_PRIVATE);
>> + __libc_signal_restore_set (&oldmask);
>> +
>> + return res;
>
> Same issue regarding ESRCH and pthread_cancel.
Ack.
> Here we can just return > 0 in case the thread is gone, I think.
I still think ESRCH is the correct returned code here.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 12/19] nptl: Use tidlock when accessing TID on pthread_getcpuclockid
2021-08-26 14:27 ` Florian Weimer
@ 2021-08-26 17:41 ` Adhemerval Zanella
2021-08-30 9:34 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 17:41 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 11:27, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> Checked on x86_64-linux-gnu.
>> ---
>> nptl/pthread_getcpuclockid.c | 27 +++++++++++++++++----------
>> 1 file changed, 17 insertions(+), 10 deletions(-)
>>
>> diff --git a/nptl/pthread_getcpuclockid.c b/nptl/pthread_getcpuclockid.c
>> index 0a6656ea4c..8c0db0b9ba 100644
>> --- a/nptl/pthread_getcpuclockid.c
>> +++ b/nptl/pthread_getcpuclockid.c
>> @@ -29,16 +29,23 @@ __pthread_getcpuclockid (pthread_t threadid, clockid_t *clockid)
>> struct pthread *pd = (struct pthread *) threadid;
>>
>> /* Make sure the descriptor is valid. */
>> - if (INVALID_TD_P (pd))
>> - /* Not a valid thread handle. */
>> - return ESRCH;
>> -
>> - /* The clockid_t value is a simple computation from the TID. */
>> -
>> - const clockid_t tidclock = MAKE_THREAD_CPUCLOCK (pd->tid, CPUCLOCK_SCHED);
>> -
>> - *clockid = tidclock;
>> - return 0;
>> + sigset_t oldmask;
>> + __libc_signal_block_all (&oldmask);
>> + lll_lock (pd->tidlock, LLL_PRIVATE);
>> +
>> + int res;
>> + if (pd->tid != 0)
>> + {
>> + *clockid = MAKE_THREAD_CPUCLOCK (pd->tid, CPUCLOCK_SCHED);
>> + res = 0;
>> + }
>> + else
>> + res = -ESRCH;
>> +
>> + lll_unlock (pd->tidlock, LLL_PRIVATE);
>> + __libc_signal_restore_set (&oldmask);
>> +
>> + return res;
>
> This doesn't really solve the race, does it? The caller cannot use the
> clock ID safely.
Not really if the callee intention is to use along clock_gettime()
or any other routine. I added this patch more for consistency.
In fact, it makes me wonder if the requested improvement such as
a pthread pid accessor [1] does really make sense, since it is
inherent racy in a way that there is no guarantee that the tid
will be valid when it is actually used.
[1] https://sourceware.org/bugzilla/show_bug.cgi?id=27880
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 15/19] nptl: Use tidlock when accessing TID on pthread_getname_np
2021-08-26 14:38 ` Florian Weimer
@ 2021-08-26 17:45 ` Adhemerval Zanella
2021-08-30 9:37 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 17:45 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 11:38, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> + /* Block all signal, since the lock is recursive and used on pthread_cancel
>> + (which should be async-signal-safe). */
>> + sigset_t oldmask;
>> + __libc_signal_block_all (&oldmask);
>> + lll_lock (pd->tidlock, LLL_PRIVATE);
>>
>> - int fd = __open64_nocancel (fname, O_RDONLY);
>> - if (fd == -1)
>> - return errno;
>> + char fname[sizeof ("/proc/self/task//comm" ) + INT_BUFSIZE_BOUND (pid_t)];
>> + __snprintf (fname, sizeof (fname), "/proc/self/task/%d/comm", pd->tid);
>>
>> int res = 0;
>> - ssize_t n = TEMP_FAILURE_RETRY (__read_nocancel (fd, buf, len));
>> - if (n < 0)
>> - res = errno;
>> - else
>> + int fd = __open64_nocancel (fname, O_RDONLY);
>> + if (fd != -1)
>> {
>> - if (buf[n - 1] == '\n')
>> - buf[n - 1] = '\0';
>> - else if (n == len)
>> - res = ERANGE;
>> + ssize_t n = TEMP_FAILURE_RETRY (__read_nocancel (fd, buf, len));
>> + if (n > 0)
>> + {
>> + if (buf[n - 1] == '\n')
>> + buf[n - 1] = '\0';
>> + else if (n == len)
>> + res = ERANGE;
>> + else
>> + buf[n] = '\0';
>> + }
>> else
>> - buf[n] = '\0';
>> + res = errno;
>> +
>> + __close_nocancel_nostatus (fd);
>> }
>> + else
>> + res = errno;
>
> ENOENT for exited threads is inconsistent with ESRCH (or 0) in other
> cases.
Right, I think we will need to map ENOENT to ESRCH. But I also think
this is another fix.
>
> I wonder if we should cache the thread name in the TCB. That would be
> nice for debugging coredumps, too.
This does not make sense, although it would require some memory
allocation.
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 17/19] nptl: Use tidlock when accessing TID on pthread_sigqueue
2021-08-26 14:43 ` Florian Weimer
@ 2021-08-26 17:49 ` Adhemerval Zanella
2021-08-30 9:26 ` Florian Weimer
0 siblings, 1 reply; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 17:49 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 11:43, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> + /* Block all signal, since the lock is recursive and used on pthread_cancel
>> + (which should be async-signal-safe). */
>> + sigset_t oldmask;
>> + __libc_signal_block_all (&oldmask);
>> + lll_lock (pd->tidlock, LLL_PRIVATE);
>> +
>> + int res;
>> + if (pd->tid == 0)
>> + {
>> + pid_t pid = getpid ();
>
> Huh, that can't be right, should be pd->tid != 0.
>
It is definitely not, I have fixed.
> Don't we have test coverage for this?
Nops, I will add one in the next version.
>
>> + else
>> + res = -ESRCH;
>
> We can return 0 in this case, I think.
No sure about this, we can return 0 but it means signal won't be potentially
delivered.
>
> It's possibly that the same issue regarding synchronous signal delivery
> for pthread_jill applies here.
>
> Thanks,
> Florian
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 00/19] Fix various NPTL synchronization issues
2021-08-26 14:47 ` [PATCH v2 00/19] Fix various NPTL synchronization issues Florian Weimer
@ 2021-08-26 18:19 ` Adhemerval Zanella
0 siblings, 0 replies; 68+ messages in thread
From: Adhemerval Zanella @ 2021-08-26 18:19 UTC (permalink / raw)
To: Florian Weimer; +Cc: libc-alpha
On 26/08/2021 11:47, Florian Weimer wrote:
> * Adhemerval Zanella:
>
>> This is an update of my previous set to fix some NPTL issues [1].
>> The main changes are:
>>
>> - Rebased against master and adjusted the __clone_internal usage.
>> - Adapted Florian's ESRCH fixes [2]
>> - Add fixes for various function that access the 'tid'.
>>
>> Patch 01 to 03 are general nptl fixes and they are independent of the
>> other fixes.
>>
>> Patch 04 is the main change of this patchset, it uses a different
>> field instead of the pthread 'tid' to synchrnonize the internal
>> thread state (BZ#19951).
>>
>> It allows to both move the thread setxid internal state out of
>> 'cancelhandling' (used on setuid() call in multi-thread information),
>> and remove the EXITING_BIT and TERMINATED_BIT (since 'joinstate' now
>> track such it). This is done on patch 05 and 06.
>>
>> Patches 08 and 09 fixes two long standing issues regarding
>> pthread_kill() and thread exit (BZ#12889 and BZ#19193). Now that]
>> 'tid' is setting explicitly by pthread_create(), a simple lock can be
>> used instead of more complex futex operation.
>>
>> Patches 10 to 18 extend the same 'tid' access fix to other pthread
>> functions that uses the member.
>
> I don't think this series of patches is suitable for backport to glibc
> 2.34 once completed. The libpthreaddb changes look particularly
> cumbersome because you'll need two versions of the library depending
> which coredumps you are investigating. However, I expect that we need
> to fix the pthread_cancel race in glibc 2.34.
I thin for fix the pthread_cancel race on glibc 2.34 only the fix
for BZ#19951 along with your patch should be suffice. I don't think
this is cumbersome, but I didn't considered the coredump (which I
also don't think it should be blocker though).
>
> I can send my previous attempt with a straightforward lock (and perhaps
> with the callback-based function removed).
>
> However, I'd like to know what people think about relying on signal
> unblocking delivering the signal that was sent to the thread itself.
> Do we need to special-case the pthread_self case or not?
>
> Thanks,
> Florian
>
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 09/19] nptl: Fix race between pthread_kill and thread exit (bug 12889)
2021-08-26 17:06 ` Adhemerval Zanella
@ 2021-08-30 9:25 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-30 9:25 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
>> This needs a comment explaining that *all* pending signals are delivered
>> at the point of the __libc_signal_restore_set call. I hope that this is
>> actually what happens in Linux; POSIX only guarantees that for one
>> pending signal that is unblocked.
>
> My understanding from kernel/signal.c all pending signals are delivered
> with the signal mask restore, but only real-time one are queued (and
> subjected to RLIMIT_SIGPENDING, which causes another issue [1]).
If real-time signals are special, I think we need to change the way we
handle the self-signal case and avoid the lock there.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 17/19] nptl: Use tidlock when accessing TID on pthread_sigqueue
2021-08-26 17:49 ` Adhemerval Zanella
@ 2021-08-30 9:26 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-30 9:26 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
>>> + else
>>> + res = -ESRCH;
>>
>> We can return 0 in this case, I think.
>
> No sure about this, we can return 0 but it means signal won't be potentially
> delivered.
The thread may have already started exiting and blocked the signals when
pthread_sigqueue is called. In that case, it would return 0 even though
the signal is never delivered.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 10/19] nptl: Use tidlock when accessing TID on pthread_getaffinity_np
2021-08-26 17:29 ` Adhemerval Zanella
@ 2021-08-30 9:30 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-30 9:30 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
>> The pthread_cancel reference looks like a cut-and-paste-bug.
>
> It is, but I think it is applicable. Maybe:
>
> /* Block all signal, since the lock is not recursive and used on
> async-signal-safe functions. */
Just mention async-signal-safe, please. I do not think recursive locks
can achieve async-signal-safety.
>>> + sigset_t oldmask;
>>> + __libc_signal_block_all (&oldmask);
>>> + lll_lock (pd->tidlock, LLL_PRIVATE);
>>> +
>>> + int res = pd->tid == 0
>>> + ? -ESRCH
>>> + : INTERNAL_SYSCALL_CALL (sched_getaffinity, pd->tid,
>>> + MIN (INT_MAX, cpusetsize), cpuset);
>>> +
>>> + lll_unlock (pd->tidlock, LLL_PRIVATE);
>>> + __libc_signal_restore_set (&oldmask);
>>> +
>>> + if (res < 0)
>>> + return -res;
>>
>> ESRCH doesn't look like the right error code here. Should we return an
>> affinity mask without any bits set?
>
> Why not? Returning anything but an error does not improve things here
> since the information won't make much sense. Also it follows what other
> symbols is already doing (such as pthread_cancel()).
POSIX reserves ESRCH for using a pthread_t thread ID outside of the
thread lifetime. In glibc, this is always undefined. In contrast, the
case above involves a valid thread ID of a thread that has exited.
Returning ESRCH suggests that the thread lifetime has ended, which is
not true.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 12/19] nptl: Use tidlock when accessing TID on pthread_getcpuclockid
2021-08-26 17:41 ` Adhemerval Zanella
@ 2021-08-30 9:34 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-30 9:34 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> On 26/08/2021 11:27, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>> Checked on x86_64-linux-gnu.
>>> ---
>>> nptl/pthread_getcpuclockid.c | 27 +++++++++++++++++----------
>>> 1 file changed, 17 insertions(+), 10 deletions(-)
>>>
>>> diff --git a/nptl/pthread_getcpuclockid.c b/nptl/pthread_getcpuclockid.c
>>> index 0a6656ea4c..8c0db0b9ba 100644
>>> --- a/nptl/pthread_getcpuclockid.c
>>> +++ b/nptl/pthread_getcpuclockid.c
>>> @@ -29,16 +29,23 @@ __pthread_getcpuclockid (pthread_t threadid, clockid_t *clockid)
>>> struct pthread *pd = (struct pthread *) threadid;
>>>
>>> /* Make sure the descriptor is valid. */
>>> - if (INVALID_TD_P (pd))
>>> - /* Not a valid thread handle. */
>>> - return ESRCH;
>>> -
>>> - /* The clockid_t value is a simple computation from the TID. */
>>> -
>>> - const clockid_t tidclock = MAKE_THREAD_CPUCLOCK (pd->tid, CPUCLOCK_SCHED);
>>> -
>>> - *clockid = tidclock;
>>> - return 0;
>>> + sigset_t oldmask;
>>> + __libc_signal_block_all (&oldmask);
>>> + lll_lock (pd->tidlock, LLL_PRIVATE);
>>> +
>>> + int res;
>>> + if (pd->tid != 0)
>>> + {
>>> + *clockid = MAKE_THREAD_CPUCLOCK (pd->tid, CPUCLOCK_SCHED);
>>> + res = 0;
>>> + }
>>> + else
>>> + res = -ESRCH;
>>> +
>>> + lll_unlock (pd->tidlock, LLL_PRIVATE);
>>> + __libc_signal_restore_set (&oldmask);
>>> +
>>> + return res;
>>
>> This doesn't really solve the race, does it? The caller cannot use the
>> clock ID safely.
>
> Not really if the callee intention is to use along clock_gettime()
> or any other routine. I added this patch more for consistency.
>
> In fact, it makes me wonder if the requested improvement such as
> a pthread pid accessor [1] does really make sense, since it is
> inherent racy in a way that there is no guarantee that the tid
> will be valid when it is actually used.
>
> [1] https://sourceware.org/bugzilla/show_bug.cgi?id=27880
I think it still makes sense because the caller can ensure that the
thread does not exit, so the TID remains valid while it is used.
The original TID (after the thread has exited) may also be useful for
debug logs.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 15/19] nptl: Use tidlock when accessing TID on pthread_getname_np
2021-08-26 17:45 ` Adhemerval Zanella
@ 2021-08-30 9:37 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-30 9:37 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
>> I wonder if we should cache the thread name in the TCB. That would be
>> nice for debugging coredumps, too.
>
> This does not make sense, although it would require some memory
> allocation.
Isn't the name limited to 16 bytes? So it wouldn't need a separate
allocation.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 06/19] nptl: Replace struct thread cancelhandling field
2021-08-26 16:48 ` Adhemerval Zanella
@ 2021-08-30 10:36 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-30 10:36 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
>>> diff --git a/nptl/nptl_free_tcb.c b/nptl/nptl_free_tcb.c
>>> index cbf3580f59..15e1a18562 100644
>>> --- a/nptl/nptl_free_tcb.c
>>> +++ b/nptl/nptl_free_tcb.c
>>> @@ -24,22 +24,12 @@
>>> void
>>> __nptl_free_tcb (struct pthread *pd)
>>> {
>>> - /* The thread is exiting now. */
>>> - if (atomic_bit_test_set (&pd->cancelhandling, TERMINATED_BIT) == 0)
>>> - {
>>> - /* Free TPP data. */
>>> - if (pd->tpp != NULL)
>>> - {
>>> - struct priority_protection_data *tpp = pd->tpp;
>>> + free (pd->tpp);
>>> + pd->tpp = NULL;
>>
>> For detached threads, this is called on the thread after signals have
>> been blocked. I don't think this is entirely correct. And things
>> become rather interesting if malloc needs pd->tpp.
>
> Why do you think this is not correct? This is accessed through
> mutex code, so I don't think this would matter on the detached
> exit phase.
>
> And what do you mean by malloc needing?
free is essentially user code, and we should call user code with all
signals disabled.
malloc locking could need the TPP facility.
>> It's not clear to me whether this could happen before.
>>
>>> + /* Queue the stack memory block for reuse and exit the process. The kernel
>>> + will signal via writing to the address returned by QUEUE-STACK when the
>>> + stack is available. */
>>> + __nptl_deallocate_stack (pd);
>>> }
>>
>> That old comment seems to be out-of-date.
>
> It is still correct I think, __nptl_stack_in_use() now checks the joinstate
> which is still advertise by the kernel.
I meant: there is no QUEUE-STACK.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 04/19] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
2021-08-26 16:16 ` Adhemerval Zanella
@ 2021-08-30 10:42 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-30 10:42 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
>>>>> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
>>>>> index 08e5189ad6..763e32bc3e 100644
>>>>> --- a/nptl/pthread_create.c
>>>>> +++ b/nptl/pthread_create.c
>>>>> @@ -286,7 +286,7 @@ static int create_thread (struct pthread *pd,
>>>>> const struct
>>>>> @@ -351,13 +351,16 @@ start_thread (void *arg)
>>>>> and free any resource prior return to the pthread_create caller. */
>>>>> setup_failed = pd->setup_failed == 1;
>>>>> if (setup_failed)
>>>>> - pd->joinid = NULL;
>>>>> + pd->joinstate = THREAD_STATE_JOINABLE;
>>>>>
>>>>> /* And give it up right away. */
>>>>> lll_unlock (pd->lock, LLL_PRIVATE);
>>>>>
>>>>> if (setup_failed)
>>>>> - goto out;
>>>>> + {
>>>>> + pd->tid = 0;
>>>>> + goto out;
>>>>> + }
>>>>> }
>>>>
>>>> What's the advantage of setting pd->tid here and below in start_thread?
>>>
>>> We don't really need to clear the tid on setup_failed case in fact, since
>>> in this case no pthread_t will be returned to caller. I remove it.
>>
>> What about the change in start_thread?
>>
>> The subsequent changes look at the tid member, but they could equally
>> well look at joinstate, I think.
>
> My understanding it we still need to clear the tid member to avoid the
> unintentional usage, since kernel will clear it anymore. For instance
> on pthread_kill() (or any other usage), there is still an windows
> where the joinstate test and the tid read might result in an invalid
> value (either a tid reuse or an invalid value).
It's more a matter of the final state. I think we should let the tid
field be after the other changes went in.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 05/19] nptl: Move setxid flag out of cancelhandling
2021-08-26 16:39 ` Adhemerval Zanella
@ 2021-08-30 11:27 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-30 11:27 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> On 26/08/2021 12:21, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>> On 26/08/2021 08:34, Florian Weimer wrote:
>>>> * Adhemerval Zanella:
>>>>
>>>>> Now that the thread state is tracked by 'joinstate' field, there is no
>>>>> need to keep the setxid flag within the 'cancelhandling'. It simplifies
>>>>> the atomic code to set and reset it (since there is no need to handle
>>>>> the thread state concurrent update).
>>>>
>>>> Some of this functionality reimplements the tidlock.
>>>>
>>>> There is some functionality around thread creation, but I do not know if
>>>> the synchronization is entirely correct. This makes it rather difficult
>>>> to review changes.
>>>>
>>>> It's also not clear to me if we need to do anything to prevent early
>>>> setxid calls:
>>>>
>>>>> /* Don't allow setxid until cloned. */
>>>>
>>>> I suspect this should just look at the TID instead. We create the
>>>> thread with all signals blocked, so once there is a TID (and the thread
>>>> has not exited), it is safe to send the signal.
>>>
>>> I agree that we might improve the setxid code, however what I intended
>>> with this changes is to keep its semantic as-is and just decouple it
>>> from cancelhandling. I might revise it on a later patch to follow your
>>> suggestions.
>>
>> I wonder if there is a new race here:
>>
>> @@ -109,41 +100,23 @@ setxid_mark_thread (struct xid_command *cmdp, struct pthread *t)
>> /* Don't let the thread exit before the setxid handler runs. */
>> t->setxid_futex = 0;
>>
>> + /* If thread is exiting right now, ignore it. */
>> + if (atomic_load_acquire (&t->joinstate) == THREAD_STATE_EXITING)
>> + return;
>>
>> + /* Release the futex if there is no other setxid in progress. */
>> + if (atomic_exchange_acquire (&t->setxid_flag, 1) == 0)
>> + {
>> + t->setxid_futex = 1;
>> + futex_wake (&t->setxid_futex, 1, FUTEX_PRIVATE);
>> }
>>
>> That is, if the setxid-sending thread can now wait for a futex wake that
>> never comes.
>
> Wouldn't be handled by setxid_unmark_thread, which wakes the futex
> unconditionally?
But that code only runs after the loop around signalled has concluded.
That's why I'm worried.
Thanks,
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [PATCH v2 07/19] support: Add support_wait_for_thread_exit
2021-08-26 16:49 ` Adhemerval Zanella
@ 2021-08-30 11:46 ` Florian Weimer
0 siblings, 0 replies; 68+ messages in thread
From: Florian Weimer @ 2021-08-30 11:46 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> On 26/08/2021 06:31, Florian Weimer wrote:
>> * Adhemerval Zanella:
>>
>>> From: Florian Weimer <fweimer@redhat.com>
>>>
>>> Wait until all threads except the current thread has exited.
>>
>> Should I commit this separately?
>
> Yes, I think it will be better.
Done.
Florian
^ permalink raw reply [flat|nested] 68+ messages in thread
end of thread, other threads:[~2021-08-30 11:46 UTC | newest]
Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-23 19:50 [PATCH v2 00/19] Fix various NPTL synchronization issues Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 01/19] nptl: Fix tst-cancel7 and tst-cancelx7 race condition (BZ #14232) Adhemerval Zanella
2021-08-26 9:33 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 02/19] nptl: Set cancellation type and state on pthread_exit Adhemerval Zanella
2021-08-26 9:38 ` Florian Weimer
2021-08-26 9:42 ` Florian Weimer
2021-08-26 11:56 ` Adhemerval Zanella
2021-08-26 11:52 ` Adhemerval Zanella
2021-08-26 12:08 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 03/19] nptl: Handle robust PI mutexes for !__ASSUME_SET_ROBUST_LIST Adhemerval Zanella
2021-08-26 9:42 ` Florian Weimer
2021-08-26 12:14 ` Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 04/19] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951) Adhemerval Zanella
2021-08-26 10:41 ` Florian Weimer
2021-08-26 14:58 ` Adhemerval Zanella
2021-08-26 15:06 ` Florian Weimer
2021-08-26 16:16 ` Adhemerval Zanella
2021-08-30 10:42 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 05/19] nptl: Move setxid flag out of cancelhandling Adhemerval Zanella
2021-08-26 11:34 ` Florian Weimer
2021-08-26 15:11 ` Adhemerval Zanella
2021-08-26 15:21 ` Florian Weimer
2021-08-26 16:39 ` Adhemerval Zanella
2021-08-30 11:27 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 06/19] nptl: Replace struct thread cancelhandling field Adhemerval Zanella
2021-08-26 14:34 ` Florian Weimer
2021-08-26 16:48 ` Adhemerval Zanella
2021-08-30 10:36 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 07/19] support: Add support_wait_for_thread_exit Adhemerval Zanella
2021-08-26 9:31 ` Florian Weimer
2021-08-26 16:49 ` Adhemerval Zanella
2021-08-30 11:46 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 08/19] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) Adhemerval Zanella
2021-08-26 10:03 ` Florian Weimer
2021-08-26 16:49 ` Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 09/19] nptl: Fix race between pthread_kill and thread exit (bug 12889) Adhemerval Zanella
2021-08-26 14:23 ` Florian Weimer
2021-08-26 17:06 ` Adhemerval Zanella
2021-08-30 9:25 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 10/19] nptl: Use tidlock when accessing TID on pthread_getaffinity_np Adhemerval Zanella
2021-08-26 14:24 ` Florian Weimer
2021-08-26 17:29 ` Adhemerval Zanella
2021-08-30 9:30 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 11/19] nptl: Use tidlock when accessing TID on pthread_setaffinity Adhemerval Zanella
2021-08-26 14:25 ` Florian Weimer
2021-08-26 17:31 ` Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 12/19] nptl: Use tidlock when accessing TID on pthread_getcpuclockid Adhemerval Zanella
2021-08-26 14:27 ` Florian Weimer
2021-08-26 17:41 ` Adhemerval Zanella
2021-08-30 9:34 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 13/19] nptl: Use tidlock when accessing TID on pthread_getschedparam Adhemerval Zanella
2021-08-26 15:00 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 14/19] nptl: Use tidlock when accessing TID on pthread_setschedparam Adhemerval Zanella
2021-08-26 14:35 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 15/19] nptl: Use tidlock when accessing TID on pthread_getname_np Adhemerval Zanella
2021-08-26 14:38 ` Florian Weimer
2021-08-26 17:45 ` Adhemerval Zanella
2021-08-30 9:37 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 16/19] nptl: Use tidlock when accessing TID on pthread_setname_np Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 17/19] nptl: Use tidlock when accessing TID on pthread_sigqueue Adhemerval Zanella
2021-08-26 14:43 ` Florian Weimer
2021-08-26 17:49 ` Adhemerval Zanella
2021-08-30 9:26 ` Florian Weimer
2021-08-23 19:50 ` [PATCH v2 18/19] nptl: Use tidlock when accessing TID on pthread_setschedprio Adhemerval Zanella
2021-08-23 19:50 ` [PATCH v2 19/19] nptl: Remove INVALID_TD_P Adhemerval Zanella
2021-08-26 9:30 ` Florian Weimer
2021-08-26 14:47 ` [PATCH v2 00/19] Fix various NPTL synchronization issues Florian Weimer
2021-08-26 18:19 ` Adhemerval Zanella
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).