* [PATCH 0/3] Fix ESRCH issues in pthread_cancel, pthread_kill
@ 2021-08-17 13:50 Florian Weimer
2021-08-17 13:50 ` [PATCH 1/3] support: Add support_wait_for_thread_exit Florian Weimer
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Florian Weimer @ 2021-08-17 13:50 UTC (permalink / raw)
To: libc-alpha
This mini-series came out of a downstream bug report:
glibc: pthread_cancel fails with ESRCH yet subsequent pthread_join passes
<https://bugzilla.redhat.com/show_bug.cgi?id=1994068>
Apparently the race is much easier to trigger due to the changes in
glibc 2.34.
Tested on i686-linux-gnu, x86_64-linux-gnu. Built with
build-many-glibcs.py.
Thanks,
Florian
Florian Weimer (3):
support: Add support_wait_for_thread_exit
nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193)
nptl: Fix race between pthread_kill and thread exit (bug 12889)
nptl/allocatestack.c | 1 +
nptl/descr.h | 12 ++
nptl/pthread_cancel.c | 9 +-
nptl/pthread_create.c | 20 ++++
nptl/pthread_kill.c | 30 ++++-
support/Makefile | 3 +-
support/support.h | 4 +
support/support_wait_for_thread_exit.c | 67 +++++++++++
sysdeps/pthread/Makefile | 7 +-
sysdeps/pthread/tst-kill4.c | 90 --------------
sysdeps/pthread/tst-pthread_cancel-exited.c | 45 +++++++
.../pthread/tst-pthread_cancel-select-loop.c | 87 ++++++++++++++
sysdeps/pthread/tst-pthread_kill-exited.c | 46 ++++++++
sysdeps/pthread/tst-pthread_kill-exiting.c | 110 ++++++++++++++++++
14 files changed, 431 insertions(+), 100 deletions(-)
create mode 100644 support/support_wait_for_thread_exit.c
delete mode 100644 sysdeps/pthread/tst-kill4.c
create mode 100644 sysdeps/pthread/tst-pthread_cancel-exited.c
create mode 100644 sysdeps/pthread/tst-pthread_cancel-select-loop.c
create mode 100644 sysdeps/pthread/tst-pthread_kill-exited.c
create mode 100644 sysdeps/pthread/tst-pthread_kill-exiting.c
--
2.31.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] support: Add support_wait_for_thread_exit
2021-08-17 13:50 [PATCH 0/3] Fix ESRCH issues in pthread_cancel, pthread_kill Florian Weimer
@ 2021-08-17 13:50 ` Florian Weimer
2021-08-19 16:12 ` Adhemerval Zanella
2021-08-17 13:51 ` [PATCH 2/3] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) Florian Weimer
2021-08-17 13:51 ` [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) Florian Weimer
2 siblings, 1 reply; 10+ messages in thread
From: Florian Weimer @ 2021-08-17 13:50 UTC (permalink / raw)
To: libc-alpha
---
support/support.h | 4 ++
support/support_wait_for_thread_exit.c | 67 ++++++++++++++++++++++++++
2 files changed, 71 insertions(+)
create mode 100644 support/support_wait_for_thread_exit.c
diff --git a/support/support.h b/support/support.h
index 834dba9097..a5978b939a 100644
--- a/support/support.h
+++ b/support/support.h
@@ -174,6 +174,10 @@ timer_t support_create_timer (uint64_t sec, long int nsec, bool repeat,
/* Disable the timer TIMER. */
void support_delete_timer (timer_t timer);
+/* Wait until all threads except the current thread have exited (as
+ far as the kernel is concerned). */
+void support_wait_for_thread_exit (void);
+
struct support_stack
{
void *stack;
diff --git a/support/support_wait_for_thread_exit.c b/support/support_wait_for_thread_exit.c
new file mode 100644
index 0000000000..808901f746
--- /dev/null
+++ b/support/support_wait_for_thread_exit.c
@@ -0,0 +1,67 @@
+/* Wait until all threads except the current thread has exited.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+#include <dirent.h>
+#include <errno.h>
+#include <string.h>
+#include <support/check.h>
+#include <support/support.h>
+#include <unistd.h>
+
+void
+support_wait_for_thread_exit (void)
+{
+ DIR *proc_self_task = opendir ("/proc/self/task");
+ if (proc_self_task == NULL)
+ /* Use a large timeout because we cannot verify that the thread
+ has exited. */
+ usleep (5 * 1000 * 1000);
+ else
+ {
+ int count;
+
+ wait_again:
+ count = 0;
+ rewinddir (proc_self_task);
+ while (true)
+ {
+ errno = 0;
+ struct dirent *e = readdir (proc_self_task);
+ if (e == NULL && errno != 0)
+ FAIL_EXIT1 ("readdir: %m");
+ if (e == NULL)
+ {
+ /* Only the main thread remains. Testing may continue. */
+ TEST_VERIFY (count >= 1);
+ closedir (proc_self_task);
+ return;
+ }
+
+ if (strcmp (e->d_name, ".") == 0 || strcmp (e->d_name, "..") == 0)
+ continue;
+
+ ++count;
+ if (count > 1)
+ {
+ /* Small timeout to give the thread a chance to exit. */
+ usleep (50 * 1000);
+ goto wait_again;
+ }
+ }
+ }
+}
--
2.31.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/3] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193)
2021-08-17 13:50 [PATCH 0/3] Fix ESRCH issues in pthread_cancel, pthread_kill Florian Weimer
2021-08-17 13:50 ` [PATCH 1/3] support: Add support_wait_for_thread_exit Florian Weimer
@ 2021-08-17 13:51 ` Florian Weimer
2021-08-17 13:51 ` [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) Florian Weimer
2 siblings, 0 replies; 10+ messages in thread
From: Florian Weimer @ 2021-08-17 13:51 UTC (permalink / raw)
To: libc-alpha
This closes one remaining race condition related to bug 12889: if
the thread already exited on the kernel side, returning ESRCH
is not correct because that error is reserved for the thread IDs
(pthread_t values) whose lifetime has ended. In case of a
kernel-side exit and a valid thread ID, no signal needs to be sent
and cancellation does not have an effect, so just return 0.
sysdeps/pthread/tst-kill4.c triggers undefined behavior and is
removed with this commit.
---
nptl/pthread_cancel.c | 9 ++--
nptl/pthread_kill.c | 7 +++-
sysdeps/pthread/Makefile | 5 ++-
sysdeps/pthread/tst-pthread_cancel-exited.c | 45 ++++++++++++++++++++
sysdeps/pthread/tst-pthread_kill-exited.c | 46 +++++++++++++++++++++
5 files changed, 106 insertions(+), 6 deletions(-)
create mode 100644 sysdeps/pthread/tst-pthread_cancel-exited.c
create mode 100644 sysdeps/pthread/tst-pthread_kill-exited.c
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index cc25ff21f3..9bac6e3b76 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -62,10 +62,11 @@ __pthread_cancel (pthread_t th)
{
volatile struct pthread *pd = (volatile struct pthread *) th;
- /* Make sure the descriptor is valid. */
- if (INVALID_TD_P (pd))
- /* Not a valid thread handle. */
- return ESRCH;
+ if (pd->tid == 0)
+ /* The thread has already exited on the kernel side. Its outcome
+ (regular exit, other cancelation) has already been
+ determined. */
+ return 0;
static int init_sigcancel = 0;
if (atomic_load_relaxed (&init_sigcancel) == 0)
diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
index f79a2b26fc..5d4c86f920 100644
--- a/nptl/pthread_kill.c
+++ b/nptl/pthread_kill.c
@@ -46,7 +46,12 @@ __pthread_kill_internal (pthread_t threadid, int signo)
? INTERNAL_SYSCALL_ERRNO (val) : 0);
}
else
- val = ESRCH;
+ /* The kernel reports that the thread has exited. POSIX specifies
+ the ESRCH error only for the case when the lifetime of a thread
+ ID has ended, but calling pthread_kill on such a thread ID is
+ undefined in glibc. Therefore, do not treat kernel thread exit
+ as an error. */
+ val = 0;
return val;
}
diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile
index 42f9fc5072..dedfa0d290 100644
--- a/sysdeps/pthread/Makefile
+++ b/sysdeps/pthread/Makefile
@@ -89,7 +89,7 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \
tst-join8 tst-join9 tst-join10 tst-join11 tst-join12 tst-join13 \
tst-join14 tst-join15 \
tst-key1 tst-key2 tst-key3 tst-key4 \
- tst-kill1 tst-kill2 tst-kill3 tst-kill4 tst-kill5 tst-kill6 \
+ tst-kill1 tst-kill2 tst-kill3 tst-kill5 tst-kill6 \
tst-locale1 tst-locale2 \
tst-memstream \
tst-mutex-errorcheck tst-mutex1 tst-mutex2 tst-mutex3 tst-mutex4 \
@@ -118,6 +118,9 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \
tst-unload \
tst-unwind-thread \
tst-pt-vfork1 tst-pt-vfork2 tst-vfork1x tst-vfork2x \
+ tst-pthread_cancel-exited \
+ tst-pthread_kill-exited \
+ # tests
tests-time64 := \
tst-abstime-time64 \
diff --git a/sysdeps/pthread/tst-pthread_cancel-exited.c b/sysdeps/pthread/tst-pthread_cancel-exited.c
new file mode 100644
index 0000000000..811c9bee07
--- /dev/null
+++ b/sysdeps/pthread/tst-pthread_cancel-exited.c
@@ -0,0 +1,45 @@
+/* Test that pthread_kill succeeds for an exited thread.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* This test verifies that pthread_kill returns 0 (and not ESRCH) for
+ a thread that has exited on the kernel side. */
+
+#include <stddef.h>
+#include <support/support.h>
+#include <support/xthread.h>
+
+static void *
+noop_thread (void *closure)
+{
+ return NULL;
+}
+
+static int
+do_test (void)
+{
+ pthread_t thr = xpthread_create (NULL, noop_thread, NULL);
+
+ support_wait_for_thread_exit ();
+
+ xpthread_cancel (thr);
+ xpthread_join (thr);
+
+ return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/pthread/tst-pthread_kill-exited.c b/sysdeps/pthread/tst-pthread_kill-exited.c
new file mode 100644
index 0000000000..7575fb6d58
--- /dev/null
+++ b/sysdeps/pthread/tst-pthread_kill-exited.c
@@ -0,0 +1,46 @@
+/* Test that pthread_kill succeeds for an exited thread.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* This test verifies that pthread_kill returns 0 (and not ESRCH) for
+ a thread that has exited on the kernel side. */
+
+#include <signal.h>
+#include <stddef.h>
+#include <support/support.h>
+#include <support/xthread.h>
+
+static void *
+noop_thread (void *closure)
+{
+ return NULL;
+}
+
+static int
+do_test (void)
+{
+ pthread_t thr = xpthread_create (NULL, noop_thread, NULL);
+
+ support_wait_for_thread_exit ();
+
+ xpthread_kill (thr, SIGUSR1);
+ xpthread_join (thr);
+
+ return 0;
+}
+
+#include <support/test-driver.c>
--
2.31.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889)
2021-08-17 13:50 [PATCH 0/3] Fix ESRCH issues in pthread_cancel, pthread_kill Florian Weimer
2021-08-17 13:50 ` [PATCH 1/3] support: Add support_wait_for_thread_exit Florian Weimer
2021-08-17 13:51 ` [PATCH 2/3] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) Florian Weimer
@ 2021-08-17 13:51 ` Florian Weimer
2021-08-19 16:12 ` Adhemerval Zanella
2 siblings, 1 reply; 10+ messages in thread
From: Florian Weimer @ 2021-08-17 13:51 UTC (permalink / raw)
To: libc-alpha; +Cc: Marek Polacek
Thread exit is delayed until all pending pthread_kill operations
on the thread have completed. This avoids sending signals to the
wrong thread or a spurious ESRCH error.
The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived
from a downstream test originally written by Marek Polacek.
---
nptl/allocatestack.c | 1 +
nptl/descr.h | 12 ++
nptl/pthread_create.c | 20 ++++
nptl/pthread_kill.c | 23 +++-
support/Makefile | 3 +-
sysdeps/pthread/Makefile | 2 +
sysdeps/pthread/tst-kill4.c | 90 --------------
.../pthread/tst-pthread_cancel-select-loop.c | 87 ++++++++++++++
sysdeps/pthread/tst-pthread_kill-exiting.c | 110 ++++++++++++++++++
9 files changed, 254 insertions(+), 94 deletions(-)
delete mode 100644 sysdeps/pthread/tst-kill4.c
create mode 100644 sysdeps/pthread/tst-pthread_cancel-select-loop.c
create mode 100644 sysdeps/pthread/tst-pthread_kill-exiting.c
diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
index cfe37a3443..344f7f8ec5 100644
--- a/nptl/allocatestack.c
+++ b/nptl/allocatestack.c
@@ -127,6 +127,7 @@ get_cached_stack (size_t *sizep, void **memp)
/* No pending event. */
result->nextevent = NULL;
+ result->exit_signal_futex = 0;
result->tls_state = (struct tls_internal_t) { 0 };
/* Clear the DTV. */
diff --git a/nptl/descr.h b/nptl/descr.h
index c85778d449..7255d1dc77 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -396,6 +396,18 @@ struct pthread
PTHREAD_CANCEL_ASYNCHRONOUS). */
unsigned char canceltype;
+ /* Futex to prevent thread exit during concurrent pthread_kill. The
+ least-significant bit is an exit flag. The remaining bits count
+ the number of pthread_kill operations in progress.
+
+ Upon exit, the thread bit sets the LSB, and waits until the futex
+ variable is 1.
+
+ pthread_kill atomically adds 2 to the futex variable before the
+ signal-sending operation, subtracts 2 after it, and skips sending
+ the signal if the least-significant bit is set. */
+ unsigned int exit_signal_futex;
+
/* Used on strsignal. */
struct tls_internal_t tls_state;
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index d8ec299cb1..7e1a3cfcdd 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -37,6 +37,7 @@
#include <sys/single_threaded.h>
#include <version.h>
#include <clone_internal.h>
+#include <futex-internal.h>
#include <shlib-compat.h>
@@ -485,6 +486,25 @@ start_thread (void *arg)
/* This was the last thread. */
exit (0);
+ /* Blocking signals is required to make pthread_kill
+ async-signal-safe. It prevents sending a signal from this thread
+ to itself during its final stages. This must come after the exit
+ call above because atexit handlers must not run with signals
+ blocked. */
+ __libc_signal_block_all (NULL);
+
+ /* Delay exiting this thread until there are no concurrent
+ pthread_kill operations in progress. */
+ {
+ /* Indicate that no further signals should be sent. */
+ atomic_fetch_or_release (&pd->exit_signal_futex, 1);
+
+ /* Wait until there are no pthread_kill operations left. */
+ unsigned int val;
+ while ((val = atomic_load_acquire (&pd->exit_signal_futex)) != 1)
+ futex_wait (&pd->exit_signal_futex, val, FUTEX_PRIVATE);
+ }
+
#ifndef __ASSUME_SET_ROBUST_LIST
/* If this thread has any robust mutexes locked, handle them now. */
# if __PTHREAD_MUTEX_HAVE_PREV
diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
index 5d4c86f920..b67bd4b769 100644
--- a/nptl/pthread_kill.c
+++ b/nptl/pthread_kill.c
@@ -18,6 +18,7 @@
#include <unistd.h>
#include <pthreadP.h>
+#include <futex-internal.h>
#include <shlib-compat.h>
int
@@ -41,9 +42,25 @@ __pthread_kill_internal (pthread_t threadid, int signo)
{
pid_t pid = __getpid ();
- val = INTERNAL_SYSCALL_CALL (tgkill, pid, tid, signo);
- val = (INTERNAL_SYSCALL_ERROR_P (val)
- ? INTERNAL_SYSCALL_ERRNO (val) : 0);
+ if (atomic_fetch_add_acq_rel (&pd->exit_signal_futex, 2) & 1)
+ /* If the thread is exiting, it has already blocked all
+ signals, so sending the signal does not have any effect.
+ Not sending the signal avoids a spurious ESRCH from tgkill
+ if the thread manages to complete its exit before the
+ signal can be sent. */
+ val = 0;
+ else
+ {
+ val = INTERNAL_SYSCALL_CALL (tgkill, pid, tid, signo);
+ val = (INTERNAL_SYSCALL_ERROR_P (val)
+ ? INTERNAL_SYSCALL_ERRNO (val) : 0);
+ }
+
+ /* A value 3 before the update indicates that the thread is
+ exiting, and this is the last remaining concurrent
+ pthread_kill operation. */
+ if (atomic_fetch_add_acq_rel (&pd->exit_signal_futex, -2) == 3)
+ futex_wake (&pd->exit_signal_futex, 1, FUTEX_PRIVATE);
}
else
/* The kernel reports that the thread has exited. POSIX specifies
diff --git a/support/Makefile b/support/Makefile
index a462781718..ef2b1a980a 100644
--- a/support/Makefile
+++ b/support/Makefile
@@ -82,9 +82,10 @@ libsupport-routines = \
support_test_compare_blob \
support_test_compare_failure \
support_test_compare_string \
- support_write_file_string \
support_test_main \
support_test_verify_impl \
+ support_wait_for_thread_exit \
+ support_write_file_string \
temp_file \
timespec \
timespec-time64 \
diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile
index dedfa0d290..48dba717a1 100644
--- a/sysdeps/pthread/Makefile
+++ b/sysdeps/pthread/Makefile
@@ -119,7 +119,9 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \
tst-unwind-thread \
tst-pt-vfork1 tst-pt-vfork2 tst-vfork1x tst-vfork2x \
tst-pthread_cancel-exited \
+ tst-pthread_cancel-select-loop \
tst-pthread_kill-exited \
+ tst-pthread_kill-exiting \
# tests
tests-time64 := \
diff --git a/sysdeps/pthread/tst-kill4.c b/sysdeps/pthread/tst-kill4.c
deleted file mode 100644
index 9563939792..0000000000
--- a/sysdeps/pthread/tst-kill4.c
+++ /dev/null
@@ -1,90 +0,0 @@
-/* Copyright (C) 2003-2021 Free Software Foundation, Inc.
- This file is part of the GNU C Library.
- Contributed by Ulrich Drepper <drepper@redhat.com>, 2003.
-
- The GNU C Library is free software; you can redistribute it and/or
- modify it under the terms of the GNU Lesser General Public
- License as published by the Free Software Foundation; either
- version 2.1 of the License, or (at your option) any later version.
-
- The GNU C Library is distributed in the hope that it will be useful,
- but WITHOUT ANY WARRANTY; without even the implied warranty of
- MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
- Lesser General Public License for more details.
-
- You should have received a copy of the GNU Lesser General Public
- License along with the GNU C Library; if not, see
- <https://www.gnu.org/licenses/>. */
-
-#include <errno.h>
-#include <pthread.h>
-#include <signal.h>
-#include <stdio.h>
-#include <stdlib.h>
-#include <unistd.h>
-
-
-static void *
-tf (void *a)
-{
- return NULL;
-}
-
-
-int
-do_test (void)
-{
- pthread_attr_t at;
- if (pthread_attr_init (&at) != 0)
- {
- puts ("attr_create failed");
- exit (1);
- }
-
- /* Limit thread stack size, because if it is too large, pthread_join
- will free it immediately rather than put it into stack cache. */
- if (pthread_attr_setstacksize (&at, 2 * 1024 * 1024) != 0)
- {
- puts ("setstacksize failed");
- exit (1);
- }
-
- pthread_t th;
- if (pthread_create (&th, &at, tf, NULL) != 0)
- {
- puts ("create failed");
- exit (1);
- }
-
- pthread_attr_destroy (&at);
-
- if (pthread_join (th, NULL) != 0)
- {
- puts ("join failed");
- exit (1);
- }
-
- /* The following only works because we assume here something about
- the implementation. Namely, that the memory allocated for the
- thread descriptor is not going away, that the TID field is
- cleared and therefore the signal is sent to process 0, and that
- we can savely assume there is no other process with this ID at
- that time. */
- int e = pthread_kill (th, 0);
- if (e == 0)
- {
- puts ("pthread_kill succeeded");
- exit (1);
- }
- if (e != ESRCH)
- {
- puts ("pthread_kill didn't return ESRCH");
- exit (1);
- }
-
- return 0;
-}
-
-
-#define TEST_FUNCTION do_test ()
-#include "../test-skeleton.c"
diff --git a/sysdeps/pthread/tst-pthread_cancel-select-loop.c b/sysdeps/pthread/tst-pthread_cancel-select-loop.c
new file mode 100644
index 0000000000..a62087589c
--- /dev/null
+++ b/sysdeps/pthread/tst-pthread_cancel-select-loop.c
@@ -0,0 +1,87 @@
+/* Test that pthread_cancel succeeds during thread exit.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* This test tries to trigger an internal race condition in
+ pthread_cancel, where the cancellation signal is sent after the
+ thread has begun the cancellation process. This can result in a
+ spurious ESRCH error. For the original bug 12889, the window is
+ quite small, so the bug was not reproduced in every run. */
+
+#include <stdbool.h>
+#include <stddef.h>
+#include <support/check.h>
+#include <support/xthread.h>
+#include <support/xunistd.h>
+#include <sys/select.h>
+#include <unistd.h>
+
+/* Set to true by timeout_thread_function when the test should
+ terminate. */
+static bool timeout;
+
+static void *
+timeout_thread_function (void *unused)
+{
+ usleep (5 * 1000 * 1000);
+ __atomic_store_n (&timeout, true, __ATOMIC_RELAXED);
+ return NULL;
+}
+
+/* Used for blocking the select function below. */
+static int pipe_fds[2];
+
+static void *
+canceled_thread_function (void *unused)
+{
+ while (true)
+ {
+ fd_set rfs;
+ fd_set wfs;
+ fd_set efs;
+ FD_ZERO (&rfs);
+ FD_ZERO (&wfs);
+ FD_ZERO (&efs);
+ FD_SET (pipe_fds[0], &rfs);
+
+ /* If the cancellation request is recognized early, the thread
+ begins exiting while the cancellation signal arrives. */
+ select (FD_SETSIZE, &rfs, &wfs, &efs, NULL);
+ }
+ return NULL;
+}
+
+static int
+do_test (void)
+{
+ xpipe (pipe_fds);
+ pthread_t thr_timeout = xpthread_create (NULL, timeout_thread_function, NULL);
+
+ while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED))
+ {
+ pthread_t thr = xpthread_create (NULL, canceled_thread_function, NULL);
+ xpthread_cancel (thr);
+ TEST_VERIFY (xpthread_join (thr) == PTHREAD_CANCELED);
+ }
+
+ xpthread_join (thr_timeout);
+ xclose (pipe_fds[0]);
+ xclose (pipe_fds[1]);
+ return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/pthread/tst-pthread_kill-exiting.c b/sysdeps/pthread/tst-pthread_kill-exiting.c
new file mode 100644
index 0000000000..15550d2310
--- /dev/null
+++ b/sysdeps/pthread/tst-pthread_kill-exiting.c
@@ -0,0 +1,110 @@
+/* Test that pthread_kill succeeds during thread exit.
+ Copyright (C) 2021 Free Software Foundation, Inc.
+ This file is part of the GNU C Library.
+
+ The GNU C Library is free software; you can redistribute it and/or
+ modify it under the terms of the GNU Lesser General Public
+ License as published by the Free Software Foundation; either
+ version 2.1 of the License, or (at your option) any later version.
+
+ The GNU C Library is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ Lesser General Public License for more details.
+
+ You should have received a copy of the GNU Lesser General Public
+ License along with the GNU C Library; if not, see
+ <https://www.gnu.org/licenses/>. */
+
+/* This test verifies that pthread_kill for a thread that is exiting
+ succeeds (with or without actually delivering the signal). */
+
+#include <array_length.h>
+#include <stdbool.h>
+#include <stddef.h>
+#include <support/xsignal.h>
+#include <support/xthread.h>
+#include <unistd.h>
+
+/* Set to true by timeout_thread_function when the test should
+ terminate. */
+static bool timeout;
+
+static void *
+timeout_thread_function (void *unused)
+{
+ usleep (1000 * 1000);
+ __atomic_store_n (&timeout, true, __ATOMIC_RELAXED);
+ return NULL;
+}
+
+/* Used to synchronize the sending threads with the target thread and
+ main thread. */
+static pthread_barrier_t barrier_1;
+static pthread_barrier_t barrier_2;
+
+/* The target thread to which signals are to be sent. */
+static pthread_t target_thread;
+
+static void *
+sender_thread_function (void *unused)
+{
+ while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED))
+ {
+ /* Wait until target_thread has been initialized. The target
+ thread and main thread participate in this barrier. */
+ xpthread_barrier_wait (&barrier_1);
+
+ xpthread_kill (target_thread, SIGUSR1);
+
+ /* Communicate that the signal has been sent. The main thread
+ participates in this barrier. */
+ xpthread_barrier_wait (&barrier_2);
+ }
+ return NULL;
+}
+
+static void *
+target_thread_function (void *unused)
+{
+ target_thread = pthread_self ();
+ xpthread_barrier_wait (&barrier_1);
+ return NULL;
+}
+
+static int
+do_test (void)
+{
+ xsignal (SIGUSR1, SIG_IGN);
+
+ pthread_t thr_timeout = xpthread_create (NULL, timeout_thread_function, NULL);
+
+ pthread_t threads[4];
+ xpthread_barrier_init (&barrier_1, NULL, array_length (threads) + 2);
+ xpthread_barrier_init (&barrier_2, NULL, array_length (threads) + 1);
+
+ for (int i = 0; i < array_length (threads); ++i)
+ threads[i] = xpthread_create (NULL, sender_thread_function, NULL);
+
+ while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED))
+ {
+ xpthread_create (NULL, target_thread_function, NULL);
+
+ /* Wait for the target thread to be set up and signal sending to
+ start. */
+ xpthread_barrier_wait (&barrier_1);
+
+ /* Wait for signal sending to complete. */
+ xpthread_barrier_wait (&barrier_2);
+
+ xpthread_join (target_thread);
+ }
+
+ for (int i = 0; i < array_length (threads); ++i)
+ xpthread_join (threads[i]);
+ xpthread_join (thr_timeout);
+
+ return 0;
+}
+
+#include <support/test-driver.c>
--
2.31.1
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889)
2021-08-17 13:51 ` [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) Florian Weimer
@ 2021-08-19 16:12 ` Adhemerval Zanella
2021-08-19 16:37 ` Florian Weimer
0 siblings, 1 reply; 10+ messages in thread
From: Adhemerval Zanella @ 2021-08-19 16:12 UTC (permalink / raw)
To: libc-alpha
On 17/08/2021 10:51, Florian Weimer via Libc-alpha wrote:
> Thread exit is delayed until all pending pthread_kill operations
> on the thread have completed. This avoids sending signals to the
> wrong thread or a spurious ESRCH error.
>
> The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived
> from a downstream test originally written by Marek Polacek.
And I really hope we could first sort out the BZ#19951 and move away
of using pthread::tid as the synchronization member to check thread
termination. I sent a possible fix [1], but it does not fully handle
the pthread::tid access, it would require to a proper lock to
synchronize between thread exit and pthread_kill (as Rich has suggested
on BZ#12889).
I will need to send an updated version of my pthread fixes patchset,
since clone3 changes broke it and INVALID_TD_P needs fixing as well.
[1] https://patchwork.sourceware.org/project/glibc/patch/20210610193639.3650754-5-adhemerval.zanella@linaro.org/
> ---
> nptl/allocatestack.c | 1 +
> nptl/descr.h | 12 ++
> nptl/pthread_create.c | 20 ++++
> nptl/pthread_kill.c | 23 +++-
> support/Makefile | 3 +-
> sysdeps/pthread/Makefile | 2 +
> sysdeps/pthread/tst-kill4.c | 90 --------------
> .../pthread/tst-pthread_cancel-select-loop.c | 87 ++++++++++++++
> sysdeps/pthread/tst-pthread_kill-exiting.c | 110 ++++++++++++++++++
> 9 files changed, 254 insertions(+), 94 deletions(-)
> delete mode 100644 sysdeps/pthread/tst-kill4.c
> create mode 100644 sysdeps/pthread/tst-pthread_cancel-select-loop.c
> create mode 100644 sysdeps/pthread/tst-pthread_kill-exiting.c
>
> diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c
> index cfe37a3443..344f7f8ec5 100644
> --- a/nptl/allocatestack.c
> +++ b/nptl/allocatestack.c
> @@ -127,6 +127,7 @@ get_cached_stack (size_t *sizep, void **memp)
> /* No pending event. */
> result->nextevent = NULL;
>
> + result->exit_signal_futex = 0;
> result->tls_state = (struct tls_internal_t) { 0 };
>
> /* Clear the DTV. */
> diff --git a/nptl/descr.h b/nptl/descr.h
> index c85778d449..7255d1dc77 100644
> --- a/nptl/descr.h
> +++ b/nptl/descr.h
> @@ -396,6 +396,18 @@ struct pthread
> PTHREAD_CANCEL_ASYNCHRONOUS). */
> unsigned char canceltype;
>
> + /* Futex to prevent thread exit during concurrent pthread_kill. The
> + least-significant bit is an exit flag. The remaining bits count
> + the number of pthread_kill operations in progress.
> +
> + Upon exit, the thread bit sets the LSB, and waits until the futex
> + variable is 1.
> +
> + pthread_kill atomically adds 2 to the futex variable before the
> + signal-sending operation, subtracts 2 after it, and skips sending
> + the signal if the least-significant bit is set. */
> + unsigned int exit_signal_futex;
> +
> /* Used on strsignal. */
> struct tls_internal_t tls_state;
>
> diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
> index d8ec299cb1..7e1a3cfcdd 100644
> --- a/nptl/pthread_create.c
> +++ b/nptl/pthread_create.c
> @@ -37,6 +37,7 @@
> #include <sys/single_threaded.h>
> #include <version.h>
> #include <clone_internal.h>
> +#include <futex-internal.h>
>
> #include <shlib-compat.h>
>
> @@ -485,6 +486,25 @@ start_thread (void *arg)
> /* This was the last thread. */
> exit (0);
>
> + /* Blocking signals is required to make pthread_kill
> + async-signal-safe. It prevents sending a signal from this thread
> + to itself during its final stages. This must come after the exit
> + call above because atexit handlers must not run with signals
> + blocked. */
> + __libc_signal_block_all (NULL);
> +
> + /* Delay exiting this thread until there are no concurrent
> + pthread_kill operations in progress. */
> + {
> + /* Indicate that no further signals should be sent. */
> + atomic_fetch_or_release (&pd->exit_signal_futex, 1);
> +
> + /* Wait until there are no pthread_kill operations left. */
> + unsigned int val;
> + while ((val = atomic_load_acquire (&pd->exit_signal_futex)) != 1)
> + futex_wait (&pd->exit_signal_futex, val, FUTEX_PRIVATE);
> + }
> +
> #ifndef __ASSUME_SET_ROBUST_LIST
> /* If this thread has any robust mutexes locked, handle them now. */
> # if __PTHREAD_MUTEX_HAVE_PREV
> diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
> index 5d4c86f920..b67bd4b769 100644
> --- a/nptl/pthread_kill.c
> +++ b/nptl/pthread_kill.c
> @@ -18,6 +18,7 @@
>
> #include <unistd.h>
> #include <pthreadP.h>
> +#include <futex-internal.h>
> #include <shlib-compat.h>
>
> int
> @@ -41,9 +42,25 @@ __pthread_kill_internal (pthread_t threadid, int signo)
> {
> pid_t pid = __getpid ();
>
> - val = INTERNAL_SYSCALL_CALL (tgkill, pid, tid, signo);
> - val = (INTERNAL_SYSCALL_ERROR_P (val)
> - ? INTERNAL_SYSCALL_ERRNO (val) : 0);
> + if (atomic_fetch_add_acq_rel (&pd->exit_signal_futex, 2) & 1)
> + /* If the thread is exiting, it has already blocked all
> + signals, so sending the signal does not have any effect.
> + Not sending the signal avoids a spurious ESRCH from tgkill
> + if the thread manages to complete its exit before the
> + signal can be sent. */
> + val = 0;
> + else
> + {
> + val = INTERNAL_SYSCALL_CALL (tgkill, pid, tid, signo);
> + val = (INTERNAL_SYSCALL_ERROR_P (val)
> + ? INTERNAL_SYSCALL_ERRNO (val) : 0);
> + }
> +
> + /* A value 3 before the update indicates that the thread is
> + exiting, and this is the last remaining concurrent
> + pthread_kill operation. */
> + if (atomic_fetch_add_acq_rel (&pd->exit_signal_futex, -2) == 3)
> + futex_wake (&pd->exit_signal_futex, 1, FUTEX_PRIVATE);
> }
> else
> /* The kernel reports that the thread has exited. POSIX specifies
> diff --git a/support/Makefile b/support/Makefile
> index a462781718..ef2b1a980a 100644
> --- a/support/Makefile
> +++ b/support/Makefile
> @@ -82,9 +82,10 @@ libsupport-routines = \
> support_test_compare_blob \
> support_test_compare_failure \
> support_test_compare_string \
> - support_write_file_string \
> support_test_main \
> support_test_verify_impl \
> + support_wait_for_thread_exit \
> + support_write_file_string \
> temp_file \
> timespec \
> timespec-time64 \
> diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile
> index dedfa0d290..48dba717a1 100644
> --- a/sysdeps/pthread/Makefile
> +++ b/sysdeps/pthread/Makefile
> @@ -119,7 +119,9 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \
> tst-unwind-thread \
> tst-pt-vfork1 tst-pt-vfork2 tst-vfork1x tst-vfork2x \
> tst-pthread_cancel-exited \
> + tst-pthread_cancel-select-loop \
> tst-pthread_kill-exited \
> + tst-pthread_kill-exiting \
> # tests
>
> tests-time64 := \
> diff --git a/sysdeps/pthread/tst-kill4.c b/sysdeps/pthread/tst-kill4.c
> deleted file mode 100644
> index 9563939792..0000000000
> --- a/sysdeps/pthread/tst-kill4.c
> +++ /dev/null
> @@ -1,90 +0,0 @@
> -/* Copyright (C) 2003-2021 Free Software Foundation, Inc.
> - This file is part of the GNU C Library.
> - Contributed by Ulrich Drepper <drepper@redhat.com>, 2003.
> -
> - The GNU C Library is free software; you can redistribute it and/or
> - modify it under the terms of the GNU Lesser General Public
> - License as published by the Free Software Foundation; either
> - version 2.1 of the License, or (at your option) any later version.
> -
> - The GNU C Library is distributed in the hope that it will be useful,
> - but WITHOUT ANY WARRANTY; without even the implied warranty of
> - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> - Lesser General Public License for more details.
> -
> - You should have received a copy of the GNU Lesser General Public
> - License along with the GNU C Library; if not, see
> - <https://www.gnu.org/licenses/>. */
> -
> -#include <errno.h>
> -#include <pthread.h>
> -#include <signal.h>
> -#include <stdio.h>
> -#include <stdlib.h>
> -#include <unistd.h>
> -
> -
> -static void *
> -tf (void *a)
> -{
> - return NULL;
> -}
> -
> -
> -int
> -do_test (void)
> -{
> - pthread_attr_t at;
> - if (pthread_attr_init (&at) != 0)
> - {
> - puts ("attr_create failed");
> - exit (1);
> - }
> -
> - /* Limit thread stack size, because if it is too large, pthread_join
> - will free it immediately rather than put it into stack cache. */
> - if (pthread_attr_setstacksize (&at, 2 * 1024 * 1024) != 0)
> - {
> - puts ("setstacksize failed");
> - exit (1);
> - }
> -
> - pthread_t th;
> - if (pthread_create (&th, &at, tf, NULL) != 0)
> - {
> - puts ("create failed");
> - exit (1);
> - }
> -
> - pthread_attr_destroy (&at);
> -
> - if (pthread_join (th, NULL) != 0)
> - {
> - puts ("join failed");
> - exit (1);
> - }
> -
> - /* The following only works because we assume here something about
> - the implementation. Namely, that the memory allocated for the
> - thread descriptor is not going away, that the TID field is
> - cleared and therefore the signal is sent to process 0, and that
> - we can savely assume there is no other process with this ID at
> - that time. */
> - int e = pthread_kill (th, 0);
> - if (e == 0)
> - {
> - puts ("pthread_kill succeeded");
> - exit (1);
> - }
> - if (e != ESRCH)
> - {
> - puts ("pthread_kill didn't return ESRCH");
> - exit (1);
> - }
> -
> - return 0;
> -}
> -
> -
> -#define TEST_FUNCTION do_test ()
> -#include "../test-skeleton.c"
> diff --git a/sysdeps/pthread/tst-pthread_cancel-select-loop.c b/sysdeps/pthread/tst-pthread_cancel-select-loop.c
> new file mode 100644
> index 0000000000..a62087589c
> --- /dev/null
> +++ b/sysdeps/pthread/tst-pthread_cancel-select-loop.c
> @@ -0,0 +1,87 @@
> +/* Test that pthread_cancel succeeds during thread exit.
> + Copyright (C) 2021 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <https://www.gnu.org/licenses/>. */
> +
> +/* This test tries to trigger an internal race condition in
> + pthread_cancel, where the cancellation signal is sent after the
> + thread has begun the cancellation process. This can result in a
> + spurious ESRCH error. For the original bug 12889, the window is
> + quite small, so the bug was not reproduced in every run. */
> +
> +#include <stdbool.h>
> +#include <stddef.h>
> +#include <support/check.h>
> +#include <support/xthread.h>
> +#include <support/xunistd.h>
> +#include <sys/select.h>
> +#include <unistd.h>
> +
> +/* Set to true by timeout_thread_function when the test should
> + terminate. */
> +static bool timeout;
> +
> +static void *
> +timeout_thread_function (void *unused)
> +{
> + usleep (5 * 1000 * 1000);
> + __atomic_store_n (&timeout, true, __ATOMIC_RELAXED);
> + return NULL;
> +}
> +
> +/* Used for blocking the select function below. */
> +static int pipe_fds[2];
> +
> +static void *
> +canceled_thread_function (void *unused)
> +{
> + while (true)
> + {
> + fd_set rfs;
> + fd_set wfs;
> + fd_set efs;
> + FD_ZERO (&rfs);
> + FD_ZERO (&wfs);
> + FD_ZERO (&efs);
> + FD_SET (pipe_fds[0], &rfs);
> +
> + /* If the cancellation request is recognized early, the thread
> + begins exiting while the cancellation signal arrives. */
> + select (FD_SETSIZE, &rfs, &wfs, &efs, NULL);
> + }
> + return NULL;
> +}
> +
> +static int
> +do_test (void)
> +{
> + xpipe (pipe_fds);
> + pthread_t thr_timeout = xpthread_create (NULL, timeout_thread_function, NULL);
> +
> + while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED))
> + {
> + pthread_t thr = xpthread_create (NULL, canceled_thread_function, NULL);
> + xpthread_cancel (thr);
> + TEST_VERIFY (xpthread_join (thr) == PTHREAD_CANCELED);
> + }
> +
> + xpthread_join (thr_timeout);
> + xclose (pipe_fds[0]);
> + xclose (pipe_fds[1]);
> + return 0;
> +}
> +
> +#include <support/test-driver.c>
> diff --git a/sysdeps/pthread/tst-pthread_kill-exiting.c b/sysdeps/pthread/tst-pthread_kill-exiting.c
> new file mode 100644
> index 0000000000..15550d2310
> --- /dev/null
> +++ b/sysdeps/pthread/tst-pthread_kill-exiting.c
> @@ -0,0 +1,110 @@
> +/* Test that pthread_kill succeeds during thread exit.
> + Copyright (C) 2021 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <https://www.gnu.org/licenses/>. */
> +
> +/* This test verifies that pthread_kill for a thread that is exiting
> + succeeds (with or without actually delivering the signal). */
> +
> +#include <array_length.h>
> +#include <stdbool.h>
> +#include <stddef.h>
> +#include <support/xsignal.h>
> +#include <support/xthread.h>
> +#include <unistd.h>
> +
> +/* Set to true by timeout_thread_function when the test should
> + terminate. */
> +static bool timeout;
> +
> +static void *
> +timeout_thread_function (void *unused)
> +{
> + usleep (1000 * 1000);
> + __atomic_store_n (&timeout, true, __ATOMIC_RELAXED);
> + return NULL;
> +}
> +
> +/* Used to synchronize the sending threads with the target thread and
> + main thread. */
> +static pthread_barrier_t barrier_1;
> +static pthread_barrier_t barrier_2;
> +
> +/* The target thread to which signals are to be sent. */
> +static pthread_t target_thread;
> +
> +static void *
> +sender_thread_function (void *unused)
> +{
> + while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED))
> + {
> + /* Wait until target_thread has been initialized. The target
> + thread and main thread participate in this barrier. */
> + xpthread_barrier_wait (&barrier_1);
> +
> + xpthread_kill (target_thread, SIGUSR1);
> +
> + /* Communicate that the signal has been sent. The main thread
> + participates in this barrier. */
> + xpthread_barrier_wait (&barrier_2);
> + }
> + return NULL;
> +}
> +
> +static void *
> +target_thread_function (void *unused)
> +{
> + target_thread = pthread_self ();
> + xpthread_barrier_wait (&barrier_1);
> + return NULL;
> +}
> +
> +static int
> +do_test (void)
> +{
> + xsignal (SIGUSR1, SIG_IGN);
> +
> + pthread_t thr_timeout = xpthread_create (NULL, timeout_thread_function, NULL);
> +
> + pthread_t threads[4];
> + xpthread_barrier_init (&barrier_1, NULL, array_length (threads) + 2);
> + xpthread_barrier_init (&barrier_2, NULL, array_length (threads) + 1);
> +
> + for (int i = 0; i < array_length (threads); ++i)
> + threads[i] = xpthread_create (NULL, sender_thread_function, NULL);
> +
> + while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED))
> + {
> + xpthread_create (NULL, target_thread_function, NULL);
> +
> + /* Wait for the target thread to be set up and signal sending to
> + start. */
> + xpthread_barrier_wait (&barrier_1);
> +
> + /* Wait for signal sending to complete. */
> + xpthread_barrier_wait (&barrier_2);
> +
> + xpthread_join (target_thread);
> + }
> +
> + for (int i = 0; i < array_length (threads); ++i)
> + xpthread_join (threads[i]);
> + xpthread_join (thr_timeout);
> +
> + return 0;
> +}
> +
> +#include <support/test-driver.c>
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] support: Add support_wait_for_thread_exit
2021-08-17 13:50 ` [PATCH 1/3] support: Add support_wait_for_thread_exit Florian Weimer
@ 2021-08-19 16:12 ` Adhemerval Zanella
2021-08-19 16:22 ` Florian Weimer
0 siblings, 1 reply; 10+ messages in thread
From: Adhemerval Zanella @ 2021-08-19 16:12 UTC (permalink / raw)
To: Florian Weimer, libc-alpha
On 17/08/2021 10:50, Florian Weimer via Libc-alpha wrote:
> ---
> support/support.h | 4 ++
> support/support_wait_for_thread_exit.c | 67 ++++++++++++++++++++++++++
Missing the Makefile entry for support_wait_for_thread_exit
> 2 files changed, 71 insertions(+)
> create mode 100644 support/support_wait_for_thread_exit.c
>
> diff --git a/support/support.h b/support/support.h
> index 834dba9097..a5978b939a 100644
> --- a/support/support.h
> +++ b/support/support.h
> @@ -174,6 +174,10 @@ timer_t support_create_timer (uint64_t sec, long int nsec, bool repeat,
> /* Disable the timer TIMER. */
> void support_delete_timer (timer_t timer);
>
> +/* Wait until all threads except the current thread have exited (as
> + far as the kernel is concerned). */
> +void support_wait_for_thread_exit (void);
> +
> struct support_stack
> {
> void *stack;
Ok.
> diff --git a/support/support_wait_for_thread_exit.c b/support/support_wait_for_thread_exit.c
> new file mode 100644
> index 0000000000..808901f746
> --- /dev/null
> +++ b/support/support_wait_for_thread_exit.c
> @@ -0,0 +1,67 @@
> +/* Wait until all threads except the current thread has exited.
> + Copyright (C) 2021 Free Software Foundation, Inc.
> + This file is part of the GNU C Library.
> +
> + The GNU C Library is free software; you can redistribute it and/or
> + modify it under the terms of the GNU Lesser General Public
> + License as published by the Free Software Foundation; either
> + version 2.1 of the License, or (at your option) any later version.
> +
> + The GNU C Library is distributed in the hope that it will be useful,
> + but WITHOUT ANY WARRANTY; without even the implied warranty of
> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> + Lesser General Public License for more details.
> +
> + You should have received a copy of the GNU Lesser General Public
> + License along with the GNU C Library; if not, see
> + <https://www.gnu.org/licenses/>. */
> +
> +#include <dirent.h>
> +#include <errno.h>
> +#include <string.h>
> +#include <support/check.h>
> +#include <support/support.h>
> +#include <unistd.h>
> +
> +void
> +support_wait_for_thread_exit (void)
> +{
> + DIR *proc_self_task = opendir ("/proc/self/task");
> + if (proc_self_task == NULL)
> + /* Use a large timeout because we cannot verify that the thread
> + has exited. */
> + usleep (5 * 1000 * 1000);
> + else
> + {
> + int count;
> +
> + wait_again:
> + count = 0;
> + rewinddir (proc_self_task);
> + while (true)
> + {
> + errno = 0;
> + struct dirent *e = readdir (proc_self_task);
> + if (e == NULL && errno != 0)
> + FAIL_EXIT1 ("readdir: %m");
> + if (e == NULL)
> + {
> + /* Only the main thread remains. Testing may continue. */
> + TEST_VERIFY (count >= 1);
> + closedir (proc_self_task);
> + return;
> + }
> +
> + if (strcmp (e->d_name, ".") == 0 || strcmp (e->d_name, "..") == 0)
> + continue;
> +
> + ++count;
> + if (count > 1)
> + {
> + /* Small timeout to give the thread a chance to exit. */
> + usleep (50 * 1000);
> + goto wait_again;
> + }
> + }
> + }
> +}
>
I am not very found of interfaces that poll kernel resources using timeouts,
but I don't think we have a better Linux interface to handle this.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] support: Add support_wait_for_thread_exit
2021-08-19 16:12 ` Adhemerval Zanella
@ 2021-08-19 16:22 ` Florian Weimer
0 siblings, 0 replies; 10+ messages in thread
From: Florian Weimer @ 2021-08-19 16:22 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: libc-alpha
* Adhemerval Zanella:
> On 17/08/2021 10:50, Florian Weimer via Libc-alpha wrote:
>> ---
>> support/support.h | 4 ++
>> support/support_wait_for_thread_exit.c | 67 ++++++++++++++++++++++++++
>
> Missing the Makefile entry for support_wait_for_thread_exit
Huh, right. I must have put that into one of the other patches. Will
reshuffle.
>> +void
>> +support_wait_for_thread_exit (void)
>> +{
>> + DIR *proc_self_task = opendir ("/proc/self/task");
>> + if (proc_self_task == NULL)
>> + /* Use a large timeout because we cannot verify that the thread
>> + has exited. */
>> + usleep (5 * 1000 * 1000);
>> + else
>> + {
>> + int count;
>> +
>> + wait_again:
>> + count = 0;
>> + rewinddir (proc_self_task);
>> + while (true)
>> + {
>> + errno = 0;
>> + struct dirent *e = readdir (proc_self_task);
>> + if (e == NULL && errno != 0)
>> + FAIL_EXIT1 ("readdir: %m");
>> + if (e == NULL)
>> + {
>> + /* Only the main thread remains. Testing may continue. */
>> + TEST_VERIFY (count >= 1);
>> + closedir (proc_self_task);
>> + return;
>> + }
>> +
>> + if (strcmp (e->d_name, ".") == 0 || strcmp (e->d_name, "..") == 0)
>> + continue;
>> +
>> + ++count;
>> + if (count > 1)
>> + {
>> + /* Small timeout to give the thread a chance to exit. */
>> + usleep (50 * 1000);
>> + goto wait_again;
>> + }
>> + }
>> + }
>> +}
>>
>
> I am not very found of interfaces that poll kernel resources using timeouts,
> but I don't think we have a better Linux interface to handle this.
I think we could use pidfd_open and poll once we have a wrapper for
pidfd_open. Not sure if it's possible to get a pollable descriptor via
/proc.
Thanks,
Florian
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889)
2021-08-19 16:12 ` Adhemerval Zanella
@ 2021-08-19 16:37 ` Florian Weimer
2021-08-19 16:49 ` Adhemerval Zanella
0 siblings, 1 reply; 10+ messages in thread
From: Florian Weimer @ 2021-08-19 16:37 UTC (permalink / raw)
To: Adhemerval Zanella via Libc-alpha
* Adhemerval Zanella via Libc-alpha:
> On 17/08/2021 10:51, Florian Weimer via Libc-alpha wrote:
>> Thread exit is delayed until all pending pthread_kill operations
>> on the thread have completed. This avoids sending signals to the
>> wrong thread or a spurious ESRCH error.
>>
>> The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived
>> from a downstream test originally written by Marek Polacek.
>
> And I really hope we could first sort out the BZ#19951 and move away
> of using pthread::tid as the synchronization member to check thread
> termination. I sent a possible fix [1], but it does not fully handle
> the pthread::tid access, it would require to a proper lock to
> synchronize between thread exit and pthread_kill (as Rich has suggested
> on BZ#12889).
>
> I will need to send an updated version of my pthread fixes patchset,
> since clone3 changes broke it and INVALID_TD_P needs fixing as well.
>
> [1] https://patchwork.sourceware.org/project/glibc/patch/20210610193639.3650754-5-adhemerval.zanella@linaro.org/
I think my patch is independent of the other fixes. Its advantage over
other proposals is that it does not add waiting to pthread_kill, so
there is no signal-safety issue.
(But I probably should test if cancellation from a signal handler that
is invoked synchronously via pthread_kill still works.)
Thanks,
Florian
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889)
2021-08-19 16:37 ` Florian Weimer
@ 2021-08-19 16:49 ` Adhemerval Zanella
2021-08-20 16:24 ` Florian Weimer
0 siblings, 1 reply; 10+ messages in thread
From: Adhemerval Zanella @ 2021-08-19 16:49 UTC (permalink / raw)
To: Florian Weimer, Adhemerval Zanella via Libc-alpha
On 19/08/2021 13:37, Florian Weimer wrote:
> * Adhemerval Zanella via Libc-alpha:
>
>> On 17/08/2021 10:51, Florian Weimer via Libc-alpha wrote:
>>> Thread exit is delayed until all pending pthread_kill operations
>>> on the thread have completed. This avoids sending signals to the
>>> wrong thread or a spurious ESRCH error.
>>>
>>> The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived
>>> from a downstream test originally written by Marek Polacek.
>>
>> And I really hope we could first sort out the BZ#19951 and move away
>> of using pthread::tid as the synchronization member to check thread
>> termination. I sent a possible fix [1], but it does not fully handle
>> the pthread::tid access, it would require to a proper lock to
>> synchronize between thread exit and pthread_kill (as Rich has suggested
>> on BZ#12889).
>>
>> I will need to send an updated version of my pthread fixes patchset,
>> since clone3 changes broke it and INVALID_TD_P needs fixing as well.
>>
>> [1] https://patchwork.sourceware.org/project/glibc/patch/20210610193639.3650754-5-adhemerval.zanella@linaro.org/
>
> I think my patch is independent of the other fixes. Its advantage over
> other proposals is that it does not add waiting to pthread_kill, so
> there is no signal-safety issue.
I am not sure, the way I think we should fix to also fix BZ#19951
would to add a lock to access 'tid' and uses not only on pthread_kill,
but also on pthread_getcpuclockid, pthread_getschedparam,
pthread_setschedparam, and pthread_setschedprio. The lock also
simplifies the code, there is no need to special handling of futex
internal state. I also think pthread_kill would also be simplified.
The lock is just to simplify the code, if we really need to add
some scalability we might use a more clever code as you are proposing.
>
> (But I probably should test if cancellation from a signal handler that
> is invoked synchronously via pthread_kill still works.)
>
> Thanks,
> Florian
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889)
2021-08-19 16:49 ` Adhemerval Zanella
@ 2021-08-20 16:24 ` Florian Weimer
0 siblings, 0 replies; 10+ messages in thread
From: Florian Weimer @ 2021-08-20 16:24 UTC (permalink / raw)
To: Adhemerval Zanella; +Cc: Adhemerval Zanella via Libc-alpha
* Adhemerval Zanella:
> On 19/08/2021 13:37, Florian Weimer wrote:
>> * Adhemerval Zanella via Libc-alpha:
>>
>>> On 17/08/2021 10:51, Florian Weimer via Libc-alpha wrote:
>>>> Thread exit is delayed until all pending pthread_kill operations
>>>> on the thread have completed. This avoids sending signals to the
>>>> wrong thread or a spurious ESRCH error.
>>>>
>>>> The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived
>>>> from a downstream test originally written by Marek Polacek.
>>>
>>> And I really hope we could first sort out the BZ#19951 and move away
>>> of using pthread::tid as the synchronization member to check thread
>>> termination. I sent a possible fix [1], but it does not fully handle
>>> the pthread::tid access, it would require to a proper lock to
>>> synchronize between thread exit and pthread_kill (as Rich has suggested
>>> on BZ#12889).
>>>
>>> I will need to send an updated version of my pthread fixes patchset,
>>> since clone3 changes broke it and INVALID_TD_P needs fixing as well.
>>>
>>> [1] https://patchwork.sourceware.org/project/glibc/patch/20210610193639.3650754-5-adhemerval.zanella@linaro.org/
>>
>> I think my patch is independent of the other fixes. Its advantage over
>> other proposals is that it does not add waiting to pthread_kill, so
>> there is no signal-safety issue.
>
> I am not sure, the way I think we should fix to also fix BZ#19951
> would to add a lock to access 'tid' and uses not only on pthread_kill,
> but also on pthread_getcpuclockid, pthread_getschedparam,
> pthread_setschedparam, and pthread_setschedprio. The lock also
> simplifies the code, there is no need to special handling of futex
> internal state. I also think pthread_kill would also be simplified.
>
> The lock is just to simplify the code, if we really need to add
> some scalability we might use a more clever code as you are proposing.
I don't think we can wait for the fix for bug 19951 because glibc 2.34
has a real regression in pthread_cancel: it can fail spuriously with
ESRCH. And it can happen with a single pthread_cancel call in the
program, you don't even need concurrent pthread_cancel to trigger this
bug now.
I have moved the magic no-TID-reuse functionality into a separate
function in the new version, so we can replace it once we have something
better.
Thanks,
Florian
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-08-20 16:24 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-17 13:50 [PATCH 0/3] Fix ESRCH issues in pthread_cancel, pthread_kill Florian Weimer
2021-08-17 13:50 ` [PATCH 1/3] support: Add support_wait_for_thread_exit Florian Weimer
2021-08-19 16:12 ` Adhemerval Zanella
2021-08-19 16:22 ` Florian Weimer
2021-08-17 13:51 ` [PATCH 2/3] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) Florian Weimer
2021-08-17 13:51 ` [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) Florian Weimer
2021-08-19 16:12 ` Adhemerval Zanella
2021-08-19 16:37 ` Florian Weimer
2021-08-19 16:49 ` Adhemerval Zanella
2021-08-20 16:24 ` Florian Weimer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).