* [PATCH 0/3] Fix ESRCH issues in pthread_cancel, pthread_kill @ 2021-08-17 13:50 Florian Weimer 2021-08-17 13:50 ` [PATCH 1/3] support: Add support_wait_for_thread_exit Florian Weimer ` (2 more replies) 0 siblings, 3 replies; 10+ messages in thread From: Florian Weimer @ 2021-08-17 13:50 UTC (permalink / raw) To: libc-alpha This mini-series came out of a downstream bug report: glibc: pthread_cancel fails with ESRCH yet subsequent pthread_join passes <https://bugzilla.redhat.com/show_bug.cgi?id=1994068> Apparently the race is much easier to trigger due to the changes in glibc 2.34. Tested on i686-linux-gnu, x86_64-linux-gnu. Built with build-many-glibcs.py. Thanks, Florian Florian Weimer (3): support: Add support_wait_for_thread_exit nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) nptl: Fix race between pthread_kill and thread exit (bug 12889) nptl/allocatestack.c | 1 + nptl/descr.h | 12 ++ nptl/pthread_cancel.c | 9 +- nptl/pthread_create.c | 20 ++++ nptl/pthread_kill.c | 30 ++++- support/Makefile | 3 +- support/support.h | 4 + support/support_wait_for_thread_exit.c | 67 +++++++++++ sysdeps/pthread/Makefile | 7 +- sysdeps/pthread/tst-kill4.c | 90 -------------- sysdeps/pthread/tst-pthread_cancel-exited.c | 45 +++++++ .../pthread/tst-pthread_cancel-select-loop.c | 87 ++++++++++++++ sysdeps/pthread/tst-pthread_kill-exited.c | 46 ++++++++ sysdeps/pthread/tst-pthread_kill-exiting.c | 110 ++++++++++++++++++ 14 files changed, 431 insertions(+), 100 deletions(-) create mode 100644 support/support_wait_for_thread_exit.c delete mode 100644 sysdeps/pthread/tst-kill4.c create mode 100644 sysdeps/pthread/tst-pthread_cancel-exited.c create mode 100644 sysdeps/pthread/tst-pthread_cancel-select-loop.c create mode 100644 sysdeps/pthread/tst-pthread_kill-exited.c create mode 100644 sysdeps/pthread/tst-pthread_kill-exiting.c -- 2.31.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 1/3] support: Add support_wait_for_thread_exit 2021-08-17 13:50 [PATCH 0/3] Fix ESRCH issues in pthread_cancel, pthread_kill Florian Weimer @ 2021-08-17 13:50 ` Florian Weimer 2021-08-19 16:12 ` Adhemerval Zanella 2021-08-17 13:51 ` [PATCH 2/3] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) Florian Weimer 2021-08-17 13:51 ` [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) Florian Weimer 2 siblings, 1 reply; 10+ messages in thread From: Florian Weimer @ 2021-08-17 13:50 UTC (permalink / raw) To: libc-alpha --- support/support.h | 4 ++ support/support_wait_for_thread_exit.c | 67 ++++++++++++++++++++++++++ 2 files changed, 71 insertions(+) create mode 100644 support/support_wait_for_thread_exit.c diff --git a/support/support.h b/support/support.h index 834dba9097..a5978b939a 100644 --- a/support/support.h +++ b/support/support.h @@ -174,6 +174,10 @@ timer_t support_create_timer (uint64_t sec, long int nsec, bool repeat, /* Disable the timer TIMER. */ void support_delete_timer (timer_t timer); +/* Wait until all threads except the current thread have exited (as + far as the kernel is concerned). */ +void support_wait_for_thread_exit (void); + struct support_stack { void *stack; diff --git a/support/support_wait_for_thread_exit.c b/support/support_wait_for_thread_exit.c new file mode 100644 index 0000000000..808901f746 --- /dev/null +++ b/support/support_wait_for_thread_exit.c @@ -0,0 +1,67 @@ +/* Wait until all threads except the current thread has exited. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <dirent.h> +#include <errno.h> +#include <string.h> +#include <support/check.h> +#include <support/support.h> +#include <unistd.h> + +void +support_wait_for_thread_exit (void) +{ + DIR *proc_self_task = opendir ("/proc/self/task"); + if (proc_self_task == NULL) + /* Use a large timeout because we cannot verify that the thread + has exited. */ + usleep (5 * 1000 * 1000); + else + { + int count; + + wait_again: + count = 0; + rewinddir (proc_self_task); + while (true) + { + errno = 0; + struct dirent *e = readdir (proc_self_task); + if (e == NULL && errno != 0) + FAIL_EXIT1 ("readdir: %m"); + if (e == NULL) + { + /* Only the main thread remains. Testing may continue. */ + TEST_VERIFY (count >= 1); + closedir (proc_self_task); + return; + } + + if (strcmp (e->d_name, ".") == 0 || strcmp (e->d_name, "..") == 0) + continue; + + ++count; + if (count > 1) + { + /* Small timeout to give the thread a chance to exit. */ + usleep (50 * 1000); + goto wait_again; + } + } + } +} -- 2.31.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] support: Add support_wait_for_thread_exit 2021-08-17 13:50 ` [PATCH 1/3] support: Add support_wait_for_thread_exit Florian Weimer @ 2021-08-19 16:12 ` Adhemerval Zanella 2021-08-19 16:22 ` Florian Weimer 0 siblings, 1 reply; 10+ messages in thread From: Adhemerval Zanella @ 2021-08-19 16:12 UTC (permalink / raw) To: Florian Weimer, libc-alpha On 17/08/2021 10:50, Florian Weimer via Libc-alpha wrote: > --- > support/support.h | 4 ++ > support/support_wait_for_thread_exit.c | 67 ++++++++++++++++++++++++++ Missing the Makefile entry for support_wait_for_thread_exit > 2 files changed, 71 insertions(+) > create mode 100644 support/support_wait_for_thread_exit.c > > diff --git a/support/support.h b/support/support.h > index 834dba9097..a5978b939a 100644 > --- a/support/support.h > +++ b/support/support.h > @@ -174,6 +174,10 @@ timer_t support_create_timer (uint64_t sec, long int nsec, bool repeat, > /* Disable the timer TIMER. */ > void support_delete_timer (timer_t timer); > > +/* Wait until all threads except the current thread have exited (as > + far as the kernel is concerned). */ > +void support_wait_for_thread_exit (void); > + > struct support_stack > { > void *stack; Ok. > diff --git a/support/support_wait_for_thread_exit.c b/support/support_wait_for_thread_exit.c > new file mode 100644 > index 0000000000..808901f746 > --- /dev/null > +++ b/support/support_wait_for_thread_exit.c > @@ -0,0 +1,67 @@ > +/* Wait until all threads except the current thread has exited. > + Copyright (C) 2021 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <https://www.gnu.org/licenses/>. */ > + > +#include <dirent.h> > +#include <errno.h> > +#include <string.h> > +#include <support/check.h> > +#include <support/support.h> > +#include <unistd.h> > + > +void > +support_wait_for_thread_exit (void) > +{ > + DIR *proc_self_task = opendir ("/proc/self/task"); > + if (proc_self_task == NULL) > + /* Use a large timeout because we cannot verify that the thread > + has exited. */ > + usleep (5 * 1000 * 1000); > + else > + { > + int count; > + > + wait_again: > + count = 0; > + rewinddir (proc_self_task); > + while (true) > + { > + errno = 0; > + struct dirent *e = readdir (proc_self_task); > + if (e == NULL && errno != 0) > + FAIL_EXIT1 ("readdir: %m"); > + if (e == NULL) > + { > + /* Only the main thread remains. Testing may continue. */ > + TEST_VERIFY (count >= 1); > + closedir (proc_self_task); > + return; > + } > + > + if (strcmp (e->d_name, ".") == 0 || strcmp (e->d_name, "..") == 0) > + continue; > + > + ++count; > + if (count > 1) > + { > + /* Small timeout to give the thread a chance to exit. */ > + usleep (50 * 1000); > + goto wait_again; > + } > + } > + } > +} > I am not very found of interfaces that poll kernel resources using timeouts, but I don't think we have a better Linux interface to handle this. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 1/3] support: Add support_wait_for_thread_exit 2021-08-19 16:12 ` Adhemerval Zanella @ 2021-08-19 16:22 ` Florian Weimer 0 siblings, 0 replies; 10+ messages in thread From: Florian Weimer @ 2021-08-19 16:22 UTC (permalink / raw) To: Adhemerval Zanella; +Cc: libc-alpha * Adhemerval Zanella: > On 17/08/2021 10:50, Florian Weimer via Libc-alpha wrote: >> --- >> support/support.h | 4 ++ >> support/support_wait_for_thread_exit.c | 67 ++++++++++++++++++++++++++ > > Missing the Makefile entry for support_wait_for_thread_exit Huh, right. I must have put that into one of the other patches. Will reshuffle. >> +void >> +support_wait_for_thread_exit (void) >> +{ >> + DIR *proc_self_task = opendir ("/proc/self/task"); >> + if (proc_self_task == NULL) >> + /* Use a large timeout because we cannot verify that the thread >> + has exited. */ >> + usleep (5 * 1000 * 1000); >> + else >> + { >> + int count; >> + >> + wait_again: >> + count = 0; >> + rewinddir (proc_self_task); >> + while (true) >> + { >> + errno = 0; >> + struct dirent *e = readdir (proc_self_task); >> + if (e == NULL && errno != 0) >> + FAIL_EXIT1 ("readdir: %m"); >> + if (e == NULL) >> + { >> + /* Only the main thread remains. Testing may continue. */ >> + TEST_VERIFY (count >= 1); >> + closedir (proc_self_task); >> + return; >> + } >> + >> + if (strcmp (e->d_name, ".") == 0 || strcmp (e->d_name, "..") == 0) >> + continue; >> + >> + ++count; >> + if (count > 1) >> + { >> + /* Small timeout to give the thread a chance to exit. */ >> + usleep (50 * 1000); >> + goto wait_again; >> + } >> + } >> + } >> +} >> > > I am not very found of interfaces that poll kernel resources using timeouts, > but I don't think we have a better Linux interface to handle this. I think we could use pidfd_open and poll once we have a wrapper for pidfd_open. Not sure if it's possible to get a pollable descriptor via /proc. Thanks, Florian ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 2/3] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) 2021-08-17 13:50 [PATCH 0/3] Fix ESRCH issues in pthread_cancel, pthread_kill Florian Weimer 2021-08-17 13:50 ` [PATCH 1/3] support: Add support_wait_for_thread_exit Florian Weimer @ 2021-08-17 13:51 ` Florian Weimer 2021-08-17 13:51 ` [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) Florian Weimer 2 siblings, 0 replies; 10+ messages in thread From: Florian Weimer @ 2021-08-17 13:51 UTC (permalink / raw) To: libc-alpha This closes one remaining race condition related to bug 12889: if the thread already exited on the kernel side, returning ESRCH is not correct because that error is reserved for the thread IDs (pthread_t values) whose lifetime has ended. In case of a kernel-side exit and a valid thread ID, no signal needs to be sent and cancellation does not have an effect, so just return 0. sysdeps/pthread/tst-kill4.c triggers undefined behavior and is removed with this commit. --- nptl/pthread_cancel.c | 9 ++-- nptl/pthread_kill.c | 7 +++- sysdeps/pthread/Makefile | 5 ++- sysdeps/pthread/tst-pthread_cancel-exited.c | 45 ++++++++++++++++++++ sysdeps/pthread/tst-pthread_kill-exited.c | 46 +++++++++++++++++++++ 5 files changed, 106 insertions(+), 6 deletions(-) create mode 100644 sysdeps/pthread/tst-pthread_cancel-exited.c create mode 100644 sysdeps/pthread/tst-pthread_kill-exited.c diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c index cc25ff21f3..9bac6e3b76 100644 --- a/nptl/pthread_cancel.c +++ b/nptl/pthread_cancel.c @@ -62,10 +62,11 @@ __pthread_cancel (pthread_t th) { volatile struct pthread *pd = (volatile struct pthread *) th; - /* Make sure the descriptor is valid. */ - if (INVALID_TD_P (pd)) - /* Not a valid thread handle. */ - return ESRCH; + if (pd->tid == 0) + /* The thread has already exited on the kernel side. Its outcome + (regular exit, other cancelation) has already been + determined. */ + return 0; static int init_sigcancel = 0; if (atomic_load_relaxed (&init_sigcancel) == 0) diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c index f79a2b26fc..5d4c86f920 100644 --- a/nptl/pthread_kill.c +++ b/nptl/pthread_kill.c @@ -46,7 +46,12 @@ __pthread_kill_internal (pthread_t threadid, int signo) ? INTERNAL_SYSCALL_ERRNO (val) : 0); } else - val = ESRCH; + /* The kernel reports that the thread has exited. POSIX specifies + the ESRCH error only for the case when the lifetime of a thread + ID has ended, but calling pthread_kill on such a thread ID is + undefined in glibc. Therefore, do not treat kernel thread exit + as an error. */ + val = 0; return val; } diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile index 42f9fc5072..dedfa0d290 100644 --- a/sysdeps/pthread/Makefile +++ b/sysdeps/pthread/Makefile @@ -89,7 +89,7 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \ tst-join8 tst-join9 tst-join10 tst-join11 tst-join12 tst-join13 \ tst-join14 tst-join15 \ tst-key1 tst-key2 tst-key3 tst-key4 \ - tst-kill1 tst-kill2 tst-kill3 tst-kill4 tst-kill5 tst-kill6 \ + tst-kill1 tst-kill2 tst-kill3 tst-kill5 tst-kill6 \ tst-locale1 tst-locale2 \ tst-memstream \ tst-mutex-errorcheck tst-mutex1 tst-mutex2 tst-mutex3 tst-mutex4 \ @@ -118,6 +118,9 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \ tst-unload \ tst-unwind-thread \ tst-pt-vfork1 tst-pt-vfork2 tst-vfork1x tst-vfork2x \ + tst-pthread_cancel-exited \ + tst-pthread_kill-exited \ + # tests tests-time64 := \ tst-abstime-time64 \ diff --git a/sysdeps/pthread/tst-pthread_cancel-exited.c b/sysdeps/pthread/tst-pthread_cancel-exited.c new file mode 100644 index 0000000000..811c9bee07 --- /dev/null +++ b/sysdeps/pthread/tst-pthread_cancel-exited.c @@ -0,0 +1,45 @@ +/* Test that pthread_kill succeeds for an exited thread. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +/* This test verifies that pthread_kill returns 0 (and not ESRCH) for + a thread that has exited on the kernel side. */ + +#include <stddef.h> +#include <support/support.h> +#include <support/xthread.h> + +static void * +noop_thread (void *closure) +{ + return NULL; +} + +static int +do_test (void) +{ + pthread_t thr = xpthread_create (NULL, noop_thread, NULL); + + support_wait_for_thread_exit (); + + xpthread_cancel (thr); + xpthread_join (thr); + + return 0; +} + +#include <support/test-driver.c> diff --git a/sysdeps/pthread/tst-pthread_kill-exited.c b/sysdeps/pthread/tst-pthread_kill-exited.c new file mode 100644 index 0000000000..7575fb6d58 --- /dev/null +++ b/sysdeps/pthread/tst-pthread_kill-exited.c @@ -0,0 +1,46 @@ +/* Test that pthread_kill succeeds for an exited thread. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +/* This test verifies that pthread_kill returns 0 (and not ESRCH) for + a thread that has exited on the kernel side. */ + +#include <signal.h> +#include <stddef.h> +#include <support/support.h> +#include <support/xthread.h> + +static void * +noop_thread (void *closure) +{ + return NULL; +} + +static int +do_test (void) +{ + pthread_t thr = xpthread_create (NULL, noop_thread, NULL); + + support_wait_for_thread_exit (); + + xpthread_kill (thr, SIGUSR1); + xpthread_join (thr); + + return 0; +} + +#include <support/test-driver.c> -- 2.31.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) 2021-08-17 13:50 [PATCH 0/3] Fix ESRCH issues in pthread_cancel, pthread_kill Florian Weimer 2021-08-17 13:50 ` [PATCH 1/3] support: Add support_wait_for_thread_exit Florian Weimer 2021-08-17 13:51 ` [PATCH 2/3] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) Florian Weimer @ 2021-08-17 13:51 ` Florian Weimer 2021-08-19 16:12 ` Adhemerval Zanella 2 siblings, 1 reply; 10+ messages in thread From: Florian Weimer @ 2021-08-17 13:51 UTC (permalink / raw) To: libc-alpha; +Cc: Marek Polacek Thread exit is delayed until all pending pthread_kill operations on the thread have completed. This avoids sending signals to the wrong thread or a spurious ESRCH error. The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived from a downstream test originally written by Marek Polacek. --- nptl/allocatestack.c | 1 + nptl/descr.h | 12 ++ nptl/pthread_create.c | 20 ++++ nptl/pthread_kill.c | 23 +++- support/Makefile | 3 +- sysdeps/pthread/Makefile | 2 + sysdeps/pthread/tst-kill4.c | 90 -------------- .../pthread/tst-pthread_cancel-select-loop.c | 87 ++++++++++++++ sysdeps/pthread/tst-pthread_kill-exiting.c | 110 ++++++++++++++++++ 9 files changed, 254 insertions(+), 94 deletions(-) delete mode 100644 sysdeps/pthread/tst-kill4.c create mode 100644 sysdeps/pthread/tst-pthread_cancel-select-loop.c create mode 100644 sysdeps/pthread/tst-pthread_kill-exiting.c diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c index cfe37a3443..344f7f8ec5 100644 --- a/nptl/allocatestack.c +++ b/nptl/allocatestack.c @@ -127,6 +127,7 @@ get_cached_stack (size_t *sizep, void **memp) /* No pending event. */ result->nextevent = NULL; + result->exit_signal_futex = 0; result->tls_state = (struct tls_internal_t) { 0 }; /* Clear the DTV. */ diff --git a/nptl/descr.h b/nptl/descr.h index c85778d449..7255d1dc77 100644 --- a/nptl/descr.h +++ b/nptl/descr.h @@ -396,6 +396,18 @@ struct pthread PTHREAD_CANCEL_ASYNCHRONOUS). */ unsigned char canceltype; + /* Futex to prevent thread exit during concurrent pthread_kill. The + least-significant bit is an exit flag. The remaining bits count + the number of pthread_kill operations in progress. + + Upon exit, the thread bit sets the LSB, and waits until the futex + variable is 1. + + pthread_kill atomically adds 2 to the futex variable before the + signal-sending operation, subtracts 2 after it, and skips sending + the signal if the least-significant bit is set. */ + unsigned int exit_signal_futex; + /* Used on strsignal. */ struct tls_internal_t tls_state; diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index d8ec299cb1..7e1a3cfcdd 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -37,6 +37,7 @@ #include <sys/single_threaded.h> #include <version.h> #include <clone_internal.h> +#include <futex-internal.h> #include <shlib-compat.h> @@ -485,6 +486,25 @@ start_thread (void *arg) /* This was the last thread. */ exit (0); + /* Blocking signals is required to make pthread_kill + async-signal-safe. It prevents sending a signal from this thread + to itself during its final stages. This must come after the exit + call above because atexit handlers must not run with signals + blocked. */ + __libc_signal_block_all (NULL); + + /* Delay exiting this thread until there are no concurrent + pthread_kill operations in progress. */ + { + /* Indicate that no further signals should be sent. */ + atomic_fetch_or_release (&pd->exit_signal_futex, 1); + + /* Wait until there are no pthread_kill operations left. */ + unsigned int val; + while ((val = atomic_load_acquire (&pd->exit_signal_futex)) != 1) + futex_wait (&pd->exit_signal_futex, val, FUTEX_PRIVATE); + } + #ifndef __ASSUME_SET_ROBUST_LIST /* If this thread has any robust mutexes locked, handle them now. */ # if __PTHREAD_MUTEX_HAVE_PREV diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c index 5d4c86f920..b67bd4b769 100644 --- a/nptl/pthread_kill.c +++ b/nptl/pthread_kill.c @@ -18,6 +18,7 @@ #include <unistd.h> #include <pthreadP.h> +#include <futex-internal.h> #include <shlib-compat.h> int @@ -41,9 +42,25 @@ __pthread_kill_internal (pthread_t threadid, int signo) { pid_t pid = __getpid (); - val = INTERNAL_SYSCALL_CALL (tgkill, pid, tid, signo); - val = (INTERNAL_SYSCALL_ERROR_P (val) - ? INTERNAL_SYSCALL_ERRNO (val) : 0); + if (atomic_fetch_add_acq_rel (&pd->exit_signal_futex, 2) & 1) + /* If the thread is exiting, it has already blocked all + signals, so sending the signal does not have any effect. + Not sending the signal avoids a spurious ESRCH from tgkill + if the thread manages to complete its exit before the + signal can be sent. */ + val = 0; + else + { + val = INTERNAL_SYSCALL_CALL (tgkill, pid, tid, signo); + val = (INTERNAL_SYSCALL_ERROR_P (val) + ? INTERNAL_SYSCALL_ERRNO (val) : 0); + } + + /* A value 3 before the update indicates that the thread is + exiting, and this is the last remaining concurrent + pthread_kill operation. */ + if (atomic_fetch_add_acq_rel (&pd->exit_signal_futex, -2) == 3) + futex_wake (&pd->exit_signal_futex, 1, FUTEX_PRIVATE); } else /* The kernel reports that the thread has exited. POSIX specifies diff --git a/support/Makefile b/support/Makefile index a462781718..ef2b1a980a 100644 --- a/support/Makefile +++ b/support/Makefile @@ -82,9 +82,10 @@ libsupport-routines = \ support_test_compare_blob \ support_test_compare_failure \ support_test_compare_string \ - support_write_file_string \ support_test_main \ support_test_verify_impl \ + support_wait_for_thread_exit \ + support_write_file_string \ temp_file \ timespec \ timespec-time64 \ diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile index dedfa0d290..48dba717a1 100644 --- a/sysdeps/pthread/Makefile +++ b/sysdeps/pthread/Makefile @@ -119,7 +119,9 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \ tst-unwind-thread \ tst-pt-vfork1 tst-pt-vfork2 tst-vfork1x tst-vfork2x \ tst-pthread_cancel-exited \ + tst-pthread_cancel-select-loop \ tst-pthread_kill-exited \ + tst-pthread_kill-exiting \ # tests tests-time64 := \ diff --git a/sysdeps/pthread/tst-kill4.c b/sysdeps/pthread/tst-kill4.c deleted file mode 100644 index 9563939792..0000000000 --- a/sysdeps/pthread/tst-kill4.c +++ /dev/null @@ -1,90 +0,0 @@ -/* Copyright (C) 2003-2021 Free Software Foundation, Inc. - This file is part of the GNU C Library. - Contributed by Ulrich Drepper <drepper@redhat.com>, 2003. - - The GNU C Library is free software; you can redistribute it and/or - modify it under the terms of the GNU Lesser General Public - License as published by the Free Software Foundation; either - version 2.1 of the License, or (at your option) any later version. - - The GNU C Library is distributed in the hope that it will be useful, - but WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public - License along with the GNU C Library; if not, see - <https://www.gnu.org/licenses/>. */ - -#include <errno.h> -#include <pthread.h> -#include <signal.h> -#include <stdio.h> -#include <stdlib.h> -#include <unistd.h> - - -static void * -tf (void *a) -{ - return NULL; -} - - -int -do_test (void) -{ - pthread_attr_t at; - if (pthread_attr_init (&at) != 0) - { - puts ("attr_create failed"); - exit (1); - } - - /* Limit thread stack size, because if it is too large, pthread_join - will free it immediately rather than put it into stack cache. */ - if (pthread_attr_setstacksize (&at, 2 * 1024 * 1024) != 0) - { - puts ("setstacksize failed"); - exit (1); - } - - pthread_t th; - if (pthread_create (&th, &at, tf, NULL) != 0) - { - puts ("create failed"); - exit (1); - } - - pthread_attr_destroy (&at); - - if (pthread_join (th, NULL) != 0) - { - puts ("join failed"); - exit (1); - } - - /* The following only works because we assume here something about - the implementation. Namely, that the memory allocated for the - thread descriptor is not going away, that the TID field is - cleared and therefore the signal is sent to process 0, and that - we can savely assume there is no other process with this ID at - that time. */ - int e = pthread_kill (th, 0); - if (e == 0) - { - puts ("pthread_kill succeeded"); - exit (1); - } - if (e != ESRCH) - { - puts ("pthread_kill didn't return ESRCH"); - exit (1); - } - - return 0; -} - - -#define TEST_FUNCTION do_test () -#include "../test-skeleton.c" diff --git a/sysdeps/pthread/tst-pthread_cancel-select-loop.c b/sysdeps/pthread/tst-pthread_cancel-select-loop.c new file mode 100644 index 0000000000..a62087589c --- /dev/null +++ b/sysdeps/pthread/tst-pthread_cancel-select-loop.c @@ -0,0 +1,87 @@ +/* Test that pthread_cancel succeeds during thread exit. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +/* This test tries to trigger an internal race condition in + pthread_cancel, where the cancellation signal is sent after the + thread has begun the cancellation process. This can result in a + spurious ESRCH error. For the original bug 12889, the window is + quite small, so the bug was not reproduced in every run. */ + +#include <stdbool.h> +#include <stddef.h> +#include <support/check.h> +#include <support/xthread.h> +#include <support/xunistd.h> +#include <sys/select.h> +#include <unistd.h> + +/* Set to true by timeout_thread_function when the test should + terminate. */ +static bool timeout; + +static void * +timeout_thread_function (void *unused) +{ + usleep (5 * 1000 * 1000); + __atomic_store_n (&timeout, true, __ATOMIC_RELAXED); + return NULL; +} + +/* Used for blocking the select function below. */ +static int pipe_fds[2]; + +static void * +canceled_thread_function (void *unused) +{ + while (true) + { + fd_set rfs; + fd_set wfs; + fd_set efs; + FD_ZERO (&rfs); + FD_ZERO (&wfs); + FD_ZERO (&efs); + FD_SET (pipe_fds[0], &rfs); + + /* If the cancellation request is recognized early, the thread + begins exiting while the cancellation signal arrives. */ + select (FD_SETSIZE, &rfs, &wfs, &efs, NULL); + } + return NULL; +} + +static int +do_test (void) +{ + xpipe (pipe_fds); + pthread_t thr_timeout = xpthread_create (NULL, timeout_thread_function, NULL); + + while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED)) + { + pthread_t thr = xpthread_create (NULL, canceled_thread_function, NULL); + xpthread_cancel (thr); + TEST_VERIFY (xpthread_join (thr) == PTHREAD_CANCELED); + } + + xpthread_join (thr_timeout); + xclose (pipe_fds[0]); + xclose (pipe_fds[1]); + return 0; +} + +#include <support/test-driver.c> diff --git a/sysdeps/pthread/tst-pthread_kill-exiting.c b/sysdeps/pthread/tst-pthread_kill-exiting.c new file mode 100644 index 0000000000..15550d2310 --- /dev/null +++ b/sysdeps/pthread/tst-pthread_kill-exiting.c @@ -0,0 +1,110 @@ +/* Test that pthread_kill succeeds during thread exit. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +/* This test verifies that pthread_kill for a thread that is exiting + succeeds (with or without actually delivering the signal). */ + +#include <array_length.h> +#include <stdbool.h> +#include <stddef.h> +#include <support/xsignal.h> +#include <support/xthread.h> +#include <unistd.h> + +/* Set to true by timeout_thread_function when the test should + terminate. */ +static bool timeout; + +static void * +timeout_thread_function (void *unused) +{ + usleep (1000 * 1000); + __atomic_store_n (&timeout, true, __ATOMIC_RELAXED); + return NULL; +} + +/* Used to synchronize the sending threads with the target thread and + main thread. */ +static pthread_barrier_t barrier_1; +static pthread_barrier_t barrier_2; + +/* The target thread to which signals are to be sent. */ +static pthread_t target_thread; + +static void * +sender_thread_function (void *unused) +{ + while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED)) + { + /* Wait until target_thread has been initialized. The target + thread and main thread participate in this barrier. */ + xpthread_barrier_wait (&barrier_1); + + xpthread_kill (target_thread, SIGUSR1); + + /* Communicate that the signal has been sent. The main thread + participates in this barrier. */ + xpthread_barrier_wait (&barrier_2); + } + return NULL; +} + +static void * +target_thread_function (void *unused) +{ + target_thread = pthread_self (); + xpthread_barrier_wait (&barrier_1); + return NULL; +} + +static int +do_test (void) +{ + xsignal (SIGUSR1, SIG_IGN); + + pthread_t thr_timeout = xpthread_create (NULL, timeout_thread_function, NULL); + + pthread_t threads[4]; + xpthread_barrier_init (&barrier_1, NULL, array_length (threads) + 2); + xpthread_barrier_init (&barrier_2, NULL, array_length (threads) + 1); + + for (int i = 0; i < array_length (threads); ++i) + threads[i] = xpthread_create (NULL, sender_thread_function, NULL); + + while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED)) + { + xpthread_create (NULL, target_thread_function, NULL); + + /* Wait for the target thread to be set up and signal sending to + start. */ + xpthread_barrier_wait (&barrier_1); + + /* Wait for signal sending to complete. */ + xpthread_barrier_wait (&barrier_2); + + xpthread_join (target_thread); + } + + for (int i = 0; i < array_length (threads); ++i) + xpthread_join (threads[i]); + xpthread_join (thr_timeout); + + return 0; +} + +#include <support/test-driver.c> -- 2.31.1 ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) 2021-08-17 13:51 ` [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) Florian Weimer @ 2021-08-19 16:12 ` Adhemerval Zanella 2021-08-19 16:37 ` Florian Weimer 0 siblings, 1 reply; 10+ messages in thread From: Adhemerval Zanella @ 2021-08-19 16:12 UTC (permalink / raw) To: libc-alpha On 17/08/2021 10:51, Florian Weimer via Libc-alpha wrote: > Thread exit is delayed until all pending pthread_kill operations > on the thread have completed. This avoids sending signals to the > wrong thread or a spurious ESRCH error. > > The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived > from a downstream test originally written by Marek Polacek. And I really hope we could first sort out the BZ#19951 and move away of using pthread::tid as the synchronization member to check thread termination. I sent a possible fix [1], but it does not fully handle the pthread::tid access, it would require to a proper lock to synchronize between thread exit and pthread_kill (as Rich has suggested on BZ#12889). I will need to send an updated version of my pthread fixes patchset, since clone3 changes broke it and INVALID_TD_P needs fixing as well. [1] https://patchwork.sourceware.org/project/glibc/patch/20210610193639.3650754-5-adhemerval.zanella@linaro.org/ > --- > nptl/allocatestack.c | 1 + > nptl/descr.h | 12 ++ > nptl/pthread_create.c | 20 ++++ > nptl/pthread_kill.c | 23 +++- > support/Makefile | 3 +- > sysdeps/pthread/Makefile | 2 + > sysdeps/pthread/tst-kill4.c | 90 -------------- > .../pthread/tst-pthread_cancel-select-loop.c | 87 ++++++++++++++ > sysdeps/pthread/tst-pthread_kill-exiting.c | 110 ++++++++++++++++++ > 9 files changed, 254 insertions(+), 94 deletions(-) > delete mode 100644 sysdeps/pthread/tst-kill4.c > create mode 100644 sysdeps/pthread/tst-pthread_cancel-select-loop.c > create mode 100644 sysdeps/pthread/tst-pthread_kill-exiting.c > > diff --git a/nptl/allocatestack.c b/nptl/allocatestack.c > index cfe37a3443..344f7f8ec5 100644 > --- a/nptl/allocatestack.c > +++ b/nptl/allocatestack.c > @@ -127,6 +127,7 @@ get_cached_stack (size_t *sizep, void **memp) > /* No pending event. */ > result->nextevent = NULL; > > + result->exit_signal_futex = 0; > result->tls_state = (struct tls_internal_t) { 0 }; > > /* Clear the DTV. */ > diff --git a/nptl/descr.h b/nptl/descr.h > index c85778d449..7255d1dc77 100644 > --- a/nptl/descr.h > +++ b/nptl/descr.h > @@ -396,6 +396,18 @@ struct pthread > PTHREAD_CANCEL_ASYNCHRONOUS). */ > unsigned char canceltype; > > + /* Futex to prevent thread exit during concurrent pthread_kill. The > + least-significant bit is an exit flag. The remaining bits count > + the number of pthread_kill operations in progress. > + > + Upon exit, the thread bit sets the LSB, and waits until the futex > + variable is 1. > + > + pthread_kill atomically adds 2 to the futex variable before the > + signal-sending operation, subtracts 2 after it, and skips sending > + the signal if the least-significant bit is set. */ > + unsigned int exit_signal_futex; > + > /* Used on strsignal. */ > struct tls_internal_t tls_state; > > diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c > index d8ec299cb1..7e1a3cfcdd 100644 > --- a/nptl/pthread_create.c > +++ b/nptl/pthread_create.c > @@ -37,6 +37,7 @@ > #include <sys/single_threaded.h> > #include <version.h> > #include <clone_internal.h> > +#include <futex-internal.h> > > #include <shlib-compat.h> > > @@ -485,6 +486,25 @@ start_thread (void *arg) > /* This was the last thread. */ > exit (0); > > + /* Blocking signals is required to make pthread_kill > + async-signal-safe. It prevents sending a signal from this thread > + to itself during its final stages. This must come after the exit > + call above because atexit handlers must not run with signals > + blocked. */ > + __libc_signal_block_all (NULL); > + > + /* Delay exiting this thread until there are no concurrent > + pthread_kill operations in progress. */ > + { > + /* Indicate that no further signals should be sent. */ > + atomic_fetch_or_release (&pd->exit_signal_futex, 1); > + > + /* Wait until there are no pthread_kill operations left. */ > + unsigned int val; > + while ((val = atomic_load_acquire (&pd->exit_signal_futex)) != 1) > + futex_wait (&pd->exit_signal_futex, val, FUTEX_PRIVATE); > + } > + > #ifndef __ASSUME_SET_ROBUST_LIST > /* If this thread has any robust mutexes locked, handle them now. */ > # if __PTHREAD_MUTEX_HAVE_PREV > diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c > index 5d4c86f920..b67bd4b769 100644 > --- a/nptl/pthread_kill.c > +++ b/nptl/pthread_kill.c > @@ -18,6 +18,7 @@ > > #include <unistd.h> > #include <pthreadP.h> > +#include <futex-internal.h> > #include <shlib-compat.h> > > int > @@ -41,9 +42,25 @@ __pthread_kill_internal (pthread_t threadid, int signo) > { > pid_t pid = __getpid (); > > - val = INTERNAL_SYSCALL_CALL (tgkill, pid, tid, signo); > - val = (INTERNAL_SYSCALL_ERROR_P (val) > - ? INTERNAL_SYSCALL_ERRNO (val) : 0); > + if (atomic_fetch_add_acq_rel (&pd->exit_signal_futex, 2) & 1) > + /* If the thread is exiting, it has already blocked all > + signals, so sending the signal does not have any effect. > + Not sending the signal avoids a spurious ESRCH from tgkill > + if the thread manages to complete its exit before the > + signal can be sent. */ > + val = 0; > + else > + { > + val = INTERNAL_SYSCALL_CALL (tgkill, pid, tid, signo); > + val = (INTERNAL_SYSCALL_ERROR_P (val) > + ? INTERNAL_SYSCALL_ERRNO (val) : 0); > + } > + > + /* A value 3 before the update indicates that the thread is > + exiting, and this is the last remaining concurrent > + pthread_kill operation. */ > + if (atomic_fetch_add_acq_rel (&pd->exit_signal_futex, -2) == 3) > + futex_wake (&pd->exit_signal_futex, 1, FUTEX_PRIVATE); > } > else > /* The kernel reports that the thread has exited. POSIX specifies > diff --git a/support/Makefile b/support/Makefile > index a462781718..ef2b1a980a 100644 > --- a/support/Makefile > +++ b/support/Makefile > @@ -82,9 +82,10 @@ libsupport-routines = \ > support_test_compare_blob \ > support_test_compare_failure \ > support_test_compare_string \ > - support_write_file_string \ > support_test_main \ > support_test_verify_impl \ > + support_wait_for_thread_exit \ > + support_write_file_string \ > temp_file \ > timespec \ > timespec-time64 \ > diff --git a/sysdeps/pthread/Makefile b/sysdeps/pthread/Makefile > index dedfa0d290..48dba717a1 100644 > --- a/sysdeps/pthread/Makefile > +++ b/sysdeps/pthread/Makefile > @@ -119,7 +119,9 @@ tests += tst-cnd-basic tst-mtx-trylock tst-cnd-broadcast \ > tst-unwind-thread \ > tst-pt-vfork1 tst-pt-vfork2 tst-vfork1x tst-vfork2x \ > tst-pthread_cancel-exited \ > + tst-pthread_cancel-select-loop \ > tst-pthread_kill-exited \ > + tst-pthread_kill-exiting \ > # tests > > tests-time64 := \ > diff --git a/sysdeps/pthread/tst-kill4.c b/sysdeps/pthread/tst-kill4.c > deleted file mode 100644 > index 9563939792..0000000000 > --- a/sysdeps/pthread/tst-kill4.c > +++ /dev/null > @@ -1,90 +0,0 @@ > -/* Copyright (C) 2003-2021 Free Software Foundation, Inc. > - This file is part of the GNU C Library. > - Contributed by Ulrich Drepper <drepper@redhat.com>, 2003. > - > - The GNU C Library is free software; you can redistribute it and/or > - modify it under the terms of the GNU Lesser General Public > - License as published by the Free Software Foundation; either > - version 2.1 of the License, or (at your option) any later version. > - > - The GNU C Library is distributed in the hope that it will be useful, > - but WITHOUT ANY WARRANTY; without even the implied warranty of > - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > - Lesser General Public License for more details. > - > - You should have received a copy of the GNU Lesser General Public > - License along with the GNU C Library; if not, see > - <https://www.gnu.org/licenses/>. */ > - > -#include <errno.h> > -#include <pthread.h> > -#include <signal.h> > -#include <stdio.h> > -#include <stdlib.h> > -#include <unistd.h> > - > - > -static void * > -tf (void *a) > -{ > - return NULL; > -} > - > - > -int > -do_test (void) > -{ > - pthread_attr_t at; > - if (pthread_attr_init (&at) != 0) > - { > - puts ("attr_create failed"); > - exit (1); > - } > - > - /* Limit thread stack size, because if it is too large, pthread_join > - will free it immediately rather than put it into stack cache. */ > - if (pthread_attr_setstacksize (&at, 2 * 1024 * 1024) != 0) > - { > - puts ("setstacksize failed"); > - exit (1); > - } > - > - pthread_t th; > - if (pthread_create (&th, &at, tf, NULL) != 0) > - { > - puts ("create failed"); > - exit (1); > - } > - > - pthread_attr_destroy (&at); > - > - if (pthread_join (th, NULL) != 0) > - { > - puts ("join failed"); > - exit (1); > - } > - > - /* The following only works because we assume here something about > - the implementation. Namely, that the memory allocated for the > - thread descriptor is not going away, that the TID field is > - cleared and therefore the signal is sent to process 0, and that > - we can savely assume there is no other process with this ID at > - that time. */ > - int e = pthread_kill (th, 0); > - if (e == 0) > - { > - puts ("pthread_kill succeeded"); > - exit (1); > - } > - if (e != ESRCH) > - { > - puts ("pthread_kill didn't return ESRCH"); > - exit (1); > - } > - > - return 0; > -} > - > - > -#define TEST_FUNCTION do_test () > -#include "../test-skeleton.c" > diff --git a/sysdeps/pthread/tst-pthread_cancel-select-loop.c b/sysdeps/pthread/tst-pthread_cancel-select-loop.c > new file mode 100644 > index 0000000000..a62087589c > --- /dev/null > +++ b/sysdeps/pthread/tst-pthread_cancel-select-loop.c > @@ -0,0 +1,87 @@ > +/* Test that pthread_cancel succeeds during thread exit. > + Copyright (C) 2021 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <https://www.gnu.org/licenses/>. */ > + > +/* This test tries to trigger an internal race condition in > + pthread_cancel, where the cancellation signal is sent after the > + thread has begun the cancellation process. This can result in a > + spurious ESRCH error. For the original bug 12889, the window is > + quite small, so the bug was not reproduced in every run. */ > + > +#include <stdbool.h> > +#include <stddef.h> > +#include <support/check.h> > +#include <support/xthread.h> > +#include <support/xunistd.h> > +#include <sys/select.h> > +#include <unistd.h> > + > +/* Set to true by timeout_thread_function when the test should > + terminate. */ > +static bool timeout; > + > +static void * > +timeout_thread_function (void *unused) > +{ > + usleep (5 * 1000 * 1000); > + __atomic_store_n (&timeout, true, __ATOMIC_RELAXED); > + return NULL; > +} > + > +/* Used for blocking the select function below. */ > +static int pipe_fds[2]; > + > +static void * > +canceled_thread_function (void *unused) > +{ > + while (true) > + { > + fd_set rfs; > + fd_set wfs; > + fd_set efs; > + FD_ZERO (&rfs); > + FD_ZERO (&wfs); > + FD_ZERO (&efs); > + FD_SET (pipe_fds[0], &rfs); > + > + /* If the cancellation request is recognized early, the thread > + begins exiting while the cancellation signal arrives. */ > + select (FD_SETSIZE, &rfs, &wfs, &efs, NULL); > + } > + return NULL; > +} > + > +static int > +do_test (void) > +{ > + xpipe (pipe_fds); > + pthread_t thr_timeout = xpthread_create (NULL, timeout_thread_function, NULL); > + > + while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED)) > + { > + pthread_t thr = xpthread_create (NULL, canceled_thread_function, NULL); > + xpthread_cancel (thr); > + TEST_VERIFY (xpthread_join (thr) == PTHREAD_CANCELED); > + } > + > + xpthread_join (thr_timeout); > + xclose (pipe_fds[0]); > + xclose (pipe_fds[1]); > + return 0; > +} > + > +#include <support/test-driver.c> > diff --git a/sysdeps/pthread/tst-pthread_kill-exiting.c b/sysdeps/pthread/tst-pthread_kill-exiting.c > new file mode 100644 > index 0000000000..15550d2310 > --- /dev/null > +++ b/sysdeps/pthread/tst-pthread_kill-exiting.c > @@ -0,0 +1,110 @@ > +/* Test that pthread_kill succeeds during thread exit. > + Copyright (C) 2021 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <https://www.gnu.org/licenses/>. */ > + > +/* This test verifies that pthread_kill for a thread that is exiting > + succeeds (with or without actually delivering the signal). */ > + > +#include <array_length.h> > +#include <stdbool.h> > +#include <stddef.h> > +#include <support/xsignal.h> > +#include <support/xthread.h> > +#include <unistd.h> > + > +/* Set to true by timeout_thread_function when the test should > + terminate. */ > +static bool timeout; > + > +static void * > +timeout_thread_function (void *unused) > +{ > + usleep (1000 * 1000); > + __atomic_store_n (&timeout, true, __ATOMIC_RELAXED); > + return NULL; > +} > + > +/* Used to synchronize the sending threads with the target thread and > + main thread. */ > +static pthread_barrier_t barrier_1; > +static pthread_barrier_t barrier_2; > + > +/* The target thread to which signals are to be sent. */ > +static pthread_t target_thread; > + > +static void * > +sender_thread_function (void *unused) > +{ > + while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED)) > + { > + /* Wait until target_thread has been initialized. The target > + thread and main thread participate in this barrier. */ > + xpthread_barrier_wait (&barrier_1); > + > + xpthread_kill (target_thread, SIGUSR1); > + > + /* Communicate that the signal has been sent. The main thread > + participates in this barrier. */ > + xpthread_barrier_wait (&barrier_2); > + } > + return NULL; > +} > + > +static void * > +target_thread_function (void *unused) > +{ > + target_thread = pthread_self (); > + xpthread_barrier_wait (&barrier_1); > + return NULL; > +} > + > +static int > +do_test (void) > +{ > + xsignal (SIGUSR1, SIG_IGN); > + > + pthread_t thr_timeout = xpthread_create (NULL, timeout_thread_function, NULL); > + > + pthread_t threads[4]; > + xpthread_barrier_init (&barrier_1, NULL, array_length (threads) + 2); > + xpthread_barrier_init (&barrier_2, NULL, array_length (threads) + 1); > + > + for (int i = 0; i < array_length (threads); ++i) > + threads[i] = xpthread_create (NULL, sender_thread_function, NULL); > + > + while (!__atomic_load_n (&timeout, __ATOMIC_RELAXED)) > + { > + xpthread_create (NULL, target_thread_function, NULL); > + > + /* Wait for the target thread to be set up and signal sending to > + start. */ > + xpthread_barrier_wait (&barrier_1); > + > + /* Wait for signal sending to complete. */ > + xpthread_barrier_wait (&barrier_2); > + > + xpthread_join (target_thread); > + } > + > + for (int i = 0; i < array_length (threads); ++i) > + xpthread_join (threads[i]); > + xpthread_join (thr_timeout); > + > + return 0; > +} > + > +#include <support/test-driver.c> > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) 2021-08-19 16:12 ` Adhemerval Zanella @ 2021-08-19 16:37 ` Florian Weimer 2021-08-19 16:49 ` Adhemerval Zanella 0 siblings, 1 reply; 10+ messages in thread From: Florian Weimer @ 2021-08-19 16:37 UTC (permalink / raw) To: Adhemerval Zanella via Libc-alpha * Adhemerval Zanella via Libc-alpha: > On 17/08/2021 10:51, Florian Weimer via Libc-alpha wrote: >> Thread exit is delayed until all pending pthread_kill operations >> on the thread have completed. This avoids sending signals to the >> wrong thread or a spurious ESRCH error. >> >> The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived >> from a downstream test originally written by Marek Polacek. > > And I really hope we could first sort out the BZ#19951 and move away > of using pthread::tid as the synchronization member to check thread > termination. I sent a possible fix [1], but it does not fully handle > the pthread::tid access, it would require to a proper lock to > synchronize between thread exit and pthread_kill (as Rich has suggested > on BZ#12889). > > I will need to send an updated version of my pthread fixes patchset, > since clone3 changes broke it and INVALID_TD_P needs fixing as well. > > [1] https://patchwork.sourceware.org/project/glibc/patch/20210610193639.3650754-5-adhemerval.zanella@linaro.org/ I think my patch is independent of the other fixes. Its advantage over other proposals is that it does not add waiting to pthread_kill, so there is no signal-safety issue. (But I probably should test if cancellation from a signal handler that is invoked synchronously via pthread_kill still works.) Thanks, Florian ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) 2021-08-19 16:37 ` Florian Weimer @ 2021-08-19 16:49 ` Adhemerval Zanella 2021-08-20 16:24 ` Florian Weimer 0 siblings, 1 reply; 10+ messages in thread From: Adhemerval Zanella @ 2021-08-19 16:49 UTC (permalink / raw) To: Florian Weimer, Adhemerval Zanella via Libc-alpha On 19/08/2021 13:37, Florian Weimer wrote: > * Adhemerval Zanella via Libc-alpha: > >> On 17/08/2021 10:51, Florian Weimer via Libc-alpha wrote: >>> Thread exit is delayed until all pending pthread_kill operations >>> on the thread have completed. This avoids sending signals to the >>> wrong thread or a spurious ESRCH error. >>> >>> The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived >>> from a downstream test originally written by Marek Polacek. >> >> And I really hope we could first sort out the BZ#19951 and move away >> of using pthread::tid as the synchronization member to check thread >> termination. I sent a possible fix [1], but it does not fully handle >> the pthread::tid access, it would require to a proper lock to >> synchronize between thread exit and pthread_kill (as Rich has suggested >> on BZ#12889). >> >> I will need to send an updated version of my pthread fixes patchset, >> since clone3 changes broke it and INVALID_TD_P needs fixing as well. >> >> [1] https://patchwork.sourceware.org/project/glibc/patch/20210610193639.3650754-5-adhemerval.zanella@linaro.org/ > > I think my patch is independent of the other fixes. Its advantage over > other proposals is that it does not add waiting to pthread_kill, so > there is no signal-safety issue. I am not sure, the way I think we should fix to also fix BZ#19951 would to add a lock to access 'tid' and uses not only on pthread_kill, but also on pthread_getcpuclockid, pthread_getschedparam, pthread_setschedparam, and pthread_setschedprio. The lock also simplifies the code, there is no need to special handling of futex internal state. I also think pthread_kill would also be simplified. The lock is just to simplify the code, if we really need to add some scalability we might use a more clever code as you are proposing. > > (But I probably should test if cancellation from a signal handler that > is invoked synchronously via pthread_kill still works.) > > Thanks, > Florian > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) 2021-08-19 16:49 ` Adhemerval Zanella @ 2021-08-20 16:24 ` Florian Weimer 0 siblings, 0 replies; 10+ messages in thread From: Florian Weimer @ 2021-08-20 16:24 UTC (permalink / raw) To: Adhemerval Zanella; +Cc: Adhemerval Zanella via Libc-alpha * Adhemerval Zanella: > On 19/08/2021 13:37, Florian Weimer wrote: >> * Adhemerval Zanella via Libc-alpha: >> >>> On 17/08/2021 10:51, Florian Weimer via Libc-alpha wrote: >>>> Thread exit is delayed until all pending pthread_kill operations >>>> on the thread have completed. This avoids sending signals to the >>>> wrong thread or a spurious ESRCH error. >>>> >>>> The test sysdeps/pthread/tst-pthread_cancel-select-loop.c is derived >>>> from a downstream test originally written by Marek Polacek. >>> >>> And I really hope we could first sort out the BZ#19951 and move away >>> of using pthread::tid as the synchronization member to check thread >>> termination. I sent a possible fix [1], but it does not fully handle >>> the pthread::tid access, it would require to a proper lock to >>> synchronize between thread exit and pthread_kill (as Rich has suggested >>> on BZ#12889). >>> >>> I will need to send an updated version of my pthread fixes patchset, >>> since clone3 changes broke it and INVALID_TD_P needs fixing as well. >>> >>> [1] https://patchwork.sourceware.org/project/glibc/patch/20210610193639.3650754-5-adhemerval.zanella@linaro.org/ >> >> I think my patch is independent of the other fixes. Its advantage over >> other proposals is that it does not add waiting to pthread_kill, so >> there is no signal-safety issue. > > I am not sure, the way I think we should fix to also fix BZ#19951 > would to add a lock to access 'tid' and uses not only on pthread_kill, > but also on pthread_getcpuclockid, pthread_getschedparam, > pthread_setschedparam, and pthread_setschedprio. The lock also > simplifies the code, there is no need to special handling of futex > internal state. I also think pthread_kill would also be simplified. > > The lock is just to simplify the code, if we really need to add > some scalability we might use a more clever code as you are proposing. I don't think we can wait for the fix for bug 19951 because glibc 2.34 has a real regression in pthread_cancel: it can fail spuriously with ESRCH. And it can happen with a single pthread_cancel call in the program, you don't even need concurrent pthread_cancel to trigger this bug now. I have moved the magic no-TID-reuse functionality into a separate function in the new version, so we can replace it once we have something better. Thanks, Florian ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2021-08-20 16:24 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-08-17 13:50 [PATCH 0/3] Fix ESRCH issues in pthread_cancel, pthread_kill Florian Weimer 2021-08-17 13:50 ` [PATCH 1/3] support: Add support_wait_for_thread_exit Florian Weimer 2021-08-19 16:12 ` Adhemerval Zanella 2021-08-19 16:22 ` Florian Weimer 2021-08-17 13:51 ` [PATCH 2/3] nptl: pthread_kill, pthread_cancel should fail after exit (bug 19193) Florian Weimer 2021-08-17 13:51 ` [PATCH 3/3] nptl: Fix race between pthread_kill and thread exit (bug 12889) Florian Weimer 2021-08-19 16:12 ` Adhemerval Zanella 2021-08-19 16:37 ` Florian Weimer 2021-08-19 16:49 ` Adhemerval Zanella 2021-08-20 16:24 ` Florian Weimer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).