public inbox for glibc-cvs@sourceware.org
help / color / mirror / Atom feed
* [glibc/azanella/bz12683] nptl: Fix Race conditions in pthread cancellation [BZ#12683]
@ 2023-04-07 17:07 Adhemerval Zanella
  0 siblings, 0 replies; 6+ messages in thread
From: Adhemerval Zanella @ 2023-04-07 17:07 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=de8cdfc3a6a95aa7125c934ed81fdd0c34ef5eec

commit de8cdfc3a6a95aa7125c934ed81fdd0c34ef5eec
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Fri Mar 31 17:24:24 2023 -0300

    nptl: Fix Race conditions in pthread cancellation [BZ#12683]
    
    This patch is the initial fix for race conditions in NPTL cancellation
    code by redefining how cancellable syscalls are defined and handled.
    The current buggy approach is to enable asynchronous cancellation
    before making the syscall and restore the previous cancellation
    type once the syscall returns.
    
    As described in BZ#12683, this approach shows 2 important problems:
    
      1. Cancellation can act after the syscall has returned from the
         kernel, but before userspace saves the return value.  It might
         result in a resource leak if the syscall allocated a resource or a
         side effect (partial read/write), and there is no way to program
         handle it with cancellation handlers.
    
      2. If a signal is handled while the thread is blocked at a cancellable
         syscall, the entire signal handler runs with asynchronous
         cancellation enabled.  This can lead to issues if the signal
         handler call functions which are async-signal-safe but not
         async-cancel-safe.
    
    For the cancellation to work correctly, there are 5 points at which the
    cancellation signal could arrive:
    
      1. Before the final "testcancel" and before the syscall is made.
      2. Between the "testcancel" and the syscall.
      3. While the syscall is blocked and no side effects have yet taken
         place.
      4. While the syscall is blocked but with some side effects already
         having taken place (e.g. a partial read or write).
      5. After the syscall has returned.
    
    And GLIBC wants to act on cancellation in cases 1, 2, and 3 but not
    in cases 4 or 5.  For the 4 and 5 cases, the cancellation will eventually
    happen in the next cancellable entry point without any further external
    event.
    
    The proposed solution follows for each case:
    
      1. Do a conditional branch based on whether the thread has received
         a cancellation request;
    
      2. It can be caught by the signal handler determining that the saved
         program counter (from the ucontext_t) is in some address range
         beginning just before the "testcancel" and ending with the
         syscall instruction.
    
      3. In this case, except for certain syscalls that ALWAYS fail with
         EINTR even for non-interrupting signals, the kernel will reset
         the program counter to point at the syscall instruction during
         signal handling, so that the syscall is restarted when the signal
         handler returns.  So, from the signal handler's standpoint, this
         looks the same as case 2, and thus it's taken care of.
    
      4. For syscalls with side-effects, the kernel cannot restart the
         syscall; when it's interrupted by a signal, the kernel must cause
         the syscall to return with whatever partial result is obtained
         (e.g. partial read or write).
    
      5. In this case, the saved program counter points just after the
         syscall instruction, so the signal handler won't act on
         cancellation.  This is similar to 4. since the program counter
         is past the syscall instruction.
    
    Another case that needs handling is syscalls that fail with EINTR even
    when the signal handler is non-interrupting. In this case, the syscall
    wrapper code can just check the cancellation flag when the errno result
    is EINTR, and act on cancellation if it's set.
    
    The proposed GLIBC adjustments are:
    
      1. Remove the enable_asynccancel/disable_asynccancel function usage in
         syscall definition and instead make them call a common symbol that
         will check if cancellation is enabled (__syscall_cancel at
         nptl/libc-cancellation.c), call the arch-specific cancellable
         entry-point (__syscall_cancel_arch) and cancel the thread when
         required.
    
      2. Provide an arch-specific generic system call wrapper function
         that contains global markers.  These markers will be used in
         SIGCANCEL handler to check if the interruption has been called in a
         valid syscall and if the syscalls have been completed or not.
    
         A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
         is provided.  However, the markers may not be set on correct
         expected places depending on how INTERNAL_SYSCALL_NCS is
         implemented by the architecture and it uses compiler-specific
         construct (asm volatile) to place the required markers.
         It is expected that all architectures add an arch-specific
         implementation.
    
      3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
         type and if current IP from signal handler falls between the global
         markers and act accordingly (sigcancel_handler at nptl/nptl-init.c).
    
      4. Adjust nptl/pthread_cancel.c to send a signal instead of acting
         directly. This avoids synchronization issues when updating the
         cancellation status and also focuses the logic on the signal
         handler and cancellation syscall code.
    
      5. Adjust pthread code to replace CANCEL_ASYNC/CANCEL_RESET calls to
         appropriated cancelable futex syscalls.
    
      6. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
         appropriated cancelable syscalls.
    
      7. Adjust 'lowlevellock-futex.h' arch-specific implementations to
         provide cancelable futex calls (used in libpthread code).
    
    This patch adds the proposed changes to NPTL common code and the
    following patches add the requires arch-specific bits.  The build for
    ia64-linux-gnu, mips-*, and x86_64-* are broken without the
    arch-specific patches.

Diff:
---
 elf/Makefile                             |   5 +-
 nptl/Makefile                            |  11 ++-
 nptl/cancellation.c                      |  97 +++++++++++--------------
 nptl/descr-const.sym                     |   5 ++
 nptl/descr.h                             |  18 +++++
 nptl/pthread_cancel.c                    |  70 ++++++++----------
 nptl/tst-cancel31.c                      | 100 ++++++++++++++++++++++++++
 sysdeps/generic/syscall_types.h          |  25 +++++++
 sysdeps/nptl/cancellation-pc-check.h     |  54 ++++++++++++++
 sysdeps/nptl/lowlevellock-futex.h        |  20 ++----
 sysdeps/nptl/pthreadP.h                  |   7 ++
 sysdeps/unix/sysdep.h                    | 118 +++++++++++++++++++++++--------
 sysdeps/unix/sysv/linux/socketcall.h     |  35 ++++++---
 sysdeps/unix/sysv/linux/syscall_cancel.c |  61 ++++++++++++++++
 sysdeps/unix/sysv/linux/sysdep-cancel.h  |  12 ----
 sysdeps/unix/sysv/linux/sysdep.h         |   9 +++
 16 files changed, 476 insertions(+), 171 deletions(-)

diff --git a/elf/Makefile b/elf/Makefile
index 396ec51424..b684e5d44c 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -1252,11 +1252,8 @@ $(objpfx)dl-allobjs.os: $(all-rtld-routines:%=$(objpfx)%.os)
 # discovery mechanism is not compatible with the libc implementation
 # when compiled for libc.
 rtld-stubbed-symbols = \
-  __GI___pthread_disable_asynccancel \
-  __GI___pthread_enable_asynccancel \
+  __syscall_cancel \
   __libc_assert_fail \
-  __pthread_disable_asynccancel \
-  __pthread_enable_asynccancel \
   calloc \
   free \
   malloc \
diff --git a/nptl/Makefile b/nptl/Makefile
index 8cec6faee3..bce033179e 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -204,6 +204,7 @@ routines = \
   sem_timedwait \
   sem_unlink \
   sem_wait \
+  syscall_cancel \
   tpp \
   unwind \
   vars \
@@ -234,7 +235,8 @@ CFLAGS-pthread_setcanceltype.c += -fexceptions -fasynchronous-unwind-tables
 
 # These are internal functions which similar functionality as setcancelstate
 # and setcanceltype.
-CFLAGS-cancellation.c += -fasynchronous-unwind-tables
+CFLAGS-cancellation.c += -fexceptions -fasynchronous-unwind-tables
+CFLAGS-syscall_cancel.c += -fexceptions -fasynchronous-unwind-tables
 
 # Calling pthread_exit() must cause the registered cancel handlers to
 # be executed.  Therefore exceptions have to be thrown through this
@@ -286,7 +288,7 @@ tests = tst-attr2 tst-attr3 tst-default-attr \
 	tst-sem17 \
 	tst-tsd3 tst-tsd4 \
 	tst-cancel4_1 tst-cancel4_2 \
-	tst-cancel7 tst-cancel17 tst-cancel24 \
+	tst-cancel7 tst-cancel17 tst-cancel24 tst-cancel31 \
 	tst-signal3 \
 	tst-exec4 tst-exec5 \
 	tst-stack2 tst-stack3 tst-stack4 \
@@ -339,7 +341,10 @@ xtests += tst-eintr1
 
 test-srcs = tst-oddstacklimit
 
-gen-as-const-headers = unwindbuf.sym
+gen-as-const-headers = \
+  descr-const.sym \
+  unwindbuf.sym \
+  # gen-as-const-headers
 
 gen-py-const-headers := nptl_lock_constants.pysym
 pretty-printers := nptl-printers.py
diff --git a/nptl/cancellation.c b/nptl/cancellation.c
index 765511d66d..3f991ab9b3 100644
--- a/nptl/cancellation.c
+++ b/nptl/cancellation.c
@@ -18,74 +18,61 @@
 #include <setjmp.h>
 #include <stdlib.h>
 #include "pthreadP.h"
-#include <futex-internal.h>
 
-
-/* The next two functions are similar to pthread_setcanceltype() but
-   more specialized for the use in the cancelable functions like write().
-   They do not need to check parameters etc.  These functions must be
-   AS-safe, with the exception of the actual cancellation, because they
-   are called by wrappers around AS-safe functions like write().*/
-int
-__pthread_enable_asynccancel (void)
+/* Cancellation function called by all cancellable syscalls.  */
+long int
+__syscall_cancel (__syscall_arg_t nr, __syscall_arg_t a1,
+		  __syscall_arg_t a2, __syscall_arg_t a3,
+		  __syscall_arg_t a4, __syscall_arg_t a5,
+		  __syscall_arg_t a6)
 {
-  struct pthread *self = THREAD_SELF;
-  int oldval = atomic_load_relaxed (&self->cancelhandling);
+  long int result;
+  struct pthread *pd = THREAD_SELF;
 
-  while (1)
+  /* If cancellation is not enabled, call the syscall directly.  It also
+     is thread is terminating, to avoid issue __ssyscall_do_cancel while
+     executing cleanup handlers.  */
+  int ch = atomic_load_relaxed (&pd->cancelhandling);
+  if (SINGLE_THREAD_P || !cancel_enabled (ch) || cancel_exiting (ch))
     {
-      int newval = oldval | CANCELTYPE_BITMASK;
-
-      if (newval == oldval)
-	break;
+      result = INTERNAL_SYSCALL_NCS_CALL (nr, a1, a2, a3, a4, a5, a6);
+      if (INTERNAL_SYSCALL_ERROR_P (result))
+	return -INTERNAL_SYSCALL_ERRNO (result);
+      return result;
+    }
 
-      if (atomic_compare_exchange_weak_acquire (&self->cancelhandling,
-						&oldval, newval))
-	{
-	  if (cancel_enabled_and_canceled_and_async (newval))
-	    {
-	      self->result = PTHREAD_CANCELED;
-	      __do_cancel ();
-	    }
+  /* Call the arch-specific entry points that contains the globals markers
+     to be checked by SIGCANCEL handler.  */
+  result = __syscall_cancel_arch (&pd->cancelhandling, nr, a1, a2, a3, a4, a5,
+			          a6);
 
-	  break;
-	}
-    }
+  ch = atomic_load_relaxed (&pd->cancelhandling);
+  if (result == -EINTR && cancel_enabled_and_canceled (ch))
+    __syscall_do_cancel ();
 
-  return oldval;
+  return result;
 }
-libc_hidden_def (__pthread_enable_asynccancel)
 
-/* See the comment for __pthread_enable_asynccancel regarding
-   the AS-safety of this function.  */
-void
-__pthread_disable_asynccancel (int oldtype)
+/* Since __do_cancel is a always inline function, this creates a symbol the
+   arch-specific symbol can call to cancel the thread.  */
+_Noreturn void
+__syscall_do_cancel (void)
 {
-  /* If asynchronous cancellation was enabled before we do not have
-     anything to do.  */
-  if (oldtype & CANCELTYPE_BITMASK)
-    return;
-
   struct pthread *self = THREAD_SELF;
-  int newval;
+  self->result = PTHREAD_CANCELED;
+
+  /* Disable thread cancellation to avoid cancellable entrypoints to call
+     __syscall_do_cancel recursively.  */
   int oldval = atomic_load_relaxed (&self->cancelhandling);
-  do
+  while (1)
     {
-      newval = oldval & ~CANCELTYPE_BITMASK;
+      int newval = oldval | CANCELSTATE_BITMASK;
+      if (oldval == newval)
+	break;
+      if (atomic_compare_exchange_weak_acquire (&self->cancelhandling,
+						&oldval, newval))
+	break;
     }
-  while (!atomic_compare_exchange_weak_acquire (&self->cancelhandling,
-						&oldval, newval));
 
-  /* We cannot return when we are being canceled.  Upon return the
-     thread might be things which would have to be undone.  The
-     following loop should loop until the cancellation signal is
-     delivered.  */
-  while (__glibc_unlikely ((newval & (CANCELING_BITMASK | CANCELED_BITMASK))
-			   == CANCELING_BITMASK))
-    {
-      futex_wait_simple ((unsigned int *) &self->cancelhandling, newval,
-			 FUTEX_PRIVATE);
-      newval = atomic_load_relaxed (&self->cancelhandling);
-    }
+  __do_cancel ();
 }
-libc_hidden_def (__pthread_disable_asynccancel)
diff --git a/nptl/descr-const.sym b/nptl/descr-const.sym
new file mode 100644
index 0000000000..895a8dada4
--- /dev/null
+++ b/nptl/descr-const.sym
@@ -0,0 +1,5 @@
+#include <tls.h>
+
+-- Not strictly offsets, but these values are also used in the TCB.
+TCB_CANCELED_BIT	 CANCELED_BIT
+TCB_CANCELED_BITMASK	 CANCELED_BITMASK
diff --git a/nptl/descr.h b/nptl/descr.h
index f8b5ac7c22..142470f3f3 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -415,6 +415,24 @@ struct pthread
   (sizeof (struct pthread) - offsetof (struct pthread, end_padding))
 } __attribute ((aligned (TCB_ALIGNMENT)));
 
+static inline bool
+cancel_enabled (int value)
+{
+  return (value & CANCELSTATE_BITMASK) == 0;
+}
+
+static inline bool
+cancel_async_enabled (int value)
+{
+  return (value & CANCELTYPE_BITMASK) != 0;
+}
+
+static inline bool
+cancel_exiting (int value)
+{
+  return (value & EXITING_BITMASK) != 0;
+}
+
 static inline bool
 cancel_enabled_and_canceled (int value)
 {
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index 87c9ef69ad..2bb9933f4b 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -23,6 +23,7 @@
 #include <sysdep.h>
 #include <unistd.h>
 #include <unwind-link.h>
+#include <cancellation-pc-check.h>
 #include <stdio.h>
 #include <gnu/lib-names.h>
 #include <sys/single_threaded.h>
@@ -43,28 +44,17 @@ sigcancel_handler (int sig, siginfo_t *si, void *ctx)
   struct pthread *self = THREAD_SELF;
 
   int oldval = atomic_load_relaxed (&self->cancelhandling);
-  while (1)
-    {
-      /* We are canceled now.  When canceled by another thread this flag
-	 is already set but if the signal is directly send (internally or
-	 from another process) is has to be done here.  */
-      int newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
-
-      if (oldval == newval || (oldval & EXITING_BITMASK) != 0)
-	/* Already canceled or exiting.  */
-	break;
-
-      if (atomic_compare_exchange_weak_acquire (&self->cancelhandling,
-						&oldval, newval))
-	{
-	  self->result = PTHREAD_CANCELED;
+  if (!cancel_enabled_and_canceled (oldval))
+    return;
 
-	  /* Make sure asynchronous cancellation is still enabled.  */
-	  if ((oldval & CANCELTYPE_BITMASK) != 0)
-	    /* Run the registered destructors and terminate the thread.  */
-	    __do_cancel ();
-	}
-    }
+  /* Check if asynchronous cancellation mode is set or if interrupted
+     instruction pointer falls within the cancellable syscall bridge.  For
+     interruptable syscalls that might generate external side-effects (partial
+     reads or writes, for instance), the kernel  will set the IP to after
+     '__syscall_cancel_arch_end', thus disabling the cancellation and allowing
+     the process to handle such conditions.  */
+  if (cancel_async_enabled (oldval) || cancellation_pc_check (ctx))
+    __syscall_do_cancel ();
 }
 
 int
@@ -106,15 +96,13 @@ __pthread_cancel (pthread_t th)
   /* Some syscalls are never restarted after being interrupted by a signal
      handler, regardless of the use of SA_RESTART (they always fail with
      EINTR).  So pthread_cancel cannot send SIGCANCEL unless the cancellation
-     is enabled and set as asynchronous (in this case the cancellation will
-     be acted in the cancellation handler instead by the syscall wrapper).
-     Otherwise the target thread is set as 'cancelling' (CANCELING_BITMASK)
+     is enabled.
+     In this case the target thread is set as 'cancelled' (CANCELED_BITMASK)
      by atomically setting 'cancelhandling' and the cancelation will be acted
      upon on next cancellation entrypoing in the target thread.
 
-     It also requires to atomically check if cancellation is enabled and
-     asynchronous, so both cancellation state and type are tracked on
-     'cancelhandling'.  */
+     It also requires to atomically check if cancellation is enabled, so the
+     state are also tracked on 'cancelhandling'.  */
 
   int result = 0;
   int oldval = atomic_load_relaxed (&pd->cancelhandling);
@@ -122,19 +110,17 @@ __pthread_cancel (pthread_t th)
   do
     {
     again:
-      newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
+      newval = oldval | CANCELED_BITMASK;
       if (oldval == newval)
 	break;
 
       /* If the cancellation is handled asynchronously just send a
 	 signal.  We avoid this if possible since it's more
 	 expensive.  */
-      if (cancel_enabled_and_canceled_and_async (newval))
+      if (cancel_enabled (newval))
 	{
-	  /* Mark the cancellation as "in progress".  */
-	  int newval2 = oldval | CANCELING_BITMASK;
 	  if (!atomic_compare_exchange_weak_acquire (&pd->cancelhandling,
-						     &oldval, newval2))
+						     &oldval, newval))
 	    goto again;
 
 	  if (pd == THREAD_SELF)
@@ -144,7 +130,7 @@ __pthread_cancel (pthread_t th)
 	       set up for a self-cancel.  */
 	    {
 	      pd->result = PTHREAD_CANCELED;
-	      if ((newval & CANCELTYPE_BITMASK) != 0)
+	      if (cancel_async_enabled (newval))
 		__do_cancel ();
 	    }
 	  else
@@ -154,19 +140,19 @@ __pthread_cancel (pthread_t th)
 
 	  break;
 	}
-
-	/* A single-threaded process should be able to kill itself, since
-	   there is nothing in the POSIX specification that says that it
-	   cannot.  So we set multiple_threads to true so that cancellation
-	   points get executed.  */
-	THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
-#ifndef TLS_MULTIPLE_THREADS_IN_TCB
-	__libc_single_threaded_internal = 0;
-#endif
     }
   while (!atomic_compare_exchange_weak_acquire (&pd->cancelhandling, &oldval,
 						newval));
 
+  /* A single-threaded process should be able to kill itself, since
+     there is nothing in the POSIX specification that says that it
+     cannot.  So we set multiple_threads to true so that cancellation
+     points get executed.  */
+  THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
+#ifndef TLS_MULTIPLE_THREADS_IN_TCB
+  __libc_single_threaded_internal = 0;
+#endif
+
   return result;
 }
 versioned_symbol (libc, __pthread_cancel, pthread_cancel, GLIBC_2_34);
diff --git a/nptl/tst-cancel31.c b/nptl/tst-cancel31.c
new file mode 100644
index 0000000000..4e93cc5ae1
--- /dev/null
+++ b/nptl/tst-cancel31.c
@@ -0,0 +1,100 @@
+/* Check side-effect act for cancellable syscalls (BZ #12683).
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* This testcase checks if there is resource leakage if the syscall has
+   returned from kernelspace, but before userspace saves the return
+   value.  The 'leaker' thread should be able to close the file descriptor
+   if the resource is already allocated, meaning that if the cancellation
+   signal arrives *after* the open syscal return from kernel, the
+   side-effect should be visible to application.  */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+
+#include <support/xunistd.h>
+#include <support/xthread.h>
+#include <support/check.h>
+#include <support/temp_file.h>
+#include <support/support.h>
+#include <support/descriptors.h>
+
+static void *
+writeopener (void *arg)
+{
+  int fd;
+  for (;;)
+    {
+      fd = open (arg, O_WRONLY);
+      xclose (fd);
+    }
+  return NULL;
+}
+
+static void *
+leaker (void *arg)
+{
+  int fd = open (arg, O_RDONLY);
+  TEST_VERIFY_EXIT (fd > 0);
+  pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, 0);
+  xclose (fd);
+  return NULL;
+}
+
+static int
+do_test (void)
+{
+  enum {
+    iter_count = 1000
+  };
+
+  char *dir = support_create_temp_directory ("tst-cancel28");
+  char *name = xasprintf ("%s/fifo", dir);
+  TEST_COMPARE (mkfifo (name, 0600), 0);
+  add_temp_file (name);
+
+  struct support_descriptors *descrs = support_descriptors_list ();
+
+  srand (1);
+
+  xpthread_create (NULL, writeopener, name);
+  for (int i = 0; i < iter_count; i++)
+    {
+      pthread_t td = xpthread_create (NULL, leaker, name);
+      struct timespec ts =
+	{ .tv_nsec = rand () % 100000, .tv_sec = 0 };
+      nanosleep (&ts, NULL);
+      /* Ignore pthread_cancel result because it might be the
+	 case when pthread_cancel is called when thread is already
+	 exited.  */
+      pthread_cancel (td);
+      xpthread_join (td);
+    }
+
+  support_descriptors_check (descrs);
+
+  support_descriptors_free (descrs);
+
+  free (name);
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/generic/syscall_types.h b/sysdeps/generic/syscall_types.h
new file mode 100644
index 0000000000..2ddeaa2b5f
--- /dev/null
+++ b/sysdeps/generic/syscall_types.h
@@ -0,0 +1,25 @@
+/* Types and macros used for syscall issuing.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _SYSCALL_TYPES_H
+#define _SYSCALL_TYPES_H
+
+typedef long int __syscall_arg_t;
+#define __SSC(__x) ((__syscall_arg_t) (__x))
+
+#endif
diff --git a/sysdeps/nptl/cancellation-pc-check.h b/sysdeps/nptl/cancellation-pc-check.h
new file mode 100644
index 0000000000..6f82b62e36
--- /dev/null
+++ b/sysdeps/nptl/cancellation-pc-check.h
@@ -0,0 +1,54 @@
+/* Architecture specific code for pthread cancellation handling.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _NPTL_CANCELLATION_PC_CHECK
+#define _NPTL_CANCELLATION_PC_CHECK
+
+#include <sigcontextinfo.h>
+
+/* For syscalls with side-effects, the kernel cannot restart the syscall; when
+   it is interrupted by a signal, the kernel must cause the syscall to return
+   with whatever partial result is obtained (e.g. partial read or write).  In
+   this case, the saved program counter points just after the syscall
+   instruction, so the SIGCANCEL handler should not act on cancellation.
+
+   The __syscall_cancel_arch function, used for all cancellable syscalls,
+   contains two extra markers, __syscall_cancel_arch_start and
+   __syscall_cancel_arch_end.  The former points to just before the initial
+   conditional branch that checks if the thread has received a cancellation
+   request, while former points to the instruction after the one responsible
+   to issue the syscall.
+
+   The function check if the program counter (PC) from ucontext_t CTX is
+   within the start and then end boundary from the __syscall_cancel_arch
+   bridge.  Return TRUE if the PC is within the boundary, meaning the
+   syscall does not have any side effects; or FALSE otherwise.  */
+
+static __always_inline bool
+cancellation_pc_check (void *ctx)
+{
+  /* Both are defined in syscall_cancel.S.  */
+  extern const char __syscall_cancel_arch_start[1];
+  extern const char __syscall_cancel_arch_end[1];
+
+  uintptr_t pc = sigcontext_get_pc (ctx);
+  return pc >= (uintptr_t) __syscall_cancel_arch_start
+	 && pc < (uintptr_t) __syscall_cancel_arch_end;
+}
+
+#endif
diff --git a/sysdeps/nptl/lowlevellock-futex.h b/sysdeps/nptl/lowlevellock-futex.h
index 0392b5c04f..bd57913b6f 100644
--- a/sysdeps/nptl/lowlevellock-futex.h
+++ b/sysdeps/nptl/lowlevellock-futex.h
@@ -21,7 +21,6 @@
 
 #ifndef __ASSEMBLER__
 # include <sysdep.h>
-# include <sysdep-cancel.h>
 # include <kernel-features.h>
 #endif
 
@@ -120,21 +119,10 @@
 		     nr_wake, nr_move, mutex, val)
 
 /* Like lll_futex_wait, but acting as a cancellable entrypoint.  */
-# define lll_futex_wait_cancel(futexp, val, private) \
-  ({                                                                   \
-    int __oldtype = LIBC_CANCEL_ASYNC ();			       \
-    long int __err = lll_futex_wait (futexp, val, LLL_SHARED);	       \
-    LIBC_CANCEL_RESET (__oldtype);				       \
-    __err;							       \
-  })
-
-/* Like lll_futex_timed_wait, but acting as a cancellable entrypoint.  */
-# define lll_futex_timed_wait_cancel(futexp, val, timeout, private) \
-  ({									   \
-    int __oldtype = LIBC_CANCEL_ASYNC ();			       	   \
-    long int __err = lll_futex_timed_wait (futexp, val, timeout, private); \
-    LIBC_CANCEL_RESET (__oldtype);					   \
-    __err;								   \
+# define lll_futex_wait_cancel(futexp, val, private)			\
+  ({									\
+     int __op = __lll_private_flag (FUTEX_WAIT, private);		\
+     INTERNAL_SYSCALL_CANCEL (futex, futexp, __op, val, NULL);		\
   })
 
 #endif  /* !__ASSEMBLER__  */
diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
index 54f9198681..c893267c52 100644
--- a/sysdeps/nptl/pthreadP.h
+++ b/sysdeps/nptl/pthreadP.h
@@ -272,6 +272,13 @@ __do_cancel (void)
 		    THREAD_GETMEM (self, cleanup_jmp_buf));
 }
 
+extern long int __syscall_cancel_arch (volatile int *, __syscall_arg_t nr,
+     __syscall_arg_t arg1, __syscall_arg_t arg2, __syscall_arg_t arg3,
+     __syscall_arg_t arg4, __syscall_arg_t arg5, __syscall_arg_t arg6)
+  attribute_hidden;
+
+extern _Noreturn void __syscall_do_cancel (void) attribute_hidden;
+
 
 /* Internal prototypes.  */
 
diff --git a/sysdeps/unix/sysdep.h b/sysdeps/unix/sysdep.h
index 1ba4de99db..78d5ab6a41 100644
--- a/sysdeps/unix/sysdep.h
+++ b/sysdeps/unix/sysdep.h
@@ -24,6 +24,9 @@
 #define	SYSCALL__(name, args)	PSEUDO (__##name, name, args)
 #define	SYSCALL(name, args)	PSEUDO (name, name, args)
 
+#ifndef __ASSEMBLER__
+# include <errno.h>
+
 #define __SYSCALL_CONCAT_X(a,b)     a##b
 #define __SYSCALL_CONCAT(a,b)       __SYSCALL_CONCAT_X (a, b)
 
@@ -108,42 +111,95 @@
 #define INLINE_SYSCALL_CALL(...) \
   __INLINE_SYSCALL_DISP (__INLINE_SYSCALL, __VA_ARGS__)
 
+#define __INTERNAL_SYSCALL_NCS0(name) \
+  INTERNAL_SYSCALL_NCS (name, 0)
+#define __INTERNAL_SYSCALL_NCS1(name, a1) \
+  INTERNAL_SYSCALL_NCS (name, 1, a1)
+#define __INTERNAL_SYSCALL_NCS2(name, a1, a2) \
+  INTERNAL_SYSCALL_NCS (name, 2, a1, a2)
+#define __INTERNAL_SYSCALL_NCS3(name, a1, a2, a3) \
+  INTERNAL_SYSCALL_NCS (name, 3, a1, a2, a3)
+#define __INTERNAL_SYSCALL_NCS4(name, a1, a2, a3, a4) \
+  INTERNAL_SYSCALL_NCS (name, 4, a1, a2, a3, a4)
+#define __INTERNAL_SYSCALL_NCS5(name, a1, a2, a3, a4, a5) \
+  INTERNAL_SYSCALL_NCS (name, 5, a1, a2, a3, a4, a5)
+#define __INTERNAL_SYSCALL_NCS6(name, a1, a2, a3, a4, a5, a6) \
+  INTERNAL_SYSCALL_NCS (name, 6, a1, a2, a3, a4, a5, a6)
+#define __INTERNAL_SYSCALL_NCS7(name, a1, a2, a3, a4, a5, a6, a7) \
+  INTERNAL_SYSCALL_NCS (name, 7, a1, a2, a3, a4, a5, a6, a7)
+
+/* Issue a syscall defined by syscall number plus any other argument required.
+   It is similar to INTERNAL_SYSCALL_NCS macro, but without the need to pass
+   the expected argument number as third parameter.  */
+#define INTERNAL_SYSCALL_NCS_CALL(...) \
+  __INTERNAL_SYSCALL_DISP (__INTERNAL_SYSCALL_NCS, __VA_ARGS__)
+
+/* Cancellation macros.  */
+#include <syscall_types.h>
+
+long int __syscall_cancel (__syscall_arg_t nr, __syscall_arg_t arg1,
+			   __syscall_arg_t arg2, __syscall_arg_t arg3,
+			   __syscall_arg_t arg4, __syscall_arg_t arg5,
+			   __syscall_arg_t arg6)
+  attribute_hidden;
+
+#define __SYSCALL_CANCEL0(name) \
+  __syscall_cancel (__NR_##name, 0, 0, 0, 0, 0, 0)
+#define __SYSCALL_CANCEL1(name, a1) \
+  __syscall_cancel (__NR_##name, __SSC (a1), 0, 0, 0, 0, 0)
+#define __SYSCALL_CANCEL2(name, a1, a2) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), 0, 0, 0, 0)
+#define __SYSCALL_CANCEL3(name, a1, a2, a3) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     0, 0, 0)
+#define __SYSCALL_CANCEL4(name, a1, a2, a3, a4) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     __SSC(a4), 0, 0)
+#define __SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     __SSC(a4), __SSC (a5), 0)
+#define __SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     __SSC (a4), __SSC (a5), __SSC (a6))
+
+#define __SYSCALL_CANCEL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
+#define __SYSCALL_CANCEL_NARGS(...) \
+  __SYSCALL_CANCEL_NARGS_X (__VA_ARGS__,7,6,5,4,3,2,1,0,)
+#define __SYSCALL_CANCEL_CONCAT_X(a,b)     a##b
+#define __SYSCALL_CANCEL_CONCAT(a,b)       __SYSCALL_CANCEL_CONCAT_X (a, b)
+#define __SYSCALL_CANCEL_DISP(b,...) \
+  __SYSCALL_CANCEL_CONCAT (b,__SYSCALL_CANCEL_NARGS(__VA_ARGS__))(__VA_ARGS__)
+
+#define __SYSCALL_CANCEL_CALL(...) \
+  __SYSCALL_CANCEL_DISP (__SYSCALL_CANCEL, __VA_ARGS__)
+
+/* Issue a cancellable syscall defined by syscall number NAME plus any other
+   argument required.  If an error occurs its value is returned as an negative
+   number unmodified and errno is not set.  */
 #if IS_IN (rtld)
-/* All cancellation points are compiled out in the dynamic loader.  */
-# define NO_SYSCALL_CANCEL_CHECKING 1
+# define INTERNAL_SYSCALL_CANCEL(name, args...) \
+  INTERNAL_SYSCALL_CALL(name, args)
 #else
-# define NO_SYSCALL_CANCEL_CHECKING SINGLE_THREAD_P
+# define INTERNAL_SYSCALL_CANCEL(name, args...) \
+  __SYSCALL_CANCEL_CALL (name, args)
 #endif
 
-#define SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (NO_SYSCALL_CANCEL_CHECKING)					     \
-      sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
-  })
+/* Issue a cancellable syscall defined first argument plus any other argument
+   required.  If and error occurs its value, the macro returns -1 and sets
+   errno accordingly.  */
+#if IS_IN (rtld)
+/* The loader does not need to handle thread cancellation, use direct
+   syscall instead.  */
+# define SYSCALL_CANCEL(...) INLINE_SYSCALL_CALL (__VA_ARGS__)
+#else
+# define SYSCALL_CANCEL(...) \
+  ({									\
+    long int sc_ret = __SYSCALL_CANCEL_CALL (__VA_ARGS__);		\
+    SYSCALL_CANCEL_RET ((sc_ret));					\
+   })
+#endif
 
-/* Issue a syscall defined by syscall number plus any other argument
-   required.  Any error will be returned unmodified (including errno).  */
-#define INTERNAL_SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (NO_SYSCALL_CANCEL_CHECKING) 					     \
-      sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
-  })
+#endif /* __ASSEMBLER__  */
 
 /* Machine-dependent sysdep.h files are expected to define the macro
    PSEUDO (function_name, syscall_name) to emit assembly code to define the
diff --git a/sysdeps/unix/sysv/linux/socketcall.h b/sysdeps/unix/sysv/linux/socketcall.h
index d1a173277e..19a6c17a86 100644
--- a/sysdeps/unix/sysv/linux/socketcall.h
+++ b/sysdeps/unix/sysv/linux/socketcall.h
@@ -88,14 +88,33 @@
     sc_ret;								\
   })
 
-
-#define SOCKETCALL_CANCEL(name, args...)				\
-  ({									\
-    int oldtype = LIBC_CANCEL_ASYNC ();					\
-    long int sc_ret = __SOCKETCALL (SOCKOP_##name, args);		\
-    LIBC_CANCEL_RESET (oldtype);					\
-    sc_ret;								\
-  })
+#define __SOCKETCALL_CANCEL1(__name, __a1) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [1]) { (long int) __a1 }))
+#define __SOCKETCALL_CANCEL2(__name, __a1, __a2) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [2]) { (long int) __a1, (long int) __a2 }))
+#define __SOCKETCALL_CANCEL3(__name, __a1, __a2, __a3) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [3]) { (long int) __a1, (long int) __a2, (long int) __a3 }))
+#define __SOCKETCALL_CANCEL4(__name, __a1, __a2, __a3, __a4) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [4]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4 }))
+#define __SOCKETCALL_CANCEL5(__name, __a1, __a2, __a3, __a4, __a5) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [5]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5 }))
+#define __SOCKETCALL_CANCEL6(__name, __a1, __a2, __a3, __a4, __a5, __a6) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [6]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5, (long int) __a6 }))
+
+#define __SOCKETCALL_CANCEL(...) __SOCKETCALL_DISP (__SOCKETCALL_CANCEL,\
+						    __VA_ARGS__)
+
+#define SOCKETCALL_CANCEL(name, args...) \
+   __SOCKETCALL_CANCEL (SOCKOP_##name, args)
 
 
 #endif /* sys/socketcall.h */
diff --git a/sysdeps/unix/sysv/linux/syscall_cancel.c b/sysdeps/unix/sysv/linux/syscall_cancel.c
new file mode 100644
index 0000000000..c4d9d4e8bb
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/syscall_cancel.c
@@ -0,0 +1,61 @@
+/* Default cancellation syscall bridge.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <pthreadP.h>
+
+#warning "This implementation should be use just as reference or for bootstrapping"
+
+/* This is the generic version of the cancellable syscall code which
+   adds the label guards (__syscall_cancel_arch_{start,end}) used
+   on SIGCANCEL sigcancel_handler (nptl-init.c) to check if the cancelled
+   syscall have side-effects that need to be signaled to program.
+
+   This implementation should be used a reference one to document the
+   implementation constraints: the __syscall_cancel_arch_end should point
+   to the immediate next instruction after the syscall one.  This is because
+   kernel will signal interrupted syscall with side effects by setting
+   the signal frame program counter (on the ucontext_t third argument from
+   SA_SIGINFO signal handler) right after the syscall instruction.
+
+   If the INTERNAL_SYSCALL_NCS macro use more instructions to get the
+   error condition from kernel (as for powerpc and sparc), uses an
+   out of the line helper (as for ARM thumb), or uses a kernel helper
+   gate (as for i686 or ia64) the architecture should adjust the
+   macro or provide a custom __syscall_cancel_arch implementation.   */
+long int
+__syscall_cancel_arch (volatile int *ch, __syscall_arg_t nr,
+		       __syscall_arg_t a1, __syscall_arg_t a2,
+		       __syscall_arg_t a3, __syscall_arg_t a4,
+		       __syscall_arg_t a5, __syscall_arg_t a6)
+{
+#define ADD_LABEL(__label)		\
+  asm volatile (			\
+    ".global " __label "\t\n"		\
+    __label ":\n");
+
+  ADD_LABEL ("__syscall_cancel_arch_start");
+  if (__glibc_unlikely (*ch & CANCELED_BITMASK))
+    __syscall_do_cancel();
+
+  long int result = INTERNAL_SYSCALL_NCS_CALL (nr, a1, a2, a3, a4, a5, a6);
+  ADD_LABEL ("__syscall_cancel_arch_end");
+  if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (result)))
+    return -INTERNAL_SYSCALL_ERRNO (result);
+  return result;
+}
diff --git a/sysdeps/unix/sysv/linux/sysdep-cancel.h b/sysdeps/unix/sysv/linux/sysdep-cancel.h
index 102682c5ee..1b686d53a9 100644
--- a/sysdeps/unix/sysv/linux/sysdep-cancel.h
+++ b/sysdeps/unix/sysv/linux/sysdep-cancel.h
@@ -21,17 +21,5 @@
 #define _SYSDEP_CANCEL_H
 
 #include <sysdep.h>
-#include <tls.h>
-#include <errno.h>
-
-/* Set cancellation mode to asynchronous.  */
-extern int __pthread_enable_asynccancel (void);
-libc_hidden_proto (__pthread_enable_asynccancel)
-#define LIBC_CANCEL_ASYNC() __pthread_enable_asynccancel ()
-
-/* Reset to previous cancellation mode.  */
-extern void __pthread_disable_asynccancel (int oldtype);
-libc_hidden_proto (__pthread_disable_asynccancel)
-#define LIBC_CANCEL_RESET(oldtype) __pthread_disable_asynccancel (oldtype)
 
 #endif
diff --git a/sysdeps/unix/sysv/linux/sysdep.h b/sysdeps/unix/sysv/linux/sysdep.h
index cc975e9f3e..16a6672074 100644
--- a/sysdeps/unix/sysv/linux/sysdep.h
+++ b/sysdeps/unix/sysv/linux/sysdep.h
@@ -59,6 +59,15 @@
     -1l;					\
   })
 
+/* The return error from cancellable syscall has the same semantic as non
+   cancellable ones.  */
+#define SYSCALL_CANCEL_RET(__ret)				\
+  ({								\
+    __glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (__ret))		\
+    ? SYSCALL_ERROR_LABEL (INTERNAL_SYSCALL_ERRNO (__ret))	\
+    : __ret;							\
+   })
+
 /* Provide a dummy argument that can be used to force register
    alignment for register pairs if required by the syscall ABI.  */
 #ifdef __ASSUME_ALIGNED_REGISTER_PAIRS

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [glibc/azanella/bz12683] nptl: Fix Race conditions in pthread cancellation [BZ#12683]
@ 2023-04-11 14:18 Adhemerval Zanella
  0 siblings, 0 replies; 6+ messages in thread
From: Adhemerval Zanella @ 2023-04-11 14:18 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=9b73801e2abcfd3d08f5c79ebf7caf9daf27ff3d

commit 9b73801e2abcfd3d08f5c79ebf7caf9daf27ff3d
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Fri Mar 31 17:24:24 2023 -0300

    nptl: Fix Race conditions in pthread cancellation [BZ#12683]
    
    The current racy approach is to enable asynchronous cancellation
    before making the syscall and restore the previous cancellation
    type once the syscall returns, and check if cancellation has happen
    during the cancellation entrypoint.
    
    As described in BZ#12683, this approach shows 2 problems:
    
      1. Cancellation can act after the syscall has returned from the
         kernel, but before userspace saves the return value.  It might
         result in a resource leak if the syscall allocated a resource or a
         side effect (partial read/write), and there is no way to program
         handle it with cancellation handlers.
    
      2. If a signal is handled while the thread is blocked at a cancellable
         syscall, the entire signal handler runs with asynchronous
         cancellation enabled.  This can lead to issues if the signal
         handler call functions which are async-signal-safe but not
         async-cancel-safe.
    
    For the cancellation to work correctly, there are 5 points at which the
    cancellation signal could arrive:
    
      1. Before the final "testcancel" and before the syscall is made.
      2. Between the "testcancel" and the syscall.
      3. While the syscall is blocked and no side effects have yet taken
         place.
      4. While the syscall is blocked but with some side effects already
         having taken place (e.g. a partial read or write).
      5. After the syscall has returned.
    
    And libc wants to act on cancellation in cases 1, 2, and 3 but not
    in cases 4 or 5.  For the 4 and 5 cases, the cancellation will eventually
    happen in the next cancellable entrypoint without any further external
    event.
    
    The proposed solution for each case is:
    
      1. Do a conditional branch based on whether the thread has received
         a cancellation request;
    
      2. It can be caught by the signal handler determining that the saved
         program counter (from the ucontext_t) is in some address range
         beginning just before the "testcancel" and ending with the
         syscall instruction.
    
      3. In this case, except for certain syscalls that ALWAYS fail with
         EINTR even for non-interrupting signals, the kernel will reset
         the program counter to point at the syscall instruction during
         signal handling, so that the syscall is restarted when the signal
         handler returns.  So, from the signal handler's standpoint, this
         looks the same as case 2, and thus it's taken care of.
    
      4. For syscalls with side-effects, the kernel cannot restart the
         syscall; when it's interrupted by a signal, the kernel must cause
         the syscall to return with whatever partial result is obtained
         (e.g. partial read or write).
    
      5. The saved program counter points just after the syscall
         instruction, so the signal handler won't act on cancellation.
         This is similar to 4. since the program counter is past the syscall
         instruction.
    
    So The proposed fixes are:
    
      1. Remove the enable_asynccancel/disable_asynccancel function usage in
         cancellable syscall definition and instead make them call a common
         symbol that will check if cancellation is enabled (__syscall_cancel
         at nptl/cancellation.c), call the arch-specific cancellable
         entry-point (__syscall_cancel_arch), and cancel the thread when
         required.
    
      2. Provide an arch-specific generic system call wrapper function
         that contains global markers.  These markers will be used in
         SIGCANCEL signal handler to check if the interruption has been
         called in a valid syscall and if the syscalls has side-effects.
    
         A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
         is provided.  However, the markers may not be set on correct
         expected places depending on how INTERNAL_SYSCALL_NCS is
         implemented by the architecture.  It is expected that all
         architectures add an arch-specific implementation.
    
      3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
         type and if current IP from signal handler falls between the global
         markers and act accordingly.
    
      4. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
         appropriated cancelable syscalls.
    
      5. Adjust 'lowlevellock-futex.h' arch-specific implementations to
         provide cancelable futex calls.
    
    This patch adds the proposed changes to NPTL common code and the
    following patches add the requires arch-specific bits.  The build for
    ia64-linux-gnu, mips-*, and x86_64-* are broken without the
    arch-specific patches.

Diff:
---
 elf/Makefile                             |   5 +-
 nptl/Makefile                            |  11 ++-
 nptl/cancellation.c                      | 112 +++++++++++++------------
 nptl/cleanup_defer.c                     |   5 +-
 nptl/descr-const.sym                     |   6 ++
 nptl/descr.h                             |  18 ++++
 nptl/libc-cleanup.c                      |   5 +-
 nptl/pthread_cancel.c                    |  78 +++++++----------
 nptl/pthread_exit.c                      |   4 +-
 nptl/pthread_setcancelstate.c            |   2 +-
 nptl/pthread_setcanceltype.c             |   2 +-
 nptl/pthread_testcancel.c                |   5 +-
 nptl/tst-cancel31.c                      | 100 ++++++++++++++++++++++
 sysdeps/generic/syscall_types.h          |  25 ++++++
 sysdeps/nptl/cancellation-pc-check.h     |  54 ++++++++++++
 sysdeps/nptl/lowlevellock-futex.h        |  20 +----
 sysdeps/nptl/pthreadP.h                  |  11 ++-
 sysdeps/unix/sysdep.h                    | 140 ++++++++++++++++++++++++-------
 sysdeps/unix/sysv/linux/socketcall.h     |  35 ++++++--
 sysdeps/unix/sysv/linux/syscall_cancel.c |  71 ++++++++++++++++
 sysdeps/unix/sysv/linux/sysdep-cancel.h  |  12 ---
 21 files changed, 525 insertions(+), 196 deletions(-)

diff --git a/elf/Makefile b/elf/Makefile
index 396ec51424..b684e5d44c 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -1252,11 +1252,8 @@ $(objpfx)dl-allobjs.os: $(all-rtld-routines:%=$(objpfx)%.os)
 # discovery mechanism is not compatible with the libc implementation
 # when compiled for libc.
 rtld-stubbed-symbols = \
-  __GI___pthread_disable_asynccancel \
-  __GI___pthread_enable_asynccancel \
+  __syscall_cancel \
   __libc_assert_fail \
-  __pthread_disable_asynccancel \
-  __pthread_enable_asynccancel \
   calloc \
   free \
   malloc \
diff --git a/nptl/Makefile b/nptl/Makefile
index 8cec6faee3..bce033179e 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -204,6 +204,7 @@ routines = \
   sem_timedwait \
   sem_unlink \
   sem_wait \
+  syscall_cancel \
   tpp \
   unwind \
   vars \
@@ -234,7 +235,8 @@ CFLAGS-pthread_setcanceltype.c += -fexceptions -fasynchronous-unwind-tables
 
 # These are internal functions which similar functionality as setcancelstate
 # and setcanceltype.
-CFLAGS-cancellation.c += -fasynchronous-unwind-tables
+CFLAGS-cancellation.c += -fexceptions -fasynchronous-unwind-tables
+CFLAGS-syscall_cancel.c += -fexceptions -fasynchronous-unwind-tables
 
 # Calling pthread_exit() must cause the registered cancel handlers to
 # be executed.  Therefore exceptions have to be thrown through this
@@ -286,7 +288,7 @@ tests = tst-attr2 tst-attr3 tst-default-attr \
 	tst-sem17 \
 	tst-tsd3 tst-tsd4 \
 	tst-cancel4_1 tst-cancel4_2 \
-	tst-cancel7 tst-cancel17 tst-cancel24 \
+	tst-cancel7 tst-cancel17 tst-cancel24 tst-cancel31 \
 	tst-signal3 \
 	tst-exec4 tst-exec5 \
 	tst-stack2 tst-stack3 tst-stack4 \
@@ -339,7 +341,10 @@ xtests += tst-eintr1
 
 test-srcs = tst-oddstacklimit
 
-gen-as-const-headers = unwindbuf.sym
+gen-as-const-headers = \
+  descr-const.sym \
+  unwindbuf.sym \
+  # gen-as-const-headers
 
 gen-py-const-headers := nptl_lock_constants.pysym
 pretty-printers := nptl-printers.py
diff --git a/nptl/cancellation.c b/nptl/cancellation.c
index 765511d66d..eee5b6b758 100644
--- a/nptl/cancellation.c
+++ b/nptl/cancellation.c
@@ -18,74 +18,78 @@
 #include <setjmp.h>
 #include <stdlib.h>
 #include "pthreadP.h"
-#include <futex-internal.h>
 
-
-/* The next two functions are similar to pthread_setcanceltype() but
-   more specialized for the use in the cancelable functions like write().
-   They do not need to check parameters etc.  These functions must be
-   AS-safe, with the exception of the actual cancellation, because they
-   are called by wrappers around AS-safe functions like write().*/
-int
-__pthread_enable_asynccancel (void)
+/* Called by the INTERNAL_SYSCALL_CANCEL macro, check for cancellation and
+   returns the syscall value or its negative error code.  */
+long int
+__internal_syscall_cancel (__syscall_arg_t a1, __syscall_arg_t a2,
+			   __syscall_arg_t a3, __syscall_arg_t a4,
+			   __syscall_arg_t a5, __syscall_arg_t a6,
+			   __syscall_arg_t nr)
 {
-  struct pthread *self = THREAD_SELF;
-  int oldval = atomic_load_relaxed (&self->cancelhandling);
+  long int result;
+  struct pthread *pd = THREAD_SELF;
 
-  while (1)
+  /* If cancellation is not enabled, call the syscall directly and also
+     for thread terminatation to avoid call __syscall_do_cancel while
+     executing cleanup handlers.  */
+  int ch = atomic_load_relaxed (&pd->cancelhandling);
+  if (SINGLE_THREAD_P || !cancel_enabled (ch) || cancel_exiting (ch))
     {
-      int newval = oldval | CANCELTYPE_BITMASK;
-
-      if (newval == oldval)
-	break;
+      result = INTERNAL_SYSCALL_NCS_CALL (nr, a1, a2, a3, a4, a5, a6);
+      if (INTERNAL_SYSCALL_ERROR_P (result))
+	return -INTERNAL_SYSCALL_ERRNO (result);
+      return result;
+    }
 
-      if (atomic_compare_exchange_weak_acquire (&self->cancelhandling,
-						&oldval, newval))
-	{
-	  if (cancel_enabled_and_canceled_and_async (newval))
-	    {
-	      self->result = PTHREAD_CANCELED;
-	      __do_cancel ();
-	    }
+  /* Call the arch-specific entry points that contains the globals markers
+     to be checked by SIGCANCEL handler.  */
+  result = __syscall_cancel_arch (&pd->cancelhandling, nr, a1, a2, a3, a4, a5,
+			          a6);
 
-	  break;
-	}
-    }
+  /* If the cancellable syscall was interrupted by SIGCANCEL and it has not
+     side-effect, cancel the thread if cancellation is enabled.  */
+  ch = atomic_load_relaxed (&pd->cancelhandling);
+  if (result == -EINTR && cancel_enabled_and_canceled (ch))
+    __syscall_do_cancel ();
 
-  return oldval;
+  return result;
 }
-libc_hidden_def (__pthread_enable_asynccancel)
 
-/* See the comment for __pthread_enable_asynccancel regarding
-   the AS-safety of this function.  */
-void
-__pthread_disable_asynccancel (int oldtype)
+/* Called by the SYSCALL_CANCEL macro, check for cancellation and return the
+   syscall expected success value (usually 0) or, in case of failure, -1 and
+   sets errno to syscall return value.  */
+long int
+__syscall_cancel (__syscall_arg_t a1, __syscall_arg_t a2,
+		  __syscall_arg_t a3, __syscall_arg_t a4,
+		  __syscall_arg_t a5, __syscall_arg_t a6,
+		  __syscall_arg_t nr)
 {
-  /* If asynchronous cancellation was enabled before we do not have
-     anything to do.  */
-  if (oldtype & CANCELTYPE_BITMASK)
-    return;
+  int r = __internal_syscall_cancel (a1, a2, a3, a4, a5, a6, nr);
+  return __glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (r))
+	 ? SYSCALL_ERROR_LABEL (INTERNAL_SYSCALL_ERRNO (r))
+	 : r;
+}
 
+/* Called by __syscall_cancel_arch or function above start the thread
+   cancellation.  */
+_Noreturn void
+__syscall_do_cancel (void)
+{
   struct pthread *self = THREAD_SELF;
-  int newval;
+
+  /* Disable thread cancellation to avoid cancellable entrypoints to call
+     __syscall_do_cancel recursively.  */
   int oldval = atomic_load_relaxed (&self->cancelhandling);
-  do
+  while (1)
     {
-      newval = oldval & ~CANCELTYPE_BITMASK;
+      int newval = oldval | CANCELSTATE_BITMASK;
+      if (oldval == newval)
+	break;
+      if (atomic_compare_exchange_weak_acquire (&self->cancelhandling,
+						&oldval, newval))
+	break;
     }
-  while (!atomic_compare_exchange_weak_acquire (&self->cancelhandling,
-						&oldval, newval));
 
-  /* We cannot return when we are being canceled.  Upon return the
-     thread might be things which would have to be undone.  The
-     following loop should loop until the cancellation signal is
-     delivered.  */
-  while (__glibc_unlikely ((newval & (CANCELING_BITMASK | CANCELED_BITMASK))
-			   == CANCELING_BITMASK))
-    {
-      futex_wait_simple ((unsigned int *) &self->cancelhandling, newval,
-			 FUTEX_PRIVATE);
-      newval = atomic_load_relaxed (&self->cancelhandling);
-    }
+  __do_cancel (PTHREAD_CANCELED);
 }
-libc_hidden_def (__pthread_disable_asynccancel)
diff --git a/nptl/cleanup_defer.c b/nptl/cleanup_defer.c
index eef87f9a9c..d04227722b 100644
--- a/nptl/cleanup_defer.c
+++ b/nptl/cleanup_defer.c
@@ -82,10 +82,7 @@ ___pthread_unregister_cancel_restore (__pthread_unwind_buf_t *buf)
 						    &cancelhandling, newval));
 
       if (cancel_enabled_and_canceled (cancelhandling))
-	{
-	  self->result = PTHREAD_CANCELED;
-	  __do_cancel ();
-	}
+	__do_cancel (PTHREAD_CANCELED);
     }
 }
 versioned_symbol (libc, ___pthread_unregister_cancel_restore,
diff --git a/nptl/descr-const.sym b/nptl/descr-const.sym
new file mode 100644
index 0000000000..8608248354
--- /dev/null
+++ b/nptl/descr-const.sym
@@ -0,0 +1,6 @@
+#include <tls.h>
+
+-- Not strictly offsets, these values are using thread cancellation by arch
+-- specific cancel entrypoint.
+TCB_CANCELED_BIT	 CANCELED_BIT
+TCB_CANCELED_BITMASK	 CANCELED_BITMASK
diff --git a/nptl/descr.h b/nptl/descr.h
index f8b5ac7c22..142470f3f3 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -415,6 +415,24 @@ struct pthread
   (sizeof (struct pthread) - offsetof (struct pthread, end_padding))
 } __attribute ((aligned (TCB_ALIGNMENT)));
 
+static inline bool
+cancel_enabled (int value)
+{
+  return (value & CANCELSTATE_BITMASK) == 0;
+}
+
+static inline bool
+cancel_async_enabled (int value)
+{
+  return (value & CANCELTYPE_BITMASK) != 0;
+}
+
+static inline bool
+cancel_exiting (int value)
+{
+  return (value & EXITING_BITMASK) != 0;
+}
+
 static inline bool
 cancel_enabled_and_canceled (int value)
 {
diff --git a/nptl/libc-cleanup.c b/nptl/libc-cleanup.c
index 4c7bcda302..252006060a 100644
--- a/nptl/libc-cleanup.c
+++ b/nptl/libc-cleanup.c
@@ -69,10 +69,7 @@ __libc_cleanup_pop_restore (struct _pthread_cleanup_buffer *buffer)
 						    &cancelhandling, newval));
 
       if (cancel_enabled_and_canceled (cancelhandling))
-	{
-	  self->result = PTHREAD_CANCELED;
-	  __do_cancel ();
-	}
+	__do_cancel (PTHREAD_CANCELED);
     }
 }
 libc_hidden_def (__libc_cleanup_pop_restore)
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index 87c9ef69ad..fc5ca8b3d4 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -23,6 +23,7 @@
 #include <sysdep.h>
 #include <unistd.h>
 #include <unwind-link.h>
+#include <cancellation-pc-check.h>
 #include <stdio.h>
 #include <gnu/lib-names.h>
 #include <sys/single_threaded.h>
@@ -40,31 +41,16 @@ sigcancel_handler (int sig, siginfo_t *si, void *ctx)
       || si->si_code != SI_TKILL)
     return;
 
+  /* Check if asynchronous cancellation mode is set or if interrupted
+     instruction pointer falls within the cancellable syscall bridge.  For
+     interruptable syscalls with external side-effects (i.e. partial reads),
+     the kernel  will set the IP to after __syscall_cancel_arch_end, thus
+     disabling the cancellation and allowing the process to handle such
+     conditions.  */
   struct pthread *self = THREAD_SELF;
-
   int oldval = atomic_load_relaxed (&self->cancelhandling);
-  while (1)
-    {
-      /* We are canceled now.  When canceled by another thread this flag
-	 is already set but if the signal is directly send (internally or
-	 from another process) is has to be done here.  */
-      int newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
-
-      if (oldval == newval || (oldval & EXITING_BITMASK) != 0)
-	/* Already canceled or exiting.  */
-	break;
-
-      if (atomic_compare_exchange_weak_acquire (&self->cancelhandling,
-						&oldval, newval))
-	{
-	  self->result = PTHREAD_CANCELED;
-
-	  /* Make sure asynchronous cancellation is still enabled.  */
-	  if ((oldval & CANCELTYPE_BITMASK) != 0)
-	    /* Run the registered destructors and terminate the thread.  */
-	    __do_cancel ();
-	}
-    }
+  if (cancel_async_enabled (oldval) || cancellation_pc_check (ctx))
+    __syscall_do_cancel ();
 }
 
 int
@@ -106,15 +92,13 @@ __pthread_cancel (pthread_t th)
   /* Some syscalls are never restarted after being interrupted by a signal
      handler, regardless of the use of SA_RESTART (they always fail with
      EINTR).  So pthread_cancel cannot send SIGCANCEL unless the cancellation
-     is enabled and set as asynchronous (in this case the cancellation will
-     be acted in the cancellation handler instead by the syscall wrapper).
-     Otherwise the target thread is set as 'cancelling' (CANCELING_BITMASK)
+     is enabled.
+     In this case the target thread is set as 'cancelled' (CANCELED_BITMASK)
      by atomically setting 'cancelhandling' and the cancelation will be acted
      upon on next cancellation entrypoing in the target thread.
 
-     It also requires to atomically check if cancellation is enabled and
-     asynchronous, so both cancellation state and type are tracked on
-     'cancelhandling'.  */
+     It also requires to atomically check if cancellation is enabled, so the
+     state are also tracked on 'cancelhandling'.  */
 
   int result = 0;
   int oldval = atomic_load_relaxed (&pd->cancelhandling);
@@ -122,19 +106,17 @@ __pthread_cancel (pthread_t th)
   do
     {
     again:
-      newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
+      newval = oldval | CANCELED_BITMASK;
       if (oldval == newval)
 	break;
 
-      /* If the cancellation is handled asynchronously just send a
-	 signal.  We avoid this if possible since it's more
-	 expensive.  */
-      if (cancel_enabled_and_canceled_and_async (newval))
+      /* Only send the SIGANCEL signal is cancellation is enabled, since some
+	 syscalls are never restarted even with SA_RESTART.  The signal
+	 will act iff async cancellation is enabled.  */
+      if (cancel_enabled (newval))
 	{
-	  /* Mark the cancellation as "in progress".  */
-	  int newval2 = oldval | CANCELING_BITMASK;
 	  if (!atomic_compare_exchange_weak_acquire (&pd->cancelhandling,
-						     &oldval, newval2))
+						     &oldval, newval))
 	    goto again;
 
 	  if (pd == THREAD_SELF)
@@ -143,9 +125,8 @@ __pthread_cancel (pthread_t th)
 	       pthread_create, so the signal handler may not have been
 	       set up for a self-cancel.  */
 	    {
-	      pd->result = PTHREAD_CANCELED;
-	      if ((newval & CANCELTYPE_BITMASK) != 0)
-		__do_cancel ();
+	      if (cancel_async_enabled (newval))
+		__do_cancel (PTHREAD_CANCELED);
 	    }
 	  else
 	    /* The cancellation handler will take care of marking the
@@ -154,19 +135,18 @@ __pthread_cancel (pthread_t th)
 
 	  break;
 	}
-
-	/* A single-threaded process should be able to kill itself, since
-	   there is nothing in the POSIX specification that says that it
-	   cannot.  So we set multiple_threads to true so that cancellation
-	   points get executed.  */
-	THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
-#ifndef TLS_MULTIPLE_THREADS_IN_TCB
-	__libc_single_threaded_internal = 0;
-#endif
     }
   while (!atomic_compare_exchange_weak_acquire (&pd->cancelhandling, &oldval,
 						newval));
 
+  /* A single-threaded process should be able to kill itself, since there is
+     nothing in the POSIX specification that says that it cannot.  So we set
+     multiple_threads to true so that cancellation points get executed.  */
+  THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
+#ifndef TLS_MULTIPLE_THREADS_IN_TCB
+  __libc_single_threaded_internal = 0;
+#endif
+
   return result;
 }
 versioned_symbol (libc, __pthread_cancel, pthread_cancel, GLIBC_2_34);
diff --git a/nptl/pthread_exit.c b/nptl/pthread_exit.c
index 9f48dcc5d0..125f44b78a 100644
--- a/nptl/pthread_exit.c
+++ b/nptl/pthread_exit.c
@@ -31,9 +31,7 @@ __pthread_exit (void *value)
                     " must be installed for pthread_exit to work\n");
   }
 
-  THREAD_SETMEM (THREAD_SELF, result, value);
-
-  __do_cancel ();
+  __do_cancel (value);
 }
 libc_hidden_def (__pthread_exit)
 weak_alias (__pthread_exit, pthread_exit)
diff --git a/nptl/pthread_setcancelstate.c b/nptl/pthread_setcancelstate.c
index 7f81d812dd..ffb482a83d 100644
--- a/nptl/pthread_setcancelstate.c
+++ b/nptl/pthread_setcancelstate.c
@@ -48,7 +48,7 @@ __pthread_setcancelstate (int state, int *oldstate)
 						&oldval, newval))
 	{
 	  if (cancel_enabled_and_canceled_and_async (newval))
-	    __do_cancel ();
+	    __do_cancel (PTHREAD_CANCELED);
 
 	  break;
 	}
diff --git a/nptl/pthread_setcanceltype.c b/nptl/pthread_setcanceltype.c
index 7dfeee4364..9fe7c0029b 100644
--- a/nptl/pthread_setcanceltype.c
+++ b/nptl/pthread_setcanceltype.c
@@ -48,7 +48,7 @@ __pthread_setcanceltype (int type, int *oldtype)
 	  if (cancel_enabled_and_canceled_and_async (newval))
 	    {
 	      THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-	      __do_cancel ();
+	      __do_cancel (PTHREAD_CANCELED);
 	    }
 
 	  break;
diff --git a/nptl/pthread_testcancel.c b/nptl/pthread_testcancel.c
index 38b5a2d4bc..b574c0f001 100644
--- a/nptl/pthread_testcancel.c
+++ b/nptl/pthread_testcancel.c
@@ -25,10 +25,7 @@ ___pthread_testcancel (void)
   struct pthread *self = THREAD_SELF;
   int cancelhandling = atomic_load_relaxed (&self->cancelhandling);
   if (cancel_enabled_and_canceled (cancelhandling))
-    {
-      self->result = PTHREAD_CANCELED;
-      __do_cancel ();
-    }
+    __do_cancel (PTHREAD_CANCELED);
 }
 versioned_symbol (libc, ___pthread_testcancel, pthread_testcancel, GLIBC_2_34);
 libc_hidden_ver (___pthread_testcancel, __pthread_testcancel)
diff --git a/nptl/tst-cancel31.c b/nptl/tst-cancel31.c
new file mode 100644
index 0000000000..4e93cc5ae1
--- /dev/null
+++ b/nptl/tst-cancel31.c
@@ -0,0 +1,100 @@
+/* Check side-effect act for cancellable syscalls (BZ #12683).
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* This testcase checks if there is resource leakage if the syscall has
+   returned from kernelspace, but before userspace saves the return
+   value.  The 'leaker' thread should be able to close the file descriptor
+   if the resource is already allocated, meaning that if the cancellation
+   signal arrives *after* the open syscal return from kernel, the
+   side-effect should be visible to application.  */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+
+#include <support/xunistd.h>
+#include <support/xthread.h>
+#include <support/check.h>
+#include <support/temp_file.h>
+#include <support/support.h>
+#include <support/descriptors.h>
+
+static void *
+writeopener (void *arg)
+{
+  int fd;
+  for (;;)
+    {
+      fd = open (arg, O_WRONLY);
+      xclose (fd);
+    }
+  return NULL;
+}
+
+static void *
+leaker (void *arg)
+{
+  int fd = open (arg, O_RDONLY);
+  TEST_VERIFY_EXIT (fd > 0);
+  pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, 0);
+  xclose (fd);
+  return NULL;
+}
+
+static int
+do_test (void)
+{
+  enum {
+    iter_count = 1000
+  };
+
+  char *dir = support_create_temp_directory ("tst-cancel28");
+  char *name = xasprintf ("%s/fifo", dir);
+  TEST_COMPARE (mkfifo (name, 0600), 0);
+  add_temp_file (name);
+
+  struct support_descriptors *descrs = support_descriptors_list ();
+
+  srand (1);
+
+  xpthread_create (NULL, writeopener, name);
+  for (int i = 0; i < iter_count; i++)
+    {
+      pthread_t td = xpthread_create (NULL, leaker, name);
+      struct timespec ts =
+	{ .tv_nsec = rand () % 100000, .tv_sec = 0 };
+      nanosleep (&ts, NULL);
+      /* Ignore pthread_cancel result because it might be the
+	 case when pthread_cancel is called when thread is already
+	 exited.  */
+      pthread_cancel (td);
+      xpthread_join (td);
+    }
+
+  support_descriptors_check (descrs);
+
+  support_descriptors_free (descrs);
+
+  free (name);
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/sysdeps/generic/syscall_types.h b/sysdeps/generic/syscall_types.h
new file mode 100644
index 0000000000..2ddeaa2b5f
--- /dev/null
+++ b/sysdeps/generic/syscall_types.h
@@ -0,0 +1,25 @@
+/* Types and macros used for syscall issuing.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+#ifndef _SYSCALL_TYPES_H
+#define _SYSCALL_TYPES_H
+
+typedef long int __syscall_arg_t;
+#define __SSC(__x) ((__syscall_arg_t) (__x))
+
+#endif
diff --git a/sysdeps/nptl/cancellation-pc-check.h b/sysdeps/nptl/cancellation-pc-check.h
new file mode 100644
index 0000000000..cb38ad6819
--- /dev/null
+++ b/sysdeps/nptl/cancellation-pc-check.h
@@ -0,0 +1,54 @@
+/* Architecture specific code for pthread cancellation handling.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _NPTL_CANCELLATION_PC_CHECK
+#define _NPTL_CANCELLATION_PC_CHECK
+
+#include <sigcontextinfo.h>
+
+/* For syscalls with side-effects (e.g read that might return partial read),
+   the kernel cannot restart the syscall when interrupted by a signal, it must
+   return from the call with whatever partial result.  In this case, the saved
+   program counter is set just after the syscall instruction, so the SIGCANCEL
+   handler should not act on cancellation.
+
+   The __syscall_cancel_arch function, used for all cancellable syscalls,
+   contains two extra markers, __syscall_cancel_arch_start and
+   __syscall_cancel_arch_end.  The former points to just before the initial
+   conditional branch that checks if the thread has received a cancellation
+   request, while former points to the instruction after the one responsible
+   to issue the syscall.
+
+   The function check if the program counter (PC) from ucontext_t CTX is
+   within the start and then end boundary from the __syscall_cancel_arch
+   bridge.  Return TRUE if the PC is within the boundary, meaning the
+   syscall does not have any side effects; or FALSE otherwise.  */
+
+static __always_inline bool
+cancellation_pc_check (void *ctx)
+{
+  /* Both are defined in syscall_cancel.S.  */
+  extern const char __syscall_cancel_arch_start[1];
+  extern const char __syscall_cancel_arch_end[1];
+
+  uintptr_t pc = sigcontext_get_pc (ctx);
+  return pc >= (uintptr_t) __syscall_cancel_arch_start
+	 && pc < (uintptr_t) __syscall_cancel_arch_end;
+}
+
+#endif
diff --git a/sysdeps/nptl/lowlevellock-futex.h b/sysdeps/nptl/lowlevellock-futex.h
index 0392b5c04f..bd57913b6f 100644
--- a/sysdeps/nptl/lowlevellock-futex.h
+++ b/sysdeps/nptl/lowlevellock-futex.h
@@ -21,7 +21,6 @@
 
 #ifndef __ASSEMBLER__
 # include <sysdep.h>
-# include <sysdep-cancel.h>
 # include <kernel-features.h>
 #endif
 
@@ -120,21 +119,10 @@
 		     nr_wake, nr_move, mutex, val)
 
 /* Like lll_futex_wait, but acting as a cancellable entrypoint.  */
-# define lll_futex_wait_cancel(futexp, val, private) \
-  ({                                                                   \
-    int __oldtype = LIBC_CANCEL_ASYNC ();			       \
-    long int __err = lll_futex_wait (futexp, val, LLL_SHARED);	       \
-    LIBC_CANCEL_RESET (__oldtype);				       \
-    __err;							       \
-  })
-
-/* Like lll_futex_timed_wait, but acting as a cancellable entrypoint.  */
-# define lll_futex_timed_wait_cancel(futexp, val, timeout, private) \
-  ({									   \
-    int __oldtype = LIBC_CANCEL_ASYNC ();			       	   \
-    long int __err = lll_futex_timed_wait (futexp, val, timeout, private); \
-    LIBC_CANCEL_RESET (__oldtype);					   \
-    __err;								   \
+# define lll_futex_wait_cancel(futexp, val, private)			\
+  ({									\
+     int __op = __lll_private_flag (FUTEX_WAIT, private);		\
+     INTERNAL_SYSCALL_CANCEL (futex, futexp, __op, val, NULL);		\
   })
 
 #endif  /* !__ASSEMBLER__  */
diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
index 54f9198681..15a7a063e5 100644
--- a/sysdeps/nptl/pthreadP.h
+++ b/sysdeps/nptl/pthreadP.h
@@ -261,10 +261,12 @@ libc_hidden_proto (__pthread_unregister_cancel)
 /* Called when a thread reacts on a cancellation request.  */
 static inline void
 __attribute ((noreturn, always_inline))
-__do_cancel (void)
+__do_cancel (void *result)
 {
   struct pthread *self = THREAD_SELF;
 
+  self->result = result;
+
   /* Make sure we get no more cancellations.  */
   atomic_fetch_or_relaxed (&self->cancelhandling, EXITING_BITMASK);
 
@@ -272,6 +274,13 @@ __do_cancel (void)
 		    THREAD_GETMEM (self, cleanup_jmp_buf));
 }
 
+extern long int __syscall_cancel_arch (volatile int *, __syscall_arg_t nr,
+     __syscall_arg_t arg1, __syscall_arg_t arg2, __syscall_arg_t arg3,
+     __syscall_arg_t arg4, __syscall_arg_t arg5, __syscall_arg_t arg6)
+  attribute_hidden;
+
+extern _Noreturn void __syscall_do_cancel (void) attribute_hidden;
+
 
 /* Internal prototypes.  */
 
diff --git a/sysdeps/unix/sysdep.h b/sysdeps/unix/sysdep.h
index 1ba4de99db..32bc85592e 100644
--- a/sysdeps/unix/sysdep.h
+++ b/sysdeps/unix/sysdep.h
@@ -24,6 +24,9 @@
 #define	SYSCALL__(name, args)	PSEUDO (__##name, name, args)
 #define	SYSCALL(name, args)	PSEUDO (name, name, args)
 
+#ifndef __ASSEMBLER__
+# include <errno.h>
+
 #define __SYSCALL_CONCAT_X(a,b)     a##b
 #define __SYSCALL_CONCAT(a,b)       __SYSCALL_CONCAT_X (a, b)
 
@@ -108,42 +111,115 @@
 #define INLINE_SYSCALL_CALL(...) \
   __INLINE_SYSCALL_DISP (__INLINE_SYSCALL, __VA_ARGS__)
 
+#define __INTERNAL_SYSCALL_NCS0(name) \
+  INTERNAL_SYSCALL_NCS (name, 0)
+#define __INTERNAL_SYSCALL_NCS1(name, a1) \
+  INTERNAL_SYSCALL_NCS (name, 1, a1)
+#define __INTERNAL_SYSCALL_NCS2(name, a1, a2) \
+  INTERNAL_SYSCALL_NCS (name, 2, a1, a2)
+#define __INTERNAL_SYSCALL_NCS3(name, a1, a2, a3) \
+  INTERNAL_SYSCALL_NCS (name, 3, a1, a2, a3)
+#define __INTERNAL_SYSCALL_NCS4(name, a1, a2, a3, a4) \
+  INTERNAL_SYSCALL_NCS (name, 4, a1, a2, a3, a4)
+#define __INTERNAL_SYSCALL_NCS5(name, a1, a2, a3, a4, a5) \
+  INTERNAL_SYSCALL_NCS (name, 5, a1, a2, a3, a4, a5)
+#define __INTERNAL_SYSCALL_NCS6(name, a1, a2, a3, a4, a5, a6) \
+  INTERNAL_SYSCALL_NCS (name, 6, a1, a2, a3, a4, a5, a6)
+#define __INTERNAL_SYSCALL_NCS7(name, a1, a2, a3, a4, a5, a6, a7) \
+  INTERNAL_SYSCALL_NCS (name, 7, a1, a2, a3, a4, a5, a6, a7)
+
+/* Issue a syscall defined by syscall number plus any other argument required.
+   It is similar to INTERNAL_SYSCALL_NCS macro, but without the need to pass
+   the expected argument number as third parameter.  */
+#define INTERNAL_SYSCALL_NCS_CALL(...) \
+  __INTERNAL_SYSCALL_DISP (__INTERNAL_SYSCALL_NCS, __VA_ARGS__)
+
+/* Cancellation macros.  */
+#include <syscall_types.h>
+
+long int __internal_syscall_cancel (__syscall_arg_t a1, __syscall_arg_t a2,
+				    __syscall_arg_t a3, __syscall_arg_t a4,
+				    __syscall_arg_t a5, __syscall_arg_t a6,
+				    __syscall_arg_t nr) attribute_hidden;
+
+long int __syscall_cancel (__syscall_arg_t nr, __syscall_arg_t arg1,
+			   __syscall_arg_t arg2, __syscall_arg_t arg3,
+			   __syscall_arg_t arg4, __syscall_arg_t arg5,
+			   __syscall_arg_t arg6) attribute_hidden;
+
+#define __SYSCALL_CANCEL0(name)						\
+  __syscall_cancel (0, 0, 0, 0, 0, 0, __NR_##name)
+#define __SYSCALL_CANCEL1(name, a1)					\
+  __syscall_cancel (__SSC (a1), 0, 0, 0, 0, 0, __NR_##name)
+#define __SYSCALL_CANCEL2(name, a1, a2) \
+  __syscall_cancel (__SSC (a1), __SSC (a2), 0, 0, 0, 0, __NR_##name)
+#define __SYSCALL_CANCEL3(name, a1, a2, a3) \
+  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), 0, 0, 0,	\
+		    __NR_##name)
+#define __SYSCALL_CANCEL4(name, a1, a2, a3, a4) \
+  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3),			\
+		    __SSC(a4), 0, 0, __NR_##name)
+#define __SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5) \
+  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), __SSC(a4),	\
+		    __SSC (a5), 0, __NR_##name)
+#define __SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6) \
+  __syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), __SSC (a4),	\
+		    __SSC (a5), __SSC (a6), __NR_##name)
+
+#define __SYSCALL_CANCEL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
+#define __SYSCALL_CANCEL_NARGS(...) \
+  __SYSCALL_CANCEL_NARGS_X (__VA_ARGS__,7,6,5,4,3,2,1,0,)
+#define __SYSCALL_CANCEL_CONCAT_X(a,b)     a##b
+#define __SYSCALL_CANCEL_CONCAT(a,b)       __SYSCALL_CANCEL_CONCAT_X (a, b)
+#define __SYSCALL_CANCEL_DISP(b,...) \
+  __SYSCALL_CANCEL_CONCAT (b,__SYSCALL_CANCEL_NARGS(__VA_ARGS__))(__VA_ARGS__)
+
+/* Issue a cancellable syscall defined first argument plus any other argument
+   required.  If and error occurs its value, the macro returns -1 and sets
+   errno accordingly.  */
+#define __SYSCALL_CANCEL_CALL(...) \
+  __SYSCALL_CANCEL_DISP (__SYSCALL_CANCEL, __VA_ARGS__)
+
+#define __INTERNAL_SYSCALL_CANCEL0(name)				\
+  __internal_syscall_cancel (0, 0, 0, 0, 0, 0, __NR_##name)
+#define __INTERNAL_SYSCALL_CANCEL1(name, a1)				\
+  __internal_syscall_cancel (__SSC (a1), 0, 0, 0, 0, 0, __NR_##name)
+#define __INTERNAL_SYSCALL_CANCEL2(name, a1, a2)			\
+  __internal_syscall_cancel (__SSC (a1), __SSC (a2), 0, 0, 0, 0,	\
+			     __NR_##name)
+#define __INTERNAL_SYSCALL_CANCEL3(name, a1, a2, a3)			\
+  __internal_syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3), 0,	\
+			     0, 0, __NR_##name)
+#define __INTERNAL_SYSCALL_CANCEL4(name, a1, a2, a3, a4)		\
+  __internal_syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3),	\
+			     __SSC(a4), 0, 0, __NR_##name)
+#define __INTERNAL_SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5)		\
+  __internal_syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3),	\
+			     __SSC(a4), __SSC (a5), 0, __NR_##name)
+#define __INTERNAL_SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6)	\
+  __internal_syscall_cancel (__SSC (a1), __SSC (a2), __SSC (a3),	\
+			     __SSC (a4), __SSC (a5), __SSC (a6),	\
+			     __NR_##name)
+
+/* Issue a cancellable syscall defined by syscall number NAME plus any other
+   argument required.  If an error occurs its value is returned as an negative
+   number unmodified and errno is not set.  */
+#define __INTERNAL_SYSCALL_CANCEL_CALL(...) \
+  __SYSCALL_CANCEL_DISP (__INTERNAL_SYSCALL_CANCEL, __VA_ARGS__)
+
 #if IS_IN (rtld)
-/* All cancellation points are compiled out in the dynamic loader.  */
-# define NO_SYSCALL_CANCEL_CHECKING 1
+/* The loader does not need to handle thread cancellation, use direct
+   syscall instead.  */
+# define INTERNAL_SYSCALL_CANCEL(...) INTERNAL_SYSCALL_CALL(__VA_ARGS__)
+# define SYSCALL_CANCEL(...)          INLINE_SYSCALL_CALL (__VA_ARGS__)
 #else
-# define NO_SYSCALL_CANCEL_CHECKING SINGLE_THREAD_P
+# define INTERNAL_SYSCALL_CANCEL(...) \
+  __INTERNAL_SYSCALL_CANCEL_CALL (__VA_ARGS__)
+# define SYSCALL_CANCEL(...) \
+  __SYSCALL_CANCEL_CALL (__VA_ARGS__)
 #endif
 
-#define SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (NO_SYSCALL_CANCEL_CHECKING)					     \
-      sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
-  })
-
-/* Issue a syscall defined by syscall number plus any other argument
-   required.  Any error will be returned unmodified (including errno).  */
-#define INTERNAL_SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (NO_SYSCALL_CANCEL_CHECKING) 					     \
-      sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
-  })
+#endif /* __ASSEMBLER__  */
 
 /* Machine-dependent sysdep.h files are expected to define the macro
    PSEUDO (function_name, syscall_name) to emit assembly code to define the
diff --git a/sysdeps/unix/sysv/linux/socketcall.h b/sysdeps/unix/sysv/linux/socketcall.h
index d1a173277e..19a6c17a86 100644
--- a/sysdeps/unix/sysv/linux/socketcall.h
+++ b/sysdeps/unix/sysv/linux/socketcall.h
@@ -88,14 +88,33 @@
     sc_ret;								\
   })
 
-
-#define SOCKETCALL_CANCEL(name, args...)				\
-  ({									\
-    int oldtype = LIBC_CANCEL_ASYNC ();					\
-    long int sc_ret = __SOCKETCALL (SOCKOP_##name, args);		\
-    LIBC_CANCEL_RESET (oldtype);					\
-    sc_ret;								\
-  })
+#define __SOCKETCALL_CANCEL1(__name, __a1) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [1]) { (long int) __a1 }))
+#define __SOCKETCALL_CANCEL2(__name, __a1, __a2) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [2]) { (long int) __a1, (long int) __a2 }))
+#define __SOCKETCALL_CANCEL3(__name, __a1, __a2, __a3) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [3]) { (long int) __a1, (long int) __a2, (long int) __a3 }))
+#define __SOCKETCALL_CANCEL4(__name, __a1, __a2, __a3, __a4) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [4]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4 }))
+#define __SOCKETCALL_CANCEL5(__name, __a1, __a2, __a3, __a4, __a5) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [5]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5 }))
+#define __SOCKETCALL_CANCEL6(__name, __a1, __a2, __a3, __a4, __a5, __a6) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [6]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5, (long int) __a6 }))
+
+#define __SOCKETCALL_CANCEL(...) __SOCKETCALL_DISP (__SOCKETCALL_CANCEL,\
+						    __VA_ARGS__)
+
+#define SOCKETCALL_CANCEL(name, args...) \
+   __SOCKETCALL_CANCEL (SOCKOP_##name, args)
 
 
 #endif /* sys/socketcall.h */
diff --git a/sysdeps/unix/sysv/linux/syscall_cancel.c b/sysdeps/unix/sysv/linux/syscall_cancel.c
new file mode 100644
index 0000000000..260680c99f
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/syscall_cancel.c
@@ -0,0 +1,71 @@
+/* Pthread cancellation syscall bridge.  Default Linux version.
+   Copyright (C) 2023 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <pthreadP.h>
+
+#warning "This implementation should be use just as reference or for bootstrapping"
+
+/* This is the generic version of the cancellable syscall code which
+   adds the label guards (__syscall_cancel_arch_{start,end}) used on SIGCANCEL
+   handler to check if the cancelled syscall have side-effects that need to be
+   returned to the caller.
+
+   This implementation should be used as a reference one to document the
+   implementation constraints:
+
+     1. The __syscall_cancel_arch_start should point just before the test
+        that thread is already cancelled,
+     2.	The __syscall_cancel_arch_end should point to the immediate next
+        instruction after the syscall one.
+     3. It should return the syscall value or a negative result if is has
+        failed, similar to INTERNAL_SYSCALL_CALL.
+
+   The __syscall_cancel_arch_end one is because the kernel will signal
+   interrupted syscall with side effects by setting the signal frame program
+   counter (on the ucontext_t third argument from SA_SIGINFO signal handler)
+   right after the syscall instruction.
+
+   For some architecture, the INTERNAL_SYSCALL_NCS macro use more instructions
+   to get the error condition from kernel (as for powerpc and sparc that
+   checks for the conditional register), or uses an out of the line helper
+   (ARM thumb), or uses a kernel helper gate (i686 or ia64).  In this case
+   the architecture should either adjust the macro or provide a custom
+   __syscall_cancel_arch implementation.   */
+
+long int
+__syscall_cancel_arch (volatile int *ch, __syscall_arg_t nr,
+		       __syscall_arg_t a1, __syscall_arg_t a2,
+		       __syscall_arg_t a3, __syscall_arg_t a4,
+		       __syscall_arg_t a5, __syscall_arg_t a6)
+{
+#define ADD_LABEL(__label)		\
+  asm volatile (			\
+    ".global " __label "\t\n"		\
+    __label ":\n");
+
+  ADD_LABEL ("__syscall_cancel_arch_start");
+  if (__glibc_unlikely (*ch & CANCELED_BITMASK))
+    __syscall_do_cancel();
+
+  long int result = INTERNAL_SYSCALL_NCS_CALL (nr, a1, a2, a3, a4, a5, a6);
+  ADD_LABEL ("__syscall_cancel_arch_end");
+  if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (result)))
+    return -INTERNAL_SYSCALL_ERRNO (result);
+  return result;
+}
diff --git a/sysdeps/unix/sysv/linux/sysdep-cancel.h b/sysdeps/unix/sysv/linux/sysdep-cancel.h
index 102682c5ee..1b686d53a9 100644
--- a/sysdeps/unix/sysv/linux/sysdep-cancel.h
+++ b/sysdeps/unix/sysv/linux/sysdep-cancel.h
@@ -21,17 +21,5 @@
 #define _SYSDEP_CANCEL_H
 
 #include <sysdep.h>
-#include <tls.h>
-#include <errno.h>
-
-/* Set cancellation mode to asynchronous.  */
-extern int __pthread_enable_asynccancel (void);
-libc_hidden_proto (__pthread_enable_asynccancel)
-#define LIBC_CANCEL_ASYNC() __pthread_enable_asynccancel ()
-
-/* Reset to previous cancellation mode.  */
-extern void __pthread_disable_asynccancel (int oldtype);
-libc_hidden_proto (__pthread_disable_asynccancel)
-#define LIBC_CANCEL_RESET(oldtype) __pthread_disable_asynccancel (oldtype)
 
 #endif

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [glibc/azanella/bz12683] nptl: Fix Race conditions in pthread cancellation [BZ#12683]
@ 2020-04-07 14:03 Adhemerval Zanella
  0 siblings, 0 replies; 6+ messages in thread
From: Adhemerval Zanella @ 2020-04-07 14:03 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=899a84992a35a11b35f56aedda07f3da518cce05

commit 899a84992a35a11b35f56aedda07f3da518cce05
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Fri Sep 18 18:26:35 2015 -0300

    nptl: Fix Race conditions in pthread cancellation [BZ#12683]
    
    This patch is the initial fix for race conditions in NPTL cancellation
    code by redefining how cancellable syscalls are defined and handled.
    The current buggy approach is to enable asynchronous cancellation
    before making the syscall and restore the previous cancellation
    type once the syscall returns.
    
    As described in BZ#12683, this approach shows 2 important problems:
    
      1. Cancellation can act after the syscall has returned from the
         kernel, but before userspace saves the return value.  It might
         result in a resource leak if the syscall allocated a resource or a
         side effect (partial read/write), and there is no way to program
         handle it with cancellation handlers.
    
      2. If a signal is handled while the thread is blocked at a cancellable
         syscall, the entire signal handler runs with asynchronous
         cancellation enabled.  This can lead to issues if the signal
         handler call functions which are async-signal-safe but not
         async-cancel-safe.
    
    For the cancellation to work correctly, there are 5 points at which the
    cancellation signal could arrive:
    
      1. Before the final "testcancel" and before the syscall is made.
      2. Between the "testcancel" and the syscall.
      3. While the syscall is blocked and no side effects have yet taken
         place.
      4. While the syscall is blocked but with some side effects already
         having taken place (e.g. a partial read or write).
      5. After the syscall has returned.
    
    And GLIBC wants to act on cancellation in cases 1, 2, and 3 but not
    in cases 4 or 5.  For the 4 and 5 cases, the cancellation will eventually
    happen in the next cancellable entry point without any further external
    event.
    
    The proposed solution follows for each case:
    
      1. Do a conditional branch based on whether the thread has received
         a cancellation request;
    
      2. It can be caught by the signal handler determining that the saved
         program counter (from the ucontext_t) is in some address range
         beginning just before the "testcancel" and ending with the
         syscall instruction.
    
      3. In this case, except for certain syscalls that ALWAYS fail with
         EINTR even for non-interrupting signals, the kernel will reset
         the program counter to point at the syscall instruction during
         signal handling, so that the syscall is restarted when the signal
         handler returns.  So, from the signal handler's standpoint, this
         looks the same as case 2, and thus it's taken care of.
    
      4. For syscalls with side-effects, the kernel cannot restart the
         syscall; when it's interrupted by a signal, the kernel must cause
         the syscall to return with whatever partial result is obtained
         (e.g. partial read or write).
    
      5. In this case, the saved program counter points just after the
         syscall instruction, so the signal handler won't act on
         cancellation.  This is similar to 4. since the program counter
         is past the syscall instruction.
    
    Another case that needs handling is syscalls that fail with EINTR even
    when the signal handler is non-interrupting. In this case, the syscall
    wrapper code can just check the cancellation flag when the errno result
    is EINTR, and act on cancellation if it's set.
    
    The proposed GLIBC adjustments are:
    
      1. Remove the enable_asynccancel/disable_asynccancel function usage in
         syscall definition and instead make them call a common symbol that
         will check if cancellation is enabled (__syscall_cancel at
         nptl/libc-cancellation.c), call the arch-specific cancellable
         entry-point (__syscall_cancel_arch) and cancel the thread when
         required.
    
      2. Provide an arch-specific generic system call wrapper function
         that contains global markers.  These markers will be used in
         SIGCANCEL handler to check if the interruption has been called in a
         valid syscall and if the syscalls have been completed or not.
    
         A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
         is provided.  However, the markers may not be set on correct
         expected places depending on how INTERNAL_SYSCALL_NCS is
         implemented by the architecture and it uses compiler-specific
         construct (asm volatile) to place the required markers.
         It is expected that all architectures add an arch-specific
         implementation.
    
      3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
         type and if current IP from signal handler falls between the global
         markers and act accordingly (sigcancel_handler at nptl/nptl-init.c).
    
      4. Adjust nptl/pthread_cancel.c to send a signal instead of acting
         directly. This avoids synchronization issues when updating the
         cancellation status and also focuses the logic on the signal
         handler and cancellation syscall code.
    
      5. Adjust pthread code to replace CANCEL_ASYNC/CANCEL_RESET calls to
         appropriated cancelable futex syscalls.
    
      6. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
         appropriated cancelable syscalls.
    
      7. Adjust 'lowlevellock-futex.h' arch-specific implementations to
         provide cancelable futex calls (used in libpthread code).
    
    This patch adds the proposed changes to NPTL common code and the
    following patches add the requires arch-specific bits.  The build for
    ia64-linux-gnu, mips-*, and x86_64-* are broken without the
    arch-specific patches.
    
    As a side note regarding SIGCANCEL and SIGTIMER being the the same,
    it should not impact timer_create functionality.  It arranges for
    SIGCANCEL/SIGTIMER to be sent to the internal helper thread, which
    in turn check if the si.si_code is SI_TIMER and call pthread_exit
    otherwise (sysdeps/unix/sysv/linux/timer_routines.c:129).
    
    This suggests that the helper thread does NOT depend on EINTR
    being generated for SIGCANCEL/SIGTIMER, and it should be fine to use
    SA_RESTART for that signal as far as timer_create is concerned.

Diff:
---
 manual/llio.texi                                   |   4 +-
 nptl/Makefile                                      |  10 +-
 nptl/Versions                                      |   4 +
 nptl/cancellation.c                                |  78 --------------
 nptl/descr.h                                       |   3 -
 nptl/libc-cancellation.c                           |  43 +++++++-
 nptl/nptl-init.c                                   |  92 ++++++++--------
 nptl/pthreadP.h                                    |  57 +++++++---
 nptl/pthread_cancel.c                              |  66 +++---------
 nptl/pthread_create.c                              |   7 +-
 nptl/pthread_exit.c                                |   9 +-
 nptl/pthread_join_common.c                         |   7 +-
 nptl/pthread_kill.c                                |   7 +-
 .../pthread_kill.c => nptl/pthread_kill_internal.c |  21 +---
 nptl/pthread_setcanceltype.c                       |   2 +-
 nptl/pthread_testcancel.c                          |  12 +--
 nptl/sem_wait.c                                    |   2 +-
 nptl/tst-cancel29.c                                | 100 +++++++++++++++++
 rt/Makefile                                        |   2 +-
 .../syscall_types.h}                               |  13 +--
 sysdeps/generic/sysdep-cancel.h                    |   2 -
 sysdeps/nptl/Makefile                              |   3 +-
 sysdeps/nptl/cancellation-pc-check.h               |  53 +++++++++
 sysdeps/nptl/cancellation-sigmask.h                |  30 ++++++
 sysdeps/nptl/futex-internal.h                      |  19 +---
 sysdeps/nptl/lowlevellock-futex.h                  |  41 ++++---
 sysdeps/unix/sysdep.h                              | 118 ++++++++++++++++-----
 sysdeps/unix/sysv/linux/socketcall.h               |  40 ++++---
 sysdeps/unix/sysv/linux/syscall_cancel.c           |  62 +++++++++++
 sysdeps/unix/sysv/linux/sysdep-cancel.h            |  42 --------
 sysdeps/unix/sysv/linux/sysdep.h                   |   9 ++
 31 files changed, 591 insertions(+), 367 deletions(-)

diff --git a/manual/llio.texi b/manual/llio.texi
index fe59002915..c02ee83428 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -2534,13 +2534,13 @@ aiocb64}, since the LFS transparently replaces the old interface.
 @c     sigemptyset ok
 @c     sigaddset ok
 @c     setjmp ok
-@c     CANCEL_ASYNC -> pthread_enable_asynccancel ok
+@c     __pthread_setcanceltype ok
 @c      do_cancel ok
 @c       pthread_unwind ok
 @c        Unwind_ForcedUnwind or longjmp ok [@ascuheap @acsmem?]
 @c     lll_lock @asulock @aculock
 @c     lll_unlock @asulock @aculock
-@c     CANCEL_RESET -> pthread_disable_asynccancel ok
+@c     __pthread_setcanceltype ok
 @c      lll_futex_wait ok
 @c     ->start_routine ok -----
 @c     call_tls_dtors @asulock @ascuheap @aculock @acsmem
diff --git a/nptl/Makefile b/nptl/Makefile
index e554a3898d..0cb724eb76 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -60,6 +60,7 @@ routines = \
   pthread_self \
   pthread_setschedparam \
   register-atfork \
+  syscall_cancel
 
 shared-only-routines = forward
 static-only-routines = pthread_atfork
@@ -123,7 +124,8 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
 		      pthread_barrierattr_setpshared \
 		      pthread_key_create pthread_key_delete \
 		      pthread_getspecific pthread_setspecific \
-		      pthread_sigmask pthread_kill pthread_sigqueue \
+		      pthread_sigmask pthread_kill pthread_kill_internal \
+		      pthread_sigqueue \
 		      pthread_cancel pthread_testcancel \
 		      pthread_setcancelstate pthread_setcanceltype \
 		      pthread_once \
@@ -137,7 +139,6 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
 		      cleanup cleanup_defer cleanup_compat \
 		      cleanup_defer_compat unwind \
 		      pt-longjmp pt-cleanup\
-		      cancellation \
 		      lowlevellock \
 		      lll_timedlock_wait \
 		      pt-fork pt-fcntl \
@@ -187,8 +188,7 @@ CFLAGS-pthread_setcanceltype.c += -fexceptions -fasynchronous-unwind-tables
 
 # These are internal functions which similar functionality as setcancelstate
 # and setcanceltype.
-CFLAGS-cancellation.c += -fasynchronous-unwind-tables
-CFLAGS-libc-cancellation.c += -fasynchronous-unwind-tables
+CFLAGS-libc-cancellation.c += -fexceptions -fasynchronous-unwind-tables
 
 # Calling pthread_exit() must cause the registered cancel handlers to
 # be executed.  Therefore exceptions have to be thrown through this
@@ -286,7 +286,7 @@ tests = tst-attr2 tst-attr3 tst-default-attr \
 	tst-cancel11 tst-cancel12 tst-cancel13 tst-cancel14 tst-cancel15 \
 	tst-cancel16 tst-cancel17 tst-cancel18 tst-cancel19 tst-cancel20 \
 	tst-cancel21 tst-cancel22 tst-cancel23 tst-cancel24 \
-	tst-cancel26 tst-cancel27 tst-cancel28 \
+	tst-cancel26 tst-cancel27 tst-cancel28 tst-cancel29 \
 	tst-cancel-self tst-cancel-self-cancelstate \
 	tst-cancel-self-canceltype tst-cancel-self-testcancel \
 	tst-cleanup0 tst-cleanup1 tst-cleanup2 tst-cleanup3 tst-cleanup4 \
diff --git a/nptl/Versions b/nptl/Versions
index 543dddc4ee..41d732e1dd 100644
--- a/nptl/Versions
+++ b/nptl/Versions
@@ -41,6 +41,10 @@ libc {
     __libc_allocate_rtsig_private;
     # Used by the C11 threads implementation.
     __pthread_cond_destroy; __pthread_cond_init;
+    # Used by pthread cancellation.
+    __syscall_cancel;
+    __syscall_cancel_arch_start;
+    __syscall_cancel_arch_end;
   }
 }
 
diff --git a/nptl/cancellation.c b/nptl/cancellation.c
deleted file mode 100644
index 7127f9ae91..0000000000
--- a/nptl/cancellation.c
+++ /dev/null
@@ -1,78 +0,0 @@
-/* Copyright (C) 2002-2020 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <setjmp.h>
-#include <stdlib.h>
-#include "pthreadP.h"
-#include <futex-internal.h>
-
-
-/* The next two functions are similar to pthread_setcanceltype() but
-   more specialized for the use in the cancelable functions like write().
-   They do not need to check parameters etc.  These functions must be
-   AS-safe, with the exception of the actual cancellation, because they
-   are called by wrappers around AS-safe functions like write().*/
-int
-attribute_hidden
-__pthread_enable_asynccancel (void)
-{
-  struct pthread *self = THREAD_SELF;
-
-  int oldval = THREAD_GETMEM (self, canceltype);
-  THREAD_SETMEM (self, canceltype, PTHREAD_CANCEL_ASYNCHRONOUS);
-
-  int ch = THREAD_GETMEM (self, cancelhandling);
-
-  if (self->cancelstate == PTHREAD_CANCEL_ENABLE
-      && (ch & (CANCELED_BITMASK | EXITING_BITMASK | TERMINATED_BITMASK))
-	  == CANCELED_BITMASK)
-    {
-      THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-      __do_cancel ();
-    }
-
-  return oldval;
-}
-
-/* See the comment for __pthread_enable_asynccancel regarding
-   the AS-safety of this function.  */
-void
-attribute_hidden
-__pthread_disable_asynccancel (int oldtype)
-{
-  /* If asynchronous cancellation was enabled before we do not have
-     anything to do.  */
-  if (oldtype == PTHREAD_CANCEL_ASYNCHRONOUS)
-    return;
-
-  struct pthread *self = THREAD_SELF;
-  THREAD_SETMEM (self, canceltype, PTHREAD_CANCEL_DEFERRED);
-
-  /* We cannot return when we are being canceled.  Upon return the
-     thread might be things which would have to be undone.  The
-     following loop should loop until the cancellation signal is
-     delivered.  */
-  int ch = THREAD_GETMEM (self, cancelhandling);
-  while (__glibc_unlikely ((ch & (CANCELING_BITMASK | CANCELED_BITMASK))
-			    == CANCELING_BITMASK))
-    {
-      futex_wait_simple ((unsigned int *) &self->cancelhandling, ch,
-			 FUTEX_PRIVATE);
-      ch = THREAD_GETMEM (self, cancelhandling);
-    }
-}
diff --git a/nptl/descr.h b/nptl/descr.h
index bae9457e33..c0b8f6c40e 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -269,9 +269,6 @@ struct pthread
 
   /* Flags determining processing of cancellation.  */
   int cancelhandling;
-  /* Bit set if canceling has been initiated.  */
-#define CANCELING_BIT		2
-#define CANCELING_BITMASK	(0x01 << CANCELING_BIT)
   /* Bit set if canceled.  */
 #define CANCELED_BIT		3
 #define CANCELED_BITMASK	(0x01 << CANCELED_BIT)
diff --git a/nptl/libc-cancellation.c b/nptl/libc-cancellation.c
index eae81d504c..e695d67417 100644
--- a/nptl/libc-cancellation.c
+++ b/nptl/libc-cancellation.c
@@ -18,7 +18,44 @@
 
 #include "pthreadP.h"
 
+/* Cancellation function called by all cancellable syscalls.  */
+long int
+__syscall_cancel (__syscall_arg_t nr, __syscall_arg_t a1,
+		  __syscall_arg_t a2, __syscall_arg_t a3,
+		  __syscall_arg_t a4, __syscall_arg_t a5,
+		  __syscall_arg_t a6)
+{
+  struct pthread *pd = THREAD_SELF;
+  long int result;
 
-#define __pthread_enable_asynccancel __libc_enable_asynccancel
-#define __pthread_disable_asynccancel __libc_disable_asynccancel
-#include <nptl/cancellation.c>
+  /* If cancellation is not enabled, call the syscall directly.  */
+  if (pd->cancelstate == PTHREAD_CANCEL_DISABLE)
+    {
+      result = INTERNAL_SYSCALL_NCS_CALL (nr, a1, a2, a3, a4, a5, a6);
+      if (INTERNAL_SYSCALL_ERROR_P (result))
+	return -INTERNAL_SYSCALL_ERRNO (result);
+      return result;
+    }
+
+  /* Call the arch-specific entry points that contains the globals markers
+     to be checked by SIGCANCEL handler.  */
+  result = __syscall_cancel_arch (&pd->cancelhandling, nr, a1, a2, a3, a4, a5,
+			          a6);
+
+  if (result == -EINTR
+      && __pthread_self_cancelled ()
+      && pd->cancelstate == PTHREAD_CANCEL_ENABLE)
+    __do_cancel (PTHREAD_CANCELED);
+
+  return result;
+}
+libc_hidden_def (__syscall_cancel)
+
+/* Since __do_cancel is a always inline function, this creates a symbol the
+   arch-specific symbol can call to cancel the thread.  */
+_Noreturn void
+attribute_hidden
+__syscall_do_cancel (void)
+{
+  __do_cancel (PTHREAD_CANCELED);
+}
diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
index 89f58d2e5e..75225965c4 100644
--- a/nptl/nptl-init.c
+++ b/nptl/nptl-init.c
@@ -39,6 +39,9 @@
 #include <libc-pointer-arith.h>
 #include <pthread-pids.h>
 #include <pthread_mutex_conf.h>
+#include <sigcontextinfo.h>
+#include <cancellation-sigmask.h>
+#include <cancellation-pc-check.h>
 
 #ifndef TLS_MULTIPLE_THREADS_IN_TCB
 /* Pointer to the corresponding variable in libc.  */
@@ -137,35 +140,23 @@ sigcancel_handler (int sig, siginfo_t *si, void *ctx)
 
   struct pthread *self = THREAD_SELF;
 
-  int oldval = THREAD_GETMEM (self, cancelhandling);
-  while (1)
-    {
-      /* We are canceled now.  When canceled by another thread this flag
-	 is already set but if the signal is directly send (internally or
-	 from another process) is has to be done here.  */
-      int newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
-
-      if (oldval == newval || (oldval & EXITING_BITMASK) != 0)
-	/* Already canceled or exiting.  */
-	break;
-
-      int curval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling, newval,
-					      oldval);
-      if (curval == oldval)
-	{
-	  /* Set the return value.  */
-	  THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-
-	  /* Make sure asynchronous cancellation is still enabled.  */
-	  if (self->canceltype == PTHREAD_CANCEL_ASYNCHRONOUS)
-	    /* Run the registered destructors and terminate the thread.  */
-	    __do_cancel ();
-
-	  break;
-	}
-
-      oldval = curval;
-    }
+  if (!__pthread_self_cancelled ()
+      || self->cancelstate == PTHREAD_CANCEL_DISABLE)
+    return;
+
+  /* Add SIGCANCEL on ignored sigmask to avoid the handler to be called
+     again.  */
+  ucontext_block_sigcancel (ctx);
+
+  /* Check if asynchronous cancellation mode is set or if interrupted
+     instruction pointer falls within the cancellable syscall bridge.  For
+     interruptable syscalls that might generate external side-effects (partial
+     reads or writes, for instance), the kernel will set the IP to after
+     '__syscall_cancel_arch_end', thus disabling the cancellation and allowing
+     the process to handle such conditions.  */
+  if (self->canceltype == PTHREAD_CANCEL_ASYNCHRONOUS
+      || cancellation_pc_check (ctx))
+    __do_cancel (PTHREAD_CANCELED);
 }
 
 
@@ -260,28 +251,39 @@ __pthread_initialize_minimal_internal (void)
      had to set __nptl_initial_report_events.  Propagate its setting.  */
   THREAD_SETMEM (pd, report_events, __nptl_initial_report_events);
 
-  struct sigaction sa;
-  __sigemptyset (&sa.sa_mask);
-
   /* Install the cancellation signal handler.  If for some reason we
      cannot install the handler we do not abort.  Maybe we should, but
      it is only asynchronous cancellation which is affected.  */
-  sa.sa_sigaction = sigcancel_handler;
-  sa.sa_flags = SA_SIGINFO;
-  (void) __libc_sigaction (SIGCANCEL, &sa, NULL);
+  {
+    struct sigaction sa;
+    sa.sa_sigaction = sigcancel_handler;
+    /* The signal handle should be non-interruptible to avoid the risk of
+       spurious EINTR caused by SIGCANCEL sent to process or if pthread_cancel
+       is called while cancellation is disabled in the target thread.  */
+    sa.sa_flags = SA_SIGINFO | SA_RESTART;
+    sa.sa_mask = sigall_set;
+    __libc_sigaction (SIGCANCEL, &sa, NULL);
+  }
 
-  /* Install the handle to change the threads' uid/gid.  */
-  sa.sa_sigaction = sighandler_setxid;
-  sa.sa_flags = SA_SIGINFO | SA_RESTART;
-  (void) __libc_sigaction (SIGSETXID, &sa, NULL);
+  {
+    /* Install the handle to change the threads' uid/gid.  */
+    struct sigaction sa;
+    __sigemptyset (&sa.sa_mask);
+    sa.sa_sigaction = sighandler_setxid;
+    sa.sa_flags = SA_SIGINFO | SA_RESTART;
+    __libc_sigaction (SIGSETXID, &sa, NULL);
+  }
 
   /* The parent process might have left the signals blocked.  Just in
-     case, unblock it.  We reuse the signal mask in the sigaction
-     structure.  It is already cleared.  */
-  __sigaddset (&sa.sa_mask, SIGCANCEL);
-  __sigaddset (&sa.sa_mask, SIGSETXID);
-  INTERNAL_SYSCALL_CALL (rt_sigprocmask, SIG_UNBLOCK, &sa.sa_mask,
-			 NULL, _NSIG / 8);
+     case, unblock it.  */
+  {
+    struct sigaction sa;
+    __sigemptyset (&sa.sa_mask);
+    __sigaddset (&sa.sa_mask, SIGCANCEL);
+    __sigaddset (&sa.sa_mask, SIGSETXID);
+    INTERNAL_SYSCALL_CALL (rt_sigprocmask, SIG_UNBLOCK, &sa.sa_mask,
+			   NULL, _NSIG / 8);
+  }
 
   /* Get the size of the static and alignment requirements for the TLS
      block.  */
diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index d55c3b26a4..a20b136f14 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -286,20 +286,13 @@ extern void __nptl_unwind_freeres (void) attribute_hidden;
 #endif
 
 
-/* Called when a thread reacts on a cancellation request.  */
-static inline void
-__attribute ((noreturn, always_inline))
-__do_cancel (void)
-{
-  struct pthread *self = THREAD_SELF;
-
-  /* Make sure we get no more cancellations.  */
-  THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT);
-
-  __pthread_unwind ((__pthread_unwind_buf_t *)
-		    THREAD_GETMEM (self, cleanup_jmp_buf));
-}
+extern long int __syscall_cancel_arch (volatile int *, __syscall_arg_t nr,
+     __syscall_arg_t arg1, __syscall_arg_t arg2, __syscall_arg_t arg3,
+     __syscall_arg_t arg4, __syscall_arg_t arg5, __syscall_arg_t arg6);
+libc_hidden_proto (__syscall_cancel_arch);
 
+extern _Noreturn void __syscall_do_cancel (void)
+     attribute_hidden;
 
 /* Internal prototypes.  */
 
@@ -461,12 +454,13 @@ extern int __pthread_equal (pthread_t thread1, pthread_t thread2);
 extern int __pthread_detach (pthread_t th);
 extern int __pthread_cancel (pthread_t th);
 extern int __pthread_kill (pthread_t threadid, int signo);
-extern void __pthread_exit (void *value) __attribute__ ((__noreturn__));
+extern int __pthread_kill_internal (pthread_t threadid, int signo)
+  attribute_hidden;
+extern void _Noreturn __pthread_exit (void *value);
 extern int __pthread_join (pthread_t threadid, void **thread_return);
 extern int __pthread_setcanceltype (int type, int *oldtype);
-extern int __pthread_enable_asynccancel (void) attribute_hidden;
-extern void __pthread_disable_asynccancel (int oldtype) attribute_hidden;
 extern void __pthread_testcancel (void);
+extern void __pthread_exit (void *value);
 extern int __pthread_clockjoin_ex (pthread_t, void **, clockid_t,
 				   const struct timespec *, bool)
   attribute_hidden;
@@ -487,10 +481,41 @@ hidden_proto (__pthread_setspecific)
 hidden_proto (__pthread_once)
 hidden_proto (__pthread_setcancelstate)
 hidden_proto (__pthread_testcancel)
+hidden_proto (__pthread_exit)
 hidden_proto (__pthread_mutexattr_init)
 hidden_proto (__pthread_mutexattr_settype)
 #endif
 
+/* Called when a thread reacts on a cancellation request.  */
+_Noreturn static inline void
+__do_cancel (void *value)
+{
+  struct pthread *self = THREAD_SELF;
+
+  /* Make sure we get no more cancellations by clearing the cancel
+     state.  */
+  THREAD_SETMEM (self, cancelstate, PTHREAD_CANCEL_DISABLE);
+  THREAD_SETMEM (self, canceltype, PTHREAD_CANCEL_DEFERRED);
+
+  THREAD_SETMEM (self, result, value);
+
+  THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT);
+
+  __pthread_unwind ((__pthread_unwind_buf_t *)
+		    THREAD_GETMEM (self, cleanup_jmp_buf));
+}
+
+static inline bool
+__pthread_self_cancelled (void)
+{
+  struct pthread *self = THREAD_SELF;
+  int cancelhandling = THREAD_GETMEM (self, cancelhandling);
+  return self->cancelstate == PTHREAD_CANCEL_ENABLE
+	  && (cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
+			        | TERMINATED_BITMASK))
+	      == CANCELED_BITMASK;
+}
+
 extern int __pthread_cond_broadcast_2_0 (pthread_cond_2_0_t *cond);
 extern int __pthread_cond_destroy_2_0 (pthread_cond_2_0_t *cond);
 extern int __pthread_cond_init_2_0 (pthread_cond_2_0_t *cond,
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index 7518c0b8bf..3873a47af8 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -37,63 +37,29 @@ __pthread_cancel (pthread_t th)
 #ifdef SHARED
   pthread_cancel_init ();
 #endif
-  int result = 0;
-  int oldval;
-  int newval;
-  do
-    {
-    again:
-      oldval = pd->cancelhandling;
-      newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
-
-      /* Avoid doing unnecessary work.  The atomic operation can
-	 potentially be expensive if the bug has to be locked and
-	 remote cache lines have to be invalidated.  */
-      if (oldval == newval)
-	break;
-
-      /* If the cancellation is handled asynchronously just send a
-	 signal.  We avoid this if possible since it's more
-	 expensive.  */
-      if (pd->cancelstate == PTHREAD_CANCEL_ENABLE
-	  && pd->canceltype == PTHREAD_CANCEL_ASYNCHRONOUS
-	  && (newval & (CANCELED_BITMASK | EXITING_BITMASK
-			| TERMINATED_BITMASK))
-	      == CANCELED_BITMASK)
-	{
-	  /* Mark the cancellation as "in progress".  */
-	  if (atomic_compare_and_exchange_bool_acq (&pd->cancelhandling,
-						    oldval | CANCELING_BITMASK,
-						    oldval))
-	    goto again;
 
-	  /* The cancellation handler will take care of marking the
-	     thread as canceled.  */
-	  pid_t pid = __getpid ();
+  THREAD_ATOMIC_BIT_SET (pd, cancelhandling, CANCELED_BIT);
 
-	  int val = INTERNAL_SYSCALL_CALL (tgkill, pid, pd->tid,
-					   SIGCANCEL);
-	  if (INTERNAL_SYSCALL_ERROR_P (val))
-	    result = INTERNAL_SYSCALL_ERRNO (val);
+  /* A single-threaded process should be able to kill itself, since there is
+     nothing in the POSIX specification that says that it cannot.  So we set
+     multiple_threads to true so that cancellation points get executed.  */
+  THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
 
-	  break;
-	}
-
-	/* A single-threaded process should be able to kill itself, since
-	   there is nothing in the POSIX specification that says that it
-	   cannot.  So we set multiple_threads to true so that cancellation
-	   points get executed.  */
-	THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
 #ifndef TLS_MULTIPLE_THREADS_IN_TCB
-	__pthread_multiple_threads = *__libc_multiple_threads_ptr = 1;
+  __pthread_multiple_threads = *__libc_multiple_threads_ptr = 1;
 #endif
+
+  /* Avoid signaling when thread attempts cancel itself (pthread_kill
+     is expensive).  */
+  if (pd == THREAD_SELF)
+    {
+      if (pd->cancelstate == PTHREAD_CANCEL_ENABLE
+	  && pd->canceltype == PTHREAD_CANCEL_ASYNCHRONOUS)
+	__pthread_exit (PTHREAD_CANCELED);
+      return 0;
     }
-  /* Mark the thread as canceled.  This has to be done
-     atomically since other bits could be modified as well.  */
-  while (atomic_compare_and_exchange_bool_acq (&pd->cancelhandling, newval,
-					       oldval));
 
-  return result;
+  return __pthread_kill_internal (th, SIGCANCEL);
 }
 weak_alias (__pthread_cancel, pthread_cancel)
 
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 7c752d0f99..bc9dd2d965 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -402,7 +402,7 @@ START_THREAD_DEFN
   /* If the parent was running cancellation handlers while creating
      the thread the new thread inherited the signal mask.  Reset the
      cancellation signal mask.  */
-  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELING_BITMASK))
+  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELED_BITMASK))
     {
       sigset_t mask;
       __sigemptyset (&mask);
@@ -443,7 +443,8 @@ START_THREAD_DEFN
 	 have ownership (see CONCURRENCY NOTES above).  */
       if (__glibc_unlikely (pd->stopped_start))
 	{
-	  int oldtype = CANCEL_ASYNC ();
+	  int ct;
+	  __pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, &ct);
 
 	  /* Get the lock the parent locked to force synchronization.  */
 	  lll_lock (pd->lock, LLL_PRIVATE);
@@ -453,7 +454,7 @@ START_THREAD_DEFN
 	  /* And give it up right away.  */
 	  lll_unlock (pd->lock, LLL_PRIVATE);
 
-	  CANCEL_RESET (oldtype);
+	  __pthread_setcanceltype (ct, NULL);
 	}
 
       LIBC_PROBE (pthread_start, 3, (pthread_t) pd, pd->start_routine, pd->arg);
diff --git a/nptl/pthread_exit.c b/nptl/pthread_exit.c
index e9b5e62c86..059b30623d 100644
--- a/nptl/pthread_exit.c
+++ b/nptl/pthread_exit.c
@@ -16,18 +16,15 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <stdlib.h>
 #include "pthreadP.h"
 
-
-void
+_Noreturn void
 __pthread_exit (void *value)
 {
-  THREAD_SETMEM (THREAD_SELF, result, value);
-
-  __do_cancel ();
+  __do_cancel (value);
 }
 weak_alias (__pthread_exit, pthread_exit)
+hidden_def (__pthread_exit)
 
 /* After a thread terminates, __libc_start_main decrements
    __nptl_nthreads defined in pthread_create.c.  */
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
index 03e202136f..4d1898c52b 100644
--- a/nptl/pthread_join_common.c
+++ b/nptl/pthread_join_common.c
@@ -70,7 +70,8 @@ clockwait_tid (pid_t *tidp, clockid_t clockid, const struct timespec *abstime)
       /* If *tidp == tid, wait until thread terminates or the wait times out.
          The kernel up to version 3.16.3 does not use the private futex
          operations for futex wake-up when the clone terminates.  */
-      if (lll_futex_timed_wait_cancel (tidp, tid, &rt, LLL_SHARED)
+      if (lll_futex_timed_wait_cancel ((unsigned int *) tidp, tid, &rt,
+				       LLL_SHARED)
 	  == -ETIMEDOUT)
         return ETIMEDOUT;
     }
@@ -103,7 +104,7 @@ __pthread_clockjoin_ex (pthread_t threadid, void **thread_return,
   if ((pd == self
        || (self->joinid == pd
 	   && (pd->cancelhandling
-	       & (CANCELING_BITMASK | CANCELED_BITMASK | EXITING_BITMASK
+	       & (CANCELED_BITMASK | EXITING_BITMASK
 		  | TERMINATED_BITMASK)) == 0))
       && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
 	   && (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
@@ -145,7 +146,7 @@ __pthread_clockjoin_ex (pthread_t threadid, void **thread_return,
 	  /* We need acquire MO here so that we synchronize with the
 	     kernel's store to 0 when the clone terminates. (see above)  */
 	  while ((tid = atomic_load_acquire (&pd->tid)) != 0)
-	    lll_futex_wait_cancel (&pd->tid, tid, LLL_SHARED);
+	    lll_futex_wait_cancel ((unsigned int *) &pd->tid, tid, LLL_SHARED);
 	}
 
       pthread_cleanup_pop (0);
diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
index 73144a07ec..bf9e9bb81f 100644
--- a/nptl/pthread_kill.c
+++ b/nptl/pthread_kill.c
@@ -31,8 +31,9 @@ __pthread_kill (pthread_t threadid, int signo)
     /* Not a valid thread handle.  */
     return ESRCH;
 
-  return ENOSYS;
+  if (__is_internal_signal (signo))
+    return EINVAL;
+
+  return __pthread_kill_internal (threadid, signo);
 }
 strong_alias (__pthread_kill, pthread_kill)
-
-stub_warning (pthread_kill)
diff --git a/sysdeps/unix/sysv/linux/pthread_kill.c b/nptl/pthread_kill_internal.c
similarity index 75%
rename from sysdeps/unix/sysv/linux/pthread_kill.c
rename to nptl/pthread_kill_internal.c
index 4dfe08ffcd..f428a46f57 100644
--- a/sysdeps/unix/sysv/linux/pthread_kill.c
+++ b/nptl/pthread_kill_internal.c
@@ -16,24 +16,15 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <errno.h>
-#include <signal.h>
-#include <pthreadP.h>
-#include <tls.h>
-#include <sysdep.h>
 #include <unistd.h>
+#include <pthreadP.h>
 
-
+/* Used internally by pthread_cancel, so we can't filter SIGCANCEL.  */
 int
-__pthread_kill (pthread_t threadid, int signo)
+__pthread_kill_internal (pthread_t threadid, int signo)
 {
   struct pthread *pd = (struct pthread *) threadid;
 
-  /* Make sure the descriptor is valid.  */
-  if (DEBUGGING_P && INVALID_TD_P (pd))
-    /* Not a valid thread handle.  */
-    return ESRCH;
-
   /* Force load of pd->tid into local variable or register.  Otherwise
      if a thread exits between ESRCH test and tgkill, we might return
      EINVAL, because pd->tid would be cleared by the kernel.  */
@@ -42,11 +33,6 @@ __pthread_kill (pthread_t threadid, int signo)
     /* Not a valid thread handle.  */
     return ESRCH;
 
-  /* Disallow sending the signal we use for cancellation, timers,
-     for the setxid implementation.  */
-  if (signo == SIGCANCEL || signo == SIGTIMER || signo == SIGSETXID)
-    return EINVAL;
-
   /* We have a special syscall to do the work.  */
   pid_t pid = __getpid ();
 
@@ -54,4 +40,3 @@ __pthread_kill (pthread_t threadid, int signo)
   return (INTERNAL_SYSCALL_ERROR_P (val)
 	  ? INTERNAL_SYSCALL_ERRNO (val) : 0);
 }
-strong_alias (__pthread_kill, pthread_kill)
diff --git a/nptl/pthread_setcanceltype.c b/nptl/pthread_setcanceltype.c
index d8cb54736d..77e4adf537 100644
--- a/nptl/pthread_setcanceltype.c
+++ b/nptl/pthread_setcanceltype.c
@@ -37,4 +37,4 @@ __pthread_setcanceltype (int type, int *oldtype)
 
   return 0;
 }
-strong_alias (__pthread_setcanceltype, pthread_setcanceltype)
+weak_alias (__pthread_setcanceltype, pthread_setcanceltype)
diff --git a/nptl/pthread_testcancel.c b/nptl/pthread_testcancel.c
index 026c20f82e..584ff242be 100644
--- a/nptl/pthread_testcancel.c
+++ b/nptl/pthread_testcancel.c
@@ -23,16 +23,8 @@
 void
 __pthread_testcancel (void)
 {
-  struct pthread *self = THREAD_SELF;
-  int cancelhandling = THREAD_GETMEM (self, cancelhandling);
-  if (self->cancelstate == PTHREAD_CANCEL_ENABLE
-      && (cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
-			    | TERMINATED_BITMASK))
-	  == CANCELED_BITMASK)
-    {
-      THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-      __do_cancel ();
-    }
+  if (__pthread_self_cancelled ())
+    __do_cancel (PTHREAD_CANCELED);
 }
 weak_alias (__pthread_testcancel, pthread_testcancel)
 hidden_def (__pthread_testcancel)
diff --git a/nptl/sem_wait.c b/nptl/sem_wait.c
index 171716fdbc..1ba9926c0e 100644
--- a/nptl/sem_wait.c
+++ b/nptl/sem_wait.c
@@ -58,7 +58,7 @@ __old_sem_wait (sem_t *sem)
 	return 0;
 
       /* Always assume the semaphore is shared.  */
-      err = lll_futex_wait_cancel (futex, 0, LLL_SHARED);
+      err = lll_futex_wait_cancel ((unsigned int *) futex, 0, LLL_SHARED);
     }
   while (err == 0 || err == -EWOULDBLOCK);
 
diff --git a/nptl/tst-cancel29.c b/nptl/tst-cancel29.c
new file mode 100644
index 0000000000..1671cfe04f
--- /dev/null
+++ b/nptl/tst-cancel29.c
@@ -0,0 +1,100 @@
+/* Check side-effect act for cancellable syscalls (BZ #12683).
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* This testcase checks if there is resource leakage if the syscall has
+   returned from kernelspace, but before userspace saves the return
+   value.  The 'leaker' thread should be able to close the file descriptor
+   if the resource is already allocated, meaning that if the cancellation
+   signal arrives *after* the open syscal return from kernel, the
+   side-effect should be visible to application.  */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+
+#include <support/xunistd.h>
+#include <support/xthread.h>
+#include <support/check.h>
+#include <support/temp_file.h>
+#include <support/support.h>
+#include <support/descriptors.h>
+
+static void *
+writeopener (void *arg)
+{
+  int fd;
+  for (;;)
+    {
+      fd = open (arg, O_WRONLY);
+      xclose (fd);
+    }
+  return NULL;
+}
+
+static void *
+leaker (void *arg)
+{
+  int fd = open (arg, O_RDONLY);
+  TEST_VERIFY_EXIT (fd > 0);
+  pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, 0);
+  xclose (fd);
+  return NULL;
+}
+
+static int
+do_test (void)
+{
+  enum {
+    iter_count = 1000
+  };
+
+  char *dir = support_create_temp_directory ("tst-cancel28");
+  char *name = xasprintf ("%s/fifo", dir);
+  TEST_COMPARE (mkfifo (name, 0600), 0);
+  add_temp_file (name);
+
+  struct support_descriptors *descrs = support_descriptors_list ();
+
+  srand (1);
+
+  xpthread_create (NULL, writeopener, name);
+  for (int i = 0; i < iter_count; i++)
+    {
+      pthread_t td = xpthread_create (NULL, leaker, name);
+      struct timespec ts =
+	{ .tv_nsec = rand () % 100000, .tv_sec = 0 };
+      nanosleep (&ts, NULL);
+      /* Ignore pthread_cancel result because it might be the
+	 case when pthread_cancel is called when thread is already
+	 exited.  */
+      pthread_cancel (td);
+      xpthread_join (td);
+    }
+
+  support_descriptors_check (descrs);
+
+  support_descriptors_free (descrs);
+
+  free (name);
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/rt/Makefile b/rt/Makefile
index dab5d62a57..565b76c9c4 100644
--- a/rt/Makefile
+++ b/rt/Makefile
@@ -57,7 +57,7 @@ include ../Rules
 CFLAGS-aio_suspend.c += -fexceptions
 CFLAGS-mq_timedreceive.c += -fexceptions -fasynchronous-unwind-tables
 CFLAGS-mq_timedsend.c += -fexceptions -fasynchronous-unwind-tables
-CFLAGS-librt-cancellation.c += -fasynchronous-unwind-tables
+CFLAGS-clock_nanosleep.c += -fexceptions -fasynchronous-unwind-tables
 
 LDFLAGS-rt.so = -Wl,--enable-new-dtags,-z,nodelete
 
diff --git a/sysdeps/nptl/librt-cancellation.c b/sysdeps/generic/syscall_types.h
similarity index 70%
rename from sysdeps/nptl/librt-cancellation.c
rename to sysdeps/generic/syscall_types.h
index af1d11b2e6..cf11956fd7 100644
--- a/sysdeps/nptl/librt-cancellation.c
+++ b/sysdeps/generic/syscall_types.h
@@ -1,6 +1,6 @@
-/* Copyright (C) 2002-2020 Free Software Foundation, Inc.
+/* Types and macros used for syscall issuing.
+   Copyright (C) 2020 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
 
    The GNU C Library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -16,9 +16,10 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <nptl/pthreadP.h>
+#ifndef _SYSCALL_TYPES_H
+#define _SYSCALL_TYPES_H
 
+typedef long int __syscall_arg_t;
+#define __SSC(__x) ((__syscall_arg_t) (__x))
 
-#define __pthread_enable_asynccancel __librt_enable_asynccancel
-#define __pthread_disable_asynccancel __librt_disable_asynccancel
-#include <nptl/cancellation.c>
+#endif
diff --git a/sysdeps/generic/sysdep-cancel.h b/sysdeps/generic/sysdep-cancel.h
index d22a786536..5c84b4499a 100644
--- a/sysdeps/generic/sysdep-cancel.h
+++ b/sysdeps/generic/sysdep-cancel.h
@@ -3,5 +3,3 @@
 /* No multi-thread handling enabled.  */
 #define SINGLE_THREAD_P (1)
 #define RTLD_SINGLE_THREAD_P (1)
-#define LIBC_CANCEL_ASYNC()	0 /* Just a dummy value.  */
-#define LIBC_CANCEL_RESET(val)	((void)(val)) /* Nothing, but evaluate it.  */
diff --git a/sysdeps/nptl/Makefile b/sysdeps/nptl/Makefile
index 0631a870c8..30f9c8e91e 100644
--- a/sysdeps/nptl/Makefile
+++ b/sysdeps/nptl/Makefile
@@ -21,8 +21,7 @@ libpthread-sysdep_routines += errno-loc
 endif
 
 ifeq ($(subdir),rt)
-librt-sysdep_routines += timer_routines librt-cancellation
-CFLAGS-librt-cancellation.c += -fexceptions -fasynchronous-unwind-tables
+librt-sysdep_routines += timer_routines
 
 tests += tst-mqueue8x
 CFLAGS-tst-mqueue8x.c += -fexceptions
diff --git a/sysdeps/nptl/cancellation-pc-check.h b/sysdeps/nptl/cancellation-pc-check.h
new file mode 100644
index 0000000000..ae124fcced
--- /dev/null
+++ b/sysdeps/nptl/cancellation-pc-check.h
@@ -0,0 +1,53 @@
+/* Architecture specific code for pthread cancellation handling.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _NPTL_CANCELLATION_PC_CHECK
+#define _NPTL_CANCELLATION_PC_CHECK
+
+#include <sigcontextinfo.h>
+
+/* For syscalls with side-effects, the kernel cannot restart the syscall; when
+   it is interrupted by a signal, the kernel must cause the syscall to return
+   with whatever partial result is obtained (e.g. partial read or write).  In
+   this case, the saved program counter points just after the syscall
+   instruction, so the SIGCANCEL handler should not act on cancellation.
+
+   The __syscall_cancel_arch function, used for all cancellable syscalls,
+   contains two extra markers, __syscall_cancel_arch_start and
+   __syscall_cancel_arch_end.  The former points to just before the initial
+   conditional branch that checks if the thread has received a cancellation
+   request, while former points to the instruction after the one responsible
+   to issue the syscall.
+
+   The function check if the program counter (PC) from ucontext_t CTX is
+   within the start and then end boundary from the __syscall_cancel_arch
+   bridge.  Return TRUE if the PC is within the boundary, meaning the
+   syscall does not have any side effects; or FALSE otherwise.  */
+static bool
+cancellation_pc_check (void *ctx)
+{
+  /* Both are defined in syscall_cancel.S.  */
+  extern const char __syscall_cancel_arch_start[1];
+  extern const char __syscall_cancel_arch_end[1];
+
+  uintptr_t pc = sigcontext_get_pc (ctx);
+  return pc >= (uintptr_t) __syscall_cancel_arch_start
+	 && pc < (uintptr_t) __syscall_cancel_arch_end;
+}
+
+#endif
diff --git a/sysdeps/nptl/cancellation-sigmask.h b/sysdeps/nptl/cancellation-sigmask.h
new file mode 100644
index 0000000000..80c2a2d2f2
--- /dev/null
+++ b/sysdeps/nptl/cancellation-sigmask.h
@@ -0,0 +1,30 @@
+/* Architecture specific code for pthread cancellation handling.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _NPTL_CANCELLATION_SIGMASK_H
+#define _NPTL_CANCELLATION_SIGMASK_H
+
+/* Add the SIGCANCEL signal on sigmask set at the ucontext_t CTX obtained from
+   the sigaction handler.  */
+static void
+ucontext_block_sigcancel (void *ctx)
+{
+  __sigaddset (&((ucontext_t*) ctx)->uc_sigmask, SIGCANCEL);
+}
+
+#endif
diff --git a/sysdeps/nptl/futex-internal.h b/sysdeps/nptl/futex-internal.h
index d622122ddc..9bce768afc 100644
--- a/sysdeps/nptl/futex-internal.h
+++ b/sysdeps/nptl/futex-internal.h
@@ -178,10 +178,7 @@ static __always_inline int
 futex_wait_cancelable (unsigned int *futex_word, unsigned int expected,
 		       int private)
 {
-  int oldtype;
-  oldtype = __pthread_enable_asynccancel ();
-  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
-  __pthread_disable_asynccancel (oldtype);
+  int err = lll_futex_timed_wait_cancel (futex_word, expected, NULL, private);
   switch (err)
     {
     case 0:
@@ -239,10 +236,8 @@ futex_reltimed_wait_cancelable (unsigned int* futex_word,
 				unsigned int expected,
 			        const struct timespec* reltime, int private)
 {
-  int oldtype;
-  oldtype = LIBC_CANCEL_ASYNC ();
-  int err = lll_futex_timed_wait (futex_word, expected, reltime, private);
-  LIBC_CANCEL_RESET (oldtype);
+  int err = lll_futex_timed_wait_cancel (futex_word, expected, reltime,
+					 private);
   switch (err)
     {
     case 0:
@@ -315,12 +310,8 @@ futex_abstimed_wait_cancelable (unsigned int* futex_word,
      despite them being valid.  */
   if (__glibc_unlikely ((abstime != NULL) && (abstime->tv_sec < 0)))
     return ETIMEDOUT;
-  int oldtype;
-  oldtype = __pthread_enable_asynccancel ();
-  int err = lll_futex_clock_wait_bitset (futex_word, expected,
-					clockid, abstime,
-					private);
-  __pthread_disable_asynccancel (oldtype);
+  int err = lll_futex_clock_wait_bitset_cancel (futex_word, expected, clockid,
+						abstime, private);
   switch (err)
     {
     case 0:
diff --git a/sysdeps/nptl/lowlevellock-futex.h b/sysdeps/nptl/lowlevellock-futex.h
index 2209ca76a1..4b72deda95 100644
--- a/sysdeps/nptl/lowlevellock-futex.h
+++ b/sysdeps/nptl/lowlevellock-futex.h
@@ -174,21 +174,36 @@
 		     nr_wake, nr_move, mutex, val)
 
 /* Like lll_futex_wait, but acting as a cancellable entrypoint.  */
-# define lll_futex_wait_cancel(futexp, val, private) \
-  ({                                                                   \
-    int __oldtype = CANCEL_ASYNC ();				       \
-    long int __err = lll_futex_wait (futexp, val, LLL_SHARED);	       \
-    CANCEL_RESET (__oldtype);					       \
-    __err;							       \
-  })
+# define lll_futex_wait_cancel(futexp, val, private)			\
+  lll_futex_timed_wait_cancel (futexp, val, NULL, private)
 
 /* Like lll_futex_timed_wait, but acting as a cancellable entrypoint.  */
-# define lll_futex_timed_wait_cancel(futexp, val, timeout, private) \
-  ({									   \
-    int __oldtype = CANCEL_ASYNC ();				       	   \
-    long int __err = lll_futex_timed_wait (futexp, val, timeout, private); \
-    CANCEL_RESET (__oldtype);						   \
-    __err;								   \
+# define lll_futex_timed_wait_cancel(futexp, val, timeout, private) 	\
+  ({									\
+     int __op = __lll_private_flag (FUTEX_WAIT, private);		\
+     INTERNAL_SYSCALL_CANCEL (futex, futexp, __op, val, timeout);	\
+  })
+
+/* Like lll_futex_clock_wait_bitset, but acting as a cancellable
+   entrypoint.  */
+# define lll_futex_clock_wait_bitset_cancel(futexp, val, clockid, timeout, \
+					    private)			   \
+  ({									\
+    long int __ret;							\
+    if (lll_futex_supported_clockid (clockid))			  	\
+      {								 	\
+	const unsigned int __clockbit =				 	\
+	  (clockid == CLOCK_REALTIME) ? FUTEX_CLOCK_REALTIME : 0;       \
+	const int __op =						\
+	  __lll_private_flag (FUTEX_WAIT_BITSET | __clockbit, private); \
+									\
+	__ret = INTERNAL_SYSCALL_CANCEL (futex, futexp, __op, val,	\
+					 timeout, NULL,			\
+					 FUTEX_BITSET_MATCH_ANY);	\
+      }								 	\
+    else								\
+      __ret = -EINVAL;							\
+    __ret;								\
   })
 
 #endif  /* !__ASSEMBLER__  */
diff --git a/sysdeps/unix/sysdep.h b/sysdeps/unix/sysdep.h
index 3c687a717a..c19cffabee 100644
--- a/sysdeps/unix/sysdep.h
+++ b/sysdeps/unix/sysdep.h
@@ -15,6 +15,9 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
+#ifndef _SYSDEP_UNIX_H
+#define _SYSDEP_UNIX_H 1
+
 #include <sysdeps/generic/sysdep.h>
 #include <single-thread.h>
 #include <sys/syscall.h>
@@ -24,6 +27,9 @@
 #define	SYSCALL__(name, args)	PSEUDO (__##name, name, args)
 #define	SYSCALL(name, args)	PSEUDO (name, name, args)
 
+#ifndef __ASSEMBLER__
+# include <errno.h>
+
 #define __SYSCALL_CONCAT_X(a,b)     a##b
 #define __SYSCALL_CONCAT(a,b)       __SYSCALL_CONCAT_X (a, b)
 
@@ -57,6 +63,29 @@
 #define INTERNAL_SYSCALL_CALL(...) \
   __INTERNAL_SYSCALL_DISP (__INTERNAL_SYSCALL, __VA_ARGS__)
 
+#define __INTERNAL_SYSCALL_NCS0(name) \
+  INTERNAL_SYSCALL_NCS (name, 0)
+#define __INTERNAL_SYSCALL_NCS1(name, a1) \
+  INTERNAL_SYSCALL_NCS (name, 1, a1)
+#define __INTERNAL_SYSCALL_NCS2(name, a1, a2) \
+  INTERNAL_SYSCALL_NCS (name, 2, a1, a2)
+#define __INTERNAL_SYSCALL_NCS3(name, a1, a2, a3) \
+  INTERNAL_SYSCALL_NCS (name, 3, a1, a2, a3)
+#define __INTERNAL_SYSCALL_NCS4(name, a1, a2, a3, a4) \
+  INTERNAL_SYSCALL_NCS (name, 4, a1, a2, a3, a4)
+#define __INTERNAL_SYSCALL_NCS5(name, a1, a2, a3, a4, a5) \
+  INTERNAL_SYSCALL_NCS (name, 5, a1, a2, a3, a4, a5)
+#define __INTERNAL_SYSCALL_NCS6(name, a1, a2, a3, a4, a5, a6) \
+  INTERNAL_SYSCALL_NCS (name, 6, a1, a2, a3, a4, a5, a6)
+#define __INTERNAL_SYSCALL_NCS7(name, a1, a2, a3, a4, a5, a6, a7) \
+  INTERNAL_SYSCALL_NCS (name, 7, a1, a2, a3, a4, a5, a6, a7)
+
+/* Issue a syscall defined by syscall number plus any other argument required.
+   It is similar to INTERNAL_SYSCALL_NCS macro, but without the need to pass
+   the expected argument number as third parameter.  */
+#define INTERNAL_SYSCALL_NCS_CALL(...) \
+  __INTERNAL_SYSCALL_DISP (__INTERNAL_SYSCALL_NCS, __VA_ARGS__)
+
 #define __INLINE_SYSCALL0(name) \
   INLINE_SYSCALL (name, 0)
 #define __INLINE_SYSCALL1(name, a1) \
@@ -88,35 +117,68 @@
 #define INLINE_SYSCALL_CALL(...) \
   __INLINE_SYSCALL_DISP (__INLINE_SYSCALL, __VA_ARGS__)
 
-#define SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (SINGLE_THREAD_P) 						     \
-      sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
-  })
 
-/* Issue a syscall defined by syscall number plus any other argument
-   required.  Any error will be returned unmodified (including errno).  */
-#define INTERNAL_SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (SINGLE_THREAD_P) 						     \
-      sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
+/* Cancellation macros.  */
+#include <syscall_types.h>
+
+long int __syscall_cancel (__syscall_arg_t nr, __syscall_arg_t arg1,
+			   __syscall_arg_t arg2, __syscall_arg_t arg3,
+			   __syscall_arg_t arg4, __syscall_arg_t arg5,
+			   __syscall_arg_t arg6);
+libc_hidden_proto (__syscall_cancel);
+
+#define __SYSCALL_CANCEL0(name) \
+  __syscall_cancel (__NR_##name, 0, 0, 0, 0, 0, 0)
+#define __SYSCALL_CANCEL1(name, a1) \
+  __syscall_cancel (__NR_##name, __SSC (a1), 0, 0, 0, 0, 0)
+#define __SYSCALL_CANCEL2(name, a1, a2) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), 0, 0, 0, 0)
+#define __SYSCALL_CANCEL3(name, a1, a2, a3) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     0, 0, 0)
+#define __SYSCALL_CANCEL4(name, a1, a2, a3, a4) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     __SSC(a4), 0, 0)
+#define __SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     __SSC(a4), __SSC (a5), 0)
+#define __SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     __SSC (a4), __SSC (a5), __SSC (a6))
+
+#define __SYSCALL_CANCEL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
+#define __SYSCALL_CANCEL_NARGS(...) \
+  __SYSCALL_CANCEL_NARGS_X (__VA_ARGS__,7,6,5,4,3,2,1,0,)
+#define __SYSCALL_CANCEL_CONCAT_X(a,b)     a##b
+#define __SYSCALL_CANCEL_CONCAT(a,b)       __SYSCALL_CANCEL_CONCAT_X (a, b)
+#define __SYSCALL_CANCEL_DISP(b,...) \
+  __SYSCALL_CANCEL_CONCAT (b,__SYSCALL_CANCEL_NARGS(__VA_ARGS__))(__VA_ARGS__)
+
+#define __SYSCALL_CANCEL_CALL(...) \
+  __SYSCALL_CANCEL_DISP (__SYSCALL_CANCEL, __VA_ARGS__)
+
+/* Issue a cancellable syscall defined by syscall number NAME plus any other
+   argument required.  If an error occurs its value is returned as an negative
+   number unmodified and errno is not set.  */
+#define INTERNAL_SYSCALL_CANCEL(name, args...) \
+  __SYSCALL_CANCEL_CALL (name, args)
+
+/* Issue a cancellable syscall defined first argument plus any other argument
+   required.  If and error occurs its value, the macro returns -1 and sets
+   errno accordingly.  */
+#if IS_IN (rtld)
+/* The loader does not need to handle thread cancellation, use direct
+   syscall instead.  */
+# define SYSCALL_CANCEL(...) INLINE_SYSCALL_CALL (__VA_ARGS__)
+#else
+# define SYSCALL_CANCEL(...) \
+  ({									\
+    long int sc_ret = __SYSCALL_CANCEL_CALL (__VA_ARGS__);		\
+    SYSCALL_CANCEL_RET ((sc_ret));					\
   })
+#endif
+
+#endif /* __ASSEMBLER__  */
 
 /* Machine-dependent sysdep.h files are expected to define the macro
    PSEUDO (function_name, syscall_name) to emit assembly code to define the
@@ -146,3 +208,5 @@
 #ifndef INLINE_SYSCALL
 #define INLINE_SYSCALL(name, nr, args...) __syscall_##name (args)
 #endif
+
+#endif /* _SYSDEP_UNIX_H  */
diff --git a/sysdeps/unix/sysv/linux/socketcall.h b/sysdeps/unix/sysv/linux/socketcall.h
index 75c2a6404d..64af566e18 100644
--- a/sysdeps/unix/sysv/linux/socketcall.h
+++ b/sysdeps/unix/sysv/linux/socketcall.h
@@ -87,18 +87,32 @@
   })
 
 
-#if IS_IN (libc)
-# define __pthread_enable_asynccancel  __libc_enable_asynccancel
-# define __pthread_disable_asynccancel __libc_disable_asynccancel
-#endif
-
-#define SOCKETCALL_CANCEL(name, args...)				\
-  ({									\
-    int oldtype = LIBC_CANCEL_ASYNC ();					\
-    long int sc_ret = __SOCKETCALL (SOCKOP_##name, args);		\
-    LIBC_CANCEL_RESET (oldtype);					\
-    sc_ret;								\
-  })
-
+#define __SOCKETCALL_CANCEL1(__name, __a1) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [1]) { (long int) __a1 }))
+#define __SOCKETCALL_CANCEL2(__name, __a1, __a2) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [2]) { (long int) __a1, (long int) __a2 }))
+#define __SOCKETCALL_CANCEL3(__name, __a1, __a2, __a3) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [3]) { (long int) __a1, (long int) __a2, (long int) __a3 }))
+#define __SOCKETCALL_CANCEL4(__name, __a1, __a2, __a3, __a4) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [4]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4 }))
+#define __SOCKETCALL_CANCEL5(__name, __a1, __a2, __a3, __a4, __a5) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [5]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5 }))
+#define __SOCKETCALL_CANCEL6(__name, __a1, __a2, __a3, __a4, __a5, __a6) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [6]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5, (long int) __a6 }))
+
+#define __SOCKETCALL_CANCEL(...) __SOCKETCALL_DISP (__SOCKETCALL_CANCEL,\
+						    __VA_ARGS__)
+
+#define SOCKETCALL_CANCEL(name, args...) \
+   __SOCKETCALL_CANCEL (SOCKOP_##name, args)
 
 #endif /* sys/socketcall.h */
diff --git a/sysdeps/unix/sysv/linux/syscall_cancel.c b/sysdeps/unix/sysv/linux/syscall_cancel.c
new file mode 100644
index 0000000000..003e485b5c
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/syscall_cancel.c
@@ -0,0 +1,62 @@
+/* Default cancellation syscall bridge.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <pthreadP.h>
+
+#warning "This implementation should be use just as reference or for bootstrapping"
+
+/* This is the generic version of the cancellable syscall code which
+   adds the label guards (__syscall_cancel_arch_{start,end}) used
+   on SIGCANCEL sigcancel_handler (nptl-init.c) to check if the cancelled
+   syscall have side-effects that need to be signaled to program.
+
+   This implementation should be used a reference one to document the
+   implementation constraints: the __syscall_cancel_arch_end should point
+   to the immediate next instruction after the syscall one.  This is because
+   kernel will signal interrupted syscall with side effects by setting
+   the signal frame program counter (on the ucontext_t third argument from
+   SA_SIGINFO signal handler) right after the syscall instruction.
+
+   If the INTERNAL_SYSCALL_NCS macro use more instructions to get the
+   error condition from kernel (as for powerpc and sparc), uses an
+   out of the line helper (as for ARM thumb), or uses a kernel helper
+   gate (as for i686 or ia64) the architecture should adjust the
+   macro or provide a custom __syscall_cancel_arch implementation.   */
+long int
+__syscall_cancel_arch (volatile int *ch, __syscall_arg_t nr,
+		       __syscall_arg_t a1, __syscall_arg_t a2,
+		       __syscall_arg_t a3, __syscall_arg_t a4,
+		       __syscall_arg_t a5, __syscall_arg_t a6)
+{
+#define ADD_LABEL(__label)		\
+  asm volatile (			\
+    ".global " __label "\t\n"		\
+    __label ":\n");
+
+  ADD_LABEL ("__syscall_cancel_arch_start");
+  if (__glibc_unlikely (*ch & CANCELED_BITMASK))
+    __syscall_do_cancel();
+
+  long int result = INTERNAL_SYSCALL_NCS_CALL (nr, a1, a2, a3, a4, a5, a6);
+  ADD_LABEL ("__syscall_cancel_arch_end");
+  if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (result)))
+    return -INTERNAL_SYSCALL_ERRNO (result);
+  return result;
+}
+libc_hidden_def (__syscall_cancel_arch)
diff --git a/sysdeps/unix/sysv/linux/sysdep-cancel.h b/sysdeps/unix/sysv/linux/sysdep-cancel.h
index 61d3348768..20824bc096 100644
--- a/sysdeps/unix/sysv/linux/sysdep-cancel.h
+++ b/sysdeps/unix/sysv/linux/sysdep-cancel.h
@@ -21,47 +21,5 @@
 #define _SYSDEP_CANCEL_H
 
 #include <sysdep.h>
-#include <tls.h>
-#include <errno.h>
-
-/* The two functions are in libc.so and not exported.  */
-extern int __libc_enable_asynccancel (void) attribute_hidden;
-extern void __libc_disable_asynccancel (int oldtype) attribute_hidden;
-
-/* The two functions are in librt.so and not exported.  */
-extern int __librt_enable_asynccancel (void) attribute_hidden;
-extern void __librt_disable_asynccancel (int oldtype) attribute_hidden;
-
-/* The two functions are in libpthread.so and not exported.  */
-extern int __pthread_enable_asynccancel (void) attribute_hidden;
-extern void __pthread_disable_asynccancel (int oldtype) attribute_hidden;
-
-/* Set cancellation mode to asynchronous.  */
-#define CANCEL_ASYNC() \
-  __pthread_enable_asynccancel ()
-/* Reset to previous cancellation mode.  */
-#define CANCEL_RESET(oldtype) \
-  __pthread_disable_asynccancel (oldtype)
-
-#if IS_IN (libc)
-/* Same as CANCEL_ASYNC, but for use in libc.so.  */
-# define LIBC_CANCEL_ASYNC() \
-  __libc_enable_asynccancel ()
-/* Same as CANCEL_RESET, but for use in libc.so.  */
-# define LIBC_CANCEL_RESET(oldtype) \
-  __libc_disable_asynccancel (oldtype)
-#elif IS_IN (libpthread)
-# define LIBC_CANCEL_ASYNC() CANCEL_ASYNC ()
-# define LIBC_CANCEL_RESET(val) CANCEL_RESET (val)
-#elif IS_IN (librt)
-# define LIBC_CANCEL_ASYNC() \
-  __librt_enable_asynccancel ()
-# define LIBC_CANCEL_RESET(val) \
-  __librt_disable_asynccancel (val)
-#else
-# define LIBC_CANCEL_ASYNC()	0 /* Just a dummy value.  */
-# define LIBC_CANCEL_RESET(val)	((void)(val)) /* Nothing, but evaluate it.  */
-#endif
-
 
 #endif
diff --git a/sysdeps/unix/sysv/linux/sysdep.h b/sysdeps/unix/sysv/linux/sysdep.h
index 5e7b6c5765..8442de9b13 100644
--- a/sysdeps/unix/sysv/linux/sysdep.h
+++ b/sysdeps/unix/sysv/linux/sysdep.h
@@ -58,6 +58,15 @@
     -1l;					\
   })
 
+/* The return error from cancellable syscall has the same semantic as non
+   cancellable ones.  */
+#define SYSCALL_CANCEL_RET(__ret)				\
+  ({								\
+    __glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (__ret))		\
+    ? SYSCALL_ERROR_LABEL (INTERNAL_SYSCALL_ERRNO (__ret))	\
+    : __ret;							\
+   })
+
 /* Provide a dummy argument that can be used to force register
    alignment for register pairs if required by the syscall ABI.  */
 #ifdef __ASSUME_ALIGNED_REGISTER_PAIRS


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [glibc/azanella/bz12683] nptl: Fix Race conditions in pthread cancellation [BZ#12683]
@ 2020-04-03 20:23 Adhemerval Zanella
  0 siblings, 0 replies; 6+ messages in thread
From: Adhemerval Zanella @ 2020-04-03 20:23 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=09cfa7663adeebc51da120b56270ebfdf67682ed

commit 09cfa7663adeebc51da120b56270ebfdf67682ed
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Fri Sep 18 18:26:35 2015 -0300

    nptl: Fix Race conditions in pthread cancellation [BZ#12683]
    
    This patch is the initial fix for race conditions in NPTL cancellation
    code by redefining how cancellable syscalls are defined and handled.
    The current buggy approach is to enable asynchronous cancellation
    before making the syscall and restore the previous cancellation
    type once the syscall returns.
    
    As described in BZ#12683, this approach shows 2 important problems:
    
      1. Cancellation can act after the syscall has returned from the
         kernel, but before userspace saves the return value.  It might
         result in a resource leak if the syscall allocated a resource or a
         side effect (partial read/write), and there is no way to program
         handle it with cancellation handlers.
    
      2. If a signal is handled while the thread is blocked at a cancellable
         syscall, the entire signal handler runs with asynchronous
         cancellation enabled.  This can lead to issues if the signal
         handler call functions which are async-signal-safe but not
         async-cancel-safe.
    
    For the cancellation to work correctly, there are 5 points at which the
    cancellation signal could arrive:
    
      1. Before the final "testcancel" and before the syscall is made.
      2. Between the "testcancel" and the syscall.
      3. While the syscall is blocked and no side effects have yet taken
         place.
      4. While the syscall is blocked but with some side effects already
         having taken place (e.g. a partial read or write).
      5. After the syscall has returned.
    
    And GLIBC wants to act on cancellation in cases 1, 2, and 3 but not
    in cases 4 or 5.  For the 4 and 5 cases, the cancellation will eventually
    happen in the next cancellable entry point without any further external
    event.
    
    The proposed solution follows for each case:
    
      1. Do a conditional branch based on whether the thread has received
         a cancellation request;
    
      2. It can be caught by the signal handler determining that the saved
         program counter (from the ucontext_t) is in some address range
         beginning just before the "testcancel" and ending with the
         syscall instruction.
    
      3. In this case, except for certain syscalls that ALWAYS fail with
         EINTR even for non-interrupting signals, the kernel will reset
         the program counter to point at the syscall instruction during
         signal handling, so that the syscall is restarted when the signal
         handler returns.  So, from the signal handler's standpoint, this
         looks the same as case 2, and thus it's taken care of.
    
      4. For syscalls with side-effects, the kernel cannot restart the
         syscall; when it's interrupted by a signal, the kernel must cause
         the syscall to return with whatever partial result is obtained
         (e.g. partial read or write).
    
      5. In this case, the saved program counter points just after the
         syscall instruction, so the signal handler won't act on
         cancellation.  This is similar to 4. since the program counter
         is past the syscall instruction.
    
    Another case that needs handling is syscalls that fail with EINTR even
    when the signal handler is non-interrupting. In this case, the syscall
    wrapper code can just check the cancellation flag when the errno result
    is EINTR, and act on cancellation if it's set.
    
    The proposed GLIBC adjustments are:
    
      1. Remove the enable_asynccancel/disable_asynccancel function usage in
         syscall definition and instead make them call a common symbol that
         will check if cancellation is enabled (__syscall_cancel at
         nptl/libc-cancellation.c), call the arch-specific cancellable
         entry-point (__syscall_cancel_arch) and cancel the thread when
         required.
    
      2. Provide an arch-specific generic system call wrapper function
         that contains global markers.  These markers will be used in
         SIGCANCEL handler to check if the interruption has been called in a
         valid syscall and if the syscalls have been completed or not.
    
         A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
         is provided.  However, the markers may not be set on correct
         expected places depending on how INTERNAL_SYSCALL_NCS is
         implemented by the architecture and it uses compiler-specific
         construct (asm volatile) to place the required markers.
         It is expected that all architectures add an arch-specific
         implementation.
    
      3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
         type and if current IP from signal handler falls between the global
         markers and act accordingly (sigcancel_handler at nptl/nptl-init.c).
    
      4. Adjust nptl/pthread_cancel.c to send a signal instead of acting
         directly. This avoids synchronization issues when updating the
         cancellation status and also focuses the logic on the signal
         handler and cancellation syscall code.
    
      5. Adjust pthread code to replace CANCEL_ASYNC/CANCEL_RESET calls to
         appropriated cancelable futex syscalls.
    
      6. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
         appropriated cancelable syscalls.
    
      7. Adjust 'lowlevellock-futex.h' arch-specific implementations to
         provide cancelable futex calls (used in libpthread code).
    
    This patch adds the proposed changes to NPTL common code and the
    following patches add the requires arch-specific bits.  The build for
    ia64-linux-gnu, mips-*, and x86_64-* are broken without the
    arch-specific patches.
    
    As a side note regarding SIGCANCEL and SIGTIMER being the the same,
    it should not impact timer_create functionality.  It arranges for
    SIGCANCEL/SIGTIMER to be sent to the internal helper thread, which
    in turn check if the si.si_code is SI_TIMER and call pthread_exit
    otherwise (sysdeps/unix/sysv/linux/timer_routines.c:129).
    
    This suggests that the helper thread does NOT depend on EINTR
    being generated for SIGCANCEL/SIGTIMER, and it should be fine to use
    SA_RESTART for that signal as far as timer_create is concerned.

Diff:
---
 manual/llio.texi                                   |   4 +-
 nptl/Makefile                                      |  10 +-
 nptl/Versions                                      |   4 +
 nptl/cancellation.c                                |  78 --------------
 nptl/descr.h                                       |   3 -
 nptl/libc-cancellation.c                           |  43 +++++++-
 nptl/nptl-init.c                                   |  92 ++++++++--------
 nptl/pthreadP.h                                    |  57 +++++++---
 nptl/pthread_cancel.c                              |  66 +++---------
 nptl/pthread_create.c                              |   7 +-
 nptl/pthread_exit.c                                |   9 +-
 nptl/pthread_join_common.c                         |   7 +-
 nptl/pthread_kill.c                                |   7 +-
 .../pthread_kill.c => nptl/pthread_kill_internal.c |  21 +---
 nptl/pthread_setcanceltype.c                       |   2 +-
 nptl/pthread_testcancel.c                          |  12 +--
 nptl/sem_wait.c                                    |   2 +-
 nptl/tst-cancel29.c                                | 100 +++++++++++++++++
 rt/Makefile                                        |   2 +-
 .../syscall_types.h}                               |  13 +--
 sysdeps/generic/sysdep-cancel.h                    |   2 -
 sysdeps/nptl/Makefile                              |   3 +-
 sysdeps/nptl/cancellation-pc-check.h               |  53 +++++++++
 sysdeps/nptl/cancellation-sigmask.h                |  30 ++++++
 sysdeps/nptl/futex-internal.h                      |  19 +---
 sysdeps/nptl/lowlevellock-futex.h                  |  41 ++++---
 sysdeps/unix/sysdep.h                              | 118 ++++++++++++++++-----
 sysdeps/unix/sysv/linux/socketcall.h               |  40 ++++---
 sysdeps/unix/sysv/linux/syscall_cancel.c           |  62 +++++++++++
 sysdeps/unix/sysv/linux/sysdep-cancel.h            |  42 --------
 sysdeps/unix/sysv/linux/sysdep.h                   |   9 ++
 31 files changed, 591 insertions(+), 367 deletions(-)

diff --git a/manual/llio.texi b/manual/llio.texi
index fe59002915..c02ee83428 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -2534,13 +2534,13 @@ aiocb64}, since the LFS transparently replaces the old interface.
 @c     sigemptyset ok
 @c     sigaddset ok
 @c     setjmp ok
-@c     CANCEL_ASYNC -> pthread_enable_asynccancel ok
+@c     __pthread_setcanceltype ok
 @c      do_cancel ok
 @c       pthread_unwind ok
 @c        Unwind_ForcedUnwind or longjmp ok [@ascuheap @acsmem?]
 @c     lll_lock @asulock @aculock
 @c     lll_unlock @asulock @aculock
-@c     CANCEL_RESET -> pthread_disable_asynccancel ok
+@c     __pthread_setcanceltype ok
 @c      lll_futex_wait ok
 @c     ->start_routine ok -----
 @c     call_tls_dtors @asulock @ascuheap @aculock @acsmem
diff --git a/nptl/Makefile b/nptl/Makefile
index e554a3898d..0cb724eb76 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -60,6 +60,7 @@ routines = \
   pthread_self \
   pthread_setschedparam \
   register-atfork \
+  syscall_cancel
 
 shared-only-routines = forward
 static-only-routines = pthread_atfork
@@ -123,7 +124,8 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
 		      pthread_barrierattr_setpshared \
 		      pthread_key_create pthread_key_delete \
 		      pthread_getspecific pthread_setspecific \
-		      pthread_sigmask pthread_kill pthread_sigqueue \
+		      pthread_sigmask pthread_kill pthread_kill_internal \
+		      pthread_sigqueue \
 		      pthread_cancel pthread_testcancel \
 		      pthread_setcancelstate pthread_setcanceltype \
 		      pthread_once \
@@ -137,7 +139,6 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
 		      cleanup cleanup_defer cleanup_compat \
 		      cleanup_defer_compat unwind \
 		      pt-longjmp pt-cleanup\
-		      cancellation \
 		      lowlevellock \
 		      lll_timedlock_wait \
 		      pt-fork pt-fcntl \
@@ -187,8 +188,7 @@ CFLAGS-pthread_setcanceltype.c += -fexceptions -fasynchronous-unwind-tables
 
 # These are internal functions which similar functionality as setcancelstate
 # and setcanceltype.
-CFLAGS-cancellation.c += -fasynchronous-unwind-tables
-CFLAGS-libc-cancellation.c += -fasynchronous-unwind-tables
+CFLAGS-libc-cancellation.c += -fexceptions -fasynchronous-unwind-tables
 
 # Calling pthread_exit() must cause the registered cancel handlers to
 # be executed.  Therefore exceptions have to be thrown through this
@@ -286,7 +286,7 @@ tests = tst-attr2 tst-attr3 tst-default-attr \
 	tst-cancel11 tst-cancel12 tst-cancel13 tst-cancel14 tst-cancel15 \
 	tst-cancel16 tst-cancel17 tst-cancel18 tst-cancel19 tst-cancel20 \
 	tst-cancel21 tst-cancel22 tst-cancel23 tst-cancel24 \
-	tst-cancel26 tst-cancel27 tst-cancel28 \
+	tst-cancel26 tst-cancel27 tst-cancel28 tst-cancel29 \
 	tst-cancel-self tst-cancel-self-cancelstate \
 	tst-cancel-self-canceltype tst-cancel-self-testcancel \
 	tst-cleanup0 tst-cleanup1 tst-cleanup2 tst-cleanup3 tst-cleanup4 \
diff --git a/nptl/Versions b/nptl/Versions
index 543dddc4ee..41d732e1dd 100644
--- a/nptl/Versions
+++ b/nptl/Versions
@@ -41,6 +41,10 @@ libc {
     __libc_allocate_rtsig_private;
     # Used by the C11 threads implementation.
     __pthread_cond_destroy; __pthread_cond_init;
+    # Used by pthread cancellation.
+    __syscall_cancel;
+    __syscall_cancel_arch_start;
+    __syscall_cancel_arch_end;
   }
 }
 
diff --git a/nptl/cancellation.c b/nptl/cancellation.c
deleted file mode 100644
index 7127f9ae91..0000000000
--- a/nptl/cancellation.c
+++ /dev/null
@@ -1,78 +0,0 @@
-/* Copyright (C) 2002-2020 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <setjmp.h>
-#include <stdlib.h>
-#include "pthreadP.h"
-#include <futex-internal.h>
-
-
-/* The next two functions are similar to pthread_setcanceltype() but
-   more specialized for the use in the cancelable functions like write().
-   They do not need to check parameters etc.  These functions must be
-   AS-safe, with the exception of the actual cancellation, because they
-   are called by wrappers around AS-safe functions like write().*/
-int
-attribute_hidden
-__pthread_enable_asynccancel (void)
-{
-  struct pthread *self = THREAD_SELF;
-
-  int oldval = THREAD_GETMEM (self, canceltype);
-  THREAD_SETMEM (self, canceltype, PTHREAD_CANCEL_ASYNCHRONOUS);
-
-  int ch = THREAD_GETMEM (self, cancelhandling);
-
-  if (self->cancelstate == PTHREAD_CANCEL_ENABLE
-      && (ch & (CANCELED_BITMASK | EXITING_BITMASK | TERMINATED_BITMASK))
-	  == CANCELED_BITMASK)
-    {
-      THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-      __do_cancel ();
-    }
-
-  return oldval;
-}
-
-/* See the comment for __pthread_enable_asynccancel regarding
-   the AS-safety of this function.  */
-void
-attribute_hidden
-__pthread_disable_asynccancel (int oldtype)
-{
-  /* If asynchronous cancellation was enabled before we do not have
-     anything to do.  */
-  if (oldtype == PTHREAD_CANCEL_ASYNCHRONOUS)
-    return;
-
-  struct pthread *self = THREAD_SELF;
-  THREAD_SETMEM (self, canceltype, PTHREAD_CANCEL_DEFERRED);
-
-  /* We cannot return when we are being canceled.  Upon return the
-     thread might be things which would have to be undone.  The
-     following loop should loop until the cancellation signal is
-     delivered.  */
-  int ch = THREAD_GETMEM (self, cancelhandling);
-  while (__glibc_unlikely ((ch & (CANCELING_BITMASK | CANCELED_BITMASK))
-			    == CANCELING_BITMASK))
-    {
-      futex_wait_simple ((unsigned int *) &self->cancelhandling, ch,
-			 FUTEX_PRIVATE);
-      ch = THREAD_GETMEM (self, cancelhandling);
-    }
-}
diff --git a/nptl/descr.h b/nptl/descr.h
index bae9457e33..c0b8f6c40e 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -269,9 +269,6 @@ struct pthread
 
   /* Flags determining processing of cancellation.  */
   int cancelhandling;
-  /* Bit set if canceling has been initiated.  */
-#define CANCELING_BIT		2
-#define CANCELING_BITMASK	(0x01 << CANCELING_BIT)
   /* Bit set if canceled.  */
 #define CANCELED_BIT		3
 #define CANCELED_BITMASK	(0x01 << CANCELED_BIT)
diff --git a/nptl/libc-cancellation.c b/nptl/libc-cancellation.c
index eae81d504c..e695d67417 100644
--- a/nptl/libc-cancellation.c
+++ b/nptl/libc-cancellation.c
@@ -18,7 +18,44 @@
 
 #include "pthreadP.h"
 
+/* Cancellation function called by all cancellable syscalls.  */
+long int
+__syscall_cancel (__syscall_arg_t nr, __syscall_arg_t a1,
+		  __syscall_arg_t a2, __syscall_arg_t a3,
+		  __syscall_arg_t a4, __syscall_arg_t a5,
+		  __syscall_arg_t a6)
+{
+  struct pthread *pd = THREAD_SELF;
+  long int result;
 
-#define __pthread_enable_asynccancel __libc_enable_asynccancel
-#define __pthread_disable_asynccancel __libc_disable_asynccancel
-#include <nptl/cancellation.c>
+  /* If cancellation is not enabled, call the syscall directly.  */
+  if (pd->cancelstate == PTHREAD_CANCEL_DISABLE)
+    {
+      result = INTERNAL_SYSCALL_NCS_CALL (nr, a1, a2, a3, a4, a5, a6);
+      if (INTERNAL_SYSCALL_ERROR_P (result))
+	return -INTERNAL_SYSCALL_ERRNO (result);
+      return result;
+    }
+
+  /* Call the arch-specific entry points that contains the globals markers
+     to be checked by SIGCANCEL handler.  */
+  result = __syscall_cancel_arch (&pd->cancelhandling, nr, a1, a2, a3, a4, a5,
+			          a6);
+
+  if (result == -EINTR
+      && __pthread_self_cancelled ()
+      && pd->cancelstate == PTHREAD_CANCEL_ENABLE)
+    __do_cancel (PTHREAD_CANCELED);
+
+  return result;
+}
+libc_hidden_def (__syscall_cancel)
+
+/* Since __do_cancel is a always inline function, this creates a symbol the
+   arch-specific symbol can call to cancel the thread.  */
+_Noreturn void
+attribute_hidden
+__syscall_do_cancel (void)
+{
+  __do_cancel (PTHREAD_CANCELED);
+}
diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
index 89f58d2e5e..75225965c4 100644
--- a/nptl/nptl-init.c
+++ b/nptl/nptl-init.c
@@ -39,6 +39,9 @@
 #include <libc-pointer-arith.h>
 #include <pthread-pids.h>
 #include <pthread_mutex_conf.h>
+#include <sigcontextinfo.h>
+#include <cancellation-sigmask.h>
+#include <cancellation-pc-check.h>
 
 #ifndef TLS_MULTIPLE_THREADS_IN_TCB
 /* Pointer to the corresponding variable in libc.  */
@@ -137,35 +140,23 @@ sigcancel_handler (int sig, siginfo_t *si, void *ctx)
 
   struct pthread *self = THREAD_SELF;
 
-  int oldval = THREAD_GETMEM (self, cancelhandling);
-  while (1)
-    {
-      /* We are canceled now.  When canceled by another thread this flag
-	 is already set but if the signal is directly send (internally or
-	 from another process) is has to be done here.  */
-      int newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
-
-      if (oldval == newval || (oldval & EXITING_BITMASK) != 0)
-	/* Already canceled or exiting.  */
-	break;
-
-      int curval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling, newval,
-					      oldval);
-      if (curval == oldval)
-	{
-	  /* Set the return value.  */
-	  THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-
-	  /* Make sure asynchronous cancellation is still enabled.  */
-	  if (self->canceltype == PTHREAD_CANCEL_ASYNCHRONOUS)
-	    /* Run the registered destructors and terminate the thread.  */
-	    __do_cancel ();
-
-	  break;
-	}
-
-      oldval = curval;
-    }
+  if (!__pthread_self_cancelled ()
+      || self->cancelstate == PTHREAD_CANCEL_DISABLE)
+    return;
+
+  /* Add SIGCANCEL on ignored sigmask to avoid the handler to be called
+     again.  */
+  ucontext_block_sigcancel (ctx);
+
+  /* Check if asynchronous cancellation mode is set or if interrupted
+     instruction pointer falls within the cancellable syscall bridge.  For
+     interruptable syscalls that might generate external side-effects (partial
+     reads or writes, for instance), the kernel will set the IP to after
+     '__syscall_cancel_arch_end', thus disabling the cancellation and allowing
+     the process to handle such conditions.  */
+  if (self->canceltype == PTHREAD_CANCEL_ASYNCHRONOUS
+      || cancellation_pc_check (ctx))
+    __do_cancel (PTHREAD_CANCELED);
 }
 
 
@@ -260,28 +251,39 @@ __pthread_initialize_minimal_internal (void)
      had to set __nptl_initial_report_events.  Propagate its setting.  */
   THREAD_SETMEM (pd, report_events, __nptl_initial_report_events);
 
-  struct sigaction sa;
-  __sigemptyset (&sa.sa_mask);
-
   /* Install the cancellation signal handler.  If for some reason we
      cannot install the handler we do not abort.  Maybe we should, but
      it is only asynchronous cancellation which is affected.  */
-  sa.sa_sigaction = sigcancel_handler;
-  sa.sa_flags = SA_SIGINFO;
-  (void) __libc_sigaction (SIGCANCEL, &sa, NULL);
+  {
+    struct sigaction sa;
+    sa.sa_sigaction = sigcancel_handler;
+    /* The signal handle should be non-interruptible to avoid the risk of
+       spurious EINTR caused by SIGCANCEL sent to process or if pthread_cancel
+       is called while cancellation is disabled in the target thread.  */
+    sa.sa_flags = SA_SIGINFO | SA_RESTART;
+    sa.sa_mask = sigall_set;
+    __libc_sigaction (SIGCANCEL, &sa, NULL);
+  }
 
-  /* Install the handle to change the threads' uid/gid.  */
-  sa.sa_sigaction = sighandler_setxid;
-  sa.sa_flags = SA_SIGINFO | SA_RESTART;
-  (void) __libc_sigaction (SIGSETXID, &sa, NULL);
+  {
+    /* Install the handle to change the threads' uid/gid.  */
+    struct sigaction sa;
+    __sigemptyset (&sa.sa_mask);
+    sa.sa_sigaction = sighandler_setxid;
+    sa.sa_flags = SA_SIGINFO | SA_RESTART;
+    __libc_sigaction (SIGSETXID, &sa, NULL);
+  }
 
   /* The parent process might have left the signals blocked.  Just in
-     case, unblock it.  We reuse the signal mask in the sigaction
-     structure.  It is already cleared.  */
-  __sigaddset (&sa.sa_mask, SIGCANCEL);
-  __sigaddset (&sa.sa_mask, SIGSETXID);
-  INTERNAL_SYSCALL_CALL (rt_sigprocmask, SIG_UNBLOCK, &sa.sa_mask,
-			 NULL, _NSIG / 8);
+     case, unblock it.  */
+  {
+    struct sigaction sa;
+    __sigemptyset (&sa.sa_mask);
+    __sigaddset (&sa.sa_mask, SIGCANCEL);
+    __sigaddset (&sa.sa_mask, SIGSETXID);
+    INTERNAL_SYSCALL_CALL (rt_sigprocmask, SIG_UNBLOCK, &sa.sa_mask,
+			   NULL, _NSIG / 8);
+  }
 
   /* Get the size of the static and alignment requirements for the TLS
      block.  */
diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index d55c3b26a4..a20b136f14 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -286,20 +286,13 @@ extern void __nptl_unwind_freeres (void) attribute_hidden;
 #endif
 
 
-/* Called when a thread reacts on a cancellation request.  */
-static inline void
-__attribute ((noreturn, always_inline))
-__do_cancel (void)
-{
-  struct pthread *self = THREAD_SELF;
-
-  /* Make sure we get no more cancellations.  */
-  THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT);
-
-  __pthread_unwind ((__pthread_unwind_buf_t *)
-		    THREAD_GETMEM (self, cleanup_jmp_buf));
-}
+extern long int __syscall_cancel_arch (volatile int *, __syscall_arg_t nr,
+     __syscall_arg_t arg1, __syscall_arg_t arg2, __syscall_arg_t arg3,
+     __syscall_arg_t arg4, __syscall_arg_t arg5, __syscall_arg_t arg6);
+libc_hidden_proto (__syscall_cancel_arch);
 
+extern _Noreturn void __syscall_do_cancel (void)
+     attribute_hidden;
 
 /* Internal prototypes.  */
 
@@ -461,12 +454,13 @@ extern int __pthread_equal (pthread_t thread1, pthread_t thread2);
 extern int __pthread_detach (pthread_t th);
 extern int __pthread_cancel (pthread_t th);
 extern int __pthread_kill (pthread_t threadid, int signo);
-extern void __pthread_exit (void *value) __attribute__ ((__noreturn__));
+extern int __pthread_kill_internal (pthread_t threadid, int signo)
+  attribute_hidden;
+extern void _Noreturn __pthread_exit (void *value);
 extern int __pthread_join (pthread_t threadid, void **thread_return);
 extern int __pthread_setcanceltype (int type, int *oldtype);
-extern int __pthread_enable_asynccancel (void) attribute_hidden;
-extern void __pthread_disable_asynccancel (int oldtype) attribute_hidden;
 extern void __pthread_testcancel (void);
+extern void __pthread_exit (void *value);
 extern int __pthread_clockjoin_ex (pthread_t, void **, clockid_t,
 				   const struct timespec *, bool)
   attribute_hidden;
@@ -487,10 +481,41 @@ hidden_proto (__pthread_setspecific)
 hidden_proto (__pthread_once)
 hidden_proto (__pthread_setcancelstate)
 hidden_proto (__pthread_testcancel)
+hidden_proto (__pthread_exit)
 hidden_proto (__pthread_mutexattr_init)
 hidden_proto (__pthread_mutexattr_settype)
 #endif
 
+/* Called when a thread reacts on a cancellation request.  */
+_Noreturn static inline void
+__do_cancel (void *value)
+{
+  struct pthread *self = THREAD_SELF;
+
+  /* Make sure we get no more cancellations by clearing the cancel
+     state.  */
+  THREAD_SETMEM (self, cancelstate, PTHREAD_CANCEL_DISABLE);
+  THREAD_SETMEM (self, canceltype, PTHREAD_CANCEL_DEFERRED);
+
+  THREAD_SETMEM (self, result, value);
+
+  THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT);
+
+  __pthread_unwind ((__pthread_unwind_buf_t *)
+		    THREAD_GETMEM (self, cleanup_jmp_buf));
+}
+
+static inline bool
+__pthread_self_cancelled (void)
+{
+  struct pthread *self = THREAD_SELF;
+  int cancelhandling = THREAD_GETMEM (self, cancelhandling);
+  return self->cancelstate == PTHREAD_CANCEL_ENABLE
+	  && (cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
+			        | TERMINATED_BITMASK))
+	      == CANCELED_BITMASK;
+}
+
 extern int __pthread_cond_broadcast_2_0 (pthread_cond_2_0_t *cond);
 extern int __pthread_cond_destroy_2_0 (pthread_cond_2_0_t *cond);
 extern int __pthread_cond_init_2_0 (pthread_cond_2_0_t *cond,
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index 7518c0b8bf..3873a47af8 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -37,63 +37,29 @@ __pthread_cancel (pthread_t th)
 #ifdef SHARED
   pthread_cancel_init ();
 #endif
-  int result = 0;
-  int oldval;
-  int newval;
-  do
-    {
-    again:
-      oldval = pd->cancelhandling;
-      newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
-
-      /* Avoid doing unnecessary work.  The atomic operation can
-	 potentially be expensive if the bug has to be locked and
-	 remote cache lines have to be invalidated.  */
-      if (oldval == newval)
-	break;
-
-      /* If the cancellation is handled asynchronously just send a
-	 signal.  We avoid this if possible since it's more
-	 expensive.  */
-      if (pd->cancelstate == PTHREAD_CANCEL_ENABLE
-	  && pd->canceltype == PTHREAD_CANCEL_ASYNCHRONOUS
-	  && (newval & (CANCELED_BITMASK | EXITING_BITMASK
-			| TERMINATED_BITMASK))
-	      == CANCELED_BITMASK)
-	{
-	  /* Mark the cancellation as "in progress".  */
-	  if (atomic_compare_and_exchange_bool_acq (&pd->cancelhandling,
-						    oldval | CANCELING_BITMASK,
-						    oldval))
-	    goto again;
 
-	  /* The cancellation handler will take care of marking the
-	     thread as canceled.  */
-	  pid_t pid = __getpid ();
+  THREAD_ATOMIC_BIT_SET (pd, cancelhandling, CANCELED_BIT);
 
-	  int val = INTERNAL_SYSCALL_CALL (tgkill, pid, pd->tid,
-					   SIGCANCEL);
-	  if (INTERNAL_SYSCALL_ERROR_P (val))
-	    result = INTERNAL_SYSCALL_ERRNO (val);
+  /* A single-threaded process should be able to kill itself, since there is
+     nothing in the POSIX specification that says that it cannot.  So we set
+     multiple_threads to true so that cancellation points get executed.  */
+  THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
 
-	  break;
-	}
-
-	/* A single-threaded process should be able to kill itself, since
-	   there is nothing in the POSIX specification that says that it
-	   cannot.  So we set multiple_threads to true so that cancellation
-	   points get executed.  */
-	THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
 #ifndef TLS_MULTIPLE_THREADS_IN_TCB
-	__pthread_multiple_threads = *__libc_multiple_threads_ptr = 1;
+  __pthread_multiple_threads = *__libc_multiple_threads_ptr = 1;
 #endif
+
+  /* Avoid signaling when thread attempts cancel itself (pthread_kill
+     is expensive).  */
+  if (pd == THREAD_SELF)
+    {
+      if (pd->cancelstate == PTHREAD_CANCEL_ENABLE
+	  && pd->canceltype == PTHREAD_CANCEL_ASYNCHRONOUS)
+	__pthread_exit (PTHREAD_CANCELED);
+      return 0;
     }
-  /* Mark the thread as canceled.  This has to be done
-     atomically since other bits could be modified as well.  */
-  while (atomic_compare_and_exchange_bool_acq (&pd->cancelhandling, newval,
-					       oldval));
 
-  return result;
+  return __pthread_kill_internal (th, SIGCANCEL);
 }
 weak_alias (__pthread_cancel, pthread_cancel)
 
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 7c752d0f99..bc9dd2d965 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -402,7 +402,7 @@ START_THREAD_DEFN
   /* If the parent was running cancellation handlers while creating
      the thread the new thread inherited the signal mask.  Reset the
      cancellation signal mask.  */
-  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELING_BITMASK))
+  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELED_BITMASK))
     {
       sigset_t mask;
       __sigemptyset (&mask);
@@ -443,7 +443,8 @@ START_THREAD_DEFN
 	 have ownership (see CONCURRENCY NOTES above).  */
       if (__glibc_unlikely (pd->stopped_start))
 	{
-	  int oldtype = CANCEL_ASYNC ();
+	  int ct;
+	  __pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, &ct);
 
 	  /* Get the lock the parent locked to force synchronization.  */
 	  lll_lock (pd->lock, LLL_PRIVATE);
@@ -453,7 +454,7 @@ START_THREAD_DEFN
 	  /* And give it up right away.  */
 	  lll_unlock (pd->lock, LLL_PRIVATE);
 
-	  CANCEL_RESET (oldtype);
+	  __pthread_setcanceltype (ct, NULL);
 	}
 
       LIBC_PROBE (pthread_start, 3, (pthread_t) pd, pd->start_routine, pd->arg);
diff --git a/nptl/pthread_exit.c b/nptl/pthread_exit.c
index e9b5e62c86..059b30623d 100644
--- a/nptl/pthread_exit.c
+++ b/nptl/pthread_exit.c
@@ -16,18 +16,15 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <stdlib.h>
 #include "pthreadP.h"
 
-
-void
+_Noreturn void
 __pthread_exit (void *value)
 {
-  THREAD_SETMEM (THREAD_SELF, result, value);
-
-  __do_cancel ();
+  __do_cancel (value);
 }
 weak_alias (__pthread_exit, pthread_exit)
+hidden_def (__pthread_exit)
 
 /* After a thread terminates, __libc_start_main decrements
    __nptl_nthreads defined in pthread_create.c.  */
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
index 03e202136f..4d1898c52b 100644
--- a/nptl/pthread_join_common.c
+++ b/nptl/pthread_join_common.c
@@ -70,7 +70,8 @@ clockwait_tid (pid_t *tidp, clockid_t clockid, const struct timespec *abstime)
       /* If *tidp == tid, wait until thread terminates or the wait times out.
          The kernel up to version 3.16.3 does not use the private futex
          operations for futex wake-up when the clone terminates.  */
-      if (lll_futex_timed_wait_cancel (tidp, tid, &rt, LLL_SHARED)
+      if (lll_futex_timed_wait_cancel ((unsigned int *) tidp, tid, &rt,
+				       LLL_SHARED)
 	  == -ETIMEDOUT)
         return ETIMEDOUT;
     }
@@ -103,7 +104,7 @@ __pthread_clockjoin_ex (pthread_t threadid, void **thread_return,
   if ((pd == self
        || (self->joinid == pd
 	   && (pd->cancelhandling
-	       & (CANCELING_BITMASK | CANCELED_BITMASK | EXITING_BITMASK
+	       & (CANCELED_BITMASK | EXITING_BITMASK
 		  | TERMINATED_BITMASK)) == 0))
       && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
 	   && (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
@@ -145,7 +146,7 @@ __pthread_clockjoin_ex (pthread_t threadid, void **thread_return,
 	  /* We need acquire MO here so that we synchronize with the
 	     kernel's store to 0 when the clone terminates. (see above)  */
 	  while ((tid = atomic_load_acquire (&pd->tid)) != 0)
-	    lll_futex_wait_cancel (&pd->tid, tid, LLL_SHARED);
+	    lll_futex_wait_cancel ((unsigned int *) &pd->tid, tid, LLL_SHARED);
 	}
 
       pthread_cleanup_pop (0);
diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
index 73144a07ec..bf9e9bb81f 100644
--- a/nptl/pthread_kill.c
+++ b/nptl/pthread_kill.c
@@ -31,8 +31,9 @@ __pthread_kill (pthread_t threadid, int signo)
     /* Not a valid thread handle.  */
     return ESRCH;
 
-  return ENOSYS;
+  if (__is_internal_signal (signo))
+    return EINVAL;
+
+  return __pthread_kill_internal (threadid, signo);
 }
 strong_alias (__pthread_kill, pthread_kill)
-
-stub_warning (pthread_kill)
diff --git a/sysdeps/unix/sysv/linux/pthread_kill.c b/nptl/pthread_kill_internal.c
similarity index 75%
rename from sysdeps/unix/sysv/linux/pthread_kill.c
rename to nptl/pthread_kill_internal.c
index 4dfe08ffcd..f428a46f57 100644
--- a/sysdeps/unix/sysv/linux/pthread_kill.c
+++ b/nptl/pthread_kill_internal.c
@@ -16,24 +16,15 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <errno.h>
-#include <signal.h>
-#include <pthreadP.h>
-#include <tls.h>
-#include <sysdep.h>
 #include <unistd.h>
+#include <pthreadP.h>
 
-
+/* Used internally by pthread_cancel, so we can't filter SIGCANCEL.  */
 int
-__pthread_kill (pthread_t threadid, int signo)
+__pthread_kill_internal (pthread_t threadid, int signo)
 {
   struct pthread *pd = (struct pthread *) threadid;
 
-  /* Make sure the descriptor is valid.  */
-  if (DEBUGGING_P && INVALID_TD_P (pd))
-    /* Not a valid thread handle.  */
-    return ESRCH;
-
   /* Force load of pd->tid into local variable or register.  Otherwise
      if a thread exits between ESRCH test and tgkill, we might return
      EINVAL, because pd->tid would be cleared by the kernel.  */
@@ -42,11 +33,6 @@ __pthread_kill (pthread_t threadid, int signo)
     /* Not a valid thread handle.  */
     return ESRCH;
 
-  /* Disallow sending the signal we use for cancellation, timers,
-     for the setxid implementation.  */
-  if (signo == SIGCANCEL || signo == SIGTIMER || signo == SIGSETXID)
-    return EINVAL;
-
   /* We have a special syscall to do the work.  */
   pid_t pid = __getpid ();
 
@@ -54,4 +40,3 @@ __pthread_kill (pthread_t threadid, int signo)
   return (INTERNAL_SYSCALL_ERROR_P (val)
 	  ? INTERNAL_SYSCALL_ERRNO (val) : 0);
 }
-strong_alias (__pthread_kill, pthread_kill)
diff --git a/nptl/pthread_setcanceltype.c b/nptl/pthread_setcanceltype.c
index d8cb54736d..77e4adf537 100644
--- a/nptl/pthread_setcanceltype.c
+++ b/nptl/pthread_setcanceltype.c
@@ -37,4 +37,4 @@ __pthread_setcanceltype (int type, int *oldtype)
 
   return 0;
 }
-strong_alias (__pthread_setcanceltype, pthread_setcanceltype)
+weak_alias (__pthread_setcanceltype, pthread_setcanceltype)
diff --git a/nptl/pthread_testcancel.c b/nptl/pthread_testcancel.c
index 026c20f82e..584ff242be 100644
--- a/nptl/pthread_testcancel.c
+++ b/nptl/pthread_testcancel.c
@@ -23,16 +23,8 @@
 void
 __pthread_testcancel (void)
 {
-  struct pthread *self = THREAD_SELF;
-  int cancelhandling = THREAD_GETMEM (self, cancelhandling);
-  if (self->cancelstate == PTHREAD_CANCEL_ENABLE
-      && (cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
-			    | TERMINATED_BITMASK))
-	  == CANCELED_BITMASK)
-    {
-      THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-      __do_cancel ();
-    }
+  if (__pthread_self_cancelled ())
+    __do_cancel (PTHREAD_CANCELED);
 }
 weak_alias (__pthread_testcancel, pthread_testcancel)
 hidden_def (__pthread_testcancel)
diff --git a/nptl/sem_wait.c b/nptl/sem_wait.c
index 171716fdbc..1ba9926c0e 100644
--- a/nptl/sem_wait.c
+++ b/nptl/sem_wait.c
@@ -58,7 +58,7 @@ __old_sem_wait (sem_t *sem)
 	return 0;
 
       /* Always assume the semaphore is shared.  */
-      err = lll_futex_wait_cancel (futex, 0, LLL_SHARED);
+      err = lll_futex_wait_cancel ((unsigned int *) futex, 0, LLL_SHARED);
     }
   while (err == 0 || err == -EWOULDBLOCK);
 
diff --git a/nptl/tst-cancel29.c b/nptl/tst-cancel29.c
new file mode 100644
index 0000000000..1671cfe04f
--- /dev/null
+++ b/nptl/tst-cancel29.c
@@ -0,0 +1,100 @@
+/* Check side-effect act for cancellable syscalls (BZ #12683).
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <https://www.gnu.org/licenses/>.  */
+
+/* This testcase checks if there is resource leakage if the syscall has
+   returned from kernelspace, but before userspace saves the return
+   value.  The 'leaker' thread should be able to close the file descriptor
+   if the resource is already allocated, meaning that if the cancellation
+   signal arrives *after* the open syscal return from kernel, the
+   side-effect should be visible to application.  */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+
+#include <support/xunistd.h>
+#include <support/xthread.h>
+#include <support/check.h>
+#include <support/temp_file.h>
+#include <support/support.h>
+#include <support/descriptors.h>
+
+static void *
+writeopener (void *arg)
+{
+  int fd;
+  for (;;)
+    {
+      fd = open (arg, O_WRONLY);
+      xclose (fd);
+    }
+  return NULL;
+}
+
+static void *
+leaker (void *arg)
+{
+  int fd = open (arg, O_RDONLY);
+  TEST_VERIFY_EXIT (fd > 0);
+  pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, 0);
+  xclose (fd);
+  return NULL;
+}
+
+static int
+do_test (void)
+{
+  enum {
+    iter_count = 1000
+  };
+
+  char *dir = support_create_temp_directory ("tst-cancel28");
+  char *name = xasprintf ("%s/fifo", dir);
+  TEST_COMPARE (mkfifo (name, 0600), 0);
+  add_temp_file (name);
+
+  struct support_descriptors *descrs = support_descriptors_list ();
+
+  srand (1);
+
+  xpthread_create (NULL, writeopener, name);
+  for (int i = 0; i < iter_count; i++)
+    {
+      pthread_t td = xpthread_create (NULL, leaker, name);
+      struct timespec ts =
+	{ .tv_nsec = rand () % 100000, .tv_sec = 0 };
+      nanosleep (&ts, NULL);
+      /* Ignore pthread_cancel result because it might be the
+	 case when pthread_cancel is called when thread is already
+	 exited.  */
+      pthread_cancel (td);
+      xpthread_join (td);
+    }
+
+  support_descriptors_check (descrs);
+
+  support_descriptors_free (descrs);
+
+  free (name);
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/rt/Makefile b/rt/Makefile
index dab5d62a57..565b76c9c4 100644
--- a/rt/Makefile
+++ b/rt/Makefile
@@ -57,7 +57,7 @@ include ../Rules
 CFLAGS-aio_suspend.c += -fexceptions
 CFLAGS-mq_timedreceive.c += -fexceptions -fasynchronous-unwind-tables
 CFLAGS-mq_timedsend.c += -fexceptions -fasynchronous-unwind-tables
-CFLAGS-librt-cancellation.c += -fasynchronous-unwind-tables
+CFLAGS-clock_nanosleep.c += -fexceptions -fasynchronous-unwind-tables
 
 LDFLAGS-rt.so = -Wl,--enable-new-dtags,-z,nodelete
 
diff --git a/sysdeps/nptl/librt-cancellation.c b/sysdeps/generic/syscall_types.h
similarity index 70%
rename from sysdeps/nptl/librt-cancellation.c
rename to sysdeps/generic/syscall_types.h
index af1d11b2e6..cf11956fd7 100644
--- a/sysdeps/nptl/librt-cancellation.c
+++ b/sysdeps/generic/syscall_types.h
@@ -1,6 +1,6 @@
-/* Copyright (C) 2002-2020 Free Software Foundation, Inc.
+/* Types and macros used for syscall issuing.
+   Copyright (C) 2020 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
 
    The GNU C Library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -16,9 +16,10 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <nptl/pthreadP.h>
+#ifndef _SYSCALL_TYPES_H
+#define _SYSCALL_TYPES_H
 
+typedef long int __syscall_arg_t;
+#define __SSC(__x) ((__syscall_arg_t) (__x))
 
-#define __pthread_enable_asynccancel __librt_enable_asynccancel
-#define __pthread_disable_asynccancel __librt_disable_asynccancel
-#include <nptl/cancellation.c>
+#endif
diff --git a/sysdeps/generic/sysdep-cancel.h b/sysdeps/generic/sysdep-cancel.h
index d22a786536..5c84b4499a 100644
--- a/sysdeps/generic/sysdep-cancel.h
+++ b/sysdeps/generic/sysdep-cancel.h
@@ -3,5 +3,3 @@
 /* No multi-thread handling enabled.  */
 #define SINGLE_THREAD_P (1)
 #define RTLD_SINGLE_THREAD_P (1)
-#define LIBC_CANCEL_ASYNC()	0 /* Just a dummy value.  */
-#define LIBC_CANCEL_RESET(val)	((void)(val)) /* Nothing, but evaluate it.  */
diff --git a/sysdeps/nptl/Makefile b/sysdeps/nptl/Makefile
index 0631a870c8..30f9c8e91e 100644
--- a/sysdeps/nptl/Makefile
+++ b/sysdeps/nptl/Makefile
@@ -21,8 +21,7 @@ libpthread-sysdep_routines += errno-loc
 endif
 
 ifeq ($(subdir),rt)
-librt-sysdep_routines += timer_routines librt-cancellation
-CFLAGS-librt-cancellation.c += -fexceptions -fasynchronous-unwind-tables
+librt-sysdep_routines += timer_routines
 
 tests += tst-mqueue8x
 CFLAGS-tst-mqueue8x.c += -fexceptions
diff --git a/sysdeps/nptl/cancellation-pc-check.h b/sysdeps/nptl/cancellation-pc-check.h
new file mode 100644
index 0000000000..ae124fcced
--- /dev/null
+++ b/sysdeps/nptl/cancellation-pc-check.h
@@ -0,0 +1,53 @@
+/* Architecture specific code for pthread cancellation handling.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _NPTL_CANCELLATION_PC_CHECK
+#define _NPTL_CANCELLATION_PC_CHECK
+
+#include <sigcontextinfo.h>
+
+/* For syscalls with side-effects, the kernel cannot restart the syscall; when
+   it is interrupted by a signal, the kernel must cause the syscall to return
+   with whatever partial result is obtained (e.g. partial read or write).  In
+   this case, the saved program counter points just after the syscall
+   instruction, so the SIGCANCEL handler should not act on cancellation.
+
+   The __syscall_cancel_arch function, used for all cancellable syscalls,
+   contains two extra markers, __syscall_cancel_arch_start and
+   __syscall_cancel_arch_end.  The former points to just before the initial
+   conditional branch that checks if the thread has received a cancellation
+   request, while former points to the instruction after the one responsible
+   to issue the syscall.
+
+   The function check if the program counter (PC) from ucontext_t CTX is
+   within the start and then end boundary from the __syscall_cancel_arch
+   bridge.  Return TRUE if the PC is within the boundary, meaning the
+   syscall does not have any side effects; or FALSE otherwise.  */
+static bool
+cancellation_pc_check (void *ctx)
+{
+  /* Both are defined in syscall_cancel.S.  */
+  extern const char __syscall_cancel_arch_start[1];
+  extern const char __syscall_cancel_arch_end[1];
+
+  uintptr_t pc = sigcontext_get_pc (ctx);
+  return pc >= (uintptr_t) __syscall_cancel_arch_start
+	 && pc < (uintptr_t) __syscall_cancel_arch_end;
+}
+
+#endif
diff --git a/sysdeps/nptl/cancellation-sigmask.h b/sysdeps/nptl/cancellation-sigmask.h
new file mode 100644
index 0000000000..80c2a2d2f2
--- /dev/null
+++ b/sysdeps/nptl/cancellation-sigmask.h
@@ -0,0 +1,30 @@
+/* Architecture specific code for pthread cancellation handling.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _NPTL_CANCELLATION_SIGMASK_H
+#define _NPTL_CANCELLATION_SIGMASK_H
+
+/* Add the SIGCANCEL signal on sigmask set at the ucontext_t CTX obtained from
+   the sigaction handler.  */
+static void
+ucontext_block_sigcancel (void *ctx)
+{
+  __sigaddset (&((ucontext_t*) ctx)->uc_sigmask, SIGCANCEL);
+}
+
+#endif
diff --git a/sysdeps/nptl/futex-internal.h b/sysdeps/nptl/futex-internal.h
index d622122ddc..9bce768afc 100644
--- a/sysdeps/nptl/futex-internal.h
+++ b/sysdeps/nptl/futex-internal.h
@@ -178,10 +178,7 @@ static __always_inline int
 futex_wait_cancelable (unsigned int *futex_word, unsigned int expected,
 		       int private)
 {
-  int oldtype;
-  oldtype = __pthread_enable_asynccancel ();
-  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
-  __pthread_disable_asynccancel (oldtype);
+  int err = lll_futex_timed_wait_cancel (futex_word, expected, NULL, private);
   switch (err)
     {
     case 0:
@@ -239,10 +236,8 @@ futex_reltimed_wait_cancelable (unsigned int* futex_word,
 				unsigned int expected,
 			        const struct timespec* reltime, int private)
 {
-  int oldtype;
-  oldtype = LIBC_CANCEL_ASYNC ();
-  int err = lll_futex_timed_wait (futex_word, expected, reltime, private);
-  LIBC_CANCEL_RESET (oldtype);
+  int err = lll_futex_timed_wait_cancel (futex_word, expected, reltime,
+					 private);
   switch (err)
     {
     case 0:
@@ -315,12 +310,8 @@ futex_abstimed_wait_cancelable (unsigned int* futex_word,
      despite them being valid.  */
   if (__glibc_unlikely ((abstime != NULL) && (abstime->tv_sec < 0)))
     return ETIMEDOUT;
-  int oldtype;
-  oldtype = __pthread_enable_asynccancel ();
-  int err = lll_futex_clock_wait_bitset (futex_word, expected,
-					clockid, abstime,
-					private);
-  __pthread_disable_asynccancel (oldtype);
+  int err = lll_futex_clock_wait_bitset_cancel (futex_word, expected, clockid,
+						abstime, private);
   switch (err)
     {
     case 0:
diff --git a/sysdeps/nptl/lowlevellock-futex.h b/sysdeps/nptl/lowlevellock-futex.h
index 2209ca76a1..4b72deda95 100644
--- a/sysdeps/nptl/lowlevellock-futex.h
+++ b/sysdeps/nptl/lowlevellock-futex.h
@@ -174,21 +174,36 @@
 		     nr_wake, nr_move, mutex, val)
 
 /* Like lll_futex_wait, but acting as a cancellable entrypoint.  */
-# define lll_futex_wait_cancel(futexp, val, private) \
-  ({                                                                   \
-    int __oldtype = CANCEL_ASYNC ();				       \
-    long int __err = lll_futex_wait (futexp, val, LLL_SHARED);	       \
-    CANCEL_RESET (__oldtype);					       \
-    __err;							       \
-  })
+# define lll_futex_wait_cancel(futexp, val, private)			\
+  lll_futex_timed_wait_cancel (futexp, val, NULL, private)
 
 /* Like lll_futex_timed_wait, but acting as a cancellable entrypoint.  */
-# define lll_futex_timed_wait_cancel(futexp, val, timeout, private) \
-  ({									   \
-    int __oldtype = CANCEL_ASYNC ();				       	   \
-    long int __err = lll_futex_timed_wait (futexp, val, timeout, private); \
-    CANCEL_RESET (__oldtype);						   \
-    __err;								   \
+# define lll_futex_timed_wait_cancel(futexp, val, timeout, private) 	\
+  ({									\
+     int __op = __lll_private_flag (FUTEX_WAIT, private);		\
+     INTERNAL_SYSCALL_CANCEL (futex, futexp, __op, val, timeout);	\
+  })
+
+/* Like lll_futex_clock_wait_bitset, but acting as a cancellable
+   entrypoint.  */
+# define lll_futex_clock_wait_bitset_cancel(futexp, val, clockid, timeout, \
+					    private)			   \
+  ({									\
+    long int __ret;							\
+    if (lll_futex_supported_clockid (clockid))			  	\
+      {								 	\
+	const unsigned int __clockbit =				 	\
+	  (clockid == CLOCK_REALTIME) ? FUTEX_CLOCK_REALTIME : 0;       \
+	const int __op =						\
+	  __lll_private_flag (FUTEX_WAIT_BITSET | __clockbit, private); \
+									\
+	__ret = INTERNAL_SYSCALL_CANCEL (futex, futexp, __op, val,	\
+					 timeout, NULL,			\
+					 FUTEX_BITSET_MATCH_ANY);	\
+      }								 	\
+    else								\
+      __ret = -EINVAL;							\
+    __ret;								\
   })
 
 #endif  /* !__ASSEMBLER__  */
diff --git a/sysdeps/unix/sysdep.h b/sysdeps/unix/sysdep.h
index 3c687a717a..c19cffabee 100644
--- a/sysdeps/unix/sysdep.h
+++ b/sysdeps/unix/sysdep.h
@@ -15,6 +15,9 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
+#ifndef _SYSDEP_UNIX_H
+#define _SYSDEP_UNIX_H 1
+
 #include <sysdeps/generic/sysdep.h>
 #include <single-thread.h>
 #include <sys/syscall.h>
@@ -24,6 +27,9 @@
 #define	SYSCALL__(name, args)	PSEUDO (__##name, name, args)
 #define	SYSCALL(name, args)	PSEUDO (name, name, args)
 
+#ifndef __ASSEMBLER__
+# include <errno.h>
+
 #define __SYSCALL_CONCAT_X(a,b)     a##b
 #define __SYSCALL_CONCAT(a,b)       __SYSCALL_CONCAT_X (a, b)
 
@@ -57,6 +63,29 @@
 #define INTERNAL_SYSCALL_CALL(...) \
   __INTERNAL_SYSCALL_DISP (__INTERNAL_SYSCALL, __VA_ARGS__)
 
+#define __INTERNAL_SYSCALL_NCS0(name) \
+  INTERNAL_SYSCALL_NCS (name, 0)
+#define __INTERNAL_SYSCALL_NCS1(name, a1) \
+  INTERNAL_SYSCALL_NCS (name, 1, a1)
+#define __INTERNAL_SYSCALL_NCS2(name, a1, a2) \
+  INTERNAL_SYSCALL_NCS (name, 2, a1, a2)
+#define __INTERNAL_SYSCALL_NCS3(name, a1, a2, a3) \
+  INTERNAL_SYSCALL_NCS (name, 3, a1, a2, a3)
+#define __INTERNAL_SYSCALL_NCS4(name, a1, a2, a3, a4) \
+  INTERNAL_SYSCALL_NCS (name, 4, a1, a2, a3, a4)
+#define __INTERNAL_SYSCALL_NCS5(name, a1, a2, a3, a4, a5) \
+  INTERNAL_SYSCALL_NCS (name, 5, a1, a2, a3, a4, a5)
+#define __INTERNAL_SYSCALL_NCS6(name, a1, a2, a3, a4, a5, a6) \
+  INTERNAL_SYSCALL_NCS (name, 6, a1, a2, a3, a4, a5, a6)
+#define __INTERNAL_SYSCALL_NCS7(name, a1, a2, a3, a4, a5, a6, a7) \
+  INTERNAL_SYSCALL_NCS (name, 7, a1, a2, a3, a4, a5, a6, a7)
+
+/* Issue a syscall defined by syscall number plus any other argument required.
+   It is similar to INTERNAL_SYSCALL_NCS macro, but without the need to pass
+   the expected argument number as third parameter.  */
+#define INTERNAL_SYSCALL_NCS_CALL(...) \
+  __INTERNAL_SYSCALL_DISP (__INTERNAL_SYSCALL_NCS, __VA_ARGS__)
+
 #define __INLINE_SYSCALL0(name) \
   INLINE_SYSCALL (name, 0)
 #define __INLINE_SYSCALL1(name, a1) \
@@ -88,35 +117,68 @@
 #define INLINE_SYSCALL_CALL(...) \
   __INLINE_SYSCALL_DISP (__INLINE_SYSCALL, __VA_ARGS__)
 
-#define SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (SINGLE_THREAD_P) 						     \
-      sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
-  })
 
-/* Issue a syscall defined by syscall number plus any other argument
-   required.  Any error will be returned unmodified (including errno).  */
-#define INTERNAL_SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (SINGLE_THREAD_P) 						     \
-      sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
+/* Cancellation macros.  */
+#include <syscall_types.h>
+
+long int __syscall_cancel (__syscall_arg_t nr, __syscall_arg_t arg1,
+			   __syscall_arg_t arg2, __syscall_arg_t arg3,
+			   __syscall_arg_t arg4, __syscall_arg_t arg5,
+			   __syscall_arg_t arg6);
+libc_hidden_proto (__syscall_cancel);
+
+#define __SYSCALL_CANCEL0(name) \
+  __syscall_cancel (__NR_##name, 0, 0, 0, 0, 0, 0)
+#define __SYSCALL_CANCEL1(name, a1) \
+  __syscall_cancel (__NR_##name, __SSC (a1), 0, 0, 0, 0, 0)
+#define __SYSCALL_CANCEL2(name, a1, a2) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), 0, 0, 0, 0)
+#define __SYSCALL_CANCEL3(name, a1, a2, a3) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     0, 0, 0)
+#define __SYSCALL_CANCEL4(name, a1, a2, a3, a4) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     __SSC(a4), 0, 0)
+#define __SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     __SSC(a4), __SSC (a5), 0)
+#define __SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6) \
+  __syscall_cancel (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		     __SSC (a4), __SSC (a5), __SSC (a6))
+
+#define __SYSCALL_CANCEL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
+#define __SYSCALL_CANCEL_NARGS(...) \
+  __SYSCALL_CANCEL_NARGS_X (__VA_ARGS__,7,6,5,4,3,2,1,0,)
+#define __SYSCALL_CANCEL_CONCAT_X(a,b)     a##b
+#define __SYSCALL_CANCEL_CONCAT(a,b)       __SYSCALL_CANCEL_CONCAT_X (a, b)
+#define __SYSCALL_CANCEL_DISP(b,...) \
+  __SYSCALL_CANCEL_CONCAT (b,__SYSCALL_CANCEL_NARGS(__VA_ARGS__))(__VA_ARGS__)
+
+#define __SYSCALL_CANCEL_CALL(...) \
+  __SYSCALL_CANCEL_DISP (__SYSCALL_CANCEL, __VA_ARGS__)
+
+/* Issue a cancellable syscall defined by syscall number NAME plus any other
+   argument required.  If an error occurs its value is returned as an negative
+   number unmodified and errno is not set.  */
+#define INTERNAL_SYSCALL_CANCEL(name, args...) \
+  __SYSCALL_CANCEL_CALL (name, args)
+
+/* Issue a cancellable syscall defined first argument plus any other argument
+   required.  If and error occurs its value, the macro returns -1 and sets
+   errno accordingly.  */
+#if IS_IN (rtld)
+/* The loader does not need to handle thread cancellation, use direct
+   syscall instead.  */
+# define SYSCALL_CANCEL(...) INLINE_SYSCALL_CALL (__VA_ARGS__)
+#else
+# define SYSCALL_CANCEL(...) \
+  ({									\
+    long int sc_ret = __SYSCALL_CANCEL_CALL (__VA_ARGS__);		\
+    SYSCALL_CANCEL_RET ((sc_ret));					\
   })
+#endif
+
+#endif /* __ASSEMBLER__  */
 
 /* Machine-dependent sysdep.h files are expected to define the macro
    PSEUDO (function_name, syscall_name) to emit assembly code to define the
@@ -146,3 +208,5 @@
 #ifndef INLINE_SYSCALL
 #define INLINE_SYSCALL(name, nr, args...) __syscall_##name (args)
 #endif
+
+#endif /* _SYSDEP_UNIX_H  */
diff --git a/sysdeps/unix/sysv/linux/socketcall.h b/sysdeps/unix/sysv/linux/socketcall.h
index 75c2a6404d..64af566e18 100644
--- a/sysdeps/unix/sysv/linux/socketcall.h
+++ b/sysdeps/unix/sysv/linux/socketcall.h
@@ -87,18 +87,32 @@
   })
 
 
-#if IS_IN (libc)
-# define __pthread_enable_asynccancel  __libc_enable_asynccancel
-# define __pthread_disable_asynccancel __libc_disable_asynccancel
-#endif
-
-#define SOCKETCALL_CANCEL(name, args...)				\
-  ({									\
-    int oldtype = LIBC_CANCEL_ASYNC ();					\
-    long int sc_ret = __SOCKETCALL (SOCKOP_##name, args);		\
-    LIBC_CANCEL_RESET (oldtype);					\
-    sc_ret;								\
-  })
-
+#define __SOCKETCALL_CANCEL1(__name, __a1) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [1]) { (long int) __a1 }))
+#define __SOCKETCALL_CANCEL2(__name, __a1, __a2) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [2]) { (long int) __a1, (long int) __a2 }))
+#define __SOCKETCALL_CANCEL3(__name, __a1, __a2, __a3) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [3]) { (long int) __a1, (long int) __a2, (long int) __a3 }))
+#define __SOCKETCALL_CANCEL4(__name, __a1, __a2, __a3, __a4) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [4]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4 }))
+#define __SOCKETCALL_CANCEL5(__name, __a1, __a2, __a3, __a4, __a5) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [5]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5 }))
+#define __SOCKETCALL_CANCEL6(__name, __a1, __a2, __a3, __a4, __a5, __a6) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [6]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5, (long int) __a6 }))
+
+#define __SOCKETCALL_CANCEL(...) __SOCKETCALL_DISP (__SOCKETCALL_CANCEL,\
+						    __VA_ARGS__)
+
+#define SOCKETCALL_CANCEL(name, args...) \
+   __SOCKETCALL_CANCEL (SOCKOP_##name, args)
 
 #endif /* sys/socketcall.h */
diff --git a/sysdeps/unix/sysv/linux/syscall_cancel.c b/sysdeps/unix/sysv/linux/syscall_cancel.c
new file mode 100644
index 0000000000..003e485b5c
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/syscall_cancel.c
@@ -0,0 +1,62 @@
+/* Default cancellation syscall bridge.
+   Copyright (C) 2020 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <pthreadP.h>
+
+#warning "This implementation should be use just as reference or for bootstrapping"
+
+/* This is the generic version of the cancellable syscall code which
+   adds the label guards (__syscall_cancel_arch_{start,end}) used
+   on SIGCANCEL sigcancel_handler (nptl-init.c) to check if the cancelled
+   syscall have side-effects that need to be signaled to program.
+
+   This implementation should be used a reference one to document the
+   implementation constraints: the __syscall_cancel_arch_end should point
+   to the immediate next instruction after the syscall one.  This is because
+   kernel will signal interrupted syscall with side effects by setting
+   the signal frame program counter (on the ucontext_t third argument from
+   SA_SIGINFO signal handler) right after the syscall instruction.
+
+   If the INTERNAL_SYSCALL_NCS macro use more instructions to get the
+   error condition from kernel (as for powerpc and sparc), uses an
+   out of the line helper (as for ARM thumb), or uses a kernel helper
+   gate (as for i686 or ia64) the architecture should adjust the
+   macro or provide a custom __syscall_cancel_arch implementation.   */
+long int
+__syscall_cancel_arch (volatile int *ch, __syscall_arg_t nr,
+		       __syscall_arg_t a1, __syscall_arg_t a2,
+		       __syscall_arg_t a3, __syscall_arg_t a4,
+		       __syscall_arg_t a5, __syscall_arg_t a6)
+{
+#define ADD_LABEL(__label)		\
+  asm volatile (			\
+    ".global " __label "\t\n"		\
+    __label ":\n");
+
+  ADD_LABEL ("__syscall_cancel_arch_start");
+  if (__glibc_unlikely (*ch & CANCELED_BITMASK))
+    __syscall_do_cancel();
+
+  long int result = INTERNAL_SYSCALL_NCS_CALL (nr, a1, a2, a3, a4, a5, a6);
+  ADD_LABEL ("__syscall_cancel_arch_end");
+  if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (result)))
+    return -INTERNAL_SYSCALL_ERRNO (result);
+  return result;
+}
+libc_hidden_def (__syscall_cancel_arch)
diff --git a/sysdeps/unix/sysv/linux/sysdep-cancel.h b/sysdeps/unix/sysv/linux/sysdep-cancel.h
index 61d3348768..20824bc096 100644
--- a/sysdeps/unix/sysv/linux/sysdep-cancel.h
+++ b/sysdeps/unix/sysv/linux/sysdep-cancel.h
@@ -21,47 +21,5 @@
 #define _SYSDEP_CANCEL_H
 
 #include <sysdep.h>
-#include <tls.h>
-#include <errno.h>
-
-/* The two functions are in libc.so and not exported.  */
-extern int __libc_enable_asynccancel (void) attribute_hidden;
-extern void __libc_disable_asynccancel (int oldtype) attribute_hidden;
-
-/* The two functions are in librt.so and not exported.  */
-extern int __librt_enable_asynccancel (void) attribute_hidden;
-extern void __librt_disable_asynccancel (int oldtype) attribute_hidden;
-
-/* The two functions are in libpthread.so and not exported.  */
-extern int __pthread_enable_asynccancel (void) attribute_hidden;
-extern void __pthread_disable_asynccancel (int oldtype) attribute_hidden;
-
-/* Set cancellation mode to asynchronous.  */
-#define CANCEL_ASYNC() \
-  __pthread_enable_asynccancel ()
-/* Reset to previous cancellation mode.  */
-#define CANCEL_RESET(oldtype) \
-  __pthread_disable_asynccancel (oldtype)
-
-#if IS_IN (libc)
-/* Same as CANCEL_ASYNC, but for use in libc.so.  */
-# define LIBC_CANCEL_ASYNC() \
-  __libc_enable_asynccancel ()
-/* Same as CANCEL_RESET, but for use in libc.so.  */
-# define LIBC_CANCEL_RESET(oldtype) \
-  __libc_disable_asynccancel (oldtype)
-#elif IS_IN (libpthread)
-# define LIBC_CANCEL_ASYNC() CANCEL_ASYNC ()
-# define LIBC_CANCEL_RESET(val) CANCEL_RESET (val)
-#elif IS_IN (librt)
-# define LIBC_CANCEL_ASYNC() \
-  __librt_enable_asynccancel ()
-# define LIBC_CANCEL_RESET(val) \
-  __librt_disable_asynccancel (val)
-#else
-# define LIBC_CANCEL_ASYNC()	0 /* Just a dummy value.  */
-# define LIBC_CANCEL_RESET(val)	((void)(val)) /* Nothing, but evaluate it.  */
-#endif
-
 
 #endif
diff --git a/sysdeps/unix/sysv/linux/sysdep.h b/sysdeps/unix/sysv/linux/sysdep.h
index 5e7b6c5765..8442de9b13 100644
--- a/sysdeps/unix/sysv/linux/sysdep.h
+++ b/sysdeps/unix/sysv/linux/sysdep.h
@@ -58,6 +58,15 @@
     -1l;					\
   })
 
+/* The return error from cancellable syscall has the same semantic as non
+   cancellable ones.  */
+#define SYSCALL_CANCEL_RET(__ret)				\
+  ({								\
+    __glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (__ret))		\
+    ? SYSCALL_ERROR_LABEL (INTERNAL_SYSCALL_ERRNO (__ret))	\
+    : __ret;							\
+   })
+
 /* Provide a dummy argument that can be used to force register
    alignment for register pairs if required by the syscall ABI.  */
 #ifdef __ASSUME_ALIGNED_REGISTER_PAIRS


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [glibc/azanella/bz12683] nptl: Fix Race conditions in pthread cancellation (BZ#12683)
@ 2019-10-17 13:56 Adhemerval Zanella
  0 siblings, 0 replies; 6+ messages in thread
From: Adhemerval Zanella @ 2019-10-17 13:56 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=e5263a744f7554c07444f49f06ff346b869b052c

commit e5263a744f7554c07444f49f06ff346b869b052c
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Fri Sep 18 18:26:35 2015 -0300

    nptl: Fix Race conditions in pthread cancellation (BZ#12683)
    
    This patch is the initial fix for race conditions in NPTL cancellation
    code by redefining how cancellable syscalls are defined and handled.
    The current buggy approach is to enable asynchronous cancellation
    before to making the syscall and restore the previous cancellation
    type once the syscall returns.
    
    As described in BZ#12683, this approach shows 2 important problems:
    
      1. Cancellation can act after the syscall has returned from the
         kernel, but before userspace saves the return value.  It might
         result in a resource leak if the syscall allocated a resource or a
         side effect (partial read/write), and there is no way to program
         handle it with cancellation handlers.
    
      2. If a signal is handled while the thread is blocked at a cancellable
         syscall, the entire signal handler runs with asynchronous
         cancellation enabled.  This can lead to issues if the signal
         handler call functions which are async-signal-safe but not
         async-cancel-safe.
    
    For the cancellation to work correctly, there are 5 points at which the
    cancellation signal could arrive:
    
      1. Before the final "testcancel" and before the syscall is made.
      2. Between the "testcancel" and the syscall.
      3. While the syscall is blocked and no side effects have yet taken
         place.
      4. While the syscall is blocked but with some side effects already
         having taken place (e.g. a partial read or write).
      5. After the syscall has returned.
    
    And GLIBC wants to act on cancellation in cases 1, 2, and 3 but not
    in cases 4 or 5.  For the 4 and 5 cases, the cancellation will eventually
    happen in the next cancellable entrypoint without any further external
    event.
    
    The proposed solution follows for each case:
    
      1. Do a conditional branch based on whether the thread has received
         a cancellation request;
    
      2. It can be caught by the signal handler determining that the saved
         program counter (from the ucontext_t) is in some address range
         beginning just before the "testcancel" and ending with the
         syscall instruction.
    
      3. In this case, except for certain syscalls that ALWAYS fail with
         EINTR even for non-interrupting signals, the kernel will reset
         the program counter to point at the syscall instruction during
         signal handling, so that the syscall is restarted when the signal
         handler returns.  So, from the signal handler's standpoint, this
         looks the same as case 2, and thus it's taken care of.
    
      4. For syscalls with side-effects, the kernel cannot restart the
         syscall; when it's interrupted by a signal, the kernel must cause
         the syscall to return with whatever partial result is obtained
         (e.g. partial read or write).
    
      5. In this case, the saved program counter points just after the
         syscall instruction, so the signal handler won't act on
         cancellation.  This is similar to 4. since the program counter
         is past the syscall instruction.
    
    Another case that needs handling is syscalls that fail with EINTR even
    when the signal handler is non-interrupting. In this case, the syscall
    wrapper code can just check the cancellation flag when the errno result
    is EINTR, and act on cancellation if it's set.
    
    The proposed GLIBC adjustments are:
    
      1. Remove the enable_asynccancel/disable_asynccancel function usage in
         syscall definition and instead make them call a common symbol that
         will check if cancellation is enabled (__syscall_cancel at
         nptl/libc-cancellation.c), call the arch-specific cancellable
         entry-point (__syscall_cancel_arch) and cancel the thread when
         required.
    
      2. Provide an arch-specific generic system call wrapper function
         that contains global markers.  These markers will be used in
         SIGCANCEL handler to check if the interruption has been called in a
         valid syscall and if the syscalls have been completed or not.
    
         A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c
         is provided.  However, the markers may not be set on correct
         expected places depending on how INTERNAL_SYSCALL_NCS is
         implemented by the architecture and it uses compiler-specific
         construct (asm volatile) to place the required markers.
         It is expected that all architectures add an arch-specific
         implementation.
    
      3. Rewrite SIGCANCEL asynchronous handler to check for both canceling
         type and if current IP from signal handler falls between the global
         markers and act accordingly (sigcancel_handler at nptl/nptl-init.c).
    
      4. Adjust nptl/pthread_cancel.c to send a signal instead of acting
         directly. This avoids synchronization issues when updating the
         cancellation status and also focuses the logic on the signal
         handler and cancellation syscall code.
    
      5. Adjust pthread code to replace CANCEL_ASYNC/CANCEL_RESET calls to
         appropriated cancelable futex syscalls.
    
      6. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
         appropriated cancelable syscalls.
    
      7. Adjust 'lowlevellock-futex.h' arch-specific implementations to
         provide cancelable futex calls (used in libpthread code).
    
    This patch adds the proposed changes to NPTL common code and the
    following patches add the requires arch-specific bits.  The build for
    ia64-linux-gnu, mips-*, and x86_64-* are broken without the
    arch-specific patches.

Diff:
---
 manual/llio.texi                                   |   4 +-
 nptl/Makefile                                      |  12 +--
 nptl/Versions                                      |   3 +
 nptl/cancellation.c                                | 101 ------------------
 nptl/descr.h                                       |  15 ++-
 nptl/libc-cancellation.c                           |  45 +++++++-
 nptl/nptl-init.c                                   |  88 ++++++++--------
 nptl/pthreadP.h                                    |  38 +++++--
 nptl/pthread_cancel.c                              |  68 +++---------
 nptl/pthread_create.c                              |   7 +-
 nptl/pthread_exit.c                                |   5 +-
 nptl/pthread_join_common.c                         |   7 +-
 nptl/pthread_kill.c                                |   7 +-
 .../pthread_kill.c => nptl/pthread_kill_internal.c |  21 +---
 nptl/pthread_setcanceltype.c                       |   2 +-
 nptl/sem_wait.c                                    |   2 +-
 nptl/thrd_sleep.c                                  |   7 +-
 nptl/tst-cancel28.c                                |  99 ++++++++++++++++++
 rt/Makefile                                        |   2 +-
 sysdeps/generic/sysdep-cancel.h                    |   2 -
 sysdeps/nptl/Makefile                              |   3 +-
 sysdeps/nptl/cancellation-pc-check.h               |  53 ++++++++++
 ...librt-cancellation.c => cancellation-sigmask.h} |  20 ++--
 sysdeps/unix/sysdep.h                              | 116 ++++++++++++++++-----
 sysdeps/unix/sysv/linux/clock_nanosleep.c          |   8 +-
 sysdeps/unix/sysv/linux/futex-internal.h           |  18 +---
 sysdeps/unix/sysv/linux/lowlevellock-futex.h       |  51 ++++++---
 sysdeps/unix/sysv/linux/socketcall.h               |  40 ++++---
 sysdeps/unix/sysv/linux/syscall_cancel.c           |  63 +++++++++++
 sysdeps/unix/sysv/linux/sysdep.h                   |  23 ++++
 30 files changed, 581 insertions(+), 349 deletions(-)

diff --git a/manual/llio.texi b/manual/llio.texi
index 447126b..ecf1753 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -2534,13 +2534,13 @@ aiocb64}, since the LFS transparently replaces the old interface.
 @c     sigemptyset ok
 @c     sigaddset ok
 @c     setjmp ok
-@c     CANCEL_ASYNC -> pthread_enable_asynccancel ok
+@c     __pthread_setcanceltype ok
 @c      do_cancel ok
 @c       pthread_unwind ok
 @c        Unwind_ForcedUnwind or longjmp ok [@ascuheap @acsmem?]
 @c     lll_lock @asulock @aculock
 @c     lll_unlock @asulock @aculock
-@c     CANCEL_RESET -> pthread_disable_asynccancel ok
+@c     __pthread_setcanceltype ok
 @c      lll_futex_wait ok
 @c     ->start_routine ok -----
 @c     call_tls_dtors @asulock @ascuheap @aculock @acsmem
diff --git a/nptl/Makefile b/nptl/Makefile
index 1129fd4..02dd05f 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -34,7 +34,8 @@ routines = alloca_cutoff forward libc-lowlevellock libc-cancellation \
 	   pthread_attr_destroy pthread_attr_init pthread_attr_getdetachstate \
 	   pthread_attr_setdetachstate pthread_attr_getinheritsched \
 	   pthread_attr_setinheritsched pthread_attr_getschedparam \
-	   pthread_attr_setschedparam
+	   pthread_attr_setschedparam \
+	   syscall_cancel
 shared-only-routines = forward
 static-only-routines = pthread_atfork
 
@@ -103,7 +104,8 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
 		      pthread_barrierattr_setpshared \
 		      pthread_key_create pthread_key_delete \
 		      pthread_getspecific pthread_setspecific \
-		      pthread_sigmask pthread_kill pthread_sigqueue \
+		      pthread_sigmask pthread_kill pthread_kill_internal \
+		      pthread_sigqueue \
 		      pthread_cancel pthread_testcancel \
 		      pthread_setcancelstate pthread_setcanceltype \
 		      pthread_once \
@@ -117,7 +119,6 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
 		      cleanup cleanup_defer cleanup_compat \
 		      cleanup_defer_compat unwind \
 		      pt-longjmp pt-cleanup\
-		      cancellation \
 		      lowlevellock \
 		      lll_timedlock_wait \
 		      pt-fork pt-fcntl \
@@ -171,8 +172,7 @@ CFLAGS-pthread_setcanceltype.c += -fexceptions -fasynchronous-unwind-tables
 
 # These are internal functions which similar functionality as setcancelstate
 # and setcanceltype.
-CFLAGS-cancellation.c += -fasynchronous-unwind-tables
-CFLAGS-libc-cancellation.c += -fasynchronous-unwind-tables
+CFLAGS-libc-cancellation.c += -fexceptions -fasynchronous-unwind-tables
 
 # Calling pthread_exit() must cause the registered cancel handlers to
 # be executed.  Therefore exceptions have to be thrown through this
@@ -286,7 +286,7 @@ tests = tst-attr1 tst-attr2 tst-attr3 tst-default-attr \
 	tst-cancel11 tst-cancel12 tst-cancel13 tst-cancel14 tst-cancel15 \
 	tst-cancel16 tst-cancel17 tst-cancel18 tst-cancel19 tst-cancel20 \
 	tst-cancel21 tst-cancel22 tst-cancel23 tst-cancel24 tst-cancel25 \
-	tst-cancel26 tst-cancel27 \
+	tst-cancel26 tst-cancel27 tst-cancel28 \
 	tst-cancel-self tst-cancel-self-cancelstate \
 	tst-cancel-self-canceltype tst-cancel-self-testcancel \
 	tst-cleanup0 tst-cleanup1 tst-cleanup2 tst-cleanup3 tst-cleanup4 \
diff --git a/nptl/Versions b/nptl/Versions
index be7e810..afdb448 100644
--- a/nptl/Versions
+++ b/nptl/Versions
@@ -39,6 +39,9 @@ libc {
     __libc_pthread_init;
     __libc_current_sigrtmin_private; __libc_current_sigrtmax_private;
     __libc_allocate_rtsig_private;
+    __syscall_cancel;
+    __syscall_cancel_arch_start;
+    __syscall_cancel_arch_end;
   }
 }
 
diff --git a/nptl/cancellation.c b/nptl/cancellation.c
deleted file mode 100644
index 7712845..0000000
--- a/nptl/cancellation.c
+++ /dev/null
@@ -1,101 +0,0 @@
-/* Copyright (C) 2002-2019 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
-
-#include <setjmp.h>
-#include <stdlib.h>
-#include "pthreadP.h"
-#include <futex-internal.h>
-
-
-/* The next two functions are similar to pthread_setcanceltype() but
-   more specialized for the use in the cancelable functions like write().
-   They do not need to check parameters etc.  */
-int
-attribute_hidden
-__pthread_enable_asynccancel (void)
-{
-  struct pthread *self = THREAD_SELF;
-  int oldval = THREAD_GETMEM (self, cancelhandling);
-
-  while (1)
-    {
-      int newval = oldval | CANCELTYPE_BITMASK;
-
-      if (newval == oldval)
-	break;
-
-      int curval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling, newval,
-					      oldval);
-      if (__glibc_likely (curval == oldval))
-	{
-	  if (CANCEL_ENABLED_AND_CANCELED_AND_ASYNCHRONOUS (newval))
-	    {
-	      THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-	      __do_cancel ();
-	    }
-
-	  break;
-	}
-
-      /* Prepare the next round.  */
-      oldval = curval;
-    }
-
-  return oldval;
-}
-
-
-void
-attribute_hidden
-__pthread_disable_asynccancel (int oldtype)
-{
-  /* If asynchronous cancellation was enabled before we do not have
-     anything to do.  */
-  if (oldtype & CANCELTYPE_BITMASK)
-    return;
-
-  struct pthread *self = THREAD_SELF;
-  int newval;
-
-  int oldval = THREAD_GETMEM (self, cancelhandling);
-
-  while (1)
-    {
-      newval = oldval & ~CANCELTYPE_BITMASK;
-
-      int curval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling, newval,
-					      oldval);
-      if (__glibc_likely (curval == oldval))
-	break;
-
-      /* Prepare the next round.  */
-      oldval = curval;
-    }
-
-  /* We cannot return when we are being canceled.  Upon return the
-     thread might be things which would have to be undone.  The
-     following loop should loop until the cancellation signal is
-     delivered.  */
-  while (__builtin_expect ((newval & (CANCELING_BITMASK | CANCELED_BITMASK))
-			   == CANCELING_BITMASK, 0))
-    {
-      futex_wait_simple ((unsigned int *) &self->cancelhandling, newval,
-			 FUTEX_PRIVATE);
-      newval = THREAD_GETMEM (self, cancelhandling);
-    }
-}
diff --git a/nptl/descr.h b/nptl/descr.h
index d3f863a..a53f332 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -275,23 +275,20 @@ struct pthread
   /* Bit set if asynchronous cancellation mode is selected.  */
 #define CANCELTYPE_BIT		1
 #define CANCELTYPE_BITMASK	(0x01 << CANCELTYPE_BIT)
-  /* Bit set if canceling has been initiated.  */
-#define CANCELING_BIT		2
-#define CANCELING_BITMASK	(0x01 << CANCELING_BIT)
-  /* Bit set if canceled.  */
-#define CANCELED_BIT		3
+  /* Bit set if thread is canceled.  */
+#define CANCELED_BIT		2
 #define CANCELED_BITMASK	(0x01 << CANCELED_BIT)
   /* Bit set if thread is exiting.  */
-#define EXITING_BIT		4
+#define EXITING_BIT		3
 #define EXITING_BITMASK		(0x01 << EXITING_BIT)
   /* Bit set if thread terminated and TCB is freed.  */
-#define TERMINATED_BIT		5
+#define TERMINATED_BIT		4
 #define TERMINATED_BITMASK	(0x01 << TERMINATED_BIT)
   /* Bit set if thread is supposed to change XID.  */
-#define SETXID_BIT		6
+#define SETXID_BIT		5
 #define SETXID_BITMASK		(0x01 << SETXID_BIT)
   /* Mask for the rest.  Helps the compiler to optimize.  */
-#define CANCEL_RESTMASK		0xffffff80
+#define CANCEL_RESTMASK		0xffffffc0
 
 #define CANCEL_ENABLED_AND_CANCELED(value) \
   (((value) & (CANCELSTATE_BITMASK | CANCELED_BITMASK | EXITING_BITMASK	      \
diff --git a/nptl/libc-cancellation.c b/nptl/libc-cancellation.c
index 37654cf..430e0b9 100644
--- a/nptl/libc-cancellation.c
+++ b/nptl/libc-cancellation.c
@@ -18,7 +18,46 @@
 
 #include "pthreadP.h"
 
+/* Cancellation function called by all cancellable syscalls.  */
+long int
+__syscall_cancel (__syscall_arg_t nr, __syscall_arg_t a1,
+		  __syscall_arg_t a2, __syscall_arg_t a3,
+		  __syscall_arg_t a4, __syscall_arg_t a5,
+		  __syscall_arg_t a6)
+{
+  pthread_t self = (pthread_t) THREAD_SELF;
+  struct pthread *pd = (struct pthread *) self;
+  long int result;
 
-#define __pthread_enable_asynccancel __libc_enable_asynccancel
-#define __pthread_disable_asynccancel __libc_disable_asynccancel
-#include <nptl/cancellation.c>
+  /* If cancellation is not enabled, call the syscall directly.  */
+  if (pd->cancelhandling & CANCELSTATE_BITMASK)
+    {
+      INTERNAL_SYSCALL_DECL (err);
+      result = INTERNAL_SYSCALL_NCS_CALL (nr, err, a1, a2, a3, a4, a5, a6);
+      if (INTERNAL_SYSCALL_ERROR_P (result, err))
+	return -INTERNAL_SYSCALL_ERRNO (result, err);
+      return result;
+    }
+
+  /* Call the arch-specific entry points that contains the globals markers
+     to be checked by SIGCANCEL handler.  */
+  result = __syscall_cancel_arch (&pd->cancelhandling, nr, a1, a2, a3, a4, a5,
+			          a6);
+
+  if ((result == -EINTR)
+      && (pd->cancelhandling & CANCELED_BITMASK)
+      && !(pd->cancelhandling & CANCELSTATE_BITMASK))
+    __do_cancel ();
+
+  return result;
+}
+libc_hidden_def (__syscall_cancel)
+
+/* Since __do_cancel is a always inline function, this creates a symbol the
+   arch-specific symbol can call to cancel the thread.  */
+_Noreturn void
+attribute_hidden
+__syscall_do_cancel (void)
+{
+  __do_cancel ();
+}
diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
index ea91b9e..b224925 100644
--- a/nptl/nptl-init.c
+++ b/nptl/nptl-init.c
@@ -39,6 +39,9 @@
 #include <libc-pointer-arith.h>
 #include <pthread-pids.h>
 #include <pthread_mutex_conf.h>
+#include <sigcontextinfo.h>
+#include <cancellation-sigmask.h>
+#include <cancellation-pc-check.h>
 
 #ifndef TLS_MULTIPLE_THREADS_IN_TCB
 /* Pointer to the corresponding variable in libc.  */
@@ -155,35 +158,23 @@ sigcancel_handler (int sig, siginfo_t *si, void *ctx)
 
   struct pthread *self = THREAD_SELF;
 
-  int oldval = THREAD_GETMEM (self, cancelhandling);
-  while (1)
-    {
-      /* We are canceled now.  When canceled by another thread this flag
-	 is already set but if the signal is directly send (internally or
-	 from another process) is has to be done here.  */
-      int newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
-
-      if (oldval == newval || (oldval & EXITING_BITMASK) != 0)
-	/* Already canceled or exiting.  */
-	break;
-
-      int curval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling, newval,
-					      oldval);
-      if (curval == oldval)
-	{
-	  /* Set the return value.  */
-	  THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-
-	  /* Make sure asynchronous cancellation is still enabled.  */
-	  if ((newval & CANCELTYPE_BITMASK) != 0)
-	    /* Run the registered destructors and terminate the thread.  */
-	    __do_cancel ();
-
-	  break;
-	}
-
-      oldval = curval;
-    }
+  if (((self->cancelhandling & (CANCELSTATE_BITMASK)) != 0)
+      || ((self->cancelhandling & CANCELED_BITMASK) == 0))
+    return;
+
+  /* Add SIGCANCEL on ignored sigmask to avoid the handler to be called
+     again.  */
+  ucontext_block_sigcancel (ctx);
+
+  /* Check if asynchronous cancellation mode is set or if interrupted
+     instruction pointer falls within the cancellable syscall bridge.  For
+     interruptable syscalls that might generate external side-effects (partial
+     reads or writes, for instance), the kernel will set the IP to after
+     '__syscall_cancel_arch_end', thus disabling the cancellation and allowing
+     the process to handle such conditions.  */
+  if (self->cancelhandling & CANCELTYPE_BITMASK
+      || cancellation_pc_check (ctx))
+    __do_cancel ();
 }
 #endif
 
@@ -286,38 +277,49 @@ __pthread_initialize_minimal_internal (void)
   THREAD_SETMEM (pd, report_events, __nptl_initial_report_events);
 
 #if defined SIGCANCEL || defined SIGSETXID
-  struct sigaction sa;
-  __sigemptyset (&sa.sa_mask);
 
 # ifdef SIGCANCEL
   /* Install the cancellation signal handler.  If for some reason we
      cannot install the handler we do not abort.  Maybe we should, but
      it is only asynchronous cancellation which is affected.  */
-  sa.sa_sigaction = sigcancel_handler;
-  sa.sa_flags = SA_SIGINFO;
-  (void) __libc_sigaction (SIGCANCEL, &sa, NULL);
+  {
+    struct sigaction sa;
+    sa.sa_sigaction = sigcancel_handler;
+    /* The signal handle should be non-interruptible to avoid the risk of
+       spurious EINTR caused by SIGCANCEL sent to process or if pthread_cancel
+       is called while cancellation is disabled in the target thread.  */
+    sa.sa_flags = SA_SIGINFO | SA_RESTART;
+    sa.sa_mask = SIGALL_SET;
+    __libc_sigaction (SIGCANCEL, &sa, NULL);
+  }
 # endif
 
 # ifdef SIGSETXID
-  /* Install the handle to change the threads' uid/gid.  */
-  sa.sa_sigaction = sighandler_setxid;
-  sa.sa_flags = SA_SIGINFO | SA_RESTART;
-  (void) __libc_sigaction (SIGSETXID, &sa, NULL);
+  {
+    /* Install the handle to change the threads' uid/gid.  */
+    struct sigaction sa;
+    __sigemptyset (&sa.sa_mask);
+    sa.sa_sigaction = sighandler_setxid;
+    sa.sa_flags = SA_SIGINFO | SA_RESTART;
+    __libc_sigaction (SIGSETXID, &sa, NULL);
+  }
 # endif
 
   /* The parent process might have left the signals blocked.  Just in
      case, unblock it.  We reuse the signal mask in the sigaction
      structure.  It is already cleared.  */
+  {
+    struct sigaction sa;
+    __sigemptyset (&sa.sa_mask);
 # ifdef SIGCANCEL
-  __sigaddset (&sa.sa_mask, SIGCANCEL);
+    __sigaddset (&sa.sa_mask, SIGCANCEL);
 # endif
 # ifdef SIGSETXID
-  __sigaddset (&sa.sa_mask, SIGSETXID);
+    __sigaddset (&sa.sa_mask, SIGSETXID);
 # endif
-  {
     INTERNAL_SYSCALL_DECL (err);
-    (void) INTERNAL_SYSCALL (rt_sigprocmask, err, 4, SIG_UNBLOCK, &sa.sa_mask,
-			     NULL, _NSIG / 8);
+    INTERNAL_SYSCALL_CALL (rt_sigprocmask, err, SIG_UNBLOCK, &sa.sa_mask,
+			   NULL, _NSIG / 8);
   }
 #endif
 
diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index 070b3af..42c0857 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -296,20 +296,46 @@ extern void __nptl_unwind_freeres (void) attribute_hidden;
 #endif
 
 
-/* Called when a thread reacts on a cancellation request.  */
 static inline void
 __attribute ((noreturn, always_inline))
-__do_cancel (void)
+__do_cancel_with_result (void *result)
 {
   struct pthread *self = THREAD_SELF;
 
-  /* Make sure we get no more cancellations.  */
-  THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT);
+  /* Make sure we get no more cancellations by clearing the cancel
+     state.  */
+  int oldval = THREAD_GETMEM (self, cancelhandling);
+  while (1)
+    {
+      int newval = oldval | CANCELSTATE_BITMASK | EXITING_BITMASK;
+      if (oldval == newval)
+	break;
+
+      oldval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling, newval,
+					  oldval);
+    }
+
+  THREAD_SETMEM (self, result, result);
 
   __pthread_unwind ((__pthread_unwind_buf_t *)
 		    THREAD_GETMEM (self, cleanup_jmp_buf));
 }
 
+/* Called when a thread reacts on a cancellation request.  */
+static inline void
+__attribute ((noreturn, always_inline))
+__do_cancel (void)
+{
+  __do_cancel_with_result (PTHREAD_CANCELED);
+}
+
+extern long int __syscall_cancel_arch (volatile int *, __syscall_arg_t nr,
+     __syscall_arg_t arg1, __syscall_arg_t arg2, __syscall_arg_t arg3,
+     __syscall_arg_t arg4, __syscall_arg_t arg5, __syscall_arg_t arg6);
+libc_hidden_proto (__syscall_cancel_arch);
+
+extern _Noreturn void __syscall_do_cancel (void)
+     attribute_hidden;
 
 /* Internal prototypes.  */
 
@@ -469,11 +495,11 @@ extern int __pthread_equal (pthread_t thread1, pthread_t thread2);
 extern int __pthread_detach (pthread_t th);
 extern int __pthread_cancel (pthread_t th);
 extern int __pthread_kill (pthread_t threadid, int signo);
+extern int __pthread_kill_internal (pthread_t threadid, int signo)
+  attribute_hidden;
 extern void __pthread_exit (void *value) __attribute__ ((__noreturn__));
 extern int __pthread_join (pthread_t threadid, void **thread_return);
 extern int __pthread_setcanceltype (int type, int *oldtype);
-extern int __pthread_enable_asynccancel (void) attribute_hidden;
-extern void __pthread_disable_asynccancel (int oldtype) attribute_hidden;
 extern void __pthread_testcancel (void);
 extern int __pthread_timedjoin_ex (pthread_t, void **, const struct timespec *,
 				   bool);
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index 64ac12e..5275a28 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -37,67 +37,23 @@ __pthread_cancel (pthread_t th)
 #ifdef SHARED
   pthread_cancel_init ();
 #endif
-  int result = 0;
-  int oldval;
-  int newval;
-  do
-    {
-    again:
-      oldval = pd->cancelhandling;
-      newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
 
-      /* Avoid doing unnecessary work.  The atomic operation can
-	 potentially be expensive if the bug has to be locked and
-	 remote cache lines have to be invalidated.  */
-      if (oldval == newval)
-	break;
+  THREAD_ATOMIC_BIT_SET (pd, cancelhandling, CANCELED_BIT);
 
-      /* If the cancellation is handled asynchronously just send a
-	 signal.  We avoid this if possible since it's more
-	 expensive.  */
-      if (CANCEL_ENABLED_AND_CANCELED_AND_ASYNCHRONOUS (newval))
-	{
-	  /* Mark the cancellation as "in progress".  */
-	  if (atomic_compare_and_exchange_bool_acq (&pd->cancelhandling,
-						    oldval | CANCELING_BITMASK,
-						    oldval))
-	    goto again;
-
-#ifdef SIGCANCEL
-	  /* The cancellation handler will take care of marking the
-	     thread as canceled.  */
-	  pid_t pid = __getpid ();
-
-	  INTERNAL_SYSCALL_DECL (err);
-	  int val = INTERNAL_SYSCALL_CALL (tgkill, err, pid, pd->tid,
-					   SIGCANCEL);
-	  if (INTERNAL_SYSCALL_ERROR_P (val, err))
-	    result = INTERNAL_SYSCALL_ERRNO (val, err);
-#else
-          /* It should be impossible to get here at all, since
-             pthread_setcanceltype should never have allowed
-             PTHREAD_CANCEL_ASYNCHRONOUS to be set.  */
-          abort ();
-#endif
-
-	  break;
-	}
-
-	/* A single-threaded process should be able to kill itself, since
-	   there is nothing in the POSIX specification that says that it
-	   cannot.  So we set multiple_threads to true so that cancellation
-	   points get executed.  */
-	THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
+  /* A single-threaded process should be able to kill itself, since there is
+     nothing in the POSIX specification that says that it cannot.  So we set
+     multiple_threads to true so that cancellation points get executed.  */
+  THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
 #ifndef TLS_MULTIPLE_THREADS_IN_TCB
-	__pthread_multiple_threads = *__libc_multiple_threads_ptr = 1;
+  __pthread_multiple_threads = *__libc_multiple_threads_ptr = 1;
 #endif
-    }
-  /* Mark the thread as canceled.  This has to be done
-     atomically since other bits could be modified as well.  */
-  while (atomic_compare_and_exchange_bool_acq (&pd->cancelhandling, newval,
-					       oldval));
 
-  return result;
+  /* Avoid signaling when thread attempts cancel itself (pthread_kill
+     is expensive).  */
+  if (pd == THREAD_SELF && !(pd->cancelhandling & CANCELTYPE_BITMASK))
+    return 0;
+
+  return __pthread_kill_internal (th, SIGCANCEL);
 }
 weak_alias (__pthread_cancel, pthread_cancel)
 
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 130937c..8cab7f9 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -406,7 +406,7 @@ START_THREAD_DEFN
   /* If the parent was running cancellation handlers while creating
      the thread the new thread inherited the signal mask.  Reset the
      cancellation signal mask.  */
-  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELING_BITMASK))
+  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELED_BITMASK))
     {
       INTERNAL_SYSCALL_DECL (err);
       sigset_t mask;
@@ -449,7 +449,8 @@ START_THREAD_DEFN
 	 have ownership (see CONCURRENCY NOTES above).  */
       if (__glibc_unlikely (pd->stopped_start))
 	{
-	  int oldtype = CANCEL_ASYNC ();
+	  int ct;
+	  __pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, &ct);
 
 	  /* Get the lock the parent locked to force synchronization.  */
 	  lll_lock (pd->lock, LLL_PRIVATE);
@@ -459,7 +460,7 @@ START_THREAD_DEFN
 	  /* And give it up right away.  */
 	  lll_unlock (pd->lock, LLL_PRIVATE);
 
-	  CANCEL_RESET (oldtype);
+	  __pthread_setcanceltype (ct, NULL);
 	}
 
       LIBC_PROBE (pthread_start, 3, (pthread_t) pd, pd->start_routine, pd->arg);
diff --git a/nptl/pthread_exit.c b/nptl/pthread_exit.c
index 643c85b..6c59cdf 100644
--- a/nptl/pthread_exit.c
+++ b/nptl/pthread_exit.c
@@ -16,16 +16,13 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <stdlib.h>
 #include "pthreadP.h"
 
 
 void
 __pthread_exit (void *value)
 {
-  THREAD_SETMEM (THREAD_SELF, result, value);
-
-  __do_cancel ();
+  __do_cancel_with_result (value);
 }
 weak_alias (__pthread_exit, pthread_exit)
 
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
index 9545ae4..974540c 100644
--- a/nptl/pthread_join_common.c
+++ b/nptl/pthread_join_common.c
@@ -68,7 +68,8 @@ timedwait_tid (pid_t *tidp, const struct timespec *abstime)
       /* If *tidp == tid, wait until thread terminates or the wait times out.
          The kernel up to version 3.16.3 does not use the private futex
          operations for futex wake-up when the clone terminates.  */
-      if (lll_futex_timed_wait_cancel (tidp, tid, &rt, LLL_SHARED)
+      if (lll_futex_timed_wait_cancel ((unsigned int *) tidp, tid, &rt,
+				       LLL_SHARED)
 	  == -ETIMEDOUT)
         return ETIMEDOUT;
     }
@@ -100,7 +101,7 @@ __pthread_timedjoin_ex (pthread_t threadid, void **thread_return,
   if ((pd == self
        || (self->joinid == pd
 	   && (pd->cancelhandling
-	       & (CANCELING_BITMASK | CANCELED_BITMASK | EXITING_BITMASK
+	       & (CANCELED_BITMASK | EXITING_BITMASK
 		  | TERMINATED_BITMASK)) == 0))
       && !CANCEL_ENABLED_AND_CANCELED (self->cancelhandling))
     /* This is a deadlock situation.  The threads are waiting for each
@@ -139,7 +140,7 @@ __pthread_timedjoin_ex (pthread_t threadid, void **thread_return,
 	  /* We need acquire MO here so that we synchronize with the
 	     kernel's store to 0 when the clone terminates. (see above)  */
 	  while ((tid = atomic_load_acquire (&pd->tid)) != 0)
-	    lll_futex_wait_cancel (&pd->tid, tid, LLL_SHARED);
+	    lll_futex_wait_cancel ((unsigned int *) &pd->tid, tid, LLL_SHARED);
 	}
 
       pthread_cleanup_pop (0);
diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
index 2805c72..4c2672a 100644
--- a/nptl/pthread_kill.c
+++ b/nptl/pthread_kill.c
@@ -31,8 +31,9 @@ __pthread_kill (pthread_t threadid, int signo)
     /* Not a valid thread handle.  */
     return ESRCH;
 
-  return ENOSYS;
+  if (__is_internal_signal (signo))
+    return EINVAL;
+
+  return __pthread_kill_internal (threadid, signo);
 }
 strong_alias (__pthread_kill, pthread_kill)
-
-stub_warning (pthread_kill)
diff --git a/sysdeps/unix/sysv/linux/pthread_kill.c b/nptl/pthread_kill_internal.c
similarity index 75%
rename from sysdeps/unix/sysv/linux/pthread_kill.c
rename to nptl/pthread_kill_internal.c
index 71305b9..714b523 100644
--- a/sysdeps/unix/sysv/linux/pthread_kill.c
+++ b/nptl/pthread_kill_internal.c
@@ -16,24 +16,15 @@
    License along with the GNU C Library; if not, see
    <https://www.gnu.org/licenses/>.  */
 
-#include <errno.h>
-#include <signal.h>
-#include <pthreadP.h>
-#include <tls.h>
-#include <sysdep.h>
 #include <unistd.h>
+#include <pthreadP.h>
 
-
+/* Used internally by pthread_cancel, so we can't filter SIGCANCEL.  */
 int
-__pthread_kill (pthread_t threadid, int signo)
+__pthread_kill_internal (pthread_t threadid, int signo)
 {
   struct pthread *pd = (struct pthread *) threadid;
 
-  /* Make sure the descriptor is valid.  */
-  if (DEBUGGING_P && INVALID_TD_P (pd))
-    /* Not a valid thread handle.  */
-    return ESRCH;
-
   /* Force load of pd->tid into local variable or register.  Otherwise
      if a thread exits between ESRCH test and tgkill, we might return
      EINVAL, because pd->tid would be cleared by the kernel.  */
@@ -42,11 +33,6 @@ __pthread_kill (pthread_t threadid, int signo)
     /* Not a valid thread handle.  */
     return ESRCH;
 
-  /* Disallow sending the signal we use for cancellation, timers,
-     for the setxid implementation.  */
-  if (signo == SIGCANCEL || signo == SIGTIMER || signo == SIGSETXID)
-    return EINVAL;
-
   /* We have a special syscall to do the work.  */
   INTERNAL_SYSCALL_DECL (err);
 
@@ -56,4 +42,3 @@ __pthread_kill (pthread_t threadid, int signo)
   return (INTERNAL_SYSCALL_ERROR_P (val, err)
 	  ? INTERNAL_SYSCALL_ERRNO (val, err) : 0);
 }
-strong_alias (__pthread_kill, pthread_kill)
diff --git a/nptl/pthread_setcanceltype.c b/nptl/pthread_setcanceltype.c
index d771c31..84cc657 100644
--- a/nptl/pthread_setcanceltype.c
+++ b/nptl/pthread_setcanceltype.c
@@ -73,4 +73,4 @@ __pthread_setcanceltype (int type, int *oldtype)
 
   return 0;
 }
-strong_alias (__pthread_setcanceltype, pthread_setcanceltype)
+weak_alias (__pthread_setcanceltype, pthread_setcanceltype)
diff --git a/nptl/sem_wait.c b/nptl/sem_wait.c
index 8420719..fc3d353 100644
--- a/nptl/sem_wait.c
+++ b/nptl/sem_wait.c
@@ -58,7 +58,7 @@ __old_sem_wait (sem_t *sem)
 	return 0;
 
       /* Always assume the semaphore is shared.  */
-      err = lll_futex_wait_cancel (futex, 0, LLL_SHARED);
+      err = lll_futex_wait_cancel ((unsigned int *) futex, 0, LLL_SHARED);
     }
   while (err == 0 || err == -EWOULDBLOCK);
 
diff --git a/nptl/thrd_sleep.c b/nptl/thrd_sleep.c
index 2e185dd..75c0d53 100644
--- a/nptl/thrd_sleep.c
+++ b/nptl/thrd_sleep.c
@@ -24,13 +24,12 @@
 int
 thrd_sleep (const struct timespec* time_point, struct timespec* remaining)
 {
-  INTERNAL_SYSCALL_DECL (err);
-  int ret = INTERNAL_SYSCALL_CANCEL (nanosleep, err, time_point, remaining);
-  if (INTERNAL_SYSCALL_ERROR_P (ret, err))
+  long int ret = INTERNAL_SYSCALL_CANCEL (nanosleep, time_point, remaining);
+  if (SYSCALL_CANCEL_ERROR (ret))
     {
       /* C11 states thrd_sleep function returns -1 if it has been interrupted
 	 by a signal, or a negative value if it fails.  */
-      ret = INTERNAL_SYSCALL_ERRNO (ret, err);
+      ret = -ret;
       if (ret == EINTR)
 	return -1;
       return -2;
diff --git a/nptl/tst-cancel28.c b/nptl/tst-cancel28.c
new file mode 100644
index 0000000..b8f77be
--- /dev/null
+++ b/nptl/tst-cancel28.c
@@ -0,0 +1,99 @@
+/* Check side-effect act for cancellable syscalls (BZ #12683).
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This testcase checks if there is resource leakage if the syscall has
+   returned from kernelspace, but before userspace saves the return
+   value.  The 'leaker' thread should be able to close the file descriptor
+   if the resource is already allocated, meaning that if the cancellation
+   signal arrives *after* the open syscal return from kernel, the
+   side-effect should be visible to application.  */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+
+#include <support/xunistd.h>
+#include <support/xthread.h>
+#include <support/check.h>
+#include <support/temp_file.h>
+#include <support/support.h>
+#include <support/descriptors.h>
+
+static void *
+writeopener (void *arg)
+{
+  int fd;
+  for (;;)
+    {
+      fd = open (arg, O_WRONLY);
+      xclose (fd);
+    }
+  return NULL;
+}
+
+static void *
+leaker (void *arg)
+{
+  int fd = open (arg, O_RDONLY);
+  TEST_VERIFY_EXIT (fd > 0);
+  pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, 0);
+  xclose (fd);
+  return NULL;
+}
+
+
+#define ITER_COUNT 1000
+
+static int
+do_test (void)
+{
+  char *dir = support_create_temp_directory ("tst-cancel28");
+  char *name = xasprintf ("%s/fifo", dir);
+  TEST_COMPARE (mkfifo (name, 0600), 0);
+  add_temp_file (name);
+
+  struct support_descriptors *descrs = support_descriptors_list ();
+
+  srand (1);
+
+  xpthread_create (NULL, writeopener, name);
+  for (int i = 0; i < ITER_COUNT; i++)
+    {
+      pthread_t td = xpthread_create (NULL, leaker, name);
+      struct timespec ts =
+	{ .tv_nsec = rand () % 100000, .tv_sec = 0 };
+      nanosleep (&ts, NULL);
+      /* Ignore pthread_cancel result because it might be the
+	 case when pthread_cancel is called when thread is already
+	 exited.  */
+      pthread_cancel (td);
+      xpthread_join (td);
+    }
+
+  support_descriptors_check (descrs);
+
+  support_descriptors_free (descrs);
+
+  free (name);
+
+  return 0;
+}
+
+#include <support/test-driver.c>
diff --git a/rt/Makefile b/rt/Makefile
index 6c8365e..a7a9ec4 100644
--- a/rt/Makefile
+++ b/rt/Makefile
@@ -56,7 +56,7 @@ include ../Rules
 CFLAGS-aio_suspend.c += -fexceptions
 CFLAGS-mq_timedreceive.c += -fexceptions -fasynchronous-unwind-tables
 CFLAGS-mq_timedsend.c += -fexceptions -fasynchronous-unwind-tables
-CFLAGS-librt-cancellation.c += -fasynchronous-unwind-tables
+CFLAGS-clock_nanosleep.c += -fexceptions -fasynchronous-unwind-tables
 
 LDFLAGS-rt.so = -Wl,--enable-new-dtags,-z,nodelete
 
diff --git a/sysdeps/generic/sysdep-cancel.h b/sysdeps/generic/sysdep-cancel.h
index d22a786..5c84b44 100644
--- a/sysdeps/generic/sysdep-cancel.h
+++ b/sysdeps/generic/sysdep-cancel.h
@@ -3,5 +3,3 @@
 /* No multi-thread handling enabled.  */
 #define SINGLE_THREAD_P (1)
 #define RTLD_SINGLE_THREAD_P (1)
-#define LIBC_CANCEL_ASYNC()	0 /* Just a dummy value.  */
-#define LIBC_CANCEL_RESET(val)	((void)(val)) /* Nothing, but evaluate it.  */
diff --git a/sysdeps/nptl/Makefile b/sysdeps/nptl/Makefile
index fbb9800..19293a7 100644
--- a/sysdeps/nptl/Makefile
+++ b/sysdeps/nptl/Makefile
@@ -21,8 +21,7 @@ libpthread-sysdep_routines += errno-loc
 endif
 
 ifeq ($(subdir),rt)
-librt-sysdep_routines += timer_routines librt-cancellation
-CFLAGS-librt-cancellation.c += -fexceptions -fasynchronous-unwind-tables
+librt-sysdep_routines += timer_routines
 
 tests += tst-mqueue8x
 CFLAGS-tst-mqueue8x.c += -fexceptions
diff --git a/sysdeps/nptl/cancellation-pc-check.h b/sysdeps/nptl/cancellation-pc-check.h
new file mode 100644
index 0000000..903a866
--- /dev/null
+++ b/sysdeps/nptl/cancellation-pc-check.h
@@ -0,0 +1,53 @@
+/* Architecture specific code for pthread cancellation handling.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _NPTL_CANCELLATION_PC_CHECK
+#define _NPTL_CANCELLATION_PC_CHECK
+
+#include <sigcontextinfo.h>
+
+/* For syscalls with side-effects, the kernel cannot restart the syscall; when
+   it is interrupted by a signal, the kernel must cause the syscall to return
+   with whatever partial result is obtained (e.g. partial read or write).  In
+   this case, the saved program counter points just after the syscall
+   instruction, so the SIGCANCEL handler should not act on cancellation.
+
+   The __syscall_cancel_arch function, used for all cancellable syscalls,
+   contains two extra markers, __syscall_cancel_arch_start and
+   __syscall_cancel_arch_end.  The former points to just before the initial
+   conditional branch that checks if the thread has received a cancellation
+   request, while former points to the instruction after the one responsible
+   to issue the syscall.
+
+   The function check if the program counter (PC) from ucontext_t CTX is
+   within the start and then end boundary from the __syscall_cancel_arch
+   bridge.  Return TRUE if the PC is within the boundary, meaning the
+   syscall does not have any side effects; or FALSE otherwise.  */
+static bool
+cancellation_pc_check (void *ctx)
+{
+  /* Both are defined in syscall_cancel.S.  */
+  extern const char __syscall_cancel_arch_start[1];
+  extern const char __syscall_cancel_arch_end[1];
+
+  uintptr_t pc = sigcontext_get_pc (ctx);
+  return pc >= (uintptr_t) __syscall_cancel_arch_start
+	 && pc < (uintptr_t) __syscall_cancel_arch_end;
+}
+
+#endif
diff --git a/sysdeps/nptl/librt-cancellation.c b/sysdeps/nptl/cancellation-sigmask.h
similarity index 60%
rename from sysdeps/nptl/librt-cancellation.c
rename to sysdeps/nptl/cancellation-sigmask.h
index 93ebe4a..ad95145 100644
--- a/sysdeps/nptl/librt-cancellation.c
+++ b/sysdeps/nptl/cancellation-sigmask.h
@@ -1,6 +1,6 @@
-/* Copyright (C) 2002-2019 Free Software Foundation, Inc.
+/* Architecture specific code for pthread cancellation handling.
+   Copyright (C) 2019 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
 
    The GNU C Library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -14,11 +14,17 @@
 
    You should have received a copy of the GNU Lesser General Public
    License along with the GNU C Library; if not, see
-   <https://www.gnu.org/licenses/>.  */
+   <http://www.gnu.org/licenses/>.  */
 
-#include <nptl/pthreadP.h>
+#ifndef _NPTL_CANCELLATION_SIGMASK_H
+#define _NPTL_CANCELLATION_SIGMASK_H
 
+/* Add the SIGCANCEL signal on sigmask set at the ucontext_t CTX obtained from
+   the sigaction handler.  */
+static void
+ucontext_block_sigcancel (void *ctx)
+{
+  __sigaddset (&((ucontext_t*) ctx)->uc_sigmask, SIGCANCEL);
+}
 
-#define __pthread_enable_asynccancel __librt_enable_asynccancel
-#define __pthread_disable_asynccancel __librt_disable_asynccancel
-#include <nptl/cancellation.c>
+#endif
diff --git a/sysdeps/unix/sysdep.h b/sysdeps/unix/sysdep.h
index 10468c7..7fe6bd8 100644
--- a/sysdeps/unix/sysdep.h
+++ b/sysdeps/unix/sysdep.h
@@ -24,6 +24,9 @@
 #define	SYSCALL__(name, args)	PSEUDO (__##name, name, args)
 #define	SYSCALL(name, args)	PSEUDO (name, name, args)
 
+#ifndef __ASSEMBLER__
+# include <errno.h>
+
 #define __SYSCALL_CONCAT_X(a,b)     a##b
 #define __SYSCALL_CONCAT(a,b)       __SYSCALL_CONCAT_X (a, b)
 
@@ -57,6 +60,29 @@
 #define INTERNAL_SYSCALL_CALL(...) \
   __INTERNAL_SYSCALL_DISP (__INTERNAL_SYSCALL, __VA_ARGS__)
 
+#define __INTERNAL_SYSCALL_NCS0(name, err) \
+  INTERNAL_SYSCALL_NCS (name, err, 0)
+#define __INTERNAL_SYSCALL_NCS1(name, err, a1) \
+  INTERNAL_SYSCALL_NCS (name, err, 1, a1)
+#define __INTERNAL_SYSCALL_NCS2(name, err, a1, a2) \
+  INTERNAL_SYSCALL_NCS (name, err, 2, a1, a2)
+#define __INTERNAL_SYSCALL_NCS3(name, err, a1, a2, a3) \
+  INTERNAL_SYSCALL_NCS (name, err, 3, a1, a2, a3)
+#define __INTERNAL_SYSCALL_NCS4(name, err, a1, a2, a3, a4) \
+  INTERNAL_SYSCALL_NCS (name, err, 4, a1, a2, a3, a4)
+#define __INTERNAL_SYSCALL_NCS5(name, err, a1, a2, a3, a4, a5) \
+  INTERNAL_SYSCALL_NCS (name, err, 5, a1, a2, a3, a4, a5)
+#define __INTERNAL_SYSCALL_NCS6(name, err, a1, a2, a3, a4, a5, a6) \
+  INTERNAL_SYSCALL_NCS (name, err, 6, a1, a2, a3, a4, a5, a6)
+#define __INTERNAL_SYSCALL_NCS7(name, err, a1, a2, a3, a4, a5, a6, a7) \
+  INTERNAL_SYSCALL_NCS (name, err, 7, a1, a2, a3, a4, a5, a6, a7)
+
+/* Issue a syscall defined by syscall number plus any other argument required.
+   It is similar to INTERNAL_SYSCALL_NCS macro, but without the need to pass
+   the expected argument number as third parameter.  */
+#define INTERNAL_SYSCALL_NCS_CALL(...) \
+  __INTERNAL_SYSCALL_DISP (__INTERNAL_SYSCALL_NCS, __VA_ARGS__)
+
 #define __INLINE_SYSCALL0(name) \
   INLINE_SYSCALL (name, 0)
 #define __INLINE_SYSCALL1(name, a1) \
@@ -88,35 +114,71 @@
 #define INLINE_SYSCALL_CALL(...) \
   __INLINE_SYSCALL_DISP (__INLINE_SYSCALL, __VA_ARGS__)
 
-#define SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (SINGLE_THREAD_P) 						     \
-      sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
-  })
 
-/* Issue a syscall defined by syscall number plus any other argument
-   required.  Any error will be returned unmodified (including errno).  */
-#define INTERNAL_SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (SINGLE_THREAD_P) 						     \
-      sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
+/* Cancellation macros.  */
+#ifndef __SSC
+typedef long int __syscall_arg_t;
+# define __SSC(__x) ((__syscall_arg_t) (__x))
+#endif
+
+long int __syscall_cancel (__syscall_arg_t nr, __syscall_arg_t arg1,
+			   __syscall_arg_t arg2, __syscall_arg_t arg3,
+			   __syscall_arg_t arg4, __syscall_arg_t arg5,
+			   __syscall_arg_t arg6);
+libc_hidden_proto (__syscall_cancel);
+
+#define __SYSCALL_CANCEL0(name) \
+  (__syscall_cancel) (__NR_##name, 0, 0, 0, 0, 0, 0)
+#define __SYSCALL_CANCEL1(name, a1) \
+  (__syscall_cancel) (__NR_##name, __SSC (a1), 0, 0, 0, 0, 0)
+#define __SYSCALL_CANCEL2(name, a1, a2) \
+  (__syscall_cancel) (__NR_##name, __SSC (a1), __SSC (a2), 0, 0, 0, 0)
+#define __SYSCALL_CANCEL3(name, a1, a2, a3) \
+  (__syscall_cancel) (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		      0, 0, 0)
+#define __SYSCALL_CANCEL4(name, a1, a2, a3, a4) \
+  (__syscall_cancel) (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		      __SSC(a4), 0, 0)
+#define __SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5) \
+  (__syscall_cancel) (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		      __SSC(a4), __SSC (a5), 0)
+#define __SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6) \
+  (__syscall_cancel) (__NR_##name, __SSC (a1), __SSC (a2), __SSC (a3), \
+		      __SSC (a4), __SSC (a5), __SSC (a6))
+
+#define __SYSCALL_CANCEL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
+#define __SYSCALL_CANCEL_NARGS(...) \
+  __SYSCALL_CANCEL_NARGS_X (__VA_ARGS__,7,6,5,4,3,2,1,0,)
+#define __SYSCALL_CANCEL_CONCAT_X(a,b)     a##b
+#define __SYSCALL_CANCEL_CONCAT(a,b)       __SYSCALL_CANCEL_CONCAT_X (a, b)
+#define __SYSCALL_CANCEL_DISP(b,...) \
+  __SYSCALL_CANCEL_CONCAT (b,__SYSCALL_CANCEL_NARGS(__VA_ARGS__))(__VA_ARGS__)
+
+#define __SYSCALL_CANCEL_CALL(...) \
+  __SYSCALL_CANCEL_DISP (__SYSCALL_CANCEL, __VA_ARGS__)
+
+/* Issue a cancellable syscall defined by syscall number NAME plus any other
+   argument required.  If an error occurs its value is returned as an negative
+   number unmodified and errno is not set.  */
+#define INTERNAL_SYSCALL_CANCEL(name, args...) \
+  __SYSCALL_CANCEL_CALL (name, args)
+
+/* Issue a cancellable syscall defined first argument plus any other argument
+   required.  If and error occurs its value, the macro returns -1 and sets
+   errno accordingly.  */
+#if IS_IN (rtld)
+/* The loader does not need to handle thread cancellation, use direct
+   syscall instead.  */
+# define SYSCALL_CANCEL(...) INLINE_SYSCALL_CALL (__VA_ARGS__)
+#else
+# define SYSCALL_CANCEL(...) \
+  ({									\
+    long int sc_ret = __SYSCALL_CANCEL_CALL (__VA_ARGS__);		\
+    SYSCALL_CANCEL_RET ((sc_ret));					\
   })
+#endif
+
+#endif /* __ASSEMBLER__  */
 
 /* Machine-dependent sysdep.h files are expected to define the macro
    PSEUDO (function_name, syscall_name) to emit assembly code to define the
diff --git a/sysdeps/unix/sysv/linux/clock_nanosleep.c b/sysdeps/unix/sysv/linux/clock_nanosleep.c
index 1f240b8..6cd2f20 100644
--- a/sysdeps/unix/sysv/linux/clock_nanosleep.c
+++ b/sysdeps/unix/sysv/linux/clock_nanosleep.c
@@ -36,11 +36,9 @@ __clock_nanosleep (clockid_t clock_id, int flags, const struct timespec *req,
 
   /* If the call is interrupted by a signal handler or encounters an error,
      it returns a positive value similar to errno.  */
-  INTERNAL_SYSCALL_DECL (err);
-  int r = INTERNAL_SYSCALL_CANCEL (clock_nanosleep, err, clock_id, flags,
-				   req, rem);
-  return (INTERNAL_SYSCALL_ERROR_P (r, err)
-	  ? INTERNAL_SYSCALL_ERRNO (r, err) : 0);
+  long int r = INTERNAL_SYSCALL_CANCEL (clock_nanosleep, clock_id, flags,
+					req, rem);
+  return SYSCALL_CANCEL_ERROR (r) ? -r : 0;
 }
 
 versioned_symbol (libc, __clock_nanosleep, clock_nanosleep, GLIBC_2_17);
diff --git a/sysdeps/unix/sysv/linux/futex-internal.h b/sysdeps/unix/sysv/linux/futex-internal.h
index 5a4f4ff..b41a4d9 100644
--- a/sysdeps/unix/sysv/linux/futex-internal.h
+++ b/sysdeps/unix/sysv/linux/futex-internal.h
@@ -75,10 +75,7 @@ static __always_inline int
 futex_wait_cancelable (unsigned int *futex_word, unsigned int expected,
 		       int private)
 {
-  int oldtype;
-  oldtype = __pthread_enable_asynccancel ();
-  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
-  __pthread_disable_asynccancel (oldtype);
+  int err = lll_futex_timed_wait_cancel (futex_word, expected, NULL, private);
   switch (err)
     {
     case 0:
@@ -129,10 +126,7 @@ futex_reltimed_wait_cancelable (unsigned int *futex_word,
 				unsigned int expected,
 			        const struct timespec *reltime, int private)
 {
-  int oldtype;
-  oldtype = LIBC_CANCEL_ASYNC ();
-  int err = lll_futex_timed_wait (futex_word, expected, reltime, private);
-  LIBC_CANCEL_RESET (oldtype);
+  int err = lll_futex_timed_wait_cancel (futex_word, expected, reltime, private);
   switch (err)
     {
     case 0:
@@ -203,12 +197,8 @@ futex_abstimed_wait_cancelable (unsigned int *futex_word,
      despite them being valid.  */
   if (__glibc_unlikely ((abstime != NULL) && (abstime->tv_sec < 0)))
     return ETIMEDOUT;
-  int oldtype;
-  oldtype = __pthread_enable_asynccancel ();
-  int err = lll_futex_clock_wait_bitset (futex_word, expected,
-					clockid, abstime,
-					private);
-  __pthread_disable_asynccancel (oldtype);
+  int err = lll_futex_clock_wait_bitset_cancel (futex_word, expected, clockid,
+						abstime, private);
   switch (err)
     {
     case 0:
diff --git a/sysdeps/unix/sysv/linux/lowlevellock-futex.h b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
index b423673..0645e4e 100644
--- a/sysdeps/unix/sysv/linux/lowlevellock-futex.h
+++ b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
@@ -22,6 +22,8 @@
 #ifndef __ASSEMBLER__
 #include <sysdep.h>
 #include <sysdep-cancel.h>
+#include <time.h>
+#include <sysdeps/unix/sysdep.h>
 #include <kernel-features.h>
 #endif
 
@@ -74,6 +76,12 @@
      ? -INTERNAL_SYSCALL_ERRNO (__ret, __err) : 0);                     \
   })
 
+#define lll_futex_syscall_cp(...)					\
+  ({                                                                    \
+    long int __ret = INTERNAL_SYSCALL_CANCEL (futex, __VA_ARGS__);	\
+    __ret;								\
+  })
+
 #define lll_futex_wait(futexp, val, private) \
   lll_futex_timed_wait (futexp, val, NULL, private)
 
@@ -147,21 +155,34 @@
 
 
 /* Cancellable futex macros.  */
-#define lll_futex_wait_cancel(futexp, val, private) \
-  ({                                                                   \
-    int __oldtype = CANCEL_ASYNC ();				       \
-    long int __err = lll_futex_wait (futexp, val, LLL_SHARED);	       \
-    CANCEL_RESET (__oldtype);					       \
-    __err;							       \
-  })
-
-#define lll_futex_timed_wait_cancel(futexp, val, timeout, private)	   \
-  ({									   \
-    int __oldtype = CANCEL_ASYNC ();				       	   \
-    long int __err = lll_futex_timed_wait (futexp, val, timeout, private); \
-    CANCEL_RESET (__oldtype);						   \
-    __err;								   \
-  })
+static __always_inline int
+lll_futex_timed_wait_cancel (unsigned int *futexp, int val,
+			     const struct timespec *timeout, int priv)
+{
+  int op = __lll_private_flag (FUTEX_WAIT, priv);
+  return INTERNAL_SYSCALL_CANCEL (futex, futexp, op, val, timeout);
+}
+
+static __always_inline int
+lll_futex_wait_cancel (unsigned int *futexp, int val, int priv)
+{
+  return lll_futex_timed_wait_cancel (futexp, val, NULL, priv);
+}
+
+static __always_inline int
+lll_futex_clock_wait_bitset_cancel (unsigned int *futexp, int val,
+				    clockid_t clockid,
+				    const struct timespec *timeout, int priv)
+{
+  if (!lll_futex_supported_clockid (clockid))
+    return -EINVAL;
+
+  const unsigned int clockbit = clockid == CLOCK_REALTIME
+				? FUTEX_CLOCK_REALTIME : 0;
+  const int op = __lll_private_flag (FUTEX_WAIT_BITSET | clockbit, priv);
+  return INTERNAL_SYSCALL_CANCEL (futex, futexp, op, val, timeout, NULL,
+				  FUTEX_BITSET_MATCH_ANY);
+}
 
 #endif  /* !__ASSEMBLER__  */
 
diff --git a/sysdeps/unix/sysv/linux/socketcall.h b/sysdeps/unix/sysv/linux/socketcall.h
index 1e6387c..38af55f 100644
--- a/sysdeps/unix/sysv/linux/socketcall.h
+++ b/sysdeps/unix/sysv/linux/socketcall.h
@@ -87,18 +87,32 @@
   })
 
 
-#if IS_IN (libc)
-# define __pthread_enable_asynccancel  __libc_enable_asynccancel
-# define __pthread_disable_asynccancel __libc_disable_asynccancel
-#endif
-
-#define SOCKETCALL_CANCEL(name, args...)				\
-  ({									\
-    int oldtype = LIBC_CANCEL_ASYNC ();					\
-    long int sc_ret = __SOCKETCALL (SOCKOP_##name, args);		\
-    LIBC_CANCEL_RESET (oldtype);					\
-    sc_ret;								\
-  })
-
+#define __SOCKETCALL_CANCEL1(__name, __a1) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [1]) { (long int) __a1 }))
+#define __SOCKETCALL_CANCEL2(__name, __a1, __a2) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [2]) { (long int) __a1, (long int) __a2 }))
+#define __SOCKETCALL_CANCEL3(__name, __a1, __a2, __a3) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [3]) { (long int) __a1, (long int) __a2, (long int) __a3 }))
+#define __SOCKETCALL_CANCEL4(__name, __a1, __a2, __a3, __a4) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [4]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4 }))
+#define __SOCKETCALL_CANCEL5(__name, __a1, __a2, __a3, __a4, __a5) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [5]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5 }))
+#define __SOCKETCALL_CANCEL6(__name, __a1, __a2, __a3, __a4, __a5, __a6) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [6]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5, (long int) __a6 }))
+
+#define __SOCKETCALL_CANCEL(...) __SOCKETCALL_DISP (__SOCKETCALL_CANCEL,\
+						    __VA_ARGS__)
+
+#define SOCKETCALL_CANCEL(name, args...) \
+   __SOCKETCALL_CANCEL (SOCKOP_##name, args)
 
 #endif /* sys/socketcall.h */
diff --git a/sysdeps/unix/sysv/linux/syscall_cancel.c b/sysdeps/unix/sysv/linux/syscall_cancel.c
new file mode 100644
index 0000000..8adb357
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/syscall_cancel.c
@@ -0,0 +1,63 @@
+/* Default cancellation syscall bridge.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <pthreadP.h>
+
+#warning "This implementation should be use just as reference or for bootstrapping"
+
+/* This is the generic version of the cancellable syscall code which
+   adds the label guards (__syscall_cancel_arch_{start,end}) used
+   on SIGCANCEL sigcancel_handler (nptl-init.c) to check if the cancelled
+   syscall have side-effects that need to be signaled to program.
+
+   This implementation should be used a reference one to document the
+   implementation constraints: the __syscall_cancel_arch_end should point
+   to the immediate next instruction after the syscall one.  This is because
+   kernel will signal interrupted syscall with side effects by setting
+   the signal frame program counter (on the ucontext_t third argument from
+   SA_SIGINFO signal handler) right after the syscall instruction.
+
+   If the INTERNAL_SYSCALL_NCS macro use more instructions to get the
+   error condition from kernel (as for powerpc and sparc), uses an
+   out of the line helper (as for ARM thumb), or uses a kernel helper
+   gate (as for i686 or ia64) the architecture should adjust the
+   macro or provide a custom __syscall_cancel_arch implementation.   */
+long int
+__syscall_cancel_arch (volatile int *ch, __syscall_arg_t nr,
+		       __syscall_arg_t a1, __syscall_arg_t a2,
+		       __syscall_arg_t a3, __syscall_arg_t a4,
+		       __syscall_arg_t a5, __syscall_arg_t a6)
+{
+#define ADD_LABEL(__label)		\
+  asm volatile (			\
+    ".global " __label "\t\n"		\
+    __label ":\n");
+
+  ADD_LABEL ("__syscall_cancel_arch_start");
+  if (__glibc_unlikely (*ch & CANCELED_BITMASK))
+    __syscall_do_cancel();
+
+  INTERNAL_SYSCALL_DECL(err);
+  long int result = INTERNAL_SYSCALL_NCS (nr, err, 6, a1, a2, a3, a4, a5, a6);
+  ADD_LABEL ("__syscall_cancel_arch_end");
+  if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (result, err)))
+    return -INTERNAL_SYSCALL_ERRNO (result, err);
+  return result;
+}
+libc_hidden_def (__syscall_cancel_arch)
diff --git a/sysdeps/unix/sysv/linux/sysdep.h b/sysdeps/unix/sysv/linux/sysdep.h
index fc9af51..ef21239 100644
--- a/sysdeps/unix/sysv/linux/sysdep.h
+++ b/sysdeps/unix/sysv/linux/sysdep.h
@@ -17,6 +17,9 @@
 
 #include <bits/wordsize.h>
 #include <kernel-features.h>
+#ifndef __ASSEMBLER__
+#include <errno.h>
+#endif
 
 /* Set error number and return -1.  A target may choose to return the
    internal function, __syscall_error, which sets errno and returns -1.
@@ -27,6 +30,26 @@
     -1l;					\
   })
 
+/* Check error from cancellable syscall and set errno accordingly.
+   Linux uses a negative return value to indicate syscall errors
+   and since version 2.1 the return value of a system call might be
+   negative even if the call succeeded (e.g., the `lseek' system call
+   might return a large offset).
+   Current contract is kernel make sure the no syscall returns a value
+   in -1 .. -4095 as a valid result so we can savely test with -4095.  */
+#define SYSCALL_CANCEL_ERROR(__ret)		\
+  ((__ret) > -4096UL)
+
+#define SYSCALL_CANCEL_RET(__ret)		\
+  ({						\
+    if (SYSCALL_CANCEL_ERROR ((__ret)))		\
+      {						\
+	__set_errno (-(__ret));			\
+	__ret = -1;				\
+      }						\
+    __ret;					\
+   })
+
 /* Provide a dummy argument that can be used to force register
    alignment for register pairs if required by the syscall ABI.  */
 #ifdef __ASSUME_ALIGNED_REGISTER_PAIRS


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [glibc/azanella/bz12683] nptl: Fix Race conditions in pthread cancellation (BZ#12683)
@ 2019-08-19 20:35 Adhemerval Zanella
  0 siblings, 0 replies; 6+ messages in thread
From: Adhemerval Zanella @ 2019-08-19 20:35 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=5cd632301aefb391d3773ec4655f5cf5ab3abd3a

commit 5cd632301aefb391d3773ec4655f5cf5ab3abd3a
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Fri Sep 18 18:26:35 2015 -0300

    nptl: Fix Race conditions in pthread cancellation (BZ#12683)
    
    This patch is the initial fix for race conditions in NPTL cancellation codei
    by redefining how cancellable syscalls are defined and handled.  Current
    buggy approach is to enable asynchronous cancellation prior to making the
    syscall and restore the previous cancellation type once the syscall returns.
    
    As decribed in BZ#12683, this approach shows 2 important problems:
    
      1. Cancellation can act after the syscall has returned from kernel, but
         before userspace saves the return value.  It might result in a resource
         leak if the syscall allocated a resource or a side effect (partial
         read/write), and there is no way to program handle it with cancellation
         handlers.
    
      2. If a signal is handled while the thread is blocked at a cancellable
         syscall, the entire signal handler runs with asynchronous cancellation
         enabled.  This can lead to issues if the signal handler call functions
         which are async-signal-safe but not async-cancel-safe.
    
    For cancellation to work correctly, there are 5 points at which the
    cancellation signal could arrive:
    
      1. Before the final "testcancel" and before the syscall is made.
      2. Between the "testcancel" and the syscall.
      3. While the syscall is blocked and no side effects have yet taken place.
      4. While the syscall is blocked but with some side effects already having
         taken place (e.g. a partial read or write).
      5. After the syscall has returned.
    
    And GLIBC wants to act on cancellation in cases 1, 2, and 3 but not in case
    4 or 5.  The proposed solution follows:
    
      * Handling case 1 is trivial: do a conditional branch based on whether the
        thread has received a cancellation request;
    
      * Case 2 can be caught by the signal handler determining that the saved
        program counter (from the ucontext_t) is in some address range beginning
        just before the "testcancel" and ending with the syscall instruction.
    
      * In this case, except for certain syscalls that ALWAYS fail with EINTR
        even for non-interrupting signals, the kernel will reset the program
        counter to point at the syscall instruction during signal handling, so
        that the syscall is restarted when the signal handler returns. So, from
        the signal handler's standpoint, this looks the same as case 2, and thus
        it's taken care of.
    
      * In this case, the kernel cannot restart the syscall; when it's
        interrupted by a signal, the kernel must cause the syscall to return
        with whatever partial result it obtained (e.g. partial read or write).
    
      * In this case, the saved program counter points just after the syscall
        instruction, so the signal handler won't act on cancellation.
        This one is equal to 4. since the program counter is past the syscall
        instruction already.
    
    Another case that needs handling is syscalls that fail with EINTR even
    when the signal handler is non-interrupting. In this case, the syscall
    wrapper code can just check the cancellation flag when the errno result
    is EINTR, and act on cancellation if it's set.
    
    The proposed GLIBC adjustments are:
    
      1. Remove the enable_asynccancel/disable_asynccancel function usage in
         syscall definition and instead make them call a common symbol that will
         check if cancellation is enabled (__syscall_cancel at
         nptl/libc-cancellation.c), call the arch-specific cancellable
         entry-point (__syscall_cancel_arch) and cancel the thread when required.
    
      2. Provide a arch-specific symbol that contains global markers. These
         markers will be used in SIGCANCEL handler to check if the interruption
         has been called in a valid syscall and if the syscalls has been
         completed or not.
    
         A reference implementation sysdeps/unix/sysv/linux/syscall_cancel.c is
         provided.  However the markers may not be set on correct expected places
         depeding of how INTERNAL_SYSCALL_NCS is implemented by the underlying
         architecture, and it is uses compiler-speficic construct (asm volatile)
         to place the required markers.  It is expected that all architectures
         implement an arch-specific.
    
      3. Rewrite SIGCANCEL asynchronous handler to check for both cancelling type
         and if current IP from signal handler falls between the global markes
         and act accordingly (sigcancel_handler at nptl/nptl-init.c).
    
      4. Adjust nptl/pthread_cancel.c to send an signal instead of acting
         directly. This avoid synchronization issues when updating the
         cancellation status and also focus the logic on signal handler and
         cancellation syscall code.
    
      5. Adjust pthread code to replace CANCEL_ASYNC/CANCEL_RESET calls to
         appropriated cancelable futex syscalls.
    
      6. Adjust libc code to replace LIBC_CANCEL_ASYNC/LIBC_CANCEL_RESET to
         appropriated cancelable syscalls.
    
      7. Adjust 'lowlevellock-futex.h' arch-specific implementations to provide
         cancelable futex calls (used in libpthread code).
    
    This patch adds the proposed changes to NPTL common code and following patches
    add the requires arch-specific bits.  The build for ia64-linux-gnu, mips-*,
    and x86_64-* are broken without the arch-specific patches.
    
    	[BZ #12683]
    	* manual/llio.texi: Adjust comments about
    	pthread_enable_asynccancel and pthread_disable_asynccancel.
    	* nptl/Makefile (routines): Add syscall_cancel object.
    	(libpthread-routines): Add pthread_kill_internal and remove
    	cancellation object.
    	(CFLAGS-cancellation.c): Remove rule.
    	(tests): Add tst-cancel28.
    	* nptl/Versions [GLIBC_PRIVATE] (libc): Add __syscall_cancel,
    	__syscall_cancel_arch_start, and __syscall_cancel_arch_end.
    	* nptl/cancellation.c: Remove file.
    	* sysdeps/nptl/librt-cancellation.c: Likewise.
    	* nptl/descr.h (CANCELING_BIT, CANCELING_BITMASK): Remove define.
    	(CANCELED_BIT, EXITING_BIT, TERMINATED_BIT, SETXID_BIT,
    	CANCEL_RESTMASK): Adjust value with CANCELED_BIT removal.
    	* nptl/libc-cancellation.c (__syscall_cancel): New symbol: symbol
    	brigde for cancellable syscalls.
    	(__syscall_do_cancel): New symbol.
    	* nptl/lll_timedwait_tid.c (__lll_timedwait_tid): Use cancellable
    	futex operation.
    	* nptl/nptl-init.c (sigcancel_handler): Rewrite function to avoid race
    	conditions.
    	(__pthread_initialize_minimal_internal): Add SA_RESTART to SIGCANCEL
    	handler.
    	* nptl/pthreadP.h (__do_cancel): Rewrite to both disable asynchronous
    	cancellation and setting the thread as cancelled.
    	(__do_cancel_with_result): New function.
    	(CANCEL_ASYNC, CANCEL_RESET, LIBC_CANCEL_ASYNC, LIBC_CANCEL_RESET,
    	LIBC_CANCEL_HANDLED): Remove macros.
    	(__syscall_cancel_arch, __syscall_do_cancel, __pthread_kill_internal):
    	New prototypes.
    	(__pthread_enable_asynccancel, __pthread_disable_asynccancel,
    	__libc_enable_asynccancel, __libc_disable_asynccancel,
    	__librt_enable_asynccancel, __librt_disable_asynccancel): Remove
    	prototypes.
    	* nptl/pthread_cancel.c (pthread_cancel): Rewrite to just set
    	CANCELLED_BIT and call __pthread_kill.
    	* nptl/pthread_create.c (START_THREAD_DEFN): Likewise.
    	* nptl/pthread_exit.c (__pthread_exit): Call __do_cancel_with_result.
    	* nptl/pthread_join_common.c (__pthread_timedjoin_ex): Likewise.
    	* nptl/pthread_kill.c (__pthread_kill): Check internal signals with
    	__is_internal_signal, tail call __pthread_kill_internal, and remove
    	stub_warning.
    	* nptl/pthread_kill_internal.c: New file.
    	* nptl/tst-cancel28.c: Likewise.
    	* sysdeps/unix/sysv/linux/pthread_kill_internal.c: Likewise.
    	* sysdeps/unix/sysv/linux/syscall_cancel.c: Likewise.
    	* rt/Makefile [CFLAGS-librt-cancellation.c]: Remove rule.
    	* sysdeps/generic/sysdep-cancel.h (LIBC_CANCEL_ASYNC,
    	LIBC_CANCEL_RESET): Remove define.
    	* sysdeps/htl/pthreadP.h (__pthread_kill_internal): New prototype.
    	* sysdeps/nptl/Makefile [$(subdir) = rt] (librt-sysdep_routines):
    	Remove librt-cancellation object.
    	[$(subdir) = rt] (CFLAGS-librt-cancellation.c): Remove rule.
    	* sysdeps/unix/sysdep.h (SYSCALL_CANCEL): Rewrite to call
    	__syscall_cancel.
    	(INTERNAL_SYSCALL_NCS_CALL, __INTERNAL_SYSCALL_NCS*,
    	__SYSCALL_CANCEL*): New macros.
    	* nptl/thrd_sleep (thrd_sleep): Adjust to semantic of
    	INTERNAL_SYSCALL_CANCEL.
    	* sysdeps/nptl/cancellation-pc-check.h: New file.
    	* sysdeps/nptl/cancellation-sigmask.h: Likewise.
    	* sysdeps/unix/sysv/linux/clock_nanosleep.c (__clock_nanosleep):
    	Likewise.
    	* sysdeps/unix/sysv/linux/futex-internal.h (fuex_wait_cancel,
    	futex_reltimed_wait_cancelable, futex_abstimed_wait_cancelable): Use
    	cancelable futex wrapper.
    	* sysdeps/unix/sysv/linux/lowlevellock-futex.h (lll_futex_syscall_cp,
    	lll_futex_wait_cancel, lll_futex_timed_wait_cancel,
    	lll_futex_timed_wait_bitset_cancel): New macros.
    	* sysdeps/unix/sysv/linux/socketcall.h (SOCKETCALL): Use __SSC macros.
    	(SOCKETCALL_CANCEL): Use SYSCALL_CANCEL macros.
    	(__SOCKETCALL_CANCEL*): New macros.
    	* sysdeps/unix/sysv/linux/sysdep.h (SYSCALL_CANCEL_RET): New macro.

Diff:
---
 manual/llio.texi                                   |   4 +-
 nptl/Makefile                                      |  11 +-
 nptl/Versions                                      |   3 +
 nptl/cancellation.c                                | 101 ------------------
 nptl/descr.h                                       |  15 ++-
 nptl/libc-cancellation.c                           |  45 +++++++-
 nptl/nptl-init.c                                   |  88 ++++++++--------
 nptl/pthreadP.h                                    |  30 +++++-
 nptl/pthread_cancel.c                              |  68 +++---------
 nptl/pthread_create.c                              |   7 +-
 nptl/pthread_exit.c                                |   5 +-
 nptl/pthread_join_common.c                         |   2 +-
 nptl/pthread_kill.c                                |   7 +-
 .../pthread_kill_internal.c                        |  16 +--
 nptl/pthread_setcanceltype.c                       |   2 +-
 nptl/thrd_sleep.c                                  |   7 +-
 nptl/tst-cancel28.c                                |  99 ++++++++++++++++++
 rt/Makefile                                        |   1 -
 sysdeps/generic/sysdep-cancel.h                    |   2 -
 sysdeps/htl/pthreadP.h                             |   1 +
 sysdeps/nptl/Makefile                              |   3 +-
 sysdeps/nptl/cancellation-pc-check.h               |  40 +++++++
 sysdeps/nptl/cancellation-sigmask.h                |  30 ++++++
 sysdeps/unix/sysdep.h                              | 115 ++++++++++++++++-----
 sysdeps/unix/sysv/linux/clock_nanosleep.c          |   6 +-
 sysdeps/unix/sysv/linux/futex-internal.h           |  18 +---
 sysdeps/unix/sysv/linux/lowlevellock-futex.h       |  45 ++++++--
 .../{pthread_kill.c => pthread_kill_internal.c}    |  22 +---
 sysdeps/unix/sysv/linux/socketcall.h               |  40 ++++---
 sysdeps/unix/sysv/linux/syscall_cancel.c           |  62 +++++++++++
 sysdeps/unix/sysv/linux/sysdep.h                   |  20 ++++
 31 files changed, 575 insertions(+), 340 deletions(-)

diff --git a/manual/llio.texi b/manual/llio.texi
index 447126b..ecf1753 100644
--- a/manual/llio.texi
+++ b/manual/llio.texi
@@ -2534,13 +2534,13 @@ aiocb64}, since the LFS transparently replaces the old interface.
 @c     sigemptyset ok
 @c     sigaddset ok
 @c     setjmp ok
-@c     CANCEL_ASYNC -> pthread_enable_asynccancel ok
+@c     __pthread_setcanceltype ok
 @c      do_cancel ok
 @c       pthread_unwind ok
 @c        Unwind_ForcedUnwind or longjmp ok [@ascuheap @acsmem?]
 @c     lll_lock @asulock @aculock
 @c     lll_unlock @asulock @aculock
-@c     CANCEL_RESET -> pthread_disable_asynccancel ok
+@c     __pthread_setcanceltype ok
 @c      lll_futex_wait ok
 @c     ->start_routine ok -----
 @c     call_tls_dtors @asulock @ascuheap @aculock @acsmem
diff --git a/nptl/Makefile b/nptl/Makefile
index a643306..06a6215 100644
--- a/nptl/Makefile
+++ b/nptl/Makefile
@@ -31,7 +31,7 @@ routines = alloca_cutoff forward libc-lowlevellock libc-cancellation \
 	   libc-cleanup libc_pthread_init libc_multiple_threads \
 	   register-atfork pthread_atfork pthread_self thrd_current \
 	   thrd_equal thrd_sleep thrd_yield pthread_equal \
-	   pthread_attr_destroy
+	   pthread_attr_destroy syscall_cancel
 shared-only-routines = forward
 static-only-routines = pthread_atfork
 
@@ -106,7 +106,8 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
 		      pthread_barrierattr_setpshared \
 		      pthread_key_create pthread_key_delete \
 		      pthread_getspecific pthread_setspecific \
-		      pthread_sigmask pthread_kill pthread_sigqueue \
+		      pthread_sigmask pthread_kill pthread_kill_internal \
+		      pthread_sigqueue \
 		      pthread_cancel pthread_testcancel \
 		      pthread_setcancelstate pthread_setcanceltype \
 		      pthread_once \
@@ -120,7 +121,6 @@ libpthread-routines = nptl-init nptlfreeres vars events version pt-interp \
 		      cleanup cleanup_defer cleanup_compat \
 		      cleanup_defer_compat unwind \
 		      pt-longjmp pt-cleanup\
-		      cancellation \
 		      lowlevellock \
 		      lll_timedlock_wait \
 		      pt-fork pt-fcntl \
@@ -174,8 +174,7 @@ CFLAGS-pthread_setcanceltype.c += -fexceptions -fasynchronous-unwind-tables
 
 # These are internal functions which similar functionality as setcancelstate
 # and setcanceltype.
-CFLAGS-cancellation.c += -fasynchronous-unwind-tables
-CFLAGS-libc-cancellation.c += -fasynchronous-unwind-tables
+CFLAGS-libc-cancellation.c += -fexceptions -fasynchronous-unwind-tables
 
 # Calling pthread_exit() must cause the registered cancel handlers to
 # be executed.  Therefore exceptions have to be thrown through this
@@ -289,7 +288,7 @@ tests = tst-attr1 tst-attr2 tst-attr3 tst-default-attr \
 	tst-cancel11 tst-cancel12 tst-cancel13 tst-cancel14 tst-cancel15 \
 	tst-cancel16 tst-cancel17 tst-cancel18 tst-cancel19 tst-cancel20 \
 	tst-cancel21 tst-cancel22 tst-cancel23 tst-cancel24 tst-cancel25 \
-	tst-cancel26 tst-cancel27 \
+	tst-cancel26 tst-cancel27 tst-cancel28 \
 	tst-cancel-self tst-cancel-self-cancelstate \
 	tst-cancel-self-canceltype tst-cancel-self-testcancel \
 	tst-cleanup0 tst-cleanup1 tst-cleanup2 tst-cleanup3 tst-cleanup4 \
diff --git a/nptl/Versions b/nptl/Versions
index 50d6717..cc9779e 100644
--- a/nptl/Versions
+++ b/nptl/Versions
@@ -39,6 +39,9 @@ libc {
     __libc_pthread_init;
     __libc_current_sigrtmin_private; __libc_current_sigrtmax_private;
     __libc_allocate_rtsig_private;
+    __syscall_cancel;
+    __syscall_cancel_arch_start;
+    __syscall_cancel_arch_end;
   }
 }
 
diff --git a/nptl/cancellation.c b/nptl/cancellation.c
deleted file mode 100644
index 9c3704d..0000000
--- a/nptl/cancellation.c
+++ /dev/null
@@ -1,101 +0,0 @@
-/* Copyright (C) 2002-2019 Free Software Foundation, Inc.
-   This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
-
-   The GNU C Library is free software; you can redistribute it and/or
-   modify it under the terms of the GNU Lesser General Public
-   License as published by the Free Software Foundation; either
-   version 2.1 of the License, or (at your option) any later version.
-
-   The GNU C Library is distributed in the hope that it will be useful,
-   but WITHOUT ANY WARRANTY; without even the implied warranty of
-   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
-   Lesser General Public License for more details.
-
-   You should have received a copy of the GNU Lesser General Public
-   License along with the GNU C Library; if not, see
-   <http://www.gnu.org/licenses/>.  */
-
-#include <setjmp.h>
-#include <stdlib.h>
-#include "pthreadP.h"
-#include <futex-internal.h>
-
-
-/* The next two functions are similar to pthread_setcanceltype() but
-   more specialized for the use in the cancelable functions like write().
-   They do not need to check parameters etc.  */
-int
-attribute_hidden
-__pthread_enable_asynccancel (void)
-{
-  struct pthread *self = THREAD_SELF;
-  int oldval = THREAD_GETMEM (self, cancelhandling);
-
-  while (1)
-    {
-      int newval = oldval | CANCELTYPE_BITMASK;
-
-      if (newval == oldval)
-	break;
-
-      int curval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling, newval,
-					      oldval);
-      if (__glibc_likely (curval == oldval))
-	{
-	  if (CANCEL_ENABLED_AND_CANCELED_AND_ASYNCHRONOUS (newval))
-	    {
-	      THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-	      __do_cancel ();
-	    }
-
-	  break;
-	}
-
-      /* Prepare the next round.  */
-      oldval = curval;
-    }
-
-  return oldval;
-}
-
-
-void
-attribute_hidden
-__pthread_disable_asynccancel (int oldtype)
-{
-  /* If asynchronous cancellation was enabled before we do not have
-     anything to do.  */
-  if (oldtype & CANCELTYPE_BITMASK)
-    return;
-
-  struct pthread *self = THREAD_SELF;
-  int newval;
-
-  int oldval = THREAD_GETMEM (self, cancelhandling);
-
-  while (1)
-    {
-      newval = oldval & ~CANCELTYPE_BITMASK;
-
-      int curval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling, newval,
-					      oldval);
-      if (__glibc_likely (curval == oldval))
-	break;
-
-      /* Prepare the next round.  */
-      oldval = curval;
-    }
-
-  /* We cannot return when we are being canceled.  Upon return the
-     thread might be things which would have to be undone.  The
-     following loop should loop until the cancellation signal is
-     delivered.  */
-  while (__builtin_expect ((newval & (CANCELING_BITMASK | CANCELED_BITMASK))
-			   == CANCELING_BITMASK, 0))
-    {
-      futex_wait_simple ((unsigned int *) &self->cancelhandling, newval,
-			 FUTEX_PRIVATE);
-      newval = THREAD_GETMEM (self, cancelhandling);
-    }
-}
diff --git a/nptl/descr.h b/nptl/descr.h
index b4db99f..01b46a8 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -275,23 +275,20 @@ struct pthread
   /* Bit set if asynchronous cancellation mode is selected.  */
 #define CANCELTYPE_BIT		1
 #define CANCELTYPE_BITMASK	(0x01 << CANCELTYPE_BIT)
-  /* Bit set if canceling has been initiated.  */
-#define CANCELING_BIT		2
-#define CANCELING_BITMASK	(0x01 << CANCELING_BIT)
-  /* Bit set if canceled.  */
-#define CANCELED_BIT		3
+  /* Bit set if thread is canceled.  */
+#define CANCELED_BIT		2
 #define CANCELED_BITMASK	(0x01 << CANCELED_BIT)
   /* Bit set if thread is exiting.  */
-#define EXITING_BIT		4
+#define EXITING_BIT		3
 #define EXITING_BITMASK		(0x01 << EXITING_BIT)
   /* Bit set if thread terminated and TCB is freed.  */
-#define TERMINATED_BIT		5
+#define TERMINATED_BIT		4
 #define TERMINATED_BITMASK	(0x01 << TERMINATED_BIT)
   /* Bit set if thread is supposed to change XID.  */
-#define SETXID_BIT		6
+#define SETXID_BIT		5
 #define SETXID_BITMASK		(0x01 << SETXID_BIT)
   /* Mask for the rest.  Helps the compiler to optimize.  */
-#define CANCEL_RESTMASK		0xffffff80
+#define CANCEL_RESTMASK		0xffffffc0
 
 #define CANCEL_ENABLED_AND_CANCELED(value) \
   (((value) & (CANCELSTATE_BITMASK | CANCELED_BITMASK | EXITING_BITMASK	      \
diff --git a/nptl/libc-cancellation.c b/nptl/libc-cancellation.c
index 3baf4fe..b3fc600 100644
--- a/nptl/libc-cancellation.c
+++ b/nptl/libc-cancellation.c
@@ -18,7 +18,46 @@
 
 #include "pthreadP.h"
 
+/* Cancellation function called by all cancellable syscalls.  */
+long int
+__syscall_cancel (__syscall_arg_t nr, __syscall_arg_t a1,
+		  __syscall_arg_t a2, __syscall_arg_t a3,
+		  __syscall_arg_t a4, __syscall_arg_t a5,
+		  __syscall_arg_t a6)
+{
+  pthread_t self = (pthread_t) THREAD_SELF;
+  struct pthread *pd = (struct pthread *) self;
+  long int result;
 
-#define __pthread_enable_asynccancel __libc_enable_asynccancel
-#define __pthread_disable_asynccancel __libc_disable_asynccancel
-#include <nptl/cancellation.c>
+  /* If cancellation is not enabled, call the syscall directly.  */
+  if (pd->cancelhandling & CANCELSTATE_BITMASK)
+    {
+      INTERNAL_SYSCALL_DECL (err);
+      result = INTERNAL_SYSCALL_NCS_CALL (nr, err, a1, a2, a3, a4, a5, a6);
+      if (INTERNAL_SYSCALL_ERROR_P (result, err))
+	return -INTERNAL_SYSCALL_ERRNO (result, err);
+      return result;
+    }
+
+  /* Call the arch-specific entry points that contains the globals markers
+     to be checked by SIGCANCEL handler.  */
+  result = __syscall_cancel_arch (&pd->cancelhandling, nr, a1, a2, a3, a4, a5,
+			          a6);
+
+  if ((result == -EINTR)
+      && (pd->cancelhandling & CANCELED_BITMASK)
+      && !(pd->cancelhandling & CANCELSTATE_BITMASK))
+    __do_cancel ();
+
+  return result;
+}
+libc_hidden_def (__syscall_cancel)
+
+/* Since __do_cancel is a always inline function, this creates a symbol the
+   arch-specific symbol can call to cancel the thread.  */
+_Noreturn void
+attribute_hidden
+__syscall_do_cancel (void)
+{
+  __do_cancel ();
+}
diff --git a/nptl/nptl-init.c b/nptl/nptl-init.c
index 8fc4f46..6e9975e 100644
--- a/nptl/nptl-init.c
+++ b/nptl/nptl-init.c
@@ -39,6 +39,9 @@
 #include <libc-pointer-arith.h>
 #include <pthread-pids.h>
 #include <pthread_mutex_conf.h>
+#include <sigcontextinfo.h>
+#include <cancellation-sigmask.h>
+#include <cancellation-pc-check.h>
 
 #ifndef TLS_MULTIPLE_THREADS_IN_TCB
 /* Pointer to the corresponding variable in libc.  */
@@ -165,35 +168,23 @@ sigcancel_handler (int sig, siginfo_t *si, void *ctx)
 
   struct pthread *self = THREAD_SELF;
 
-  int oldval = THREAD_GETMEM (self, cancelhandling);
-  while (1)
-    {
-      /* We are canceled now.  When canceled by another thread this flag
-	 is already set but if the signal is directly send (internally or
-	 from another process) is has to be done here.  */
-      int newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
-
-      if (oldval == newval || (oldval & EXITING_BITMASK) != 0)
-	/* Already canceled or exiting.  */
-	break;
-
-      int curval = THREAD_ATOMIC_CMPXCHG_VAL (self, cancelhandling, newval,
-					      oldval);
-      if (curval == oldval)
-	{
-	  /* Set the return value.  */
-	  THREAD_SETMEM (self, result, PTHREAD_CANCELED);
-
-	  /* Make sure asynchronous cancellation is still enabled.  */
-	  if ((newval & CANCELTYPE_BITMASK) != 0)
-	    /* Run the registered destructors and terminate the thread.  */
-	    __do_cancel ();
-
-	  break;
-	}
-
-      oldval = curval;
-    }
+  if (((self->cancelhandling & (CANCELSTATE_BITMASK)) != 0)
+      || ((self->cancelhandling & CANCELED_BITMASK) == 0))
+    return;
+
+  /* Add SIGCANCEL on ignored sigmask to avoid the handler to be called
+     again.  */
+  ucontext_add_cancel (ctx);
+
+  /* Check if asynchronous cancellation mode is set or if interrupted
+     instruction pointer falls within the cancellable syscall bridge.  For
+     interruptable syscalls that might generate external side-effects (partial
+     reads or writes, for instance), the kernel will set the IP to after
+     '__syscall_cancel_arch_end', thus disabling the cancellation and allowing
+     the process to handle such conditions.  */
+  if (self->cancelhandling & CANCELTYPE_BITMASK
+      || ucontext_check_pc_boundary (ctx))
+    __do_cancel ();
 }
 #endif
 
@@ -296,38 +287,49 @@ __pthread_initialize_minimal_internal (void)
   THREAD_SETMEM (pd, report_events, __nptl_initial_report_events);
 
 #if defined SIGCANCEL || defined SIGSETXID
-  struct sigaction sa;
-  __sigemptyset (&sa.sa_mask);
 
 # ifdef SIGCANCEL
   /* Install the cancellation signal handler.  If for some reason we
      cannot install the handler we do not abort.  Maybe we should, but
      it is only asynchronous cancellation which is affected.  */
-  sa.sa_sigaction = sigcancel_handler;
-  sa.sa_flags = SA_SIGINFO;
-  (void) __libc_sigaction (SIGCANCEL, &sa, NULL);
+  {
+    struct sigaction sa;
+    sa.sa_sigaction = sigcancel_handler;
+    /* The signal handle should be non-interruptible to avoid the risk of
+       spurious EINTR caused by SIGCANCEL sent to process or if pthread_cancel
+       is called while cancellation is disabled in the target thread.  */
+    sa.sa_flags = SA_SIGINFO | SA_RESTART;
+    sa.sa_mask = SIGALL_SET;
+    __libc_sigaction (SIGCANCEL, &sa, NULL);
+  }
 # endif
 
 # ifdef SIGSETXID
-  /* Install the handle to change the threads' uid/gid.  */
-  sa.sa_sigaction = sighandler_setxid;
-  sa.sa_flags = SA_SIGINFO | SA_RESTART;
-  (void) __libc_sigaction (SIGSETXID, &sa, NULL);
+  {
+    /* Install the handle to change the threads' uid/gid.  */
+    struct sigaction sa;
+    __sigemptyset (&sa.sa_mask);
+    sa.sa_sigaction = sighandler_setxid;
+    sa.sa_flags = SA_SIGINFO | SA_RESTART;
+    __libc_sigaction (SIGSETXID, &sa, NULL);
+  }
 # endif
 
   /* The parent process might have left the signals blocked.  Just in
      case, unblock it.  We reuse the signal mask in the sigaction
      structure.  It is already cleared.  */
+  {
+    struct sigaction sa;
+    __sigemptyset (&sa.sa_mask);
 # ifdef SIGCANCEL
-  __sigaddset (&sa.sa_mask, SIGCANCEL);
+    __sigaddset (&sa.sa_mask, SIGCANCEL);
 # endif
 # ifdef SIGSETXID
-  __sigaddset (&sa.sa_mask, SIGSETXID);
+    __sigaddset (&sa.sa_mask, SIGSETXID);
 # endif
-  {
     INTERNAL_SYSCALL_DECL (err);
-    (void) INTERNAL_SYSCALL (rt_sigprocmask, err, 4, SIG_UNBLOCK, &sa.sa_mask,
-			     NULL, _NSIG / 8);
+    INTERNAL_SYSCALL_CALL (rt_sigprocmask, err, SIG_UNBLOCK, &sa.sa_mask,
+			   NULL, _NSIG / 8);
   }
 #endif
 
diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index d80662a..446faae 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -296,20 +296,39 @@ extern void __nptl_unwind_freeres (void) attribute_hidden;
 #endif
 
 
-/* Called when a thread reacts on a cancellation request.  */
 static inline void
 __attribute ((noreturn, always_inline))
-__do_cancel (void)
+__do_cancel_with_result (void *result)
 {
   struct pthread *self = THREAD_SELF;
 
-  /* Make sure we get no more cancellations.  */
+  /* Make sure we get no more cancellations by clearing the cancel
+     state.  */
+  THREAD_ATOMIC_BIT_SET (self, cancelhandling, CANCELSTATE_BIT);
+
   THREAD_ATOMIC_BIT_SET (self, cancelhandling, EXITING_BIT);
 
+  THREAD_SETMEM (self, result, result);
+
   __pthread_unwind ((__pthread_unwind_buf_t *)
 		    THREAD_GETMEM (self, cleanup_jmp_buf));
 }
 
+/* Called when a thread reacts on a cancellation request.  */
+static inline void
+__attribute ((noreturn, always_inline))
+__do_cancel (void)
+{
+  __do_cancel_with_result (PTHREAD_CANCELED);
+}
+
+extern long int __syscall_cancel_arch (volatile int *, __syscall_arg_t nr,
+     __syscall_arg_t arg1, __syscall_arg_t arg2, __syscall_arg_t arg3,
+     __syscall_arg_t arg4, __syscall_arg_t arg5, __syscall_arg_t arg6);
+libc_hidden_proto (__syscall_cancel_arch);
+
+extern _Noreturn void __syscall_do_cancel (void)
+     attribute_hidden;
 
 /* Internal prototypes.  */
 
@@ -469,11 +488,11 @@ extern int __pthread_equal (pthread_t thread1, pthread_t thread2);
 extern int __pthread_detach (pthread_t th);
 extern int __pthread_cancel (pthread_t th);
 extern int __pthread_kill (pthread_t threadid, int signo);
+extern int __pthread_kill_internal (pthread_t threadid, int signo)
+  attribute_hidden;
 extern void __pthread_exit (void *value) __attribute__ ((__noreturn__));
 extern int __pthread_join (pthread_t threadid, void **thread_return);
 extern int __pthread_setcanceltype (int type, int *oldtype);
-extern int __pthread_enable_asynccancel (void) attribute_hidden;
-extern void __pthread_disable_asynccancel (int oldtype) attribute_hidden;
 extern void __pthread_testcancel (void);
 extern int __pthread_timedjoin_ex (pthread_t, void **, const struct timespec *,
 				   bool);
@@ -496,6 +515,7 @@ hidden_proto (__pthread_testcancel)
 hidden_proto (__pthread_mutexattr_init)
 hidden_proto (__pthread_mutexattr_settype)
 hidden_proto (__pthread_timedjoin_ex)
+hidden_proto (__pthread_kill_internal)
 #endif
 
 extern int __pthread_cond_broadcast_2_0 (pthread_cond_2_0_t *cond);
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index ce6a283..0f1ec49 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -37,67 +37,23 @@ __pthread_cancel (pthread_t th)
 #ifdef SHARED
   pthread_cancel_init ();
 #endif
-  int result = 0;
-  int oldval;
-  int newval;
-  do
-    {
-    again:
-      oldval = pd->cancelhandling;
-      newval = oldval | CANCELING_BITMASK | CANCELED_BITMASK;
 
-      /* Avoid doing unnecessary work.  The atomic operation can
-	 potentially be expensive if the bug has to be locked and
-	 remote cache lines have to be invalidated.  */
-      if (oldval == newval)
-	break;
+  THREAD_ATOMIC_BIT_SET (pd, cancelhandling, CANCELED_BIT);
 
-      /* If the cancellation is handled asynchronously just send a
-	 signal.  We avoid this if possible since it's more
-	 expensive.  */
-      if (CANCEL_ENABLED_AND_CANCELED_AND_ASYNCHRONOUS (newval))
-	{
-	  /* Mark the cancellation as "in progress".  */
-	  if (atomic_compare_and_exchange_bool_acq (&pd->cancelhandling,
-						    oldval | CANCELING_BITMASK,
-						    oldval))
-	    goto again;
-
-#ifdef SIGCANCEL
-	  /* The cancellation handler will take care of marking the
-	     thread as canceled.  */
-	  pid_t pid = __getpid ();
-
-	  INTERNAL_SYSCALL_DECL (err);
-	  int val = INTERNAL_SYSCALL_CALL (tgkill, err, pid, pd->tid,
-					   SIGCANCEL);
-	  if (INTERNAL_SYSCALL_ERROR_P (val, err))
-	    result = INTERNAL_SYSCALL_ERRNO (val, err);
-#else
-          /* It should be impossible to get here at all, since
-             pthread_setcanceltype should never have allowed
-             PTHREAD_CANCEL_ASYNCHRONOUS to be set.  */
-          abort ();
-#endif
-
-	  break;
-	}
-
-	/* A single-threaded process should be able to kill itself, since
-	   there is nothing in the POSIX specification that says that it
-	   cannot.  So we set multiple_threads to true so that cancellation
-	   points get executed.  */
-	THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
+  /* A single-threaded process should be able to kill itself, since there is
+     nothing in the POSIX specification that says that it cannot.  So we set
+     multiple_threads to true so that cancellation points get executed.  */
+  THREAD_SETMEM (THREAD_SELF, header.multiple_threads, 1);
 #ifndef TLS_MULTIPLE_THREADS_IN_TCB
-	__pthread_multiple_threads = *__libc_multiple_threads_ptr = 1;
+  __pthread_multiple_threads = *__libc_multiple_threads_ptr = 1;
 #endif
-    }
-  /* Mark the thread as canceled.  This has to be done
-     atomically since other bits could be modified as well.  */
-  while (atomic_compare_and_exchange_bool_acq (&pd->cancelhandling, newval,
-					       oldval));
 
-  return result;
+  /* Avoid signaling when thread attempts cancel itself (pthread_kill
+     is expensive).  */
+  if (pd == THREAD_SELF && !(pd->cancelhandling & CANCELTYPE_BITMASK))
+    return 0;
+
+  return __pthread_kill_internal (th, SIGCANCEL);
 }
 weak_alias (__pthread_cancel, pthread_cancel)
 
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 18b7bbe..22154d0 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -406,7 +406,7 @@ START_THREAD_DEFN
   /* If the parent was running cancellation handlers while creating
      the thread the new thread inherited the signal mask.  Reset the
      cancellation signal mask.  */
-  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELING_BITMASK))
+  if (__glibc_unlikely (pd->parent_cancelhandling & CANCELED_BITMASK))
     {
       INTERNAL_SYSCALL_DECL (err);
       sigset_t mask;
@@ -449,7 +449,8 @@ START_THREAD_DEFN
 	 have ownership (see CONCURRENCY NOTES above).  */
       if (__glibc_unlikely (pd->stopped_start))
 	{
-	  int oldtype = CANCEL_ASYNC ();
+	  int ct;
+	  __pthread_setcanceltype (PTHREAD_CANCEL_ASYNCHRONOUS, &ct);
 
 	  /* Get the lock the parent locked to force synchronization.  */
 	  lll_lock (pd->lock, LLL_PRIVATE);
@@ -459,7 +460,7 @@ START_THREAD_DEFN
 	  /* And give it up right away.  */
 	  lll_unlock (pd->lock, LLL_PRIVATE);
 
-	  CANCEL_RESET (oldtype);
+	  __pthread_setcanceltype (ct, NULL);
 	}
 
       LIBC_PROBE (pthread_start, 3, (pthread_t) pd, pd->start_routine, pd->arg);
diff --git a/nptl/pthread_exit.c b/nptl/pthread_exit.c
index abc9019..912bdbf 100644
--- a/nptl/pthread_exit.c
+++ b/nptl/pthread_exit.c
@@ -16,16 +16,13 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <stdlib.h>
 #include "pthreadP.h"
 
 
 void
 __pthread_exit (void *value)
 {
-  THREAD_SETMEM (THREAD_SELF, result, value);
-
-  __do_cancel ();
+  __do_cancel_with_result (value);
 }
 weak_alias (__pthread_exit, pthread_exit)
 
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
index 5224ee2..0c08335 100644
--- a/nptl/pthread_join_common.c
+++ b/nptl/pthread_join_common.c
@@ -100,7 +100,7 @@ __pthread_timedjoin_ex (pthread_t threadid, void **thread_return,
   if ((pd == self
        || (self->joinid == pd
 	   && (pd->cancelhandling
-	       & (CANCELING_BITMASK | CANCELED_BITMASK | EXITING_BITMASK
+	       & (CANCELED_BITMASK | EXITING_BITMASK
 		  | TERMINATED_BITMASK)) == 0))
       && !CANCEL_ENABLED_AND_CANCELED (self->cancelhandling))
     /* This is a deadlock situation.  The threads are waiting for each
diff --git a/nptl/pthread_kill.c b/nptl/pthread_kill.c
index 441527b..7f74901 100644
--- a/nptl/pthread_kill.c
+++ b/nptl/pthread_kill.c
@@ -31,8 +31,9 @@ __pthread_kill (pthread_t threadid, int signo)
     /* Not a valid thread handle.  */
     return ESRCH;
 
-  return ENOSYS;
+  if (__is_internal_signal (signo))
+    return EINVAL;
+
+  return __pthread_kill_internal (threadid, signo);
 }
 strong_alias (__pthread_kill, pthread_kill)
-
-stub_warning (pthread_kill)
diff --git a/sysdeps/nptl/librt-cancellation.c b/nptl/pthread_kill_internal.c
similarity index 70%
rename from sysdeps/nptl/librt-cancellation.c
rename to nptl/pthread_kill_internal.c
index fc46977..8fbc17d 100644
--- a/sysdeps/nptl/librt-cancellation.c
+++ b/nptl/pthread_kill_internal.c
@@ -1,6 +1,6 @@
-/* Copyright (C) 2002-2019 Free Software Foundation, Inc.
+/* Send a signal to a specific pthread.  Internal version.
+   Copyright (C) 2002-2019 Free Software Foundation, Inc.
    This file is part of the GNU C Library.
-   Contributed by Ulrich Drepper <drepper@redhat.com>, 2002.
 
    The GNU C Library is free software; you can redistribute it and/or
    modify it under the terms of the GNU Lesser General Public
@@ -16,9 +16,11 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <nptl/pthreadP.h>
+#include <pthreadP.h>
 
-
-#define __pthread_enable_asynccancel __librt_enable_asynccancel
-#define __pthread_disable_asynccancel __librt_disable_asynccancel
-#include <nptl/cancellation.c>
+int
+__pthread_kill_internal (pthread_t threadid, int signo)
+{
+  return ENOSYS;
+}
+hidden_def (__pthread_kill_internal)
diff --git a/nptl/pthread_setcanceltype.c b/nptl/pthread_setcanceltype.c
index 126bfd8..2531bb2 100644
--- a/nptl/pthread_setcanceltype.c
+++ b/nptl/pthread_setcanceltype.c
@@ -73,4 +73,4 @@ __pthread_setcanceltype (int type, int *oldtype)
 
   return 0;
 }
-strong_alias (__pthread_setcanceltype, pthread_setcanceltype)
+weak_alias (__pthread_setcanceltype, pthread_setcanceltype)
diff --git a/nptl/thrd_sleep.c b/nptl/thrd_sleep.c
index 07a5180..dbe182f 100644
--- a/nptl/thrd_sleep.c
+++ b/nptl/thrd_sleep.c
@@ -24,13 +24,12 @@
 int
 thrd_sleep (const struct timespec* time_point, struct timespec* remaining)
 {
-  INTERNAL_SYSCALL_DECL (err);
-  int ret = INTERNAL_SYSCALL_CANCEL (nanosleep, err, time_point, remaining);
-  if (INTERNAL_SYSCALL_ERROR_P (ret, err))
+  long int ret = INTERNAL_SYSCALL_CANCEL (nanosleep, time_point, remaining);
+  if (SYSCALL_CANCEL_ERROR (ret))
     {
       /* C11 states thrd_sleep function returns -1 if it has been interrupted
 	 by a signal, or a negative value if it fails.  */
-      ret = INTERNAL_SYSCALL_ERRNO (ret, err);
+      ret = -ret;
       if (ret == EINTR)
 	return -1;
       return -2;
diff --git a/nptl/tst-cancel28.c b/nptl/tst-cancel28.c
new file mode 100644
index 0000000..5ffbdbb
--- /dev/null
+++ b/nptl/tst-cancel28.c
@@ -0,0 +1,99 @@
+/* Check side-effect act for cancellable syscalls (BZ #12683).
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* This testcase checks if there is resource leakage if the syscall has
+   returned from kernelspace, but before userspace saves the return
+   value.  The 'leaker' thread should be able to close the file descriptor
+   if the resource is already allocated, meaning that if the cancellation
+   signal arrives *after* the open syscal return from kernel, the
+   side-effect should be visible to application.  */
+
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdlib.h>
+
+#include <support/xthread.h>
+#include <support/check.h>
+#include <support/temp_file.h>
+#include <support/support.h>
+#include <support/descriptors.h>
+
+static void *
+writeopener (void *arg)
+{
+  int fd;
+  for (;;)
+    {
+      fd = open (arg, O_WRONLY);
+      close (fd);
+    }
+  return NULL;
+}
+
+static void *
+leaker (void *arg)
+{
+  int fd = open (arg, O_RDONLY);
+  pthread_setcancelstate (PTHREAD_CANCEL_DISABLE, 0);
+  close (fd);
+  return NULL;
+}
+
+
+#define ITER_COUNT 1000
+#define MAX_FILENO 1024
+
+static int
+do_test (void)
+{
+  char *dir = support_create_temp_directory ("tst-cancel28");
+  char *name = xasprintf ("%s/fifo", dir);
+  TEST_COMPARE (mkfifo (name, 0600), 0);
+  add_temp_file (name);
+
+  struct support_descriptors *descrs = support_descriptors_list ();
+
+  srand (1);
+
+  xpthread_create (NULL, writeopener, name);
+  for (int i = 0; i < ITER_COUNT; i++)
+    {
+      pthread_t td = xpthread_create (NULL, leaker, name);
+      struct timespec ts =
+	{ .tv_nsec = rand () % 100000, .tv_sec = 0 };
+      nanosleep (&ts, NULL);
+      /* Ignore pthread_cancel result because it might be the
+	 case when pthread_cancel is called when thread is already
+	 exited.  */
+      pthread_cancel (td);
+      xpthread_join (td);
+    }
+
+  support_descriptors_check (descrs);
+
+  support_descriptors_free (descrs);
+
+  free (name);
+
+  return 0;
+}
+
+#define TIMEOUT 10
+#include <support/test-driver.c>
diff --git a/rt/Makefile b/rt/Makefile
index 9ea8394..6652c0b 100644
--- a/rt/Makefile
+++ b/rt/Makefile
@@ -64,7 +64,6 @@ CFLAGS-aio_suspend.c += -fexceptions
 CFLAGS-mq_timedreceive.c += -fexceptions -fasynchronous-unwind-tables
 CFLAGS-mq_timedsend.c += -fexceptions -fasynchronous-unwind-tables
 CFLAGS-clock_nanosleep.c += -fexceptions -fasynchronous-unwind-tables
-CFLAGS-librt-cancellation.c += -fasynchronous-unwind-tables
 
 LDFLAGS-rt.so = -Wl,--enable-new-dtags,-z,nodelete
 
diff --git a/sysdeps/generic/sysdep-cancel.h b/sysdeps/generic/sysdep-cancel.h
index d22a786..5c84b44 100644
--- a/sysdeps/generic/sysdep-cancel.h
+++ b/sysdeps/generic/sysdep-cancel.h
@@ -3,5 +3,3 @@
 /* No multi-thread handling enabled.  */
 #define SINGLE_THREAD_P (1)
 #define RTLD_SINGLE_THREAD_P (1)
-#define LIBC_CANCEL_ASYNC()	0 /* Just a dummy value.  */
-#define LIBC_CANCEL_RESET(val)	((void)(val)) /* Nothing, but evaluate it.  */
diff --git a/sysdeps/htl/pthreadP.h b/sysdeps/htl/pthreadP.h
index c666fb9..afc49bd 100644
--- a/sysdeps/htl/pthreadP.h
+++ b/sysdeps/htl/pthreadP.h
@@ -25,6 +25,7 @@
 
 extern pthread_t __pthread_self (void);
 extern int __pthread_kill (pthread_t threadid, int signo);
+extern int __pthread_kill_internal (pthread_t threadid, int signo);
 extern struct __pthread **__pthread_threads;
 
 extern int _pthread_mutex_init (pthread_mutex_t *mutex, const pthread_mutexattr_t *attr);
diff --git a/sysdeps/nptl/Makefile b/sysdeps/nptl/Makefile
index 76b091d..f0353cb 100644
--- a/sysdeps/nptl/Makefile
+++ b/sysdeps/nptl/Makefile
@@ -21,8 +21,7 @@ libpthread-sysdep_routines += errno-loc
 endif
 
 ifeq ($(subdir),rt)
-librt-sysdep_routines += timer_routines librt-cancellation
-CFLAGS-librt-cancellation.c += -fexceptions -fasynchronous-unwind-tables
+librt-sysdep_routines += timer_routines
 
 tests += tst-mqueue8x
 CFLAGS-tst-mqueue8x.c += -fexceptions
diff --git a/sysdeps/nptl/cancellation-pc-check.h b/sysdeps/nptl/cancellation-pc-check.h
new file mode 100644
index 0000000..8b26c4e
--- /dev/null
+++ b/sysdeps/nptl/cancellation-pc-check.h
@@ -0,0 +1,40 @@
+/* Architecture specific code for pthread cancellation handling.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _NPTL_CANCELLATION_PC_CHECK
+#define _NPTL_CANCELLATION_PC_CHECK
+
+#include <sigcontextinfo.h>
+
+/* Check if the program counter (PC) from ucontext CTX is within the start and
+   then end boundary from the __syscall_cancel_arch bridge.  Return TRUE if
+   the PC is within the boundary, meaning the syscall does not have any side
+   effects; or FALSE otherwise.  */
+static bool
+ucontext_check_pc_boundary (void *ctx)
+{
+  /* Both are defined in syscall_cancel.S.  */
+  extern const char __syscall_cancel_arch_start[1];
+  extern const char __syscall_cancel_arch_end[1];
+
+  uintptr_t pc = sigcontext_get_pc (ctx);
+  return pc >= (uintptr_t) __syscall_cancel_arch_start
+	 && pc < (uintptr_t) __syscall_cancel_arch_end;
+}
+
+#endif
diff --git a/sysdeps/nptl/cancellation-sigmask.h b/sysdeps/nptl/cancellation-sigmask.h
new file mode 100644
index 0000000..77702c1
--- /dev/null
+++ b/sysdeps/nptl/cancellation-sigmask.h
@@ -0,0 +1,30 @@
+/* Architecture specific code for pthread cancellation handling.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef _NPTL_CANCELLATION_SIGMASK_H
+#define _NPTL_CANCELLATION_SIGMASK_H
+
+/* Add the SIGCANCEL signal on sigmask set at the ucontext CTX obtained from
+   the sigaction handler.  */
+static void
+ucontext_add_cancel (void *ctx)
+{
+  __sigaddset (&((ucontext_t*) ctx)->uc_sigmask, SIGCANCEL);
+}
+
+#endif
diff --git a/sysdeps/unix/sysdep.h b/sysdeps/unix/sysdep.h
index 6e503d7..a7f0155 100644
--- a/sysdeps/unix/sysdep.h
+++ b/sysdeps/unix/sysdep.h
@@ -24,6 +24,9 @@
 #define	SYSCALL__(name, args)	PSEUDO (__##name, name, args)
 #define	SYSCALL(name, args)	PSEUDO (name, name, args)
 
+#ifndef __ASSEMBLER__
+# include <errno.h>
+
 #define __SYSCALL_CONCAT_X(a,b)     a##b
 #define __SYSCALL_CONCAT(a,b)       __SYSCALL_CONCAT_X (a, b)
 
@@ -57,6 +60,29 @@
 #define INTERNAL_SYSCALL_CALL(...) \
   __INTERNAL_SYSCALL_DISP (__INTERNAL_SYSCALL, __VA_ARGS__)
 
+#define __INTERNAL_SYSCALL_NCS0(name, err) \
+  INTERNAL_SYSCALL_NCS (name, err, 0)
+#define __INTERNAL_SYSCALL_NCS1(name, err, a1) \
+  INTERNAL_SYSCALL_NCS (name, err, 1, a1)
+#define __INTERNAL_SYSCALL_NCS2(name, err, a1, a2) \
+  INTERNAL_SYSCALL_NCS (name, err, 2, a1, a2)
+#define __INTERNAL_SYSCALL_NCS3(name, err, a1, a2, a3) \
+  INTERNAL_SYSCALL_NCS (name, err, 3, a1, a2, a3)
+#define __INTERNAL_SYSCALL_NCS4(name, err, a1, a2, a3, a4) \
+  INTERNAL_SYSCALL_NCS (name, err, 4, a1, a2, a3, a4)
+#define __INTERNAL_SYSCALL_NCS5(name, err, a1, a2, a3, a4, a5) \
+  INTERNAL_SYSCALL_NCS (name, err, 5, a1, a2, a3, a4, a5)
+#define __INTERNAL_SYSCALL_NCS6(name, err, a1, a2, a3, a4, a5, a6) \
+  INTERNAL_SYSCALL_NCS (name, err, 6, a1, a2, a3, a4, a5, a6)
+#define __INTERNAL_SYSCALL_NCS7(name, err, a1, a2, a3, a4, a5, a6, a7) \
+  INTERNAL_SYSCALL_NCS (name, err, 7, a1, a2, a3, a4, a5, a6, a7)
+
+/* Issue a syscall defined by syscall number plus any other argument required.
+   It is similar to INTERNAL_SYSCALL_NCS macro, but without the need to pass
+   the expected argument number as third parameter.  */
+#define INTERNAL_SYSCALL_NCS_CALL(...) \
+  __INTERNAL_SYSCALL_DISP (__INTERNAL_SYSCALL_NCS, __VA_ARGS__)
+
 #define __INLINE_SYSCALL0(name) \
   INLINE_SYSCALL (name, 0)
 #define __INLINE_SYSCALL1(name, a1) \
@@ -88,35 +114,70 @@
 #define INLINE_SYSCALL_CALL(...) \
   __INLINE_SYSCALL_DISP (__INLINE_SYSCALL, __VA_ARGS__)
 
-#define SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (SINGLE_THREAD_P) 						     \
-      sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INLINE_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
-  })
 
-/* Issue a syscall defined by syscall number plus any other argument
-   required.  Any error will be returned unmodified (including errno).  */
-#define INTERNAL_SYSCALL_CANCEL(...) \
-  ({									     \
-    long int sc_ret;							     \
-    if (SINGLE_THREAD_P) 						     \
-      sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__); 			     \
-    else								     \
-      {									     \
-	int sc_cancel_oldtype = LIBC_CANCEL_ASYNC ();			     \
-	sc_ret = INTERNAL_SYSCALL_CALL (__VA_ARGS__);			     \
-        LIBC_CANCEL_RESET (sc_cancel_oldtype);				     \
-      }									     \
-    sc_ret;								     \
+/* Cancellation macros.  */
+#ifndef __SSC
+typedef long int __syscall_arg_t;
+# define __SSC(__x) ((__syscall_arg_t) (__x))
+#endif
+
+long int __syscall_cancel (__syscall_arg_t nr, __syscall_arg_t arg1,
+			   __syscall_arg_t arg2, __syscall_arg_t arg3,
+			   __syscall_arg_t arg4, __syscall_arg_t arg5,
+			   __syscall_arg_t arg6);
+libc_hidden_proto (__syscall_cancel);
+
+#define __SYSCALL_CANCEL0(name) \
+  (__syscall_cancel)(__NR_##name, 0, 0, 0, 0, 0, 0)
+#define __SYSCALL_CANCEL1(name, a1) \
+  (__syscall_cancel)(__NR_##name, __SSC(a1), 0, 0, 0, 0, 0)
+#define __SYSCALL_CANCEL2(name, a1, a2) \
+  (__syscall_cancel)(__NR_##name, __SSC(a1), __SSC(a2), 0, 0, 0, 0)
+#define __SYSCALL_CANCEL3(name, a1, a2, a3) \
+  (__syscall_cancel)(__NR_##name, __SSC(a1), __SSC(a2), __SSC(a3), 0, 0, 0)
+#define __SYSCALL_CANCEL4(name, a1, a2, a3, a4) \
+  (__syscall_cancel)(__NR_##name, __SSC(a1), __SSC(a2), __SSC(a3), \
+		     __SSC(a4), 0, 0)
+#define __SYSCALL_CANCEL5(name, a1, a2, a3, a4, a5) \
+  (__syscall_cancel)(__NR_##name, __SSC(a1), __SSC(a2), __SSC(a3), \
+		     __SSC(a4), __SSC(a5), 0)
+#define __SYSCALL_CANCEL6(name, a1, a2, a3, a4, a5, a6) \
+  (__syscall_cancel)(__NR_##name, __SSC(a1), __SSC(a2), __SSC(a3), \
+		     __SSC(a4), __SSC(a5), __SSC(a6))
+
+#define __SYSCALL_CANCEL_NARGS_X(a,b,c,d,e,f,g,h,n,...) n
+#define __SYSCALL_CANCEL_NARGS(...) \
+  __SYSCALL_CANCEL_NARGS_X (__VA_ARGS__,7,6,5,4,3,2,1,0,)
+#define __SYSCALL_CANCEL_CONCAT_X(a,b)     a##b
+#define __SYSCALL_CANCEL_CONCAT(a,b)       __SYSCALL_CANCEL_CONCAT_X (a, b)
+#define __SYSCALL_CANCEL_DISP(b,...) \
+  __SYSCALL_CANCEL_CONCAT (b,__SYSCALL_CANCEL_NARGS(__VA_ARGS__))(__VA_ARGS__)
+
+#define __SYSCALL_CANCEL_CALL(...) \
+  __SYSCALL_CANCEL_DISP (__SYSCALL_CANCEL, __VA_ARGS__)
+
+/* Issue a cancellable syscall defined by syscall number NAME plus any other
+   argument required.  If an error occurs its value is returned as an negative
+   number unmodified and errno is not set.  */
+#define INTERNAL_SYSCALL_CANCEL(name, args...) \
+  __SYSCALL_CANCEL_CALL (name, args)
+
+/* Issue a cancellable syscall defined first argument plus any other argument
+   required.  If and error occurs its value, the macro returns -1 and sets
+   errno accordingly.  */
+#if IS_IN (rtld)
+/* The loader does not need to handle thread cancellation, use direct
+   syscall instead.  */
+# define SYSCALL_CANCEL(...) INLINE_SYSCALL_CALL (__VA_ARGS__)
+#else
+# define SYSCALL_CANCEL(...) \
+  ({									\
+    long int sc_ret = __SYSCALL_CANCEL_CALL (__VA_ARGS__);		\
+    SYSCALL_CANCEL_RET ((sc_ret));					\
   })
+#endif
+
+#endif /* __ASSEMBLER__  */
 
 /* Machine-dependent sysdep.h files are expected to define the macro
    PSEUDO (function_name, syscall_name) to emit assembly code to define the
diff --git a/sysdeps/unix/sysv/linux/clock_nanosleep.c b/sysdeps/unix/sysv/linux/clock_nanosleep.c
index 0cb6614..d9738c0 100644
--- a/sysdeps/unix/sysv/linux/clock_nanosleep.c
+++ b/sysdeps/unix/sysv/linux/clock_nanosleep.c
@@ -35,10 +35,8 @@ __clock_nanosleep (clockid_t clock_id, int flags, const struct timespec *req,
 
   /* If the call is interrupted by a signal handler or encounters an error,
      it returns a positive value similar to errno.  */
-  INTERNAL_SYSCALL_DECL (err);
-  int r = INTERNAL_SYSCALL_CANCEL (clock_nanosleep, err, clock_id, flags,
+  int r = INTERNAL_SYSCALL_CANCEL (clock_nanosleep, clock_id, flags,
 				   req, rem);
-  return (INTERNAL_SYSCALL_ERROR_P (r, err)
-	  ? INTERNAL_SYSCALL_ERRNO (r, err) : 0);
+  return SYSCALL_CANCEL_ERROR (r) ? -r : 0;
 }
 weak_alias (__clock_nanosleep, clock_nanosleep)
diff --git a/sysdeps/unix/sysv/linux/futex-internal.h b/sysdeps/unix/sysv/linux/futex-internal.h
index 980b798..b27dec1 100644
--- a/sysdeps/unix/sysv/linux/futex-internal.h
+++ b/sysdeps/unix/sysv/linux/futex-internal.h
@@ -75,10 +75,7 @@ static __always_inline int
 futex_wait_cancelable (unsigned int *futex_word, unsigned int expected,
 		       int private)
 {
-  int oldtype;
-  oldtype = __pthread_enable_asynccancel ();
-  int err = lll_futex_timed_wait (futex_word, expected, NULL, private);
-  __pthread_disable_asynccancel (oldtype);
+  int err = lll_futex_timed_wait_cancel (futex_word, expected, NULL, private);
   switch (err)
     {
     case 0:
@@ -129,10 +126,7 @@ futex_reltimed_wait_cancelable (unsigned int *futex_word,
 				unsigned int expected,
 			        const struct timespec *reltime, int private)
 {
-  int oldtype;
-  oldtype = LIBC_CANCEL_ASYNC ();
-  int err = lll_futex_timed_wait (futex_word, expected, reltime, private);
-  LIBC_CANCEL_RESET (oldtype);
+  int err = lll_futex_timed_wait_cancel (futex_word, expected, reltime, private);
   switch (err)
     {
     case 0:
@@ -203,12 +197,8 @@ futex_abstimed_wait_cancelable (unsigned int *futex_word,
      despite them being valid.  */
   if (__glibc_unlikely ((abstime != NULL) && (abstime->tv_sec < 0)))
     return ETIMEDOUT;
-  int oldtype;
-  oldtype = __pthread_enable_asynccancel ();
-  int err = lll_futex_clock_wait_bitset (futex_word, expected,
-					clockid, abstime,
-					private);
-  __pthread_disable_asynccancel (oldtype);
+  int err = lll_futex_clock_wait_bitset_cancel (futex_word, expected, clockid,
+						abstime, private);
   switch (err)
     {
     case 0:
diff --git a/sysdeps/unix/sysv/linux/lowlevellock-futex.h b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
index cfa796b..e48b5db 100644
--- a/sysdeps/unix/sysv/linux/lowlevellock-futex.h
+++ b/sysdeps/unix/sysv/linux/lowlevellock-futex.h
@@ -74,6 +74,12 @@
      ? -INTERNAL_SYSCALL_ERRNO (__ret, __err) : 0);                     \
   })
 
+#define lll_futex_syscall_cp(...)					\
+  ({                                                                    \
+    long int __ret = INTERNAL_SYSCALL_CANCEL (futex, __VA_ARGS__);	\
+    __ret;								\
+  })
+
 #define lll_futex_wait(futexp, val, private) \
   lll_futex_timed_wait (futexp, val, NULL, private)
 
@@ -148,19 +154,36 @@
 
 /* Cancellable futex macros.  */
 #define lll_futex_wait_cancel(futexp, val, private) \
-  ({                                                                   \
-    int __oldtype = CANCEL_ASYNC ();				       \
-    long int __err = lll_futex_wait (futexp, val, LLL_SHARED);	       \
-    CANCEL_RESET (__oldtype);					       \
-    __err;							       \
+  lll_futex_timed_wait_cancel (futexp, val, NULL, private)
+
+#define lll_futex_timed_wait_cancel(futexp, val, timeout, private)	\
+  ({									\
+    long int __ret;							\
+    int __op = FUTEX_WAIT;						\
+    __ret = lll_futex_syscall_cp (futexp,				\
+				  __lll_private_flag (__op, private),	\
+				  val, timeout);			\
+    __ret;								\
   })
 
-#define lll_futex_timed_wait_cancel(futexp, val, timeout, private)	   \
-  ({									   \
-    int __oldtype = CANCEL_ASYNC ();				       	   \
-    long int __err = lll_futex_timed_wait (futexp, val, timeout, private); \
-    CANCEL_RESET (__oldtype);						   \
-    __err;								   \
+#define lll_futex_clock_wait_bitset_cancel(futexp, val, clockid,	\
+					   timeout, private)		\
+  ({									\
+    long int __ret;							\
+    if (lll_futex_supported_clockid (clockid))                          \
+      {                                                                 \
+        const unsigned int clockbit =                                   \
+          (clockid == CLOCK_REALTIME) ? FUTEX_CLOCK_REALTIME : 0;       \
+        const int op =                                                  \
+          __lll_private_flag (FUTEX_WAIT_BITSET | clockbit, private);   \
+									\
+	__ret = lll_futex_syscall_cp (futexp, op, val,			\
+                                      timeout, NULL /* Unused.  */,	\
+                                      FUTEX_BITSET_MATCH_ANY);		\
+      }									\
+    else								\
+      __ret = -EINVAL;							\
+    __ret;								\
   })
 
 #endif  /* !__ASSEMBLER__  */
diff --git a/sysdeps/unix/sysv/linux/pthread_kill.c b/sysdeps/unix/sysv/linux/pthread_kill_internal.c
similarity index 75%
rename from sysdeps/unix/sysv/linux/pthread_kill.c
rename to sysdeps/unix/sysv/linux/pthread_kill_internal.c
index 9668aff..dc82821 100644
--- a/sysdeps/unix/sysv/linux/pthread_kill.c
+++ b/sysdeps/unix/sysv/linux/pthread_kill_internal.c
@@ -16,24 +16,15 @@
    License along with the GNU C Library; if not, see
    <http://www.gnu.org/licenses/>.  */
 
-#include <errno.h>
-#include <signal.h>
-#include <pthreadP.h>
-#include <tls.h>
-#include <sysdep.h>
 #include <unistd.h>
+#include <pthreadP.h>
 
-
+/* Used internally by pthread_cancel, so we can't filter SIGCANCEL.  */
 int
-__pthread_kill (pthread_t threadid, int signo)
+__pthread_kill_internal (pthread_t threadid, int signo)
 {
   struct pthread *pd = (struct pthread *) threadid;
 
-  /* Make sure the descriptor is valid.  */
-  if (DEBUGGING_P && INVALID_TD_P (pd))
-    /* Not a valid thread handle.  */
-    return ESRCH;
-
   /* Force load of pd->tid into local variable or register.  Otherwise
      if a thread exits between ESRCH test and tgkill, we might return
      EINVAL, because pd->tid would be cleared by the kernel.  */
@@ -42,11 +33,6 @@ __pthread_kill (pthread_t threadid, int signo)
     /* Not a valid thread handle.  */
     return ESRCH;
 
-  /* Disallow sending the signal we use for cancellation, timers,
-     for the setxid implementation.  */
-  if (signo == SIGCANCEL || signo == SIGTIMER || signo == SIGSETXID)
-    return EINVAL;
-
   /* We have a special syscall to do the work.  */
   INTERNAL_SYSCALL_DECL (err);
 
@@ -56,4 +42,4 @@ __pthread_kill (pthread_t threadid, int signo)
   return (INTERNAL_SYSCALL_ERROR_P (val, err)
 	  ? INTERNAL_SYSCALL_ERRNO (val, err) : 0);
 }
-strong_alias (__pthread_kill, pthread_kill)
+hidden_def (__pthread_kill_internal)
diff --git a/sysdeps/unix/sysv/linux/socketcall.h b/sysdeps/unix/sysv/linux/socketcall.h
index ed4840b..9b90df5 100644
--- a/sysdeps/unix/sysv/linux/socketcall.h
+++ b/sysdeps/unix/sysv/linux/socketcall.h
@@ -87,18 +87,32 @@
   })
 
 
-#if IS_IN (libc)
-# define __pthread_enable_asynccancel  __libc_enable_asynccancel
-# define __pthread_disable_asynccancel __libc_disable_asynccancel
-#endif
-
-#define SOCKETCALL_CANCEL(name, args...)				\
-  ({									\
-    int oldtype = LIBC_CANCEL_ASYNC ();					\
-    long int sc_ret = __SOCKETCALL (SOCKOP_##name, args);		\
-    LIBC_CANCEL_RESET (oldtype);					\
-    sc_ret;								\
-  })
-
+#define __SOCKETCALL_CANCEL1(__name, __a1) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [1]) { (long int) __a1 }))
+#define __SOCKETCALL_CANCEL2(__name, __a1, __a2) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [2]) { (long int) __a1, (long int) __a2 }))
+#define __SOCKETCALL_CANCEL3(__name, __a1, __a2, __a3) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [3]) { (long int) __a1, (long int) __a2, (long int) __a3 }))
+#define __SOCKETCALL_CANCEL4(__name, __a1, __a2, __a3, __a4) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [4]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4 }))
+#define __SOCKETCALL_CANCEL5(__name, __a1, __a2, __a3, __a4, __a5) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [5]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5 }))
+#define __SOCKETCALL_CANCEL6(__name, __a1, __a2, __a3, __a4, __a5, __a6) \
+  SYSCALL_CANCEL (socketcall, __name, \
+     ((long int [6]) { (long int) __a1, (long int) __a2, (long int) __a3, \
+                       (long int) __a4, (long int) __a5, (long int) __a6 }))
+
+#define __SOCKETCALL_CANCEL(...) __SOCKETCALL_DISP (__SOCKETCALL_CANCEL,\
+						    __VA_ARGS__)
+
+#define SOCKETCALL_CANCEL(name, args...) \
+   __SOCKETCALL_CANCEL (SOCKOP_##name, args)
 
 #endif /* sys/socketcall.h */
diff --git a/sysdeps/unix/sysv/linux/syscall_cancel.c b/sysdeps/unix/sysv/linux/syscall_cancel.c
new file mode 100644
index 0000000..79d66ec
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/syscall_cancel.c
@@ -0,0 +1,62 @@
+/* Default cancellation syscall bridge.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+#include <pthreadP.h>
+
+/* This is the generic version of the cancellable syscall code which
+   adds the label guards (__syscall_cancel_arch_{start,end}) used
+   on SIGCANCEL sigcancel_handler (nptl-init.c) to check if the cancelled
+   syscall have side-effects that need to be signaled to program.
+
+   This implementation should be used a reference one to document the
+   implementation constraints: the __syscall_cancel_arch_end should point
+   to the immediate next instruction after the syscall one.  This is because
+   kernel will signal interrupted syscall with side effects by setting
+   the signal frame program counter (on the ucontext_t third argument from
+   SA_SIGINFO signal handler) right after the syscall instruction.
+
+   If the INTERNAL_SYSCALL_NCS macro use more instructions to get the
+   error condition from kernel (as for powerpc and sparc), uses an
+   out of the line helper (as for ARM thumb), or uses a kernel helper
+   gate (as for i686 or ia64) the architecture should adjust the
+   macro or provide a custom __syscall_cancel_arch implementation.   */
+long int
+__syscall_cancel_arch (volatile int *ch, __syscall_arg_t nr,
+		       __syscall_arg_t a1, __syscall_arg_t a2,
+		       __syscall_arg_t a3, __syscall_arg_t a4,
+		       __syscall_arg_t a5, __syscall_arg_t a6)
+{
+#define ADD_LABEL(__label)		\
+  asm volatile (			\
+    ".global " __label "\t\n"		\
+    ".type " __label ",\%function\t\n" 	\
+    __label ":\n");
+
+  ADD_LABEL ("__syscall_cancel_arch_start");
+  if (__glibc_unlikely (*ch & CANCELED_BITMASK))
+    __syscall_do_cancel();
+
+  INTERNAL_SYSCALL_DECL(err);
+  long int result = INTERNAL_SYSCALL_NCS (nr, err, 6, a1, a2, a3, a4, a5, a6);
+  ADD_LABEL ("__syscall_cancel_arch_end");
+  if (__glibc_unlikely (INTERNAL_SYSCALL_ERROR_P (result, err)))
+    return -INTERNAL_SYSCALL_ERRNO (result, err);
+  return result;
+}
+libc_hidden_def (__syscall_cancel_arch)
diff --git a/sysdeps/unix/sysv/linux/sysdep.h b/sysdeps/unix/sysv/linux/sysdep.h
index af1c9a2..9819652 100644
--- a/sysdeps/unix/sysv/linux/sysdep.h
+++ b/sysdeps/unix/sysv/linux/sysdep.h
@@ -27,6 +27,26 @@
     -1l;					\
   })
 
+/* Check error from cancellable syscall and set errno accordingly.
+   Linux uses a negative return value to indicate syscall errors
+   and since version 2.1 the return value of a system call might be
+   negative even if the call succeeded (e.g., the `lseek' system call
+   might return a large offset).
+   Current contract is kernel make sure the no syscall returns a value
+   in -1 .. -4095 as a valid result so we can savely test with -4095.  */
+#define SYSCALL_CANCEL_ERROR(__ret)		\
+  (__ret > -4096UL)
+
+#define SYSCALL_CANCEL_RET(__ret)		\
+  ({						\
+    if (SYSCALL_CANCEL_ERROR(__ret))		\
+      {						\
+	__set_errno (-__ret);			\
+	__ret = -1;				\
+      }						\
+    __ret;					\
+   })
+
 /* Provide a dummy argument that can be used to force register
    alignment for register pairs if required by the syscall ABI.  */
 #ifdef __ASSUME_ALIGNED_REGISTER_PAIRS


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2023-04-11 14:18 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-07 17:07 [glibc/azanella/bz12683] nptl: Fix Race conditions in pthread cancellation [BZ#12683] Adhemerval Zanella
  -- strict thread matches above, loose matches on Subject: below --
2023-04-11 14:18 Adhemerval Zanella
2020-04-07 14:03 Adhemerval Zanella
2020-04-03 20:23 Adhemerval Zanella
2019-10-17 13:56 [glibc/azanella/bz12683] nptl: Fix Race conditions in pthread cancellation (BZ#12683) Adhemerval Zanella
2019-08-19 20:35 Adhemerval Zanella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).