public inbox for glibc-cvs@sourceware.org
help / color / mirror / Atom feed
* [glibc/azanella/pthread-multiple-fixes] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
@ 2021-08-20 17:55 Adhemerval Zanella
  0 siblings, 0 replies; 3+ messages in thread
From: Adhemerval Zanella @ 2021-08-20 17:55 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=ebae3185ca5529ad2147e6f75978a13e2d459b8e

commit ebae3185ca5529ad2147e6f75978a13e2d459b8e
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Jun 3 14:25:54 2021 -0300

    nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
    
    The use after free described in BZ#19951 is due the use of two different
    PD fields, 'joinid' and 'cancelhandling', to describe the thread state.
    And any state change potentially requires to check for both fields
    atomically to potentially handle partial state (such as pthread_join()
    with a cancellation handler and a joinstate rollback).
    
    This patch uses a different PD member with 4 possible states (JOINABLE,
    DETACHED, EXITING, and EXITED) to synchronize between pthread_create(),
    pthread_join(), pthread_detached(), and pthread_exit():
    
      1. On pthread_create() the inital state is set either to JOINABLE or
         DETACHED depending of the pthread attribute used.
    
      2. On pthread_detach() call, a CAS is issued on the state.  If the CAS
         fails it means that thread is already detached (DETACHED) or is
         being terminated (EXITING).  For former an EINVAL is returned,
         while for latter pthread_detach() should be reponsible to join the
         thread (and deallocate the resources).
    
      3. On pthread_create() exit phase (reached either if the thread
         function has returned, pthread_exit() has being called, or
         cancellation handled has been acted upon) we issue a CAS on state
         to set to EXITING Mode.  If the thread is previous on DETACHED
         mode it will be the thread itself responsible to deallocate any
         resource, otherwise the threads needs to be joined.
    
      4. The clear_tid_field on 'clone' call is changed to set the new
         'state' field on thread exit (EXITED).  This state ins only
         reached on thread termination.
    
      4. The pthread_join() implementation is now simpler: the futex wait
         is done directly on thread state and there is no need to reset it
          in case of timeout (since the state is now set either by
          pthread_detach() or by the kernel on process termination).
    
    The race condition on pthread_detach is avoided with only one atomic
    operation on PD state: once the mode is set to THREAD_STATE_DETACHED
    it is up to thread itself to deallocate its memory (done on the exit
    phase at pthread_create()).
    
    This change trigger an invalid C11 thread tests: it crates a thread,
    which detaches itself, and after a timeout the creating thread checks
    if the join fails.  The issue is once thrd_join() is called the thread
    lifetime has already expired.  The test is changed so the slee is done
    by the thread itself, so the creating thread will try to join a valid
    thread.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
    arm-linux-gnueabihf, and powerpc64-linux-gnu.

Diff:
---
 nptl/descr.h                        |  26 +++++---
 nptl/nptl-stack.h                   |   2 +-
 nptl/pthread_cancel.c               |   3 +-
 nptl/pthread_clockjoin.c            |   2 +-
 nptl/pthread_create.c               |  54 ++++++++++-----
 nptl/pthread_detach.c               |  36 ++++------
 nptl/pthread_getattr_np.c           |   2 +-
 nptl/pthread_getcpuclockid.c        |   3 +-
 nptl/pthread_getschedparam.c        |   3 +-
 nptl/pthread_join.c                 |   2 +-
 nptl/pthread_join_common.c          | 130 +++++++++++++-----------------------
 nptl/pthread_setschedparam.c        |   3 +-
 nptl/pthread_setschedprio.c         |   3 +-
 nptl/pthread_timedjoin.c            |   2 +-
 nptl/pthread_tryjoin.c              |  18 ++---
 sysdeps/nptl/dl-tls_init_tp.c       |   4 +-
 sysdeps/nptl/libc_start_call_main.h |   7 ++
 sysdeps/nptl/pthreadP.h             |   8 +--
 sysdeps/pthread/tst-thrd-detach.c   |  16 ++---
 19 files changed, 152 insertions(+), 172 deletions(-)

diff --git a/nptl/descr.h b/nptl/descr.h
index c85778d449..4b2db6edab 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -124,6 +124,18 @@ struct priority_protection_data
 };
 
 
+/* Define a possible thread state on 'joinstate' field.  The value will be
+   cleared by the kernel when the thread terminates (CLONE_CHILD_CLEARTID),
+   so THREAD_STATE_EXITED must be 0.  */
+enum
+  {
+    THREAD_STATE_EXITED = 0,
+    THREAD_STATE_EXITING,
+    THREAD_STATE_JOINABLE,
+    THREAD_STATE_DETACHED,
+  };
+
+
 /* Thread descriptor data structure.  */
 struct pthread
 {
@@ -166,8 +178,7 @@ struct pthread
      GL (dl_stack_user) list.  */
   list_t list;
 
-  /* Thread ID - which is also a 'is this thread descriptor (and
-     therefore stack) used' flag.  */
+  /* Thread ID set by the kernel with CLONE_PARENT_SETTID.  */
   pid_t tid;
 
   /* Ununsed.  */
@@ -335,15 +346,8 @@ struct pthread
   hp_timing_t cpuclock_offset_ununsed;
 #endif
 
-  /* If the thread waits to join another one the ID of the latter is
-     stored here.
-
-     In case a thread is detached this field contains a pointer of the
-     TCB if the thread itself.  This is something which cannot happen
-     in normal operation.  */
-  struct pthread *joinid;
-  /* Check whether a thread is detached.  */
-#define IS_DETACHED(pd) ((pd)->joinid == (pd))
+  /* The current thread state defined by the THREAD_STATE_* enumeration.  */
+  unsigned int joinstate;
 
   /* The result of the thread function.  */
   void *result;
diff --git a/nptl/nptl-stack.h b/nptl/nptl-stack.h
index 8bcfde7ec5..c6b2b3078f 100644
--- a/nptl/nptl-stack.h
+++ b/nptl/nptl-stack.h
@@ -32,7 +32,7 @@ extern size_t __nptl_stack_cache_maxsize attribute_hidden;
 static inline bool
 __nptl_stack_in_use (struct pthread *pd)
 {
-  return pd->tid <= 0;
+  return atomic_load_relaxed (&pd->joinstate) == THREAD_STATE_EXITED;
 }
 
 /* Remove the stack ELEM from its list.  */
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index 38f637c5cf..67e00ef007 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -63,7 +63,8 @@ __pthread_cancel (pthread_t th)
   volatile struct pthread *pd = (volatile struct pthread *) th;
 
   /* Make sure the descriptor is valid.  */
-  if (INVALID_TD_P (pd))
+  int state = atomic_load_acquire (&pd->joinstate);
+  if (state == THREAD_STATE_EXITED || state == THREAD_STATE_EXITING)
     /* Not a valid thread handle.  */
     return ESRCH;
 
diff --git a/nptl/pthread_clockjoin.c b/nptl/pthread_clockjoin.c
index 2d01ba03a2..76598ff0fa 100644
--- a/nptl/pthread_clockjoin.c
+++ b/nptl/pthread_clockjoin.c
@@ -30,7 +30,7 @@ ___pthread_clockjoin_np64 (pthread_t threadid, void **thread_return,
     return EINVAL;
 
   return __pthread_clockjoin_ex (threadid, thread_return,
-                                 clockid, abstime, true);
+                                 clockid, abstime);
 }
 
 #if __TIMESIZE == 64
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 08e5189ad6..763e32bc3e 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -286,7 +286,7 @@ static int create_thread (struct pthread *pd, const struct pthread_attr *attr,
       .flags = clone_flags,
       .pidfd = (uintptr_t) &pd->tid,
       .parent_tid = (uintptr_t) &pd->tid,
-      .child_tid = (uintptr_t) &pd->tid,
+      .child_tid = (uintptr_t) &pd->joinstate,
       .stack = (uintptr_t) stackaddr,
       .stack_size = stacksize,
       .tls = (uintptr_t) tp,
@@ -351,13 +351,16 @@ start_thread (void *arg)
          and free any resource prior return to the pthread_create caller.  */
       setup_failed = pd->setup_failed == 1;
       if (setup_failed)
-	pd->joinid = NULL;
+	pd->joinstate = THREAD_STATE_JOINABLE;
 
       /* And give it up right away.  */
       lll_unlock (pd->lock, LLL_PRIVATE);
 
       if (setup_failed)
-	goto out;
+	{
+	  pd->tid = 0;
+	  goto out;
+	}
     }
 
   /* Initialize resolver state pointer.  */
@@ -481,9 +484,20 @@ start_thread (void *arg)
      the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE.  */
   atomic_bit_set (&pd->cancelhandling, EXITING_BIT);
 
-  if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
-    /* This was the last thread.  */
-    exit (0);
+
+  /* CONCURRENCY NOTES:
+
+     Concurrent pthread_detach() will either set state to
+     THREAD_STATE_DETACHED or wait the thread to terminate.  The exiting state
+     set here is set so a pthread_join() wait until all the required cleanup
+     steps are done.
+
+     The 'joinstate' field will be used to determine who is responsible to
+     call __nptl_free_tcb below.  */
+
+  unsigned int joinstate = THREAD_STATE_JOINABLE;
+  atomic_compare_exchange_weak_acquire (&pd->joinstate, &joinstate,
+					THREAD_STATE_EXITING);
 
 #ifndef __ASSUME_SET_ROBUST_LIST
   /* We let the kernel do the notification if it is able to do so on the exit
@@ -519,6 +533,10 @@ start_thread (void *arg)
     }
 #endif
 
+  if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
+    /* This was the last thread.  */
+    exit (0);
+
   if (!pd->user_stack)
     advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
 			pd->guardsize);
@@ -539,17 +557,17 @@ start_thread (void *arg)
       pd->setxid_futex = 0;
     }
 
-  /* If the thread is detached free the TCB.  */
-  if (IS_DETACHED (pd))
-    /* Free the TCB.  */
+  if (joinstate == THREAD_STATE_DETACHED)
     __nptl_free_tcb (pd);
 
+  pd->tid = 0;
+
 out:
   /* We cannot call '_exit' here.  '_exit' will terminate the process.
 
      The 'exit' implementation in the kernel will signal when the
      process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID
-     flag.  The 'tid' field in the TCB will be set to zero.
+     flag.  The 'joinstate' field in the TCB will be set to zero.
 
      The exit code is zero since in case all threads exit by calling
      'pthread_exit' the exit status must be 0 (zero).  */
@@ -644,10 +662,9 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
   pd->flags = ((iattr->flags & ~(ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET))
 	       | (self->flags & (ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET)));
 
-  /* Initialize the field for the ID of the thread which is waiting
-     for us.  This is a self-reference in case the thread is created
-     detached.  */
-  pd->joinid = iattr->flags & ATTR_FLAG_DETACHSTATE ? pd : NULL;
+  pd->joinstate = iattr->flags & ATTR_FLAG_DETACHSTATE
+		  ? THREAD_STATE_DETACHED
+		  : THREAD_STATE_JOINABLE;
 
   /* The debug events are inherited from the parent.  */
   pd->eventbuf = self->eventbuf;
@@ -806,10 +823,11 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
 
 	  /* Similar to pthread_join, but since thread creation has failed at
 	     startup there is no need to handle all the steps.  */
-	  pid_t tid;
-	  while ((tid = atomic_load_acquire (&pd->tid)) != 0)
-	    __futex_abstimed_wait_cancelable64 ((unsigned int *) &pd->tid,
-						tid, 0, NULL, LLL_SHARED);
+	  unsigned int state;
+	  while ((state = atomic_load_acquire (&pd->joinstate))
+                 != THREAD_STATE_EXITED)
+	    __futex_abstimed_wait_cancelable64 (&pd->joinstate, state, 0,
+                                                NULL, LLL_SHARED);
         }
 
       /* State (c) or (d) and we have ownership of PD (see CONCURRENCY
diff --git a/nptl/pthread_detach.c b/nptl/pthread_detach.c
index ac50db9b0e..97d292cab8 100644
--- a/nptl/pthread_detach.c
+++ b/nptl/pthread_detach.c
@@ -26,32 +26,24 @@ ___pthread_detach (pthread_t th)
 {
   struct pthread *pd = (struct pthread *) th;
 
-  /* Make sure the descriptor is valid.  */
-  if (INVALID_NOT_TERMINATED_TD_P (pd))
-    /* Not a valid thread handle.  */
-    return ESRCH;
+  /* CONCURRENCY NOTES:
 
-  int result = 0;
+     Concurrent pthread_detach() will return EINVAL for the case the thread
+     is already detached (THREAD_STATE_DETACHED).  POSIX states it is
+     undefined to call pthread_detach if TH refers to a non joinable thread.
 
-  /* Mark the thread as detached.  */
-  if (atomic_compare_and_exchange_bool_acq (&pd->joinid, pd, NULL))
+     For the case the thread is being terminated (THREAD_STATE_EXITING),
+     pthread_detach() will responsible to clean up the stack.  */
+
+  int curstate = THREAD_STATE_JOINABLE;
+  if (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &curstate,
+					     THREAD_STATE_DETACHED))
     {
-      /* There are two possibilities here.  First, the thread might
-	 already be detached.  In this case we return EINVAL.
-	 Otherwise there might already be a waiter.  The standard does
-	 not mention what happens in this case.  */
-      if (IS_DETACHED (pd))
-	result = EINVAL;
+      if (curstate == THREAD_STATE_DETACHED)
+	return EINVAL;
+      return __pthread_join (th, 0);
     }
-  else
-    /* Check whether the thread terminated meanwhile.  In this case we
-       will just free the TCB.  */
-    if ((pd->cancelhandling & EXITING_BITMASK) != 0)
-      /* Note that the code in __free_tcb makes sure each thread
-	 control block is freed only once.  */
-      __nptl_free_tcb (pd);
-
-  return result;
+  return 0;
 }
 versioned_symbol (libc, ___pthread_detach, pthread_detach, GLIBC_2_34);
 libc_hidden_ver (___pthread_detach, __pthread_detach)
diff --git a/nptl/pthread_getattr_np.c b/nptl/pthread_getattr_np.c
index 25807cb529..560ba9ab98 100644
--- a/nptl/pthread_getattr_np.c
+++ b/nptl/pthread_getattr_np.c
@@ -53,7 +53,7 @@ __pthread_getattr_np (pthread_t thread_id, pthread_attr_t *attr)
   iattr->flags = thread->flags;
 
   /* The thread might be detached by now.  */
-  if (IS_DETACHED (thread))
+  if (atomic_load_acquire (&thread->joinstate) == THREAD_STATE_DETACHED)
     iattr->flags |= ATTR_FLAG_DETACHSTATE;
 
   /* This is the guardsize after adjusting it.  */
diff --git a/nptl/pthread_getcpuclockid.c b/nptl/pthread_getcpuclockid.c
index 0a6656ea4c..bd3d02379c 100644
--- a/nptl/pthread_getcpuclockid.c
+++ b/nptl/pthread_getcpuclockid.c
@@ -29,7 +29,8 @@ __pthread_getcpuclockid (pthread_t threadid, clockid_t *clockid)
   struct pthread *pd = (struct pthread *) threadid;
 
   /* Make sure the descriptor is valid.  */
-  if (INVALID_TD_P (pd))
+  int state = atomic_load_acquire (&pd->joinstate);
+  if (state == THREAD_STATE_EXITED || state == THREAD_STATE_EXITING)
     /* Not a valid thread handle.  */
     return ESRCH;
 
diff --git a/nptl/pthread_getschedparam.c b/nptl/pthread_getschedparam.c
index 94316cf897..878191bd6e 100644
--- a/nptl/pthread_getschedparam.c
+++ b/nptl/pthread_getschedparam.c
@@ -29,7 +29,8 @@ __pthread_getschedparam (pthread_t threadid, int *policy,
   struct pthread *pd = (struct pthread *) threadid;
 
   /* Make sure the descriptor is valid.  */
-  if (INVALID_TD_P (pd))
+  int state = atomic_load_acquire (&pd->joinstate);
+  if (state == THREAD_STATE_EXITED || state == THREAD_STATE_EXITING)
     /* Not a valid thread handle.  */
     return ESRCH;
 
diff --git a/nptl/pthread_join.c b/nptl/pthread_join.c
index d2b33de73d..195a537029 100644
--- a/nptl/pthread_join.c
+++ b/nptl/pthread_join.c
@@ -23,7 +23,7 @@ int
 ___pthread_join (pthread_t threadid, void **thread_return)
 {
   return __pthread_clockjoin_ex (threadid, thread_return, 0 /* Ignored */,
-				 NULL, true);
+				 NULL);
 }
 versioned_symbol (libc, ___pthread_join, pthread_join, GLIBC_2_34);
 libc_hidden_ver (___pthread_join, __pthread_join)
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
index 7303069316..428ed35101 100644
--- a/nptl/pthread_join_common.c
+++ b/nptl/pthread_join_common.c
@@ -22,113 +22,73 @@
 #include <time.h>
 #include <futex-internal.h>
 
-static void
-cleanup (void *arg)
+/* Check for a possible deadlock situation where the threads are waiting for
+   each other to finish.  Note that this is a "may" error.  To be 100% sure we
+   catch this error we would have to lock the data structures but it is not
+   necessary.  In the unlikely case that two threads are really caught in this
+   situation they will deadlock.  It is the programmer's problem to figure
+   this out.  */
+static inline bool
+check_for_deadlock (int state, struct pthread *pd)
 {
-  /* If we already changed the waiter ID, reset it.  The call cannot
-     fail for any reason but the thread not having done that yet so
-     there is no reason for a loop.  */
   struct pthread *self = THREAD_SELF;
-  atomic_compare_exchange_weak_acquire (&arg, &self, NULL);
+  return ((pd == self
+	   || (atomic_load_acquire (&self->joinstate) == THREAD_STATE_DETACHED
+	       && (pd->cancelhandling
+		   & (CANCELED_BITMASK | EXITING_BITMASK
+		      | TERMINATED_BITMASK)) == 0))
+	   && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
+		&& (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
+					  | TERMINATED_BITMASK))
+		== CANCELED_BITMASK));
 }
 
 int
 __pthread_clockjoin_ex (pthread_t threadid, void **thread_return,
                         clockid_t clockid,
-                        const struct __timespec64 *abstime, bool block)
+                        const struct __timespec64 *abstime)
 {
   struct pthread *pd = (struct pthread *) threadid;
 
-  /* Make sure the descriptor is valid.  */
-  if (INVALID_NOT_TERMINATED_TD_P (pd))
-    /* Not a valid thread handle.  */
-    return ESRCH;
-
-  /* Is the thread joinable?.  */
-  if (IS_DETACHED (pd))
-    /* We cannot wait for the thread.  */
-    return EINVAL;
-
-  struct pthread *self = THREAD_SELF;
-  int result = 0;
-
   LIBC_PROBE (pthread_join, 1, threadid);
 
-  if ((pd == self
-       || (self->joinid == pd
-	   && (pd->cancelhandling
-	       & (CANCELED_BITMASK | EXITING_BITMASK
-		  | TERMINATED_BITMASK)) == 0))
-      && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
-	   && (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
-				     | TERMINATED_BITMASK))
-	       == CANCELED_BITMASK))
-    /* This is a deadlock situation.  The threads are waiting for each
-       other to finish.  Note that this is a "may" error.  To be 100%
-       sure we catch this error we would have to lock the data
-       structures but it is not necessary.  In the unlikely case that
-       two threads are really caught in this situation they will
-       deadlock.  It is the programmer's problem to figure this
-       out.  */
-    return EDEADLK;
-
-  /* Wait for the thread to finish.  If it is already locked something
-     is wrong.  There can only be one waiter.  */
-  else if (__glibc_unlikely (atomic_compare_exchange_weak_acquire (&pd->joinid,
-								   &self,
-								   NULL)))
-    /* There is already somebody waiting for the thread.  */
-    return EINVAL;
-
-  /* BLOCK waits either indefinitely or based on an absolute time.  POSIX also
-     states a cancellation point shall occur for pthread_join, and we use the
-     same rationale for posix_timedjoin_np.  Both clockwait_tid and the futex
-     call use the cancellable variant.  */
-  if (block)
+  int result = 0;
+  unsigned int state;
+  while ((state = atomic_load_acquire (&pd->joinstate))
+	 != THREAD_STATE_EXITED)
     {
-      /* During the wait we change to asynchronous cancellation.  If we
-	 are cancelled the thread we are waiting for must be marked as
-	 un-wait-ed for again.  */
-      pthread_cleanup_push (cleanup, &pd->joinid);
-
-      /* We need acquire MO here so that we synchronize with the
-         kernel's store to 0 when the clone terminates. (see above)  */
-      pid_t tid;
-      while ((tid = atomic_load_acquire (&pd->tid)) != 0)
-        {
-         /* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via
-	    futex wake-up when the clone terminates.  The memory location
-	    contains the thread ID while the clone is running and is reset to
-	    zero by the kernel afterwards.  The kernel up to version 3.16.3
-	    does not use the private futex operations for futex wake-up when
-	    the clone terminates.  */
-	  int ret = __futex_abstimed_wait_cancelable64 (
-	    (unsigned int *) &pd->tid, tid, clockid, abstime, LLL_SHARED);
-	  if (ret == ETIMEDOUT || ret == EOVERFLOW)
-	    {
-	      result = ret;
-	      break;
-	    }
+      if (check_for_deadlock (state, pd))
+	return EDEADLK;
+
+      /* POSIX states calling pthread_join() on a non joinable thread is
+         undefined.  However, if PD is still in the cache we can still warn
+         the caller.  */
+      if (state == THREAD_STATE_DETACHED)
+	return EINVAL;
+
+      /* pthread_join() is a cancellation entrypoint and we use the same
+         rationale for pthread_timedjoin_np().
+
+	 The kernel notifies a process which uses CLONE_CHILD_CLEARTID via
+	 a memory zerouing and futex wake up when the process terminates.
+	 The futex operation is not private.  */
+      int ret = __futex_abstimed_wait_cancelable64 (&pd->joinstate, state,
+						    clockid, abstime,
+						    LLL_SHARED);
+      if (ret == ETIMEDOUT || ret == EOVERFLOW)
+	{
+	  result = ret;
+	  break;
 	}
-
-      pthread_cleanup_pop (0);
     }
 
   void *pd_result = pd->result;
-  if (__glibc_likely (result == 0))
+  if (result == 0)
     {
-      /* We mark the thread as terminated and as joined.  */
-      pd->tid = -1;
-
-      /* Store the return value if the caller is interested.  */
       if (thread_return != NULL)
 	*thread_return = pd_result;
-
-      /* Free the TCB.  */
       __nptl_free_tcb (pd);
     }
-  else
-    pd->joinid = NULL;
 
   LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd_result);
 
diff --git a/nptl/pthread_setschedparam.c b/nptl/pthread_setschedparam.c
index 70fa8378b8..3c26bbf59f 100644
--- a/nptl/pthread_setschedparam.c
+++ b/nptl/pthread_setschedparam.c
@@ -30,7 +30,8 @@ __pthread_setschedparam (pthread_t threadid, int policy,
   struct pthread *pd = (struct pthread *) threadid;
 
   /* Make sure the descriptor is valid.  */
-  if (INVALID_TD_P (pd))
+  int state = atomic_load_acquire (&pd->joinstate);
+  if (state == THREAD_STATE_EXITED || state == THREAD_STATE_EXITING)
     /* Not a valid thread handle.  */
     return ESRCH;
 
diff --git a/nptl/pthread_setschedprio.c b/nptl/pthread_setschedprio.c
index 7bb68d3231..565583fcce 100644
--- a/nptl/pthread_setschedprio.c
+++ b/nptl/pthread_setschedprio.c
@@ -30,7 +30,8 @@ __pthread_setschedprio (pthread_t threadid, int prio)
   struct pthread *pd = (struct pthread *) threadid;
 
   /* Make sure the descriptor is valid.  */
-  if (INVALID_TD_P (pd))
+  int state = atomic_load_acquire (&pd->joinstate);
+  if (state == THREAD_STATE_EXITED || state == THREAD_STATE_EXITING)
     /* Not a valid thread handle.  */
     return ESRCH;
 
diff --git a/nptl/pthread_timedjoin.c b/nptl/pthread_timedjoin.c
index 0b4026612f..5d561fb4bb 100644
--- a/nptl/pthread_timedjoin.c
+++ b/nptl/pthread_timedjoin.c
@@ -25,7 +25,7 @@ ___pthread_timedjoin_np64 (pthread_t threadid, void **thread_return,
                            const struct __timespec64 *abstime)
 {
   return __pthread_clockjoin_ex (threadid, thread_return,
-                                 CLOCK_REALTIME, abstime, true);
+                                 CLOCK_REALTIME, abstime);
 }
 
 #if __TIMESIZE == 64
diff --git a/nptl/pthread_tryjoin.c b/nptl/pthread_tryjoin.c
index fd938e8780..726f2abc68 100644
--- a/nptl/pthread_tryjoin.c
+++ b/nptl/pthread_tryjoin.c
@@ -22,15 +22,17 @@
 int
 __pthread_tryjoin_np (pthread_t threadid, void **thread_return)
 {
-  /* Return right away if the thread hasn't terminated yet.  */
-  struct pthread *pd = (struct pthread *) threadid;
-  if (pd->tid != 0)
-    return EBUSY;
+  /* The joinable state (THREAD_STATE_JOINABLE) is straigthforward since the
+     thread hasn't finished yet and trying to join might block.
+     Both detached (THREAD_STATE_DETACHED) and exiting (THREAD_STATE_EXITING)
+     might also result in a possible blocking call: a detached thread might
+     change its state to exiting and a exiting thread my take some time to
+     exit (and thus let the kernel set the state to THREAD_STATE_EXITED).  */
 
-  /* If pd->tid == 0 then lll_wait_tid will not block on futex
-     operation.  */
-  return __pthread_clockjoin_ex (threadid, thread_return, 0 /* Ignored */,
-				 NULL, false);
+  struct pthread *pd = (struct pthread *) threadid;
+  return atomic_load_acquire (&pd->joinstate) != THREAD_STATE_EXITED
+	 ? EBUSY
+	 : __pthread_clockjoin_ex (threadid, thread_return, 0, NULL);
 }
 versioned_symbol (libc, __pthread_tryjoin_np, pthread_tryjoin_np, GLIBC_2_34);
 
diff --git a/sysdeps/nptl/dl-tls_init_tp.c b/sysdeps/nptl/dl-tls_init_tp.c
index ca494dd3a5..35d54abd92 100644
--- a/sysdeps/nptl/dl-tls_init_tp.c
+++ b/sysdeps/nptl/dl-tls_init_tp.c
@@ -62,7 +62,7 @@ __tls_init_tp (void)
 
    /* Early initialization of the TCB.   */
    struct pthread *pd = THREAD_SELF;
-   pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->tid);
+   pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->joinstate);
    THREAD_SETMEM (pd, specific[0], &pd->specific_1stblock[0]);
    THREAD_SETMEM (pd, user_stack, true);
 
@@ -97,4 +97,6 @@ __tls_init_tp (void)
 
   THREAD_SETMEM (pd, cancelstate, PTHREAD_CANCEL_ENABLE);
   THREAD_SETMEM (pd, canceltype, PTHREAD_CANCEL_DEFERRED);
+
+  THREAD_SETMEM (pd, joinstate, THREAD_STATE_JOINABLE);
 }
diff --git a/sysdeps/nptl/libc_start_call_main.h b/sysdeps/nptl/libc_start_call_main.h
index 06d72c1e38..e7a7f7b4c4 100644
--- a/sysdeps/nptl/libc_start_call_main.h
+++ b/sysdeps/nptl/libc_start_call_main.h
@@ -18,6 +18,7 @@
 
 #include <atomic.h>
 #include <pthreadP.h>
+#include <futex-internal.h>
 
 _Noreturn static void
 __libc_start_call_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
@@ -65,6 +66,12 @@ __libc_start_call_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
       /* One less thread.  Decrement the counter.  If it is zero we
          terminate the entire process.  */
       result = 0;
+
+      /* For the case a thread is waiting for the main thread to finish.  */
+      struct pthread *self = THREAD_SELF;
+      atomic_store_release (&self->joinstate, THREAD_STATE_EXITED);
+      futex_wake (&self->joinstate, 1, FUTEX_SHARED);
+
       if (! atomic_decrement_and_test (&__nptl_nthreads))
         /* Not much left to do but to exit the thread, not the process.  */
 	while (1)
diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
index d968a91cfd..55f66a2b55 100644
--- a/sysdeps/nptl/pthreadP.h
+++ b/sysdeps/nptl/pthreadP.h
@@ -239,12 +239,6 @@ libc_hidden_proto (__pthread_tpp_change_priority)
 extern int __pthread_current_priority (void);
 libc_hidden_proto (__pthread_current_priority)
 
-/* This will not catch all invalid descriptors but is better than
-   nothing.  And if the test triggers the thread descriptor is
-   guaranteed to be invalid.  */
-#define INVALID_TD_P(pd) __builtin_expect ((pd)->tid <= 0, 0)
-#define INVALID_NOT_TERMINATED_TD_P(pd) __builtin_expect ((pd)->tid < 0, 0)
-
 extern void __pthread_unwind (__pthread_unwind_buf_t *__buf)
      __cleanup_fct_attribute __attribute ((__noreturn__))
 #if !defined SHARED && !IS_IN (libpthread)
@@ -537,7 +531,7 @@ libc_hidden_proto (__pthread_setcanceltype)
 extern void __pthread_testcancel (void);
 libc_hidden_proto (__pthread_testcancel)
 extern int __pthread_clockjoin_ex (pthread_t, void **, clockid_t,
-				   const struct __timespec64 *, bool)
+				   const struct __timespec64 *)
   attribute_hidden;
 extern int __pthread_sigmask (int, const sigset_t *, sigset_t *);
 libc_hidden_proto (__pthread_sigmask);
diff --git a/sysdeps/pthread/tst-thrd-detach.c b/sysdeps/pthread/tst-thrd-detach.c
index c844767748..e1906a0e10 100644
--- a/sysdeps/pthread/tst-thrd-detach.c
+++ b/sysdeps/pthread/tst-thrd-detach.c
@@ -20,14 +20,14 @@
 #include <time.h>
 #include <stdio.h>
 #include <unistd.h>
-
+#include <limits.h>
 #include <support/check.h>
 
 static int
 detach_thrd (void *arg)
 {
-  if (thrd_detach (thrd_current ()) != thrd_success)
-    FAIL_EXIT1 ("thrd_detach failed");
+  thrd_sleep (&(struct timespec) { .tv_sec = INT_MAX }, NULL);
+
   thrd_exit (thrd_success);
 }
 
@@ -36,15 +36,11 @@ do_test (void)
 {
   thrd_t id;
 
-  /* Create new thread.  */
-  if (thrd_create (&id, detach_thrd, NULL) != thrd_success)
-    FAIL_EXIT1 ("thrd_create failed");
+  TEST_COMPARE (thrd_create (&id, detach_thrd, NULL), thrd_success);
 
-  /* Give some time so the thread can finish.  */
-  thrd_sleep (&(struct timespec) {.tv_sec = 2}, NULL);
+  TEST_COMPARE (thrd_detach (id), thrd_success);
 
-  if (thrd_join (id, NULL) == thrd_success)
-    FAIL_EXIT1 ("thrd_join succeed where it should fail");
+  TEST_COMPARE (thrd_join (id, NULL), thrd_error);
 
   return 0;
 }


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [glibc/azanella/pthread-multiple-fixes] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
@ 2021-09-10 14:37 Adhemerval Zanella
  0 siblings, 0 replies; 3+ messages in thread
From: Adhemerval Zanella @ 2021-09-10 14:37 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=c9246e98f3e53d07a6190f6415e03e45ea4be0b1

commit c9246e98f3e53d07a6190f6415e03e45ea4be0b1
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Jun 3 14:25:54 2021 -0300

    nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
    
    The use after free described in BZ#19951 is due the use of two different
    PD fields, 'joinid' and 'cancelhandling', to describe the thread state
    to synchronize the calls of pthread_join(), pthread_detach(),
    pthread_exit(), and normal thread exit.
    
    And any state change potentially requires to check for both fields
    atomically to handle partial state (such as pthread_join() with a
    cancellation handler to issue a 'joinstate' field rollback).
    
    This patch uses a different PD member with 4 possible states (JOINABLE,
    DETACHED, EXITING, and EXITED) instead of pthread 'tid' field:
    
      1. On pthread_create() the inital state is set either to JOINABLE or
         DETACHED depending of the pthread attribute used.
    
      2. On pthread_detach(), a CAS is issued on the state.  If the CAS
         fails it means that thread is already detached (DETACHED) or is
         being terminated (EXITING).  For former an EINVAL is returned,
         while for latter pthread_detach() should be reponsible to join the
         thread (and deallocate any internal resources).
    
      3. In the exit phase of the wrapper function for the thread start
         routine (reached either if the thread function has returned,
         pthread_exit() has being called, or cancellation handled has been
         acted upon) we issue a CAS on state to set to EXITING mode.  If the
         thread is previously on DETACHED mode the thread itself is
         responsible for arranging the deallocation of any resource,
         otherwise the thread needs to be joined (detached threads cannot
         immediately deallocate themselves)
    
      4. The clear_tid_field on 'clone' call is changed to set the new
         'state' field on thread exit (EXITED).  This state ins only
         reached at thread termination.
    
      5. The pthread_join() implementation is now simpler: the futex wait
         is done directly on thread state and there is no need to reset it
         in case of timeout (since the state is now set either by
         pthread_detach() or by the kernel on process termination).
    
    The race condition on pthread_detach is avoided with only one atomic
    operation on PD state: once the mode is set to THREAD_STATE_DETACHED
    it is up to thread itself to deallocate its memory (done on the exit
    phase at pthread_create()).
    
    Also, the INVALID_NOT_TERMINATED_TD_P is removed since a a negative
    tid is not possible and the macro is not used anywhere.
    
    This change trigger an invalid C11 thread tests: it crates a thread,
    which detaches itself, and after a timeout the creating thread checks
    if the join fails.  The issue is once thrd_join() is called the thread
    lifetime has already expired.  The test is changed so the sleep is done
    by the thread itself, so the creating thread will try to join a valid
    thread.
    
    Checked on x86_64-linux-gnu, i686-linux-gnu, aarch64-linux-gnu,
    arm-linux-gnueabihf, and powerpc64-linux-gnu.

Diff:
---
 nptl/descr.h                        |  26 +++++---
 nptl/nptl-stack.h                   |   2 +-
 nptl/pthread_cancel.c               |   3 +-
 nptl/pthread_clockjoin.c            |   2 +-
 nptl/pthread_create.c               |  47 +++++++++----
 nptl/pthread_detach.c               |  40 +++++------
 nptl/pthread_getattr_np.c           |   2 +-
 nptl/pthread_join.c                 |   2 +-
 nptl/pthread_join_common.c          | 130 +++++++++++++-----------------------
 nptl/pthread_timedjoin.c            |   2 +-
 nptl/pthread_tryjoin.c              |  18 ++---
 sysdeps/nptl/dl-tls_init_tp.c       |   4 +-
 sysdeps/nptl/libc_start_call_main.h |   7 ++
 sysdeps/nptl/pthreadP.h             |   3 +-
 sysdeps/pthread/tst-thrd-detach.c   |  16 ++---
 15 files changed, 145 insertions(+), 159 deletions(-)

diff --git a/nptl/descr.h b/nptl/descr.h
index 41ee56feb2..76522b85b3 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -123,6 +123,18 @@ struct priority_protection_data
 };
 
 
+/* Define a possible thread state on 'joinstate' field.  The value will be
+   cleared by the kernel when the thread terminates (CLONE_CHILD_CLEARTID),
+   so THREAD_STATE_EXITED must be 0.  */
+enum
+  {
+    THREAD_STATE_EXITED = 0,
+    THREAD_STATE_EXITING,
+    THREAD_STATE_JOINABLE,
+    THREAD_STATE_DETACHED,
+  };
+
+
 /* Thread descriptor data structure.  */
 struct pthread
 {
@@ -165,8 +177,7 @@ struct pthread
      GL (dl_stack_user) list.  */
   list_t list;
 
-  /* Thread ID - which is also a 'is this thread descriptor (and
-     therefore stack) used' flag.  */
+  /* Thread ID set by the kernel with CLONE_PARENT_SETTID.  */
   pid_t tid;
 
   /* Ununsed.  */
@@ -334,15 +345,8 @@ struct pthread
   hp_timing_t cpuclock_offset_ununsed;
 #endif
 
-  /* If the thread waits to join another one the ID of the latter is
-     stored here.
-
-     In case a thread is detached this field contains a pointer of the
-     TCB if the thread itself.  This is something which cannot happen
-     in normal operation.  */
-  struct pthread *joinid;
-  /* Check whether a thread is detached.  */
-#define IS_DETACHED(pd) ((pd)->joinid == (pd))
+  /* The current thread state defined by the THREAD_STATE_* enumeration.  */
+  unsigned int joinstate;
 
   /* The result of the thread function.  */
   void *result;
diff --git a/nptl/nptl-stack.h b/nptl/nptl-stack.h
index c0b8e48f70..f703ae449a 100644
--- a/nptl/nptl-stack.h
+++ b/nptl/nptl-stack.h
@@ -31,7 +31,7 @@ extern size_t __nptl_stack_cache_maxsize attribute_hidden;
 static inline bool
 __nptl_stack_in_use (struct pthread *pd)
 {
-  return pd->tid <= 0;
+  return atomic_load_relaxed (&pd->joinstate) == THREAD_STATE_EXITED;
 }
 
 /* Remove the stack ELEM from its list.  */
diff --git a/nptl/pthread_cancel.c b/nptl/pthread_cancel.c
index 868cf676a3..d09947807e 100644
--- a/nptl/pthread_cancel.c
+++ b/nptl/pthread_cancel.c
@@ -61,7 +61,8 @@ __pthread_cancel (pthread_t th)
 {
   volatile struct pthread *pd = (volatile struct pthread *) th;
 
-  if (pd->tid == 0)
+  int state = atomic_load_acquire (&pd->joinstate);
+  if (state == THREAD_STATE_EXITED || state == THREAD_STATE_EXITING)
     /* The thread has already exited on the kernel side.  Its outcome
        (regular exit, other cancelation) has already been
        determined.  */
diff --git a/nptl/pthread_clockjoin.c b/nptl/pthread_clockjoin.c
index 2d01ba03a2..76598ff0fa 100644
--- a/nptl/pthread_clockjoin.c
+++ b/nptl/pthread_clockjoin.c
@@ -30,7 +30,7 @@ ___pthread_clockjoin_np64 (pthread_t threadid, void **thread_return,
     return EINVAL;
 
   return __pthread_clockjoin_ex (threadid, thread_return,
-                                 clockid, abstime, true);
+                                 clockid, abstime);
 }
 
 #if __TIMESIZE == 64
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index eb46dfdefc..3bd9854e43 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -286,7 +286,7 @@ static int create_thread (struct pthread *pd, const struct pthread_attr *attr,
       .flags = clone_flags,
       .pidfd = (uintptr_t) &pd->tid,
       .parent_tid = (uintptr_t) &pd->tid,
-      .child_tid = (uintptr_t) &pd->tid,
+      .child_tid = (uintptr_t) &pd->joinstate,
       .stack = (uintptr_t) stackaddr,
       .stack_size = stacksize,
       .tls = (uintptr_t) tp,
@@ -351,12 +351,14 @@ start_thread (void *arg)
          and free any resource prior return to the pthread_create caller.  */
       setup_failed = pd->setup_failed == 1;
       if (setup_failed)
-	pd->joinid = NULL;
+	pd->joinstate = THREAD_STATE_JOINABLE;
 
       /* And give it up right away.  */
       lll_unlock (pd->lock, LLL_PRIVATE);
 
       if (setup_failed)
+	/* No need to clear the tid here, pthread_create() will join the
+	   thread prior returning to caller.  */
 	goto out;
     }
 
@@ -481,6 +483,23 @@ start_thread (void *arg)
      the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE.  */
   atomic_bit_set (&pd->cancelhandling, EXITING_BIT);
 
+
+  /* CONCURRENCY NOTES:
+
+     Concurrent pthread_detach() will either set state to
+     THREAD_STATE_DETACHED or wait for the thread to terminate.  The exiting
+     state set here is set so a pthread_join() wait until all the required
+     cleanup steps are done.
+
+     The 'prevstate' field will be used to determine who is responsible to
+     call __nptl_free_tcb below.  */
+
+  unsigned int prevstate;
+  do
+    prevstate = atomic_load_relaxed (&pd->joinstate);
+  while (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &prevstate,
+						THREAD_STATE_EXITING));
+
   if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
     /* This was the last thread.  */
     exit (0);
@@ -551,17 +570,17 @@ start_thread (void *arg)
       pd->setxid_futex = 0;
     }
 
-  /* If the thread is detached free the TCB.  */
-  if (IS_DETACHED (pd))
-    /* Free the TCB.  */
+  if (prevstate == THREAD_STATE_DETACHED)
     __nptl_free_tcb (pd);
 
+  pd->tid = 0;
+
 out:
   /* We cannot call '_exit' here.  '_exit' will terminate the process.
 
      The 'exit' implementation in the kernel will signal when the
      process is really dead since 'clone' got passed the CLONE_CHILD_CLEARTID
-     flag.  The 'tid' field in the TCB will be set to zero.
+     flag.  The 'joinstate' field in the TCB will be set to zero.
 
      The exit code is zero since in case all threads exit by calling
      'pthread_exit' the exit status must be 0 (zero).  */
@@ -656,10 +675,9 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
   pd->flags = ((iattr->flags & ~(ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET))
 	       | (self->flags & (ATTR_FLAG_SCHED_SET | ATTR_FLAG_POLICY_SET)));
 
-  /* Initialize the field for the ID of the thread which is waiting
-     for us.  This is a self-reference in case the thread is created
-     detached.  */
-  pd->joinid = iattr->flags & ATTR_FLAG_DETACHSTATE ? pd : NULL;
+  pd->joinstate = iattr->flags & ATTR_FLAG_DETACHSTATE
+		  ? THREAD_STATE_DETACHED
+		  : THREAD_STATE_JOINABLE;
 
   /* The debug events are inherited from the parent.  */
   pd->eventbuf = self->eventbuf;
@@ -818,10 +836,11 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
 
 	  /* Similar to pthread_join, but since thread creation has failed at
 	     startup there is no need to handle all the steps.  */
-	  pid_t tid;
-	  while ((tid = atomic_load_acquire (&pd->tid)) != 0)
-	    __futex_abstimed_wait_cancelable64 ((unsigned int *) &pd->tid,
-						tid, 0, NULL, LLL_SHARED);
+	  unsigned int state;
+	  while ((state = atomic_load_acquire (&pd->joinstate))
+                 != THREAD_STATE_EXITED)
+	    __futex_abstimed_wait_cancelable64 (&pd->joinstate, state, 0,
+                                                NULL, LLL_SHARED);
         }
 
       /* State (c) or (d) and we have ownership of PD (see CONCURRENCY
diff --git a/nptl/pthread_detach.c b/nptl/pthread_detach.c
index 3bfc3c2ff0..c0fe9618a4 100644
--- a/nptl/pthread_detach.c
+++ b/nptl/pthread_detach.c
@@ -25,32 +25,28 @@ ___pthread_detach (pthread_t th)
 {
   struct pthread *pd = (struct pthread *) th;
 
-  /* Make sure the descriptor is valid.  */
-  if (INVALID_NOT_TERMINATED_TD_P (pd))
-    /* Not a valid thread handle.  */
-    return ESRCH;
+  /* CONCURRENCY NOTES:
 
-  int result = 0;
+     Concurrent pthread_detach() will return EINVAL for the case the thread
+     is already detached (THREAD_STATE_DETACHED).  POSIX states it is
+     undefined to call pthread_detach if TH refers to a non joinable thread.
 
-  /* Mark the thread as detached.  */
-  if (atomic_compare_and_exchange_bool_acq (&pd->joinid, pd, NULL))
+     For the case the thread is being terminated (THREAD_STATE_EXITING),
+     pthread_detach() will responsible to clean up the stack.  */
+
+  unsigned int prevstate = atomic_load_relaxed (&pd->joinstate);
+  do
     {
-      /* There are two possibilities here.  First, the thread might
-	 already be detached.  In this case we return EINVAL.
-	 Otherwise there might already be a waiter.  The standard does
-	 not mention what happens in this case.  */
-      if (IS_DETACHED (pd))
-	result = EINVAL;
+      if (prevstate != THREAD_STATE_JOINABLE)
+	{
+	  if (prevstate == THREAD_STATE_DETACHED)
+	    return EINVAL;
+	  return __pthread_join (th, 0);
+	}
     }
-  else
-    /* Check whether the thread terminated meanwhile.  In this case we
-       will just free the TCB.  */
-    if ((pd->cancelhandling & EXITING_BITMASK) != 0)
-      /* Note that the code in __free_tcb makes sure each thread
-	 control block is freed only once.  */
-      __nptl_free_tcb (pd);
-
-  return result;
+  while (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &prevstate,
+						THREAD_STATE_DETACHED));
+  return 0;
 }
 versioned_symbol (libc, ___pthread_detach, pthread_detach, GLIBC_2_34);
 libc_hidden_ver (___pthread_detach, __pthread_detach)
diff --git a/nptl/pthread_getattr_np.c b/nptl/pthread_getattr_np.c
index 9d4e7ecff1..73e6137bb6 100644
--- a/nptl/pthread_getattr_np.c
+++ b/nptl/pthread_getattr_np.c
@@ -52,7 +52,7 @@ __pthread_getattr_np (pthread_t thread_id, pthread_attr_t *attr)
   iattr->flags = thread->flags;
 
   /* The thread might be detached by now.  */
-  if (IS_DETACHED (thread))
+  if (atomic_load_acquire (&thread->joinstate) == THREAD_STATE_DETACHED)
     iattr->flags |= ATTR_FLAG_DETACHSTATE;
 
   /* This is the guardsize after adjusting it.  */
diff --git a/nptl/pthread_join.c b/nptl/pthread_join.c
index d4cffccf49..37b809034a 100644
--- a/nptl/pthread_join.c
+++ b/nptl/pthread_join.c
@@ -22,7 +22,7 @@ int
 ___pthread_join (pthread_t threadid, void **thread_return)
 {
   return __pthread_clockjoin_ex (threadid, thread_return, 0 /* Ignored */,
-				 NULL, true);
+				 NULL);
 }
 versioned_symbol (libc, ___pthread_join, pthread_join, GLIBC_2_34);
 libc_hidden_ver (___pthread_join, __pthread_join)
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
index 7303069316..978ca56af6 100644
--- a/nptl/pthread_join_common.c
+++ b/nptl/pthread_join_common.c
@@ -22,113 +22,73 @@
 #include <time.h>
 #include <futex-internal.h>
 
-static void
-cleanup (void *arg)
+/* Check for a possible deadlock situation where the threads are waiting for
+   each other to finish.  Note that this is a "may" error.  To be 100% sure we
+   catch this error we would have to lock the data structures but it is not
+   necessary.  In the unlikely case that two threads are really caught in this
+   situation they will deadlock.  It is the programmer's problem to figure
+   this out.  */
+static inline bool
+check_for_deadlock (struct pthread *pd)
 {
-  /* If we already changed the waiter ID, reset it.  The call cannot
-     fail for any reason but the thread not having done that yet so
-     there is no reason for a loop.  */
   struct pthread *self = THREAD_SELF;
-  atomic_compare_exchange_weak_acquire (&arg, &self, NULL);
+  return ((pd == self
+	   || (atomic_load_acquire (&self->joinstate) == THREAD_STATE_DETACHED
+	       && (pd->cancelhandling
+		   & (CANCELED_BITMASK | EXITING_BITMASK
+		      | TERMINATED_BITMASK)) == 0))
+	   && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
+		&& (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
+					  | TERMINATED_BITMASK))
+		== CANCELED_BITMASK));
 }
 
 int
 __pthread_clockjoin_ex (pthread_t threadid, void **thread_return,
                         clockid_t clockid,
-                        const struct __timespec64 *abstime, bool block)
+                        const struct __timespec64 *abstime)
 {
   struct pthread *pd = (struct pthread *) threadid;
 
-  /* Make sure the descriptor is valid.  */
-  if (INVALID_NOT_TERMINATED_TD_P (pd))
-    /* Not a valid thread handle.  */
-    return ESRCH;
-
-  /* Is the thread joinable?.  */
-  if (IS_DETACHED (pd))
-    /* We cannot wait for the thread.  */
-    return EINVAL;
-
-  struct pthread *self = THREAD_SELF;
-  int result = 0;
-
   LIBC_PROBE (pthread_join, 1, threadid);
 
-  if ((pd == self
-       || (self->joinid == pd
-	   && (pd->cancelhandling
-	       & (CANCELED_BITMASK | EXITING_BITMASK
-		  | TERMINATED_BITMASK)) == 0))
-      && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
-	   && (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
-				     | TERMINATED_BITMASK))
-	       == CANCELED_BITMASK))
-    /* This is a deadlock situation.  The threads are waiting for each
-       other to finish.  Note that this is a "may" error.  To be 100%
-       sure we catch this error we would have to lock the data
-       structures but it is not necessary.  In the unlikely case that
-       two threads are really caught in this situation they will
-       deadlock.  It is the programmer's problem to figure this
-       out.  */
-    return EDEADLK;
-
-  /* Wait for the thread to finish.  If it is already locked something
-     is wrong.  There can only be one waiter.  */
-  else if (__glibc_unlikely (atomic_compare_exchange_weak_acquire (&pd->joinid,
-								   &self,
-								   NULL)))
-    /* There is already somebody waiting for the thread.  */
-    return EINVAL;
-
-  /* BLOCK waits either indefinitely or based on an absolute time.  POSIX also
-     states a cancellation point shall occur for pthread_join, and we use the
-     same rationale for posix_timedjoin_np.  Both clockwait_tid and the futex
-     call use the cancellable variant.  */
-  if (block)
+  int result = 0;
+  unsigned int state;
+  while ((state = atomic_load_acquire (&pd->joinstate))
+	 != THREAD_STATE_EXITED)
     {
-      /* During the wait we change to asynchronous cancellation.  If we
-	 are cancelled the thread we are waiting for must be marked as
-	 un-wait-ed for again.  */
-      pthread_cleanup_push (cleanup, &pd->joinid);
-
-      /* We need acquire MO here so that we synchronize with the
-         kernel's store to 0 when the clone terminates. (see above)  */
-      pid_t tid;
-      while ((tid = atomic_load_acquire (&pd->tid)) != 0)
-        {
-         /* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via
-	    futex wake-up when the clone terminates.  The memory location
-	    contains the thread ID while the clone is running and is reset to
-	    zero by the kernel afterwards.  The kernel up to version 3.16.3
-	    does not use the private futex operations for futex wake-up when
-	    the clone terminates.  */
-	  int ret = __futex_abstimed_wait_cancelable64 (
-	    (unsigned int *) &pd->tid, tid, clockid, abstime, LLL_SHARED);
-	  if (ret == ETIMEDOUT || ret == EOVERFLOW)
-	    {
-	      result = ret;
-	      break;
-	    }
+      if (check_for_deadlock (pd))
+	return EDEADLK;
+
+      /* POSIX states calling pthread_join() on a non joinable thread is
+         undefined.  However, if PD is still in the cache we can still warn
+         the caller.  */
+      if (state == THREAD_STATE_DETACHED)
+	return EINVAL;
+
+      /* pthread_join() is a cancellation entrypoint and we use the same
+         rationale for pthread_timedjoin_np().
+
+	 The kernel notifies a process which uses CLONE_CHILD_CLEARTID via
+	 a memory zeroing and futex wake-up when the process terminates.
+	 The futex operation is not private.  */
+      int ret = __futex_abstimed_wait_cancelable64 (&pd->joinstate, state,
+						    clockid, abstime,
+						    LLL_SHARED);
+      if (ret == ETIMEDOUT || ret == EOVERFLOW)
+	{
+	  result = ret;
+	  break;
 	}
-
-      pthread_cleanup_pop (0);
     }
 
   void *pd_result = pd->result;
-  if (__glibc_likely (result == 0))
+  if (result == 0)
     {
-      /* We mark the thread as terminated and as joined.  */
-      pd->tid = -1;
-
-      /* Store the return value if the caller is interested.  */
       if (thread_return != NULL)
 	*thread_return = pd_result;
-
-      /* Free the TCB.  */
       __nptl_free_tcb (pd);
     }
-  else
-    pd->joinid = NULL;
 
   LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd_result);
 
diff --git a/nptl/pthread_timedjoin.c b/nptl/pthread_timedjoin.c
index a30aef5903..32374170b6 100644
--- a/nptl/pthread_timedjoin.c
+++ b/nptl/pthread_timedjoin.c
@@ -24,7 +24,7 @@ ___pthread_timedjoin_np64 (pthread_t threadid, void **thread_return,
                            const struct __timespec64 *abstime)
 {
   return __pthread_clockjoin_ex (threadid, thread_return,
-                                 CLOCK_REALTIME, abstime, true);
+                                 CLOCK_REALTIME, abstime);
 }
 
 #if __TIMESIZE == 64
diff --git a/nptl/pthread_tryjoin.c b/nptl/pthread_tryjoin.c
index e31c83de9e..60130e9d64 100644
--- a/nptl/pthread_tryjoin.c
+++ b/nptl/pthread_tryjoin.c
@@ -21,15 +21,17 @@
 int
 __pthread_tryjoin_np (pthread_t threadid, void **thread_return)
 {
-  /* Return right away if the thread hasn't terminated yet.  */
-  struct pthread *pd = (struct pthread *) threadid;
-  if (pd->tid != 0)
-    return EBUSY;
+  /* The joinable state (THREAD_STATE_JOINABLE) is straigthforward since the
+     thread hasn't finished yet and trying to join might block.
+     The exiting thread (THREAD_STATE_EXITING) also migth result in ablocking
+     call: a detached thread might change its state to exiting and a exiting
+     thread my take some time to exit (and thus let the kernel set the state
+     to THREAD_STATE_EXITED).  */
 
-  /* If pd->tid == 0 then lll_wait_tid will not block on futex
-     operation.  */
-  return __pthread_clockjoin_ex (threadid, thread_return, 0 /* Ignored */,
-				 NULL, false);
+  struct pthread *pd = (struct pthread *) threadid;
+  return atomic_load_acquire (&pd->joinstate) != THREAD_STATE_EXITED
+	 ? EBUSY
+	 : __pthread_clockjoin_ex (threadid, thread_return, 0, NULL);
 }
 versioned_symbol (libc, __pthread_tryjoin_np, pthread_tryjoin_np, GLIBC_2_34);
 
diff --git a/sysdeps/nptl/dl-tls_init_tp.c b/sysdeps/nptl/dl-tls_init_tp.c
index ca494dd3a5..35d54abd92 100644
--- a/sysdeps/nptl/dl-tls_init_tp.c
+++ b/sysdeps/nptl/dl-tls_init_tp.c
@@ -62,7 +62,7 @@ __tls_init_tp (void)
 
    /* Early initialization of the TCB.   */
    struct pthread *pd = THREAD_SELF;
-   pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->tid);
+   pd->tid = INTERNAL_SYSCALL_CALL (set_tid_address, &pd->joinstate);
    THREAD_SETMEM (pd, specific[0], &pd->specific_1stblock[0]);
    THREAD_SETMEM (pd, user_stack, true);
 
@@ -97,4 +97,6 @@ __tls_init_tp (void)
 
   THREAD_SETMEM (pd, cancelstate, PTHREAD_CANCEL_ENABLE);
   THREAD_SETMEM (pd, canceltype, PTHREAD_CANCEL_DEFERRED);
+
+  THREAD_SETMEM (pd, joinstate, THREAD_STATE_JOINABLE);
 }
diff --git a/sysdeps/nptl/libc_start_call_main.h b/sysdeps/nptl/libc_start_call_main.h
index 06d72c1e38..e7a7f7b4c4 100644
--- a/sysdeps/nptl/libc_start_call_main.h
+++ b/sysdeps/nptl/libc_start_call_main.h
@@ -18,6 +18,7 @@
 
 #include <atomic.h>
 #include <pthreadP.h>
+#include <futex-internal.h>
 
 _Noreturn static void
 __libc_start_call_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
@@ -65,6 +66,12 @@ __libc_start_call_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
       /* One less thread.  Decrement the counter.  If it is zero we
          terminate the entire process.  */
       result = 0;
+
+      /* For the case a thread is waiting for the main thread to finish.  */
+      struct pthread *self = THREAD_SELF;
+      atomic_store_release (&self->joinstate, THREAD_STATE_EXITED);
+      futex_wake (&self->joinstate, 1, FUTEX_SHARED);
+
       if (! atomic_decrement_and_test (&__nptl_nthreads))
         /* Not much left to do but to exit the thread, not the process.  */
 	while (1)
diff --git a/sysdeps/nptl/pthreadP.h b/sysdeps/nptl/pthreadP.h
index aebddb361b..8dfd55e87d 100644
--- a/sysdeps/nptl/pthreadP.h
+++ b/sysdeps/nptl/pthreadP.h
@@ -242,7 +242,6 @@ libc_hidden_proto (__pthread_current_priority)
    nothing.  And if the test triggers the thread descriptor is
    guaranteed to be invalid.  */
 #define INVALID_TD_P(pd) __builtin_expect ((pd)->tid <= 0, 0)
-#define INVALID_NOT_TERMINATED_TD_P(pd) __builtin_expect ((pd)->tid < 0, 0)
 
 extern void __pthread_unwind (__pthread_unwind_buf_t *__buf)
      __cleanup_fct_attribute __attribute ((__noreturn__))
@@ -536,7 +535,7 @@ libc_hidden_proto (__pthread_setcanceltype)
 extern void __pthread_testcancel (void);
 libc_hidden_proto (__pthread_testcancel)
 extern int __pthread_clockjoin_ex (pthread_t, void **, clockid_t,
-				   const struct __timespec64 *, bool)
+				   const struct __timespec64 *)
   attribute_hidden;
 extern int __pthread_sigmask (int, const sigset_t *, sigset_t *);
 libc_hidden_proto (__pthread_sigmask);
diff --git a/sysdeps/pthread/tst-thrd-detach.c b/sysdeps/pthread/tst-thrd-detach.c
index c844767748..e1906a0e10 100644
--- a/sysdeps/pthread/tst-thrd-detach.c
+++ b/sysdeps/pthread/tst-thrd-detach.c
@@ -20,14 +20,14 @@
 #include <time.h>
 #include <stdio.h>
 #include <unistd.h>
-
+#include <limits.h>
 #include <support/check.h>
 
 static int
 detach_thrd (void *arg)
 {
-  if (thrd_detach (thrd_current ()) != thrd_success)
-    FAIL_EXIT1 ("thrd_detach failed");
+  thrd_sleep (&(struct timespec) { .tv_sec = INT_MAX }, NULL);
+
   thrd_exit (thrd_success);
 }
 
@@ -36,15 +36,11 @@ do_test (void)
 {
   thrd_t id;
 
-  /* Create new thread.  */
-  if (thrd_create (&id, detach_thrd, NULL) != thrd_success)
-    FAIL_EXIT1 ("thrd_create failed");
+  TEST_COMPARE (thrd_create (&id, detach_thrd, NULL), thrd_success);
 
-  /* Give some time so the thread can finish.  */
-  thrd_sleep (&(struct timespec) {.tv_sec = 2}, NULL);
+  TEST_COMPARE (thrd_detach (id), thrd_success);
 
-  if (thrd_join (id, NULL) == thrd_success)
-    FAIL_EXIT1 ("thrd_join succeed where it should fail");
+  TEST_COMPARE (thrd_join (id, NULL), thrd_error);
 
   return 0;
 }


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [glibc/azanella/pthread-multiple-fixes] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
@ 2021-06-08 20:48 Adhemerval Zanella
  0 siblings, 0 replies; 3+ messages in thread
From: Adhemerval Zanella @ 2021-06-08 20:48 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=04acef923704890d47076bc14f145adede02a3c9

commit 04acef923704890d47076bc14f145adede02a3c9
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Thu Jun 3 14:25:54 2021 -0300

    nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951)
    
    The use after free described in BZ #19951 is due the use of the
    set_tid_address address to synchronize the thread state (joined,
    detached, terminated).
    
    First point is both thread state and tid are logically disjoined fields,
    so on every state change operation (pthread_detach, pthread_join,
    and pthread_exit/thread exit) we potentially need to check for both
    fields and handle possible partial states (for instancfe on pthread_join
    which first sets the PD joinid field, wait for kernel, and resets
    the joinid in case of failure).
    
    Second point is we do not really need the kernel exit syscall to
    advertise through set_tid_address the thread has finish on
    pthread_join().  The synchronization is only required to setup which
    thread would be responsible to cleanup the allocated resources and
    the stack cache algorithm already handles the asynchronous case where
    the stack is not readily available (the detached thread mode).
    
    So this patch uses a different PD member with 4 possible states
    (JOINABLE, DETACHED, EXITING, and EXITED) to synchronize between
    pthread_create, pthread_join, pthread_detached, and pthread_exit:
    
      1. On pthread_create() the inital state is iset either to JOINABLE or
         DETACHED depending of the pthread attribute used.
    
      2. On pthread_detach() call, a CAS is done on the state.  If the
         CAS fails it means that thread is already detached (DETACHED)
         or is being terminated (EXITING or EXITED).  For former and EINVAL
         is return, for latter pthread_detach() should be reponsible to join
         the thread and deallocate the resources.
    
      3. On pthread_create() exit phase (reached either if the thread
         function has returned, pthread_exit() has being called, or
         cancellation handled has been acted upon) we issue a CAS on state
         to set to EXITING Mode.  If the thread is previous on DETACHED
         mode it will be the thread itself responsible to deallocate any
         resource, otherwise the threads needs to be joined.  In the final
         exit phase the state is set to EXITED and a futex wake is issued
         (to synchronize with pthread_join() functions).
    
         The EXITING mode exists to avoid a pthread_detach() to wrongly
         setup the mode while the thread calling the required exit
         functions, but has not while deallocate the stack.
    
      4. Finally the pthread_join() implementation is much simpler: the
         futex wait is done directly on thread state and there is no need to
         reset it in case of TIMEOUT (since the state is now set either by
         pthread_detached or pthread_create exit phase).
    
         Another advantage is we can now use private futex operation even
         on older kernels.
    
    The race condition on pthread_detach is avoided with an atomic
    operation on PD state: once the mode is set to THREAD_STATE_DETACHED
    it is up to thread itself to deallocate its memory (done on the exit
    phase at pthread_join()).
    
    Finally the robust mutexes should now be handled at userland on thread
    exit path so the pthread_join() see the result.  We could avoid is by
    adding another thread synchronization on pthread_join() to wait for
    kernel CLONE_CHILD_CLEARTID, however the usual case is to not have
    robust mutexes.
    
    Checked on x86_64-linux-gnu and aarch64-linux-gnu.

Diff:
---
 nptl/descr.h                        |  22 ++++---
 nptl/pthreadP.h                     |   2 +-
 nptl/pthread_clockjoin.c            |   2 +-
 nptl/pthread_create.c               |  56 ++++++++++------
 nptl/pthread_detach.c               |  39 +++++------
 nptl/pthread_getattr_np.c           |   2 +-
 nptl/pthread_join.c                 |   2 +-
 nptl/pthread_join_common.c          | 125 ++++++++++++------------------------
 nptl/pthread_timedjoin.c            |   2 +-
 nptl/pthread_tryjoin.c              |  17 ++---
 sysdeps/nptl/dl-tls_init_tp.c       |   2 +
 sysdeps/nptl/libc_start_call_main.h |   7 ++
 sysdeps/pthread/tst-thrd-detach.c   |  14 ++--
 13 files changed, 135 insertions(+), 157 deletions(-)

diff --git a/nptl/descr.h b/nptl/descr.h
index c85778d449..44aa0c53be 100644
--- a/nptl/descr.h
+++ b/nptl/descr.h
@@ -124,6 +124,16 @@ struct priority_protection_data
 };
 
 
+/* Define a possible thread detached state on 'joinstate' field.  */
+enum
+  {
+    THREAD_STATE_EXITED = 0,
+    THREAD_STATE_EXITING,
+    THREAD_STATE_JOINABLE,
+    THREAD_STATE_DETACHED,
+  };
+
+
 /* Thread descriptor data structure.  */
 struct pthread
 {
@@ -335,15 +345,9 @@ struct pthread
   hp_timing_t cpuclock_offset_ununsed;
 #endif
 
-  /* If the thread waits to join another one the ID of the latter is
-     stored here.
-
-     In case a thread is detached this field contains a pointer of the
-     TCB if the thread itself.  This is something which cannot happen
-     in normal operation.  */
-  struct pthread *joinid;
-  /* Check whether a thread is detached.  */
-#define IS_DETACHED(pd) ((pd)->joinid == (pd))
+  /* Indicate whether a thread is join, detached, exiting, or exit mode
+     (as indicate by PTHREAD_STATE_* enum above).  */
+  unsigned int joinstate;
 
   /* The result of the thread function.  */
   void *result;
diff --git a/nptl/pthreadP.h b/nptl/pthreadP.h
index ef0b367120..bfb0e40b44 100644
--- a/nptl/pthreadP.h
+++ b/nptl/pthreadP.h
@@ -536,7 +536,7 @@ libc_hidden_proto (__pthread_setcanceltype)
 extern void __pthread_testcancel (void);
 libc_hidden_proto (__pthread_testcancel)
 extern int __pthread_clockjoin_ex (pthread_t, void **, clockid_t,
-				   const struct __timespec64 *, bool)
+				   const struct __timespec64 *)
   attribute_hidden;
 extern int __pthread_sigmask (int, const sigset_t *, sigset_t *);
 libc_hidden_proto (__pthread_sigmask);
diff --git a/nptl/pthread_clockjoin.c b/nptl/pthread_clockjoin.c
index f5007d7831..8798f26fcf 100644
--- a/nptl/pthread_clockjoin.c
+++ b/nptl/pthread_clockjoin.c
@@ -30,7 +30,7 @@ ___pthread_clockjoin_np64 (pthread_t threadid, void **thread_return,
     return EINVAL;
 
   return __pthread_clockjoin_ex (threadid, thread_return,
-                                 clockid, abstime, true);
+                                 clockid, abstime);
 }
 
 #if __TIMESIZE == 64
diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c
index 5ade1bd461..b6e3198e16 100644
--- a/nptl/pthread_create.c
+++ b/nptl/pthread_create.c
@@ -343,7 +343,7 @@ start_thread (void *arg)
          and free any resource prior return to the pthread_create caller.  */
       setup_failed = pd->setup_failed == 1;
       if (setup_failed)
-	pd->joinid = NULL;
+	pd->joinstate = THREAD_STATE_JOINABLE;
 
       /* And give it up right away.  */
       lll_unlock (pd->lock, LLL_PRIVATE);
@@ -473,16 +473,24 @@ start_thread (void *arg)
      the breakpoint reports TD_THR_RUN state rather than TD_THR_ZOMBIE.  */
   atomic_bit_set (&pd->cancelhandling, EXITING_BIT);
 
-  if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
-    /* This was the last thread.  */
-    exit (0);
 
-#ifndef __ASSUME_SET_ROBUST_LIST
-  /* If this thread has any robust mutexes locked, handle them now.  */
+  /* CONCURRENCY NOTES:
+
+     Concurrent pthread_detach() will either set state to
+     THREAD_STATE_DETACHED or wait the thread to terminate.  The exiting state
+     set here is set so a pthread_join() wait until all the required cleanup
+     steps are done.
+
+     The joinstate will be used to determine who is reponsible to call
+     __nptl_free_tcb below.  */
+
+  unsigned int joinstate = THREAD_STATE_JOINABLE;
+  atomic_compare_exchange_weak_acquire (&pd->joinstate, &joinstate,
+					THREAD_STATE_EXITING);
+
+  /* We handle robuts mutexes on exit thread path so pthread_join() see the
+     states when the thread is joined.  */
   void **robust;
-  /* We let the kernel do the notification if it is able to do so.
-     If we have to do it here there for sure are no PI mutexes involved
-     since the kernel support for them is even more recent.  */
   while ((robust = pd->robust_head.list)
 	 && robust != (void *) &pd->robust_head)
     {
@@ -511,7 +519,10 @@ start_thread (void *arg)
 	    futex_wake ((unsigned int *) &mtx->__lock, 1, shared);
 	}
     }
-#endif
+
+  if (__glibc_unlikely (atomic_decrement_and_test (&__nptl_nthreads)))
+    /* This was the last thread.  */
+    exit (0);
 
   if (!pd->user_stack)
     advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd,
@@ -533,12 +544,16 @@ start_thread (void *arg)
       pd->setxid_futex = 0;
     }
 
-  /* If the thread is detached free the TCB.  */
-  if (IS_DETACHED (pd))
-    /* Free the TCB.  */
+  if (joinstate == THREAD_STATE_DETACHED)
     __nptl_free_tcb (pd);
-
+  else
+    {
 out:
+      /* Wake up a pthread_join() call.  */
+      atomic_store_release (&pd->joinstate, THREAD_STATE_EXITED);
+      futex_wake (&pd->joinstate, 1, FUTEX_PRIVATE);
+    }
+
   /* We cannot call '_exit' here.  '_exit' will terminate the process.
 
      The 'exit' implementation in the kernel will signal when the
@@ -640,7 +655,9 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
   /* Initialize the field for the ID of the thread which is waiting
      for us.  This is a self-reference in case the thread is created
      detached.  */
-  pd->joinid = iattr->flags & ATTR_FLAG_DETACHSTATE ? pd : NULL;
+  pd->joinstate = iattr->flags & ATTR_FLAG_DETACHSTATE
+		  ? THREAD_STATE_DETACHED
+		  : THREAD_STATE_JOINABLE;
 
   /* The debug events are inherited from the parent.  */
   pd->eventbuf = self->eventbuf;
@@ -799,10 +816,11 @@ __pthread_create_2_1 (pthread_t *newthread, const pthread_attr_t *attr,
 
 	  /* Similar to pthread_join, but since thread creation has failed at
 	     startup there is no need to handle all the steps.  */
-	  pid_t tid;
-	  while ((tid = atomic_load_acquire (&pd->tid)) != 0)
-	    __futex_abstimed_wait_cancelable64 ((unsigned int *) &pd->tid,
-						tid, 0, NULL, LLL_SHARED);
+	  unsigned int state;
+	  while ((state = atomic_load_acquire (&pd->joinstate))
+                 != THREAD_STATE_EXITED)
+	    __futex_abstimed_wait_cancelable64 (&pd->joinstate, state, 0,
+                                                NULL, LLL_PRIVATE);
         }
 
       /* State (c) or (d) and we have ownership of PD (see CONCURRENCY
diff --git a/nptl/pthread_detach.c b/nptl/pthread_detach.c
index ac50db9b0e..69e72b9443 100644
--- a/nptl/pthread_detach.c
+++ b/nptl/pthread_detach.c
@@ -26,32 +26,27 @@ ___pthread_detach (pthread_t th)
 {
   struct pthread *pd = (struct pthread *) th;
 
-  /* Make sure the descriptor is valid.  */
-  if (INVALID_NOT_TERMINATED_TD_P (pd))
-    /* Not a valid thread handle.  */
-    return ESRCH;
+  /* CONCURRENCY NOTES:
 
-  int result = 0;
+     Concurrent pthread_detach() will return EINVAL for the case the thread
+     is already detached (THREAD_STATE_DETACHED).  POSIX states it is
+     undefined to call pthread_detach if TH refers to a non joinable thread.
 
-  /* Mark the thread as detached.  */
-  if (atomic_compare_and_exchange_bool_acq (&pd->joinid, pd, NULL))
+     For the case the thread is being terminated (THREAD_STATE_EXITING)
+     or terminated (THREAD_STATE_EXITED), its resource are not being
+     deallocated (since thread as previously set to THREAD_STATE_JOINABLE).
+     In this case pthread_detach() will responsible to clean up the
+     stack.  */
+
+  int curstate = THREAD_STATE_JOINABLE;
+  if (!atomic_compare_exchange_weak_acquire (&pd->joinstate, &curstate,
+					     THREAD_STATE_DETACHED))
     {
-      /* There are two possibilities here.  First, the thread might
-	 already be detached.  In this case we return EINVAL.
-	 Otherwise there might already be a waiter.  The standard does
-	 not mention what happens in this case.  */
-      if (IS_DETACHED (pd))
-	result = EINVAL;
+      if (curstate == THREAD_STATE_DETACHED)
+	return EINVAL;
+      return __pthread_join (th, 0);
     }
-  else
-    /* Check whether the thread terminated meanwhile.  In this case we
-       will just free the TCB.  */
-    if ((pd->cancelhandling & EXITING_BITMASK) != 0)
-      /* Note that the code in __free_tcb makes sure each thread
-	 control block is freed only once.  */
-      __nptl_free_tcb (pd);
-
-  return result;
+  return 0;
 }
 versioned_symbol (libc, ___pthread_detach, pthread_detach, GLIBC_2_34);
 libc_hidden_ver (___pthread_detach, __pthread_detach)
diff --git a/nptl/pthread_getattr_np.c b/nptl/pthread_getattr_np.c
index 25807cb529..560ba9ab98 100644
--- a/nptl/pthread_getattr_np.c
+++ b/nptl/pthread_getattr_np.c
@@ -53,7 +53,7 @@ __pthread_getattr_np (pthread_t thread_id, pthread_attr_t *attr)
   iattr->flags = thread->flags;
 
   /* The thread might be detached by now.  */
-  if (IS_DETACHED (thread))
+  if (atomic_load_acquire (&thread->joinstate) == THREAD_STATE_DETACHED)
     iattr->flags |= ATTR_FLAG_DETACHSTATE;
 
   /* This is the guardsize after adjusting it.  */
diff --git a/nptl/pthread_join.c b/nptl/pthread_join.c
index d2b33de73d..195a537029 100644
--- a/nptl/pthread_join.c
+++ b/nptl/pthread_join.c
@@ -23,7 +23,7 @@ int
 ___pthread_join (pthread_t threadid, void **thread_return)
 {
   return __pthread_clockjoin_ex (threadid, thread_return, 0 /* Ignored */,
-				 NULL, true);
+				 NULL);
 }
 versioned_symbol (libc, ___pthread_join, pthread_join, GLIBC_2_34);
 libc_hidden_ver (___pthread_join, __pthread_join)
diff --git a/nptl/pthread_join_common.c b/nptl/pthread_join_common.c
index 7303069316..35b725b7de 100644
--- a/nptl/pthread_join_common.c
+++ b/nptl/pthread_join_common.c
@@ -22,113 +22,68 @@
 #include <time.h>
 #include <futex-internal.h>
 
-static void
-cleanup (void *arg)
+/* Check for a possible deadlock situation.  The threads are waiting for each
+   other to finish.  Note that this is a "may" error.  To be 100% sure we
+   catch this error we would have to lock the data structures but it is not
+   necessary.  In the unlikely case that two threads are really caught in this
+   situation they will deadlock.  It is the programmer's problem to figure
+   this  out.  */
+static inline bool
+check_deadlock (int state, struct pthread *pd)
 {
-  /* If we already changed the waiter ID, reset it.  The call cannot
-     fail for any reason but the thread not having done that yet so
-     there is no reason for a loop.  */
   struct pthread *self = THREAD_SELF;
-  atomic_compare_exchange_weak_acquire (&arg, &self, NULL);
+  return ((pd == self
+	   || (atomic_load_acquire (&self->joinstate) == THREAD_STATE_DETACHED
+	       && (pd->cancelhandling
+		   & (CANCELED_BITMASK | EXITING_BITMASK
+		      | TERMINATED_BITMASK)) == 0))
+	   && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
+		&& (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
+					  | TERMINATED_BITMASK))
+		== CANCELED_BITMASK));
 }
 
 int
 __pthread_clockjoin_ex (pthread_t threadid, void **thread_return,
                         clockid_t clockid,
-                        const struct __timespec64 *abstime, bool block)
+                        const struct __timespec64 *abstime)
 {
   struct pthread *pd = (struct pthread *) threadid;
 
-  /* Make sure the descriptor is valid.  */
-  if (INVALID_NOT_TERMINATED_TD_P (pd))
-    /* Not a valid thread handle.  */
-    return ESRCH;
-
-  /* Is the thread joinable?.  */
-  if (IS_DETACHED (pd))
-    /* We cannot wait for the thread.  */
-    return EINVAL;
-
-  struct pthread *self = THREAD_SELF;
-  int result = 0;
-
   LIBC_PROBE (pthread_join, 1, threadid);
 
-  if ((pd == self
-       || (self->joinid == pd
-	   && (pd->cancelhandling
-	       & (CANCELED_BITMASK | EXITING_BITMASK
-		  | TERMINATED_BITMASK)) == 0))
-      && !(self->cancelstate == PTHREAD_CANCEL_ENABLE
-	   && (pd->cancelhandling & (CANCELED_BITMASK | EXITING_BITMASK
-				     | TERMINATED_BITMASK))
-	       == CANCELED_BITMASK))
-    /* This is a deadlock situation.  The threads are waiting for each
-       other to finish.  Note that this is a "may" error.  To be 100%
-       sure we catch this error we would have to lock the data
-       structures but it is not necessary.  In the unlikely case that
-       two threads are really caught in this situation they will
-       deadlock.  It is the programmer's problem to figure this
-       out.  */
-    return EDEADLK;
-
-  /* Wait for the thread to finish.  If it is already locked something
-     is wrong.  There can only be one waiter.  */
-  else if (__glibc_unlikely (atomic_compare_exchange_weak_acquire (&pd->joinid,
-								   &self,
-								   NULL)))
-    /* There is already somebody waiting for the thread.  */
-    return EINVAL;
-
-  /* BLOCK waits either indefinitely or based on an absolute time.  POSIX also
-     states a cancellation point shall occur for pthread_join, and we use the
-     same rationale for posix_timedjoin_np.  Both clockwait_tid and the futex
-     call use the cancellable variant.  */
-  if (block)
+  int result = 0;
+  unsigned int state;
+  while ((state = atomic_load_acquire (&pd->joinstate))
+	 != THREAD_STATE_EXITED)
     {
-      /* During the wait we change to asynchronous cancellation.  If we
-	 are cancelled the thread we are waiting for must be marked as
-	 un-wait-ed for again.  */
-      pthread_cleanup_push (cleanup, &pd->joinid);
-
-      /* We need acquire MO here so that we synchronize with the
-         kernel's store to 0 when the clone terminates. (see above)  */
-      pid_t tid;
-      while ((tid = atomic_load_acquire (&pd->tid)) != 0)
-        {
-         /* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via
-	    futex wake-up when the clone terminates.  The memory location
-	    contains the thread ID while the clone is running and is reset to
-	    zero by the kernel afterwards.  The kernel up to version 3.16.3
-	    does not use the private futex operations for futex wake-up when
-	    the clone terminates.  */
-	  int ret = __futex_abstimed_wait_cancelable64 (
-	    (unsigned int *) &pd->tid, tid, clockid, abstime, LLL_SHARED);
-	  if (ret == ETIMEDOUT || ret == EOVERFLOW)
-	    {
-	      result = ret;
-	      break;
-	    }
+      if (check_deadlock (state, pd))
+	return EDEADLK;
+
+      /* POSIX states calling pthread_join() on a thread identicator that does
+	 not refer to a joinble threads is undefined, however if PD is still
+	 in the cache we can still warn caller.  */
+      if (state == THREAD_STATE_DETACHED)
+	return EINVAL;
+
+      /* pthread_join is a cancellation entrypoint.  */
+      int ret = __futex_abstimed_wait_cancelable64 (&pd->joinstate, state,
+						    clockid, abstime,
+						    LLL_PRIVATE);
+      if (ret == ETIMEDOUT || ret == EOVERFLOW)
+	{
+	  result = ret;
+	  break;
 	}
-
-      pthread_cleanup_pop (0);
     }
 
   void *pd_result = pd->result;
-  if (__glibc_likely (result == 0))
+  if (result == 0)
     {
-      /* We mark the thread as terminated and as joined.  */
-      pd->tid = -1;
-
-      /* Store the return value if the caller is interested.  */
       if (thread_return != NULL)
 	*thread_return = pd_result;
-
-      /* Free the TCB.  */
       __nptl_free_tcb (pd);
     }
-  else
-    pd->joinid = NULL;
 
   LIBC_PROBE (pthread_join_ret, 3, threadid, result, pd_result);
 
diff --git a/nptl/pthread_timedjoin.c b/nptl/pthread_timedjoin.c
index ebc31f935a..b35c8f4279 100644
--- a/nptl/pthread_timedjoin.c
+++ b/nptl/pthread_timedjoin.c
@@ -25,7 +25,7 @@ ___pthread_timedjoin_np64 (pthread_t threadid, void **thread_return,
                            const struct __timespec64 *abstime)
 {
   return __pthread_clockjoin_ex (threadid, thread_return,
-                                 CLOCK_REALTIME, abstime, true);
+                                 CLOCK_REALTIME, abstime);
 }
 
 #if __TIMESIZE == 64
diff --git a/nptl/pthread_tryjoin.c b/nptl/pthread_tryjoin.c
index fd938e8780..8b2654e772 100644
--- a/nptl/pthread_tryjoin.c
+++ b/nptl/pthread_tryjoin.c
@@ -22,15 +22,16 @@
 int
 __pthread_tryjoin_np (pthread_t threadid, void **thread_return)
 {
-  /* Return right away if the thread hasn't terminated yet.  */
-  struct pthread *pd = (struct pthread *) threadid;
-  if (pd->tid != 0)
-    return EBUSY;
+  /* The THREAD_STATE_JOINABLE is straigthforward (the thread hasn't finished
+     yet).  Both THREAD_STATE_DETACHED and THREAD_STATE_EXITING might result
+     in a possible blocking call: a detached thread might change its state to
+     exiting and a exiting thread my take some time to change to
+     THREAD_STATE_EXITED.  */
 
-  /* If pd->tid == 0 then lll_wait_tid will not block on futex
-     operation.  */
-  return __pthread_clockjoin_ex (threadid, thread_return, 0 /* Ignored */,
-				 NULL, false);
+  struct pthread *pd = (struct pthread *) threadid;
+  return atomic_load_acquire (&pd->joinstate) != THREAD_STATE_EXITED
+	 ? EBUSY
+	 : __pthread_clockjoin_ex (threadid, thread_return, 0, NULL);
 }
 versioned_symbol (libc, __pthread_tryjoin_np, pthread_tryjoin_np, GLIBC_2_34);
 
diff --git a/sysdeps/nptl/dl-tls_init_tp.c b/sysdeps/nptl/dl-tls_init_tp.c
index b7b3bb1cdb..8692ac2289 100644
--- a/sysdeps/nptl/dl-tls_init_tp.c
+++ b/sysdeps/nptl/dl-tls_init_tp.c
@@ -97,4 +97,6 @@ __tls_init_tp (void)
 
   THREAD_SETMEM (pd, cancelstate, PTHREAD_CANCEL_ENABLE);
   THREAD_SETMEM (pd, canceltype, PTHREAD_CANCEL_DEFERRED);
+
+  THREAD_SETMEM (pd, joinstate, THREAD_STATE_JOINABLE);
 }
diff --git a/sysdeps/nptl/libc_start_call_main.h b/sysdeps/nptl/libc_start_call_main.h
index b56bf34325..656130b1ac 100644
--- a/sysdeps/nptl/libc_start_call_main.h
+++ b/sysdeps/nptl/libc_start_call_main.h
@@ -17,6 +17,7 @@
    <https://www.gnu.org/licenses/>.  */
 
 #include <atomic.h>
+#include <futex-internal.h>
 #include <nptl/pthreadP.h>
 
 _Noreturn static void
@@ -65,6 +66,12 @@ __libc_start_call_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL),
       /* One less thread.  Decrement the counter.  If it is zero we
          terminate the entire process.  */
       result = 0;
+
+      /* For the case a thread is waiting for the main thread to finish.  */
+      struct pthread *self = THREAD_SELF;
+      atomic_store_release (&self->joinstate, THREAD_STATE_EXITED);
+      futex_wake (&self->joinstate, 1, FUTEX_PRIVATE);
+
       if (! atomic_decrement_and_test (&__nptl_nthreads))
         /* Not much left to do but to exit the thread, not the process.  */
 	while (1)
diff --git a/sysdeps/pthread/tst-thrd-detach.c b/sysdeps/pthread/tst-thrd-detach.c
index c844767748..196768d96c 100644
--- a/sysdeps/pthread/tst-thrd-detach.c
+++ b/sysdeps/pthread/tst-thrd-detach.c
@@ -26,8 +26,8 @@
 static int
 detach_thrd (void *arg)
 {
-  if (thrd_detach (thrd_current ()) != thrd_success)
-    FAIL_EXIT1 ("thrd_detach failed");
+  thrd_sleep (&(struct timespec) { .tv_sec = 10 }, NULL);
+
   thrd_exit (thrd_success);
 }
 
@@ -36,15 +36,11 @@ do_test (void)
 {
   thrd_t id;
 
-  /* Create new thread.  */
-  if (thrd_create (&id, detach_thrd, NULL) != thrd_success)
-    FAIL_EXIT1 ("thrd_create failed");
+  TEST_COMPARE (thrd_create (&id, detach_thrd, NULL), thrd_success);
 
-  /* Give some time so the thread can finish.  */
-  thrd_sleep (&(struct timespec) {.tv_sec = 2}, NULL);
+  TEST_COMPARE (thrd_detach (id), thrd_success);
 
-  if (thrd_join (id, NULL) == thrd_success)
-    FAIL_EXIT1 ("thrd_join succeed where it should fail");
+  TEST_COMPARE (thrd_join (id, NULL), thrd_error);
 
   return 0;
 }


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-09-10 14:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-20 17:55 [glibc/azanella/pthread-multiple-fixes] nptl: Do not use pthread set_tid_address as state synchronization (BZ #19951) Adhemerval Zanella
  -- strict thread matches above, loose matches on Subject: below --
2021-09-10 14:37 Adhemerval Zanella
2021-06-08 20:48 Adhemerval Zanella

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).