public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] powerpc: Fix race condition in lock elision
@ 2016-11-16 15:54 Rajalakshmi Srinivasaraghavan
  2016-11-17 14:51 ` Torvald Riegel
  0 siblings, 1 reply; 21+ messages in thread
From: Rajalakshmi Srinivasaraghavan @ 2016-11-16 15:54 UTC (permalink / raw)
  To: libc-alpha; +Cc: sjmunroe, Rajalakshmi Srinivasaraghavan

The update of *adapt_count after the release of the lock causes a race
condition when thread A unlocks, thread B continues and returns,
destroying mutex on stack, then gets into another function,
thread A writes to *adapt_count and corrupts stack.

2016-11-16  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>

	[BZ #20822]
	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
	(__lll_unlock_elision):  Update adapt_count variable
	inside the critical section.
---
 sysdeps/unix/sysv/linux/powerpc/elision-unlock.c | 9 ++++-----
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
index 43c5a67..e5e1afd 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
@@ -28,13 +28,12 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
     __libc_tend (0);
   else
     {
-      lll_unlock ((*lock), pshared);
-
-      /* Update the adapt count AFTER completing the critical section.
-         Doing this here prevents unneeded stalling when entering
-         a critical section.  Saving about 8% runtime on P8.  */
+      /* Update the adapt count in the critical section to
+         prevent race condition as mentioned in BZ 20822.  */
       if (*adapt_count > 0)
 	(*adapt_count)--;
+      lll_unlock ((*lock), pshared);
+
     }
   return 0;
 }
-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] powerpc: Fix race condition in lock elision
  2016-11-16 15:54 [PATCH] powerpc: Fix race condition in lock elision Rajalakshmi Srinivasaraghavan
@ 2016-11-17 14:51 ` Torvald Riegel
  2016-12-09 19:39   ` [PATCHv2] powerpc: Fix write-after-destroy " Tulio Magno Quites Machado Filho
  0 siblings, 1 reply; 21+ messages in thread
From: Torvald Riegel @ 2016-11-17 14:51 UTC (permalink / raw)
  To: Rajalakshmi Srinivasaraghavan; +Cc: libc-alpha, sjmunroe

On Wed, 2016-11-16 at 21:24 +0530, Rajalakshmi Srinivasaraghavan wrote:
> The update of *adapt_count after the release of the lock causes a race
> condition when thread A unlocks, thread B continues and returns,
> destroying mutex on stack, then gets into another function,
> thread A writes to *adapt_count and corrupts stack.
> 
> 2016-11-16  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>
> 
> 	[BZ #20822]
> 	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> 	(__lll_unlock_elision):  Update adapt_count variable
> 	inside the critical section.
> ---
>  sysdeps/unix/sysv/linux/powerpc/elision-unlock.c | 9 ++++-----
>  1 file changed, 4 insertions(+), 5 deletions(-)
> 
> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> index 43c5a67..e5e1afd 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> @@ -28,13 +28,12 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
>      __libc_tend (0);
>    else
>      {
> -      lll_unlock ((*lock), pshared);
> -
> -      /* Update the adapt count AFTER completing the critical section.
> -         Doing this here prevents unneeded stalling when entering
> -         a critical section.  Saving about 8% runtime on P8.  */
> +      /* Update the adapt count in the critical section to
> +         prevent race condition as mentioned in BZ 20822.  */

I'd rather mention the mutex destruction requirements by POSIX.  This is
what we're saying in sysdeps/nptl/lowlevellock.h for unlock:

   Evaluate PRIVATE before releasing the lock so that we do not violate
the
   mutex destruction requirements.  Specifically, we need to ensure that
   another thread can destroy the mutex (and reuse its memory) once it
   acquires the lock and when there will be no further lock
acquisitions;
   thus, we must not access the lock after releasing it, or those
accesses
   could be concurrent with mutex destruction or reuse of the memory. 

>        if (*adapt_count > 0)
>  	(*adapt_count)--;
> +      lll_unlock ((*lock), pshared);
> +
>      }
>    return 0;
>  }

Please make sure that there are no data races on at least adapt_count
(ie, use atomic accesses consistently).  I'm aware that this requires
touching more files, but it makes the whole code more reliable and
cleaner.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCHv2] powerpc: Fix write-after-destroy in lock elision
  2016-11-17 14:51 ` Torvald Riegel
@ 2016-12-09 19:39   ` Tulio Magno Quites Machado Filho
  2016-12-12  4:36     ` Mike Frysinger
  0 siblings, 1 reply; 21+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2016-12-09 19:39 UTC (permalink / raw)
  To: libc-alpha; +Cc: raji, triegel, munroesj

From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>

Changes since v1:
 - Removed references to data race by the actual error: write-after-destroy.
 - Enclosed adapt_count accesses in C11 atomics.

--- 8< ---

The update of *adapt_count after the release of the lock causes a race
condition when thread A unlocks, thread B continues and destroys the
mutex, and thread A writes to *adapt_count.

2016-12-09  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>
	    Steven Munroe  <sjmunroe@us.ibm.com>
	    Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>

	[BZ #20822]
	* sysdeps/unix/sysv/linux/powerpc/elision-lock.c
	(__lll_lock_elision): Access adapt_count via C11 atomics.
	* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
	(__lll_trylock_elision): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
	(__lll_unlock_elision):  Update adapt_count variable inside the
	critical section.
---
 sysdeps/unix/sysv/linux/powerpc/elision-lock.c    |  8 +++++---
 sysdeps/unix/sysv/linux/powerpc/elision-trylock.c |  7 ++++---
 sysdeps/unix/sysv/linux/powerpc/elision-unlock.c  | 14 +++++++++-----
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
index dd1e4c3..c1ad1dd 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
@@ -45,7 +45,7 @@
 int
 __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 {
-  if (*adapt_count > 0)
+  if (atomic_load_relaxed(adapt_count) > 0)
     {
       goto use_lock;
     }
@@ -67,7 +67,8 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 	  if (_TEXASRU_FAILURE_PERSISTENT (__builtin_get_texasru ()))
 	    {
 	      if (aconf.skip_lock_internal_abort > 0)
-		*adapt_count = aconf.skip_lock_internal_abort;
+		atomic_store_relaxed (adapt_count,
+				      aconf.skip_lock_internal_abort);
 	      goto use_lock;
 	    }
 	}
@@ -75,7 +76,8 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 
   /* Fall back to locks for a bit if retries have been exhausted */
   if (aconf.try_tbegin > 0 && aconf.skip_lock_out_of_tbegin_retries > 0)
-    *adapt_count = aconf.skip_lock_out_of_tbegin_retries;
+    atomic_store_relaxed (adapt_count,
+			  aconf.skip_lock_out_of_tbegin_retries);
 
 use_lock:
   return LLL_LOCK ((*lock), pshared);
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
index 0807a6a..6248e32 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
@@ -34,7 +34,7 @@ __lll_trylock_elision (int *futex, short *adapt_count)
   __libc_tabort (_ABORT_NESTED_TRYLOCK);
 
   /* Only try a transaction if it's worth it.  */
-  if (*adapt_count > 0)
+  if (atomic_load_relaxed(adapt_count) > 0)
     {
       goto use_lock;
     }
@@ -49,7 +49,7 @@ __lll_trylock_elision (int *futex, short *adapt_count)
       __libc_tend (0);
 
       if (aconf.skip_lock_busy > 0)
-	*adapt_count = aconf.skip_lock_busy;
+	atomic_store_relaxed (adapt_count, aconf.skip_lock_busy);
     }
   else
     {
@@ -59,7 +59,8 @@ __lll_trylock_elision (int *futex, short *adapt_count)
 	     result in another failure.  Use normal locking now and
 	     for the next couple of calls.  */
 	  if (aconf.skip_trylock_internal_abort > 0)
-	    *adapt_count = aconf.skip_trylock_internal_abort;
+	    atomic_store_relaxed (adapt_count,
+				aconf.skip_trylock_internal_abort);
 	}
     }
 
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
index 43c5a67..0e0baf5 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
@@ -28,13 +28,17 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
     __libc_tend (0);
   else
     {
-      lll_unlock ((*lock), pshared);
-
-      /* Update the adapt count AFTER completing the critical section.
-         Doing this here prevents unneeded stalling when entering
-         a critical section.  Saving about 8% runtime on P8.  */
+      /* Update adapt_count in the critical section to prevent a
+	 write-after-destroy error as mentioned in BZ 20822.  The
+	 following update of adapt_count is clearly contained within
+	 the critical region of the fall-back lock as it immediately
+	 precedes the unlock.  Attempting to use C11 atomics to access
+	 adapt_count would be (at minimum) misleading and (at worse) a
+	 serious error forcing a larx-hit-stcx flush.  */
       if (*adapt_count > 0)
 	(*adapt_count)--;
+
+      lll_unlock ((*lock), pshared);
     }
   return 0;
 }
-- 
2.1.0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv2] powerpc: Fix write-after-destroy in lock elision
  2016-12-09 19:39   ` [PATCHv2] powerpc: Fix write-after-destroy " Tulio Magno Quites Machado Filho
@ 2016-12-12  4:36     ` Mike Frysinger
  2016-12-21 18:03       ` [PATCHv3] " Tulio Magno Quites Machado Filho
  0 siblings, 1 reply; 21+ messages in thread
From: Mike Frysinger @ 2016-12-12  4:36 UTC (permalink / raw)
  To: Tulio Magno Quites Machado Filho; +Cc: libc-alpha, raji, triegel, munroesj

[-- Attachment #1: Type: text/plain, Size: 165 bytes --]

On 09 Dec 2016 17:37, Tulio Magno Quites Machado Filho wrote:
> +  if (atomic_load_relaxed(adapt_count) > 0)

missing space before the (

comes up a few times
-mike

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCHv3] powerpc: Fix write-after-destroy in lock elision
  2016-12-12  4:36     ` Mike Frysinger
@ 2016-12-21 18:03       ` Tulio Magno Quites Machado Filho
  2017-01-02 14:13         ` Ping " Tulio Magno Quites Machado Filho
                           ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2016-12-21 18:03 UTC (permalink / raw)
  To: libc-alpha; +Cc: vapier, raji, triegel, munroesj

From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>

Changes since v2:
 - Fixed coding style.

Changes since v1:
 - Removed references to data race by the actual error: write-after-destroy.
 - Enclosed adapt_count accesses in C11 atomics.

--- 8< ---

The update of *adapt_count after the release of the lock causes a race
condition when thread A unlocks, thread B continues and destroys the
mutex, and thread A writes to *adapt_count.

2016-12-21  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>
	    Steven Munroe  <sjmunroe@us.ibm.com>
	    Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>

	[BZ #20822]
	* sysdeps/unix/sysv/linux/powerpc/elision-lock.c
	(__lll_lock_elision): Access adapt_count via C11 atomics.
	* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
	(__lll_trylock_elision): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
	(__lll_unlock_elision):  Update adapt_count variable inside the
	critical section.
---
 sysdeps/unix/sysv/linux/powerpc/elision-lock.c    |  8 +++++---
 sysdeps/unix/sysv/linux/powerpc/elision-trylock.c |  7 ++++---
 sysdeps/unix/sysv/linux/powerpc/elision-unlock.c  | 14 +++++++++-----
 3 files changed, 18 insertions(+), 11 deletions(-)

diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
index dd1e4c3..86adbc9 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
@@ -45,7 +45,7 @@
 int
 __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 {
-  if (*adapt_count > 0)
+  if (atomic_load_relaxed (adapt_count) > 0)
     {
       goto use_lock;
     }
@@ -67,7 +67,8 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 	  if (_TEXASRU_FAILURE_PERSISTENT (__builtin_get_texasru ()))
 	    {
 	      if (aconf.skip_lock_internal_abort > 0)
-		*adapt_count = aconf.skip_lock_internal_abort;
+		atomic_store_relaxed (adapt_count,
+				      aconf.skip_lock_internal_abort);
 	      goto use_lock;
 	    }
 	}
@@ -75,7 +76,8 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 
   /* Fall back to locks for a bit if retries have been exhausted */
   if (aconf.try_tbegin > 0 && aconf.skip_lock_out_of_tbegin_retries > 0)
-    *adapt_count = aconf.skip_lock_out_of_tbegin_retries;
+    atomic_store_relaxed (adapt_count,
+			  aconf.skip_lock_out_of_tbegin_retries);
 
 use_lock:
   return LLL_LOCK ((*lock), pshared);
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
index 0807a6a..6061856 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
@@ -34,7 +34,7 @@ __lll_trylock_elision (int *futex, short *adapt_count)
   __libc_tabort (_ABORT_NESTED_TRYLOCK);
 
   /* Only try a transaction if it's worth it.  */
-  if (*adapt_count > 0)
+  if (atomic_load_relaxed (adapt_count) > 0)
     {
       goto use_lock;
     }
@@ -49,7 +49,7 @@ __lll_trylock_elision (int *futex, short *adapt_count)
       __libc_tend (0);
 
       if (aconf.skip_lock_busy > 0)
-	*adapt_count = aconf.skip_lock_busy;
+	atomic_store_relaxed (adapt_count, aconf.skip_lock_busy);
     }
   else
     {
@@ -59,7 +59,8 @@ __lll_trylock_elision (int *futex, short *adapt_count)
 	     result in another failure.  Use normal locking now and
 	     for the next couple of calls.  */
 	  if (aconf.skip_trylock_internal_abort > 0)
-	    *adapt_count = aconf.skip_trylock_internal_abort;
+	    atomic_store_relaxed (adapt_count,
+				aconf.skip_trylock_internal_abort);
 	}
     }
 
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
index 43c5a67..0e0baf5 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
@@ -28,13 +28,17 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
     __libc_tend (0);
   else
     {
-      lll_unlock ((*lock), pshared);
-
-      /* Update the adapt count AFTER completing the critical section.
-         Doing this here prevents unneeded stalling when entering
-         a critical section.  Saving about 8% runtime on P8.  */
+      /* Update adapt_count in the critical section to prevent a
+	 write-after-destroy error as mentioned in BZ 20822.  The
+	 following update of adapt_count is clearly contained within
+	 the critical region of the fall-back lock as it immediately
+	 precedes the unlock.  Attempting to use C11 atomics to access
+	 adapt_count would be (at minimum) misleading and (at worse) a
+	 serious error forcing a larx-hit-stcx flush.  */
       if (*adapt_count > 0)
 	(*adapt_count)--;
+
+      lll_unlock ((*lock), pshared);
     }
   return 0;
 }
-- 
2.1.0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Ping Re: [PATCHv3] powerpc: Fix write-after-destroy in lock elision
  2016-12-21 18:03       ` [PATCHv3] " Tulio Magno Quites Machado Filho
@ 2017-01-02 14:13         ` Tulio Magno Quites Machado Filho
  2017-01-02 16:19         ` Wainer S. Moschetta
  2017-01-02 17:04         ` Adhemerval Zanella
  2 siblings, 0 replies; 21+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2017-01-02 14:13 UTC (permalink / raw)
  To: libc-alpha, vapier, triegel; +Cc: raji, munroesj

Ping?

Tulio Magno Quites Machado Filho <tuliom@linux.vnet.ibm.com> writes:

> From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
>
> Changes since v2:
>  - Fixed coding style.
>
> Changes since v1:
>  - Removed references to data race by the actual error: write-after-destroy.
>  - Enclosed adapt_count accesses in C11 atomics.
>
> --- 8< ---
>
> The update of *adapt_count after the release of the lock causes a race
> condition when thread A unlocks, thread B continues and destroys the
> mutex, and thread A writes to *adapt_count.
>
> 2016-12-21  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>
> 	    Steven Munroe  <sjmunroe@us.ibm.com>
> 	    Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>
>
> 	[BZ #20822]
> 	* sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> 	(__lll_lock_elision): Access adapt_count via C11 atomics.
> 	* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
> 	(__lll_trylock_elision): Likewise.
> 	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> 	(__lll_unlock_elision):  Update adapt_count variable inside the
> 	critical section.

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3] powerpc: Fix write-after-destroy in lock elision
  2016-12-21 18:03       ` [PATCHv3] " Tulio Magno Quites Machado Filho
  2017-01-02 14:13         ` Ping " Tulio Magno Quites Machado Filho
@ 2017-01-02 16:19         ` Wainer S. Moschetta
  2017-01-02 17:04         ` Adhemerval Zanella
  2 siblings, 0 replies; 21+ messages in thread
From: Wainer S. Moschetta @ 2017-01-02 16:19 UTC (permalink / raw)
  To: libc-alpha

LGTM.

On 12/21/2016 04:03 PM, Tulio Magno Quites Machado Filho wrote:
> From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
>
> Changes since v2:
>  - Fixed coding style.
>
> Changes since v1:
>  - Removed references to data race by the actual error: write-after-destroy.
>  - Enclosed adapt_count accesses in C11 atomics.
>
> --- 8< ---
>
> The update of *adapt_count after the release of the lock causes a race
> condition when thread A unlocks, thread B continues and destroys the
> mutex, and thread A writes to *adapt_count.
>
> 2016-12-21  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>
> 	    Steven Munroe  <sjmunroe@us.ibm.com>
> 	    Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>
>
> 	[BZ #20822]
> 	* sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> 	(__lll_lock_elision): Access adapt_count via C11 atomics.
> 	* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
> 	(__lll_trylock_elision): Likewise.
> 	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> 	(__lll_unlock_elision):  Update adapt_count variable inside the
> 	critical section.
> ---
>  sysdeps/unix/sysv/linux/powerpc/elision-lock.c    |  8 +++++---
>  sysdeps/unix/sysv/linux/powerpc/elision-trylock.c |  7 ++++---
>  sysdeps/unix/sysv/linux/powerpc/elision-unlock.c  | 14 +++++++++-----
>  3 files changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> index dd1e4c3..86adbc9 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> @@ -45,7 +45,7 @@
>  int
>  __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
>  {
> -  if (*adapt_count > 0)
> +  if (atomic_load_relaxed (adapt_count) > 0)
>      {
>        goto use_lock;
>      }
> @@ -67,7 +67,8 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
>  	  if (_TEXASRU_FAILURE_PERSISTENT (__builtin_get_texasru ()))
>  	    {
>  	      if (aconf.skip_lock_internal_abort > 0)
> -		*adapt_count = aconf.skip_lock_internal_abort;
> +		atomic_store_relaxed (adapt_count,
> +				      aconf.skip_lock_internal_abort);
>  	      goto use_lock;
>  	    }
>  	}
> @@ -75,7 +76,8 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
>
>    /* Fall back to locks for a bit if retries have been exhausted */
>    if (aconf.try_tbegin > 0 && aconf.skip_lock_out_of_tbegin_retries > 0)
> -    *adapt_count = aconf.skip_lock_out_of_tbegin_retries;
> +    atomic_store_relaxed (adapt_count,
> +			  aconf.skip_lock_out_of_tbegin_retries);
>
>  use_lock:
>    return LLL_LOCK ((*lock), pshared);
> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
> index 0807a6a..6061856 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
> @@ -34,7 +34,7 @@ __lll_trylock_elision (int *futex, short *adapt_count)
>    __libc_tabort (_ABORT_NESTED_TRYLOCK);
>
>    /* Only try a transaction if it's worth it.  */
> -  if (*adapt_count > 0)
> +  if (atomic_load_relaxed (adapt_count) > 0)
>      {
>        goto use_lock;
>      }
> @@ -49,7 +49,7 @@ __lll_trylock_elision (int *futex, short *adapt_count)
>        __libc_tend (0);
>
>        if (aconf.skip_lock_busy > 0)
> -	*adapt_count = aconf.skip_lock_busy;
> +	atomic_store_relaxed (adapt_count, aconf.skip_lock_busy);
>      }
>    else
>      {
> @@ -59,7 +59,8 @@ __lll_trylock_elision (int *futex, short *adapt_count)
>  	     result in another failure.  Use normal locking now and
>  	     for the next couple of calls.  */
>  	  if (aconf.skip_trylock_internal_abort > 0)
> -	    *adapt_count = aconf.skip_trylock_internal_abort;
> +	    atomic_store_relaxed (adapt_count,
> +				aconf.skip_trylock_internal_abort);
>  	}
>      }
>
> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> index 43c5a67..0e0baf5 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> @@ -28,13 +28,17 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
>      __libc_tend (0);
>    else
>      {
> -      lll_unlock ((*lock), pshared);
> -
> -      /* Update the adapt count AFTER completing the critical section.
> -         Doing this here prevents unneeded stalling when entering
> -         a critical section.  Saving about 8% runtime on P8.  */
> +      /* Update adapt_count in the critical section to prevent a
> +	 write-after-destroy error as mentioned in BZ 20822.  The
> +	 following update of adapt_count is clearly contained within
> +	 the critical region of the fall-back lock as it immediately
> +	 precedes the unlock.  Attempting to use C11 atomics to access
> +	 adapt_count would be (at minimum) misleading and (at worse) a
> +	 serious error forcing a larx-hit-stcx flush.  */
>        if (*adapt_count > 0)
>  	(*adapt_count)--;
> +
> +      lll_unlock ((*lock), pshared);
>      }
>    return 0;
>  }

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3] powerpc: Fix write-after-destroy in lock elision
  2016-12-21 18:03       ` [PATCHv3] " Tulio Magno Quites Machado Filho
  2017-01-02 14:13         ` Ping " Tulio Magno Quites Machado Filho
  2017-01-02 16:19         ` Wainer S. Moschetta
@ 2017-01-02 17:04         ` Adhemerval Zanella
  2017-01-02 17:22           ` Torvald Riegel
  2 siblings, 1 reply; 21+ messages in thread
From: Adhemerval Zanella @ 2017-01-02 17:04 UTC (permalink / raw)
  To: libc-alpha

LGTM with a nit only on the comment about the adapt_count decrement in
unlock.

On 21/12/2016 16:03, Tulio Magno Quites Machado Filho wrote:
> From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
> 
> Changes since v2:
>  - Fixed coding style.
> 
> Changes since v1:
>  - Removed references to data race by the actual error: write-after-destroy.
>  - Enclosed adapt_count accesses in C11 atomics.
> 
> --- 8< ---
> 
> The update of *adapt_count after the release of the lock causes a race
> condition when thread A unlocks, thread B continues and destroys the
> mutex, and thread A writes to *adapt_count.


>  
> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> index 43c5a67..0e0baf5 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> @@ -28,13 +28,17 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
>      __libc_tend (0);
>    else
>      {
> -      lll_unlock ((*lock), pshared);
> -
> -      /* Update the adapt count AFTER completing the critical section.
> -         Doing this here prevents unneeded stalling when entering
> -         a critical section.  Saving about 8% runtime on P8.  */
> +      /* Update adapt_count in the critical section to prevent a
> +	 write-after-destroy error as mentioned in BZ 20822.  The
> +	 following update of adapt_count is clearly contained within
> +	 the critical region of the fall-back lock as it immediately
> +	 precedes the unlock.  Attempting to use C11 atomics to access
> +	 adapt_count would be (at minimum) misleading and (at worse) a
> +	 serious error forcing a larx-hit-stcx flush.  */
>        if (*adapt_count > 0)
>  	(*adapt_count)--;

I think C11 atomic comment should outline that using atomic_fetch_add_relaxed
generates suboptimal code, not misleading, since it generates an atomic sequence
using lharx/sthcx. on a transaction region.  Something like:

      /* Update adapt_count in the critical section to prevent a
	 write-after-destroy error as mentioned in BZ 20822.  The
	 following update of adapt_count is clearly contained within
	 the critical region of the fall-back lock as it immediately
	 precedes the unlock.  
	 Using C11 atomics like atomic_fetch_add_relaxed generates
	 suboptimal code (lharx/sthcx. sequence) where an plain load, 
	 update, store is suffice inside a HTM region (the whole
	 idea of using this construction).  */

> +
> +      lll_unlock ((*lock), pshared);
>      }
>    return 0;
>  }
> 

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3] powerpc: Fix write-after-destroy in lock elision
  2017-01-02 17:04         ` Adhemerval Zanella
@ 2017-01-02 17:22           ` Torvald Riegel
  2017-01-02 17:25             ` Adhemerval Zanella
  0 siblings, 1 reply; 21+ messages in thread
From: Torvald Riegel @ 2017-01-02 17:22 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha

On Mon, 2017-01-02 at 15:03 -0200, Adhemerval Zanella wrote:
> LGTM with a nit only on the comment about the adapt_count decrement in
> unlock.
> 
> On 21/12/2016 16:03, Tulio Magno Quites Machado Filho wrote:
> > From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
> > 
> > Changes since v2:
> >  - Fixed coding style.
> > 
> > Changes since v1:
> >  - Removed references to data race by the actual error: write-after-destroy.
> >  - Enclosed adapt_count accesses in C11 atomics.
> > 
> > --- 8< ---
> > 
> > The update of *adapt_count after the release of the lock causes a race
> > condition when thread A unlocks, thread B continues and destroys the
> > mutex, and thread A writes to *adapt_count.
> 
> 
> >  
> > diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> > index 43c5a67..0e0baf5 100644
> > --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> > +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> > @@ -28,13 +28,17 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
> >      __libc_tend (0);
> >    else
> >      {
> > -      lll_unlock ((*lock), pshared);
> > -
> > -      /* Update the adapt count AFTER completing the critical section.
> > -         Doing this here prevents unneeded stalling when entering
> > -         a critical section.  Saving about 8% runtime on P8.  */
> > +      /* Update adapt_count in the critical section to prevent a
> > +	 write-after-destroy error as mentioned in BZ 20822.  The
> > +	 following update of adapt_count is clearly contained within
> > +	 the critical region of the fall-back lock as it immediately
> > +	 precedes the unlock.  Attempting to use C11 atomics to access
> > +	 adapt_count would be (at minimum) misleading and (at worse) a
> > +	 serious error forcing a larx-hit-stcx flush.  */
> >        if (*adapt_count > 0)
> >  	(*adapt_count)--;
> 
> I think C11 atomic comment should outline that using atomic_fetch_add_relaxed
> generates suboptimal code, not misleading, since it generates an atomic sequence
> using lharx/sthcx. on a transaction region.  Something like:
> 
>       /* Update adapt_count in the critical section to prevent a
> 	 write-after-destroy error as mentioned in BZ 20822.  The
> 	 following update of adapt_count is clearly contained within
> 	 the critical region of the fall-back lock as it immediately
> 	 precedes the unlock.  
> 	 Using C11 atomics like atomic_fetch_add_relaxed generates
> 	 suboptimal code (lharx/sthcx. sequence) where an plain load, 
> 	 update, store is suffice inside a HTM region (the whole
> 	 idea of using this construction).  */

Yes.

For the record, I'm still of the opinion that it should just use atomic
relaxed MO loads and stores -- simply for consistency and to make future
transition to C11 atomic types easier.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3] powerpc: Fix write-after-destroy in lock elision
  2017-01-02 17:22           ` Torvald Riegel
@ 2017-01-02 17:25             ` Adhemerval Zanella
  2017-01-02 17:39               ` Torvald Riegel
  0 siblings, 1 reply; 21+ messages in thread
From: Adhemerval Zanella @ 2017-01-02 17:25 UTC (permalink / raw)
  To: Torvald Riegel; +Cc: libc-alpha



On 02/01/2017 15:21, Torvald Riegel wrote:
> On Mon, 2017-01-02 at 15:03 -0200, Adhemerval Zanella wrote:
>> LGTM with a nit only on the comment about the adapt_count decrement in
>> unlock.
>>
>> On 21/12/2016 16:03, Tulio Magno Quites Machado Filho wrote:
>>> From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
>>>
>>> Changes since v2:
>>>  - Fixed coding style.
>>>
>>> Changes since v1:
>>>  - Removed references to data race by the actual error: write-after-destroy.
>>>  - Enclosed adapt_count accesses in C11 atomics.
>>>
>>> --- 8< ---
>>>
>>> The update of *adapt_count after the release of the lock causes a race
>>> condition when thread A unlocks, thread B continues and destroys the
>>> mutex, and thread A writes to *adapt_count.
>>
>>
>>>  
>>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>>> index 43c5a67..0e0baf5 100644
>>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>>> @@ -28,13 +28,17 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
>>>      __libc_tend (0);
>>>    else
>>>      {
>>> -      lll_unlock ((*lock), pshared);
>>> -
>>> -      /* Update the adapt count AFTER completing the critical section.
>>> -         Doing this here prevents unneeded stalling when entering
>>> -         a critical section.  Saving about 8% runtime on P8.  */
>>> +      /* Update adapt_count in the critical section to prevent a
>>> +	 write-after-destroy error as mentioned in BZ 20822.  The
>>> +	 following update of adapt_count is clearly contained within
>>> +	 the critical region of the fall-back lock as it immediately
>>> +	 precedes the unlock.  Attempting to use C11 atomics to access
>>> +	 adapt_count would be (at minimum) misleading and (at worse) a
>>> +	 serious error forcing a larx-hit-stcx flush.  */
>>>        if (*adapt_count > 0)
>>>  	(*adapt_count)--;
>>
>> I think C11 atomic comment should outline that using atomic_fetch_add_relaxed
>> generates suboptimal code, not misleading, since it generates an atomic sequence
>> using lharx/sthcx. on a transaction region.  Something like:
>>
>>       /* Update adapt_count in the critical section to prevent a
>> 	 write-after-destroy error as mentioned in BZ 20822.  The
>> 	 following update of adapt_count is clearly contained within
>> 	 the critical region of the fall-back lock as it immediately
>> 	 precedes the unlock.  
>> 	 Using C11 atomics like atomic_fetch_add_relaxed generates
>> 	 suboptimal code (lharx/sthcx. sequence) where an plain load, 
>> 	 update, store is suffice inside a HTM region (the whole
>> 	 idea of using this construction).  */
> 
> Yes.
> 
> For the record, I'm still of the opinion that it should just use atomic
> relaxed MO loads and stores -- simply for consistency and to make future
> transition to C11 atomic types easier.
> 

I think it can be changed to:

       if (atomic_load_relaxed (adapt_count) > 0)
  	 (*adapt_count)--;

Without performance issues and the comment will outline why not C11 atomic is
not being used in this specific snippet.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3] powerpc: Fix write-after-destroy in lock elision
  2017-01-02 17:25             ` Adhemerval Zanella
@ 2017-01-02 17:39               ` Torvald Riegel
  2017-01-02 18:15                 ` Adhemerval Zanella
  0 siblings, 1 reply; 21+ messages in thread
From: Torvald Riegel @ 2017-01-02 17:39 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha, Tulio Magno Quites Machado Filho

On Mon, 2017-01-02 at 15:25 -0200, Adhemerval Zanella wrote:
> 
> On 02/01/2017 15:21, Torvald Riegel wrote:
> > On Mon, 2017-01-02 at 15:03 -0200, Adhemerval Zanella wrote:
> >> LGTM with a nit only on the comment about the adapt_count decrement in
> >> unlock.
> >>
> >> On 21/12/2016 16:03, Tulio Magno Quites Machado Filho wrote:
> >>> From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
> >>>
> >>> Changes since v2:
> >>>  - Fixed coding style.
> >>>
> >>> Changes since v1:
> >>>  - Removed references to data race by the actual error: write-after-destroy.
> >>>  - Enclosed adapt_count accesses in C11 atomics.
> >>>
> >>> --- 8< ---
> >>>
> >>> The update of *adapt_count after the release of the lock causes a race
> >>> condition when thread A unlocks, thread B continues and destroys the
> >>> mutex, and thread A writes to *adapt_count.
> >>
> >>
> >>>  
> >>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> >>> index 43c5a67..0e0baf5 100644
> >>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> >>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> >>> @@ -28,13 +28,17 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
> >>>      __libc_tend (0);
> >>>    else
> >>>      {
> >>> -      lll_unlock ((*lock), pshared);
> >>> -
> >>> -      /* Update the adapt count AFTER completing the critical section.
> >>> -         Doing this here prevents unneeded stalling when entering
> >>> -         a critical section.  Saving about 8% runtime on P8.  */
> >>> +      /* Update adapt_count in the critical section to prevent a
> >>> +	 write-after-destroy error as mentioned in BZ 20822.  The
> >>> +	 following update of adapt_count is clearly contained within
> >>> +	 the critical region of the fall-back lock as it immediately
> >>> +	 precedes the unlock.  Attempting to use C11 atomics to access
> >>> +	 adapt_count would be (at minimum) misleading and (at worse) a
> >>> +	 serious error forcing a larx-hit-stcx flush.  */
> >>>        if (*adapt_count > 0)
> >>>  	(*adapt_count)--;
> >>
> >> I think C11 atomic comment should outline that using atomic_fetch_add_relaxed
> >> generates suboptimal code, not misleading, since it generates an atomic sequence
> >> using lharx/sthcx. on a transaction region.  Something like:
> >>
> >>       /* Update adapt_count in the critical section to prevent a
> >> 	 write-after-destroy error as mentioned in BZ 20822.  The
> >> 	 following update of adapt_count is clearly contained within
> >> 	 the critical region of the fall-back lock as it immediately
> >> 	 precedes the unlock.  
> >> 	 Using C11 atomics like atomic_fetch_add_relaxed generates
> >> 	 suboptimal code (lharx/sthcx. sequence) where an plain load, 
> >> 	 update, store is suffice inside a HTM region (the whole
> >> 	 idea of using this construction).  */
> > 

Looking at this again, this is not guaranteed to be in a transaction.
(Wasn't all this supposed to always be in transactional code?  Or am I
mixing this up with another patch?)

It uses the lock, so can be concurrent with other accesses (that use
atomics).  Thus, it must use atomics too to prevent data races.

The comment should also mention the mutex destruction requirements by
POSIX because this is how we name that requirement everywhere else.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3] powerpc: Fix write-after-destroy in lock elision
  2017-01-02 17:39               ` Torvald Riegel
@ 2017-01-02 18:15                 ` Adhemerval Zanella
  2017-01-02 19:33                   ` Torvald Riegel
  0 siblings, 1 reply; 21+ messages in thread
From: Adhemerval Zanella @ 2017-01-02 18:15 UTC (permalink / raw)
  To: Torvald Riegel; +Cc: libc-alpha, Tulio Magno Quites Machado Filho



On 02/01/2017 15:39, Torvald Riegel wrote:
> On Mon, 2017-01-02 at 15:25 -0200, Adhemerval Zanella wrote:
>>
>> On 02/01/2017 15:21, Torvald Riegel wrote:
>>> On Mon, 2017-01-02 at 15:03 -0200, Adhemerval Zanella wrote:
>>>> LGTM with a nit only on the comment about the adapt_count decrement in
>>>> unlock.
>>>>
>>>> On 21/12/2016 16:03, Tulio Magno Quites Machado Filho wrote:
>>>>> From: Rajalakshmi Srinivasaraghavan <raji@linux.vnet.ibm.com>
>>>>>
>>>>> Changes since v2:
>>>>>  - Fixed coding style.
>>>>>
>>>>> Changes since v1:
>>>>>  - Removed references to data race by the actual error: write-after-destroy.
>>>>>  - Enclosed adapt_count accesses in C11 atomics.
>>>>>
>>>>> --- 8< ---
>>>>>
>>>>> The update of *adapt_count after the release of the lock causes a race
>>>>> condition when thread A unlocks, thread B continues and destroys the
>>>>> mutex, and thread A writes to *adapt_count.
>>>>
>>>>
>>>>>  
>>>>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>>>>> index 43c5a67..0e0baf5 100644
>>>>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>>>>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>>>>> @@ -28,13 +28,17 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
>>>>>      __libc_tend (0);
>>>>>    else
>>>>>      {
>>>>> -      lll_unlock ((*lock), pshared);
>>>>> -
>>>>> -      /* Update the adapt count AFTER completing the critical section.
>>>>> -         Doing this here prevents unneeded stalling when entering
>>>>> -         a critical section.  Saving about 8% runtime on P8.  */
>>>>> +      /* Update adapt_count in the critical section to prevent a
>>>>> +	 write-after-destroy error as mentioned in BZ 20822.  The
>>>>> +	 following update of adapt_count is clearly contained within
>>>>> +	 the critical region of the fall-back lock as it immediately
>>>>> +	 precedes the unlock.  Attempting to use C11 atomics to access
>>>>> +	 adapt_count would be (at minimum) misleading and (at worse) a
>>>>> +	 serious error forcing a larx-hit-stcx flush.  */
>>>>>        if (*adapt_count > 0)
>>>>>  	(*adapt_count)--;
>>>>
>>>> I think C11 atomic comment should outline that using atomic_fetch_add_relaxed
>>>> generates suboptimal code, not misleading, since it generates an atomic sequence
>>>> using lharx/sthcx. on a transaction region.  Something like:
>>>>
>>>>       /* Update adapt_count in the critical section to prevent a
>>>> 	 write-after-destroy error as mentioned in BZ 20822.  The
>>>> 	 following update of adapt_count is clearly contained within
>>>> 	 the critical region of the fall-back lock as it immediately
>>>> 	 precedes the unlock.  
>>>> 	 Using C11 atomics like atomic_fetch_add_relaxed generates
>>>> 	 suboptimal code (lharx/sthcx. sequence) where an plain load, 
>>>> 	 update, store is suffice inside a HTM region (the whole
>>>> 	 idea of using this construction).  */
>>>
> 
> Looking at this again, this is not guaranteed to be in a transaction.
> (Wasn't all this supposed to always be in transactional code?  Or am I
> mixing this up with another patch?)

You are right, the adapt_count decrement is indeed on the default LL/SC
patch, not the HTM one.

> 
> It uses the lock, so can be concurrent with other accesses (that use
> atomics).  Thus, it must use atomics too to prevent data races.

Indeed, we need to use atomic decrement on this one as well.  I am not
sure if is needed to use a stronger semantic than relaxed.
> 
> The comment should also mention the mutex destruction requirements by
> POSIX because this is how we name that requirement everywhere else.
> 

Nod.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv3] powerpc: Fix write-after-destroy in lock elision
  2017-01-02 18:15                 ` Adhemerval Zanella
@ 2017-01-02 19:33                   ` Torvald Riegel
  2017-01-03 12:35                     ` [PATCHv4] " Tulio Magno Quites Machado Filho
  0 siblings, 1 reply; 21+ messages in thread
From: Torvald Riegel @ 2017-01-02 19:33 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha, Tulio Magno Quites Machado Filho

On Mon, 2017-01-02 at 16:15 -0200, Adhemerval Zanella wrote:
> 
> On 02/01/2017 15:39, Torvald Riegel wrote:
> > It uses the lock, so can be concurrent with other accesses (that use
> > atomics).  Thus, it must use atomics too to prevent data races.
> 
> Indeed, we need to use atomic decrement on this one as well.  I am not
> sure if is needed to use a stronger semantic than relaxed.

adapt_count is just a hint, as is explained in the comments in the
similar patch I had sent for x86.  Given that it's just a hint, lost
updates are harmless for correctness, and the decrement does not need to
be atomic.  The invididual loads and stores of a decrement should be
though, to avoid data races.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCHv4] powerpc: Fix write-after-destroy in lock elision
  2017-01-02 19:33                   ` Torvald Riegel
@ 2017-01-03 12:35                     ` Tulio Magno Quites Machado Filho
  2017-01-03 13:02                       ` Torvald Riegel
  2017-01-11 15:22                       ` Ulrich Weigand
  0 siblings, 2 replies; 21+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2017-01-03 12:35 UTC (permalink / raw)
  To: triegel, libc-alpha; +Cc: adhemerval.zanella, raji, munroesj

Changes since v3:
 - Use atomics to update adapt_count in __lll_unlock_elision.
 - Improved the source code comments in __lll_unlock_elision to mention
   the mutex destroy requirements.

Changes since v2:
 - Fixed coding style.

Changes since v1:
 - Removed references to data race by the actual error: write-after-destroy.
 - Enclosed adapt_count accesses in C11 atomics.

--- 8< ---

The update of *adapt_count after the release of the lock causes a race
condition when thread A unlocks, thread B continues and destroys the
mutex, and thread A writes to *adapt_count.

2017-01-03  Rajalakshmi Srinivasaraghavan  <raji@linux.vnet.ibm.com>
	    Steven Munroe  <sjmunroe@us.ibm.com>
	    Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>

	[BZ #20822]
	* sysdeps/unix/sysv/linux/powerpc/elision-lock.c
	(__lll_lock_elision): Access adapt_count via C11 atomics.
	* sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
	(__lll_trylock_elision): Likewise.
	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
	(__lll_unlock_elision):  Update adapt_count variable inside the
	critical section using C11 atomics.
---
 sysdeps/unix/sysv/linux/powerpc/elision-lock.c    |  8 +++++---
 sysdeps/unix/sysv/linux/powerpc/elision-trylock.c |  7 ++++---
 sysdeps/unix/sysv/linux/powerpc/elision-unlock.c  | 15 +++++++++------
 3 files changed, 18 insertions(+), 12 deletions(-)

diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
index 4589491..044803a 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
@@ -45,7 +45,7 @@
 int
 __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 {
-  if (*adapt_count > 0)
+  if (atomic_load_relaxed (adapt_count) > 0)
     {
       goto use_lock;
     }
@@ -67,7 +67,8 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 	  if (_TEXASRU_FAILURE_PERSISTENT (__builtin_get_texasru ()))
 	    {
 	      if (aconf.skip_lock_internal_abort > 0)
-		*adapt_count = aconf.skip_lock_internal_abort;
+		atomic_store_relaxed (adapt_count,
+				      aconf.skip_lock_internal_abort);
 	      goto use_lock;
 	    }
 	}
@@ -75,7 +76,8 @@ __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
 
   /* Fall back to locks for a bit if retries have been exhausted */
   if (aconf.try_tbegin > 0 && aconf.skip_lock_out_of_tbegin_retries > 0)
-    *adapt_count = aconf.skip_lock_out_of_tbegin_retries;
+    atomic_store_relaxed (adapt_count,
+			  aconf.skip_lock_out_of_tbegin_retries);
 
 use_lock:
   return LLL_LOCK ((*lock), pshared);
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
index 1e5cbe8..ed244d3 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-trylock.c
@@ -34,7 +34,7 @@ __lll_trylock_elision (int *futex, short *adapt_count)
   __libc_tabort (_ABORT_NESTED_TRYLOCK);
 
   /* Only try a transaction if it's worth it.  */
-  if (*adapt_count > 0)
+  if (atomic_load_relaxed (adapt_count) > 0)
     {
       goto use_lock;
     }
@@ -49,7 +49,7 @@ __lll_trylock_elision (int *futex, short *adapt_count)
       __libc_tend (0);
 
       if (aconf.skip_lock_busy > 0)
-	*adapt_count = aconf.skip_lock_busy;
+	atomic_store_relaxed (adapt_count, aconf.skip_lock_busy);
     }
   else
     {
@@ -59,7 +59,8 @@ __lll_trylock_elision (int *futex, short *adapt_count)
 	     result in another failure.  Use normal locking now and
 	     for the next couple of calls.  */
 	  if (aconf.skip_trylock_internal_abort > 0)
-	    *adapt_count = aconf.skip_trylock_internal_abort;
+	    atomic_store_relaxed (adapt_count,
+				aconf.skip_trylock_internal_abort);
 	}
     }
 
diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
index 6f45a9c..759c146 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
@@ -28,13 +28,16 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
     __libc_tend (0);
   else
     {
-      lll_unlock ((*lock), pshared);
+      /* Update adapt_count in the critical section to prevent a
+	 write-after-destroy error as mentioned in BZ 20822.  The
+	 following update of adapt_count has to be contained within
+	 the critical region of the fall-back lock in order to not violate
+	 the mutex destruction requirements.  */
+      short __tmp = atomic_load_relaxed (adapt_count);
+      if (__tmp > 0)
+	atomic_store_relaxed (adapt_count, __tmp--);
 
-      /* Update the adapt count AFTER completing the critical section.
-         Doing this here prevents unneeded stalling when entering
-         a critical section.  Saving about 8% runtime on P8.  */
-      if (*adapt_count > 0)
-	(*adapt_count)--;
+      lll_unlock ((*lock), pshared);
     }
   return 0;
 }
-- 
2.1.0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv4] powerpc: Fix write-after-destroy in lock elision
  2017-01-03 12:35                     ` [PATCHv4] " Tulio Magno Quites Machado Filho
@ 2017-01-03 13:02                       ` Torvald Riegel
  2017-01-03 19:25                         ` Tulio Magno Quites Machado Filho
  2017-01-11 15:22                       ` Ulrich Weigand
  1 sibling, 1 reply; 21+ messages in thread
From: Torvald Riegel @ 2017-01-03 13:02 UTC (permalink / raw)
  To: Tulio Magno Quites Machado Filho
  Cc: libc-alpha, adhemerval.zanella, raji, munroesj

On Tue, 2017-01-03 at 10:34 -0200, Tulio Magno Quites Machado Filho
wrote:
> Changes since v3:
>  - Use atomics to update adapt_count in __lll_unlock_elision.
>  - Improved the source code comments in __lll_unlock_elision to mention
>    the mutex destroy requirements.
> 
> Changes since v2:
>  - Fixed coding style.
> 
> Changes since v1:
>  - Removed references to data race by the actual error: write-after-destroy.
>  - Enclosed adapt_count accesses in C11 atomics.
> 
> --- 8< ---

This is okay, thanks.

For the next time, or when you change non-arch-specific code, please
remember to document memory orders; a brief comment is often sufficient
but reveals the intent behind a specific bit of implementation.  For
example:

> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> index 4589491..044803a 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
> @@ -45,7 +45,7 @@
>  int
>  __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
>  {
> -  if (*adapt_count > 0)
> +  if (atomic_load_relaxed (adapt_count) > 0)

Here, you could have just addded:
    /* adapt_count is accessed concurrently but is just a hint.  Thus,
       use atomic accesses but relaxed MO is sufficient.  */

This helps others to more quickly understand what was intended.  This
isn't critical in this case because we have well-documented similar code
for other architectures, but generally it is better to document why a
choice was made if it is not immediately obvious, or cannot be checked
for consistency based on just local information (eg, the source code
lines around the statement in question); concurrent code obviously can't
be cross-checked with just local information...

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv4] powerpc: Fix write-after-destroy in lock elision
  2017-01-03 13:02                       ` Torvald Riegel
@ 2017-01-03 19:25                         ` Tulio Magno Quites Machado Filho
  0 siblings, 0 replies; 21+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2017-01-03 19:25 UTC (permalink / raw)
  To: Torvald Riegel; +Cc: libc-alpha, adhemerval.zanella, raji, munroesj

Torvald Riegel <triegel@redhat.com> writes:

> For the next time, or when you change non-arch-specific code, please
> remember to document memory orders; a brief comment is often sufficient
> but reveals the intent behind a specific bit of implementation.  For
> example:
>
>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
>> index 4589491..044803a 100644
>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-lock.c
>> @@ -45,7 +45,7 @@
>>  int
>>  __lll_lock_elision (int *lock, short *adapt_count, EXTRAARG int pshared)
>>  {
>> -  if (*adapt_count > 0)
>> +  if (atomic_load_relaxed (adapt_count) > 0)
>
> Here, you could have just addded:
>     /* adapt_count is accessed concurrently but is just a hint.  Thus,
>        use atomic accesses but relaxed MO is sufficient.  */
>
> This helps others to more quickly understand what was intended.  This
> isn't critical in this case because we have well-documented similar code
> for other architectures, but generally it is better to document why a
> choice was made if it is not immediately obvious, or cannot be checked
> for consistency based on just local information (eg, the source code
> lines around the statement in question); concurrent code obviously can't
> be cross-checked with just local information...

I agree with you.
I added this comment in this patch and pushed it as e9a96ea.

Thanks!

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv4] powerpc: Fix write-after-destroy in lock elision
  2017-01-03 12:35                     ` [PATCHv4] " Tulio Magno Quites Machado Filho
  2017-01-03 13:02                       ` Torvald Riegel
@ 2017-01-11 15:22                       ` Ulrich Weigand
  2017-01-11 16:41                         ` Andreas Schwab
  1 sibling, 1 reply; 21+ messages in thread
From: Ulrich Weigand @ 2017-01-11 15:22 UTC (permalink / raw)
  To: Tulio Magno Quites Machado Filho
  Cc: triegel, libc-alpha, adhemerval.zanella, raji, munroesj

Tulio Magno Quites Machado Filho wrote:

> +      /* Update adapt_count in the critical section to prevent a
> +	 write-after-destroy error as mentioned in BZ 20822.  The
> +	 following update of adapt_count has to be contained within
> +	 the critical region of the fall-back lock in order to not violate
> +	 the mutex destruction requirements.  */
> +      short __tmp = atomic_load_relaxed (adapt_count);
> +      if (__tmp > 0)
> +        atomic_store_relaxed (adapt_count, __tmp--);

Doesn't this have to be --__tmp to have any effect?

Bye,
Ulrich

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  Ulrich.Weigand@de.ibm.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCHv4] powerpc: Fix write-after-destroy in lock elision
  2017-01-11 15:22                       ` Ulrich Weigand
@ 2017-01-11 16:41                         ` Andreas Schwab
  2017-01-16 13:23                           ` [PATCH 2.25] powerpc: Fix adapt_count update in __lll_unlock_elision Tulio Magno Quites Machado Filho
  0 siblings, 1 reply; 21+ messages in thread
From: Andreas Schwab @ 2017-01-11 16:41 UTC (permalink / raw)
  To: Ulrich Weigand
  Cc: Tulio Magno Quites Machado Filho, triegel, libc-alpha,
	adhemerval.zanella, raji, munroesj

On Jan 11 2017, "Ulrich Weigand" <uweigand@de.ibm.com> wrote:

> Tulio Magno Quites Machado Filho wrote:
>
>> +      /* Update adapt_count in the critical section to prevent a
>> +	 write-after-destroy error as mentioned in BZ 20822.  The
>> +	 following update of adapt_count has to be contained within
>> +	 the critical region of the fall-back lock in order to not violate
>> +	 the mutex destruction requirements.  */
>> +      short __tmp = atomic_load_relaxed (adapt_count);
>> +      if (__tmp > 0)
>> +        atomic_store_relaxed (adapt_count, __tmp--);
>
> Doesn't this have to be --__tmp to have any effect?

It should be just __tmp - 1.

Andreas.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 2.25] powerpc: Fix adapt_count update in __lll_unlock_elision
  2017-01-11 16:41                         ` Andreas Schwab
@ 2017-01-16 13:23                           ` Tulio Magno Quites Machado Filho
  2017-01-16 18:57                             ` Adhemerval Zanella
  0 siblings, 1 reply; 21+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2017-01-16 13:23 UTC (permalink / raw)
  To: libc-alpha, uweigand, schwab; +Cc: siddhesh

Thanks for catching this.

--- 8< ---

Commit e9a96ea1aca4ebaa7c86e8b83b766f118d689d0f had an error that
prevents adapt_count from being updated in __lll_unlock_elision.

2017-01-16  Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>

	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
	(__lll_unlock_elision): Fix adapt_count decrement.
---
 sysdeps/unix/sysv/linux/powerpc/elision-unlock.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
index 759c146..e3fe58e 100644
--- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
+++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
@@ -35,7 +35,7 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
 	 the mutex destruction requirements.  */
       short __tmp = atomic_load_relaxed (adapt_count);
       if (__tmp > 0)
-	atomic_store_relaxed (adapt_count, __tmp--);
+	atomic_store_relaxed (adapt_count, __tmp - 1);
 
       lll_unlock ((*lock), pshared);
     }
-- 
2.1.0

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2.25] powerpc: Fix adapt_count update in __lll_unlock_elision
  2017-01-16 13:23                           ` [PATCH 2.25] powerpc: Fix adapt_count update in __lll_unlock_elision Tulio Magno Quites Machado Filho
@ 2017-01-16 18:57                             ` Adhemerval Zanella
  2017-01-20 20:14                               ` Tulio Magno Quites Machado Filho
  0 siblings, 1 reply; 21+ messages in thread
From: Adhemerval Zanella @ 2017-01-16 18:57 UTC (permalink / raw)
  To: libc-alpha



On 16/01/2017 11:22, Tulio Magno Quites Machado Filho wrote:
> Thanks for catching this.
> 
> --- 8< ---
> 
> Commit e9a96ea1aca4ebaa7c86e8b83b766f118d689d0f had an error that
> prevents adapt_count from being updated in __lll_unlock_elision.
> 
> 2017-01-16  Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>
> 
> 	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> 	(__lll_unlock_elision): Fix adapt_count decrement.
> ---
>  sysdeps/unix/sysv/linux/powerpc/elision-unlock.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> index 759c146..e3fe58e 100644
> --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
> @@ -35,7 +35,7 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
>  	 the mutex destruction requirements.  */
>        short __tmp = atomic_load_relaxed (adapt_count);
>        if (__tmp > 0)
> -	atomic_store_relaxed (adapt_count, __tmp--);
> +	atomic_store_relaxed (adapt_count, __tmp - 1);
>  
>        lll_unlock ((*lock), pshared);
>      }
> 

LGTM.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2.25] powerpc: Fix adapt_count update in __lll_unlock_elision
  2017-01-16 18:57                             ` Adhemerval Zanella
@ 2017-01-20 20:14                               ` Tulio Magno Quites Machado Filho
  0 siblings, 0 replies; 21+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2017-01-20 20:14 UTC (permalink / raw)
  To: libc-alpha

Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:

> On 16/01/2017 11:22, Tulio Magno Quites Machado Filho wrote:
>> Commit e9a96ea1aca4ebaa7c86e8b83b766f118d689d0f had an error that
>> prevents adapt_count from being updated in __lll_unlock_elision.
>> 
>> 2017-01-16  Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>
>> 
>> 	* sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>> 	(__lll_unlock_elision): Fix adapt_count decrement.
>> ---
>>  sysdeps/unix/sysv/linux/powerpc/elision-unlock.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>> index 759c146..e3fe58e 100644
>> --- a/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>> +++ b/sysdeps/unix/sysv/linux/powerpc/elision-unlock.c
>> @@ -35,7 +35,7 @@ __lll_unlock_elision (int *lock, short *adapt_count, int pshared)
>>  	 the mutex destruction requirements.  */
>>        short __tmp = atomic_load_relaxed (adapt_count);
>>        if (__tmp > 0)
>> -	atomic_store_relaxed (adapt_count, __tmp--);
>> +	atomic_store_relaxed (adapt_count, __tmp - 1);
>>  
>>        lll_unlock ((*lock), pshared);
>>      }
>> 
>
> LGTM.

Pushed as eb1321f2.

Thanks,

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2017-01-20 20:14 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-16 15:54 [PATCH] powerpc: Fix race condition in lock elision Rajalakshmi Srinivasaraghavan
2016-11-17 14:51 ` Torvald Riegel
2016-12-09 19:39   ` [PATCHv2] powerpc: Fix write-after-destroy " Tulio Magno Quites Machado Filho
2016-12-12  4:36     ` Mike Frysinger
2016-12-21 18:03       ` [PATCHv3] " Tulio Magno Quites Machado Filho
2017-01-02 14:13         ` Ping " Tulio Magno Quites Machado Filho
2017-01-02 16:19         ` Wainer S. Moschetta
2017-01-02 17:04         ` Adhemerval Zanella
2017-01-02 17:22           ` Torvald Riegel
2017-01-02 17:25             ` Adhemerval Zanella
2017-01-02 17:39               ` Torvald Riegel
2017-01-02 18:15                 ` Adhemerval Zanella
2017-01-02 19:33                   ` Torvald Riegel
2017-01-03 12:35                     ` [PATCHv4] " Tulio Magno Quites Machado Filho
2017-01-03 13:02                       ` Torvald Riegel
2017-01-03 19:25                         ` Tulio Magno Quites Machado Filho
2017-01-11 15:22                       ` Ulrich Weigand
2017-01-11 16:41                         ` Andreas Schwab
2017-01-16 13:23                           ` [PATCH 2.25] powerpc: Fix adapt_count update in __lll_unlock_elision Tulio Magno Quites Machado Filho
2017-01-16 18:57                             ` Adhemerval Zanella
2017-01-20 20:14                               ` Tulio Magno Quites Machado Filho

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).