public inbox for libc-ports@sourceware.org
 help / color / mirror / Atom feed
* Re: [PATCH] Memory fencing problem in pthread cancellation
       [not found] <50F46969.1000305@redhat.com>
@ 2013-01-14 22:11 ` Carlos O'Donell
  2013-01-14 22:21   ` Joseph S. Myers
  2013-01-31 22:05   ` Jeff Law
  0 siblings, 2 replies; 6+ messages in thread
From: Carlos O'Donell @ 2013-01-14 22:11 UTC (permalink / raw)
  To: Jeff Law, Joseph S. Myers, Mike Frysinger
  Cc: libc-alpha, libc-ports, Marcus Shawcroft

On 01/14/2013 03:24 PM, Jeff Law wrote:
> Let's consider these two hunks of code in unwind-forcedunwind.c

I'm adding libc-ports, Joseph Myers, and Marcus Shawcroft
since this involves arm ARM devices that have a weakly ordered
memory model. I'm also adding Mike Frysinger since this involes
IA64.

See my review below.

> void
> __attribute_noinline__
> pthread_cancel_init (void)
> {
>   void *resume;
>   void *personality;
>   void *forcedunwind;
>   void *getcfa;
>   void *handle;
> 
>   if (__builtin_expect (libgcc_s_handle != NULL, 1))
>     {
>       /* Force gcc to reload all values.  */
>       asm volatile ("" ::: "memory");
>       return;
>     }
>   [ ... ]
> 
>   PTR_MANGLE (resume);
>   libgcc_s_resume = resume;
>   PTR_MANGLE (personality);
>   libgcc_s_personality = personality;
>   PTR_MANGLE (forcedunwind);
>   libgcc_s_forcedunwind = forcedunwind;
>   PTR_MANGLE (getcfa);
>   libgcc_s_getcfa = getcfa;
>   /* Make sure libgcc_s_handle is written last.  Otherwise,
>      pthread_cancel_init might return early even when the pointer the
>      caller is interested in is not initialized yet.  */
>   atomic_write_barrier ();
>   libgcc_s_handle = handle;
> }
> 
> void
> _Unwind_Resume (struct _Unwind_Exception *exc)
> {
>   if (__builtin_expect (libgcc_s_handle == NULL, 0))
>     pthread_cancel_init ();
>   else
>     atomic_read_barrier ();
> 
>   void (*resume) (struct _Unwind_Exception *exc) = libgcc_s_resume;
>   PTR_DEMANGLE (resume);
>   resume (exc);
> }
> 
> 
> If we're in Unwind_Resume and libgcc_s_handle is null, we call
> pthread_cancel_init. pthread_cancel_init first checks if
> libgcc_s_handle is non-null. If so, then pthread_cancel_init exits
> early. Upon returning to Unwind_Resume we'll proceed to load &
> demangle libgcc_s_resume and call it.
> 
> Note carefully in this case we never executed an atomic_read_barrier
> between the load of libgcc_s_handle and libgcc_s_resume loads. On a
> weakly ordered target such as power7, the processor might speculate
> the libgcc_s_resume load to a points prior to loading
> libgcc_s_handle.
> 
> So if another thread happens to update libgcc_s_handle between the
> loads & checks of libgcc_s_handle in Unwind_Resume and
> pthread_cancel_init, then we can end up using stale data for
> libgcc_s_resume and blow up.
> 
> This has been observed on a 32 processor power7 machine running 16
> instances of the attached testcase in parallel after a period of
> several hours.
> 
> There's a couple ways to fix this. One could be to eliminate the
> early return from pthread_cancel_init. Unfortunately there's a call
> to pthread_cancel_init from pthread_cancel. So every one of those
> calls would suffer a performance penalty unless libgcc_s_handle as
> exported from unwind-forcedunwind.c
> 
> Another is to add a call to atomic_read_barrier just before the early
> return from pthread_cancel_init. That's precisely what this patch
> does.
> 
> 
> While investigating, Carlos identified that the IA64 and ARM ports
> have their own unwind-forcedunwind.c implementations and that they
> were buggy in regards to fencing as well. Both fail to test
> libgcc_s_handle and do not have the necessary calls to
> atomic_read_barrier. This patch fixes those issues as well.
> 
> While I have extensively tested the generic unwind-forcedunwind.c
> change backported to a glibc-2.12 base, I have not checked the ARM or
> IA64 bits in any way.> 
> 
> 
> patch
> 
> 
> diff --git a/nptl/ChangeLog b/nptl/ChangeLog
> index 4aacc17..90d9408 100644
> --- a/nptl/ChangeLog
> +++ b/nptl/ChangeLog
> @@ -1,3 +1,8 @@
> +2013-01-14  Jeff Law  <law@redhat.com>
> +
> +	* sysdeps/pthrad/unwind-forcedunwind.c (pthread_cancel_init): Add
> +	missing call to atomic_read_barrier in early return path.
> +
>  2013-01-11  Carlos O'Donell  <codonell@redhat.com>
>  
>  	* allocatestack.c (allocate_stack): Add comment. Remove assert
> diff --git a/nptl/sysdeps/pthread/unwind-forcedunwind.c b/nptl/sysdeps/pthread/unwind-forcedunwind.c
> index 9718606..6b72e3e 100644
> --- a/nptl/sysdeps/pthread/unwind-forcedunwind.c
> +++ b/nptl/sysdeps/pthread/unwind-forcedunwind.c
> @@ -46,6 +46,10 @@ pthread_cancel_init (void)
>      {
>        /* Force gcc to reload all values.  */
>        asm volatile ("" ::: "memory");
> +      /* Order reads so as to prevent speculation of loads
> +	 of libgcc_s_{resume,personality,forcedunwind,getcfa}
> +	 to points prior to loading libgcc_s_handle.  */

I'd say:

s/loading libgcc_s_handle/the write barrier./g

It is the writes of the other thread to globals 
libgcc_s_{resume,personality,forcedunwind,getcfa}
which have not yet been seen by this thread which
we are synchronizing using the read barrier.

> +      atomic_read_barrier ();
>        return;
>      }
>  
> diff --git a/ports/ChangeLog.arm b/ports/ChangeLog.arm
> index d44ea76..cdd2210 100644
> --- a/ports/ChangeLog.arm
> +++ b/ports/ChangeLog.arm
> @@ -1,3 +1,13 @@
> +2013-01-14  Jeff Law  <law@redhat.com>
> +
> +	* sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> +	(pthread_cancel_init):  Add missing call to atomic_read_barrier
> +	in early return path.
> +	(__gcc_personality_v0): Guard call to pthread_cancel_init with
> +	a test of libgcc_s_handle, not libgcc_s_personality.  Add
> +	missing call to atomic_read_barrier.
> +	(_Unwind_ForcedUnwind, _Unwind_GetCFA): Similarly.
> +
>  2013-01-02  Joseph Myers  <joseph@codesourcery.com>
>  
>  	* All files with FSF copyright notices: Update copyright dates
> diff --git a/ports/ChangeLog.ia64 b/ports/ChangeLog.ia64
> index 28d5076..b6e1b9c 100644
> --- a/ports/ChangeLog.ia64
> +++ b/ports/ChangeLog.ia64
> @@ -1,3 +1,9 @@
> +2013-01-14  Jeff Law  <law@redhat.com>
> +
> +        * sysdeps/unix/sysv/linux/ia64/nptl/unwind-forcedunwind.c
> +        (_Unwind_GetBSP): Guard call to pthread_cancel_init with
> +        a test of libgcc_s_handle, not libgcc_s_getbsp.
> +
>  2013-01-02  Joseph Myers  <joseph@codesourcery.com>
>  
>  	* All files with FSF copyright notices: Update copyright dates
> diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> index 58ca9ac..1a081c8 100644
> --- a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> +++ b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c

Please also fix this incorrect comment:

 63   libgcc_s_getcfa = getcfa;
 64   /* Make sure libgcc_s_getcfa is written last.  Otherwise,
 65      pthread_cancel_init might return early even when the pointer the
 66      caller is interested in is not initialized yet.  */
 67   atomic_write_barrier ();
 68   libgcc_s_handle = handle;
 69 }

It should talk about libcc_s_handle not libgcc_s_getcfa.


> @@ -40,6 +40,10 @@ pthread_cancel_init (void)
>      {
>        /* Force gcc to reload all values.  */
>        asm volatile ("" ::: "memory");
> +     /* Order reads so as to prevent speculation of loads
> +       of libgcc_s_{resume,personality,forcedunwind,getcfa}
> +       to points prior to loading libgcc_s_handle.  */
> +      atomic_read_barrier ();

Same comment as above.

>        return;
>      }
>  
> @@ -132,8 +136,10 @@ __gcc_personality_v0 (_Unwind_State state,
>  		      struct _Unwind_Exception *ue_header,
>  		      struct _Unwind_Context *context)
>  {
> -  if (__builtin_expect (libgcc_s_personality == NULL, 0))
> +  if (__builtin_expect (libgcc_s_handle == NULL, 0))
>      pthread_cancel_init ();
> +  else
> +    atomic_read_barrier ();
>  
>    return libgcc_s_personality (state, ue_header, context);
>  }
> @@ -142,8 +148,10 @@ _Unwind_Reason_Code
>  _Unwind_ForcedUnwind (struct _Unwind_Exception *exc, _Unwind_Stop_Fn stop,
>  		      void *stop_argument)
>  {
> -  if (__builtin_expect (libgcc_s_forcedunwind == NULL, 0))
> +  if (__builtin_expect (libgcc_s_handle == NULL, 0))
>      pthread_cancel_init ();
> +  else
> +    atomic_read_barrier ();
>  
>    return libgcc_s_forcedunwind (exc, stop, stop_argument);
>  }
> @@ -151,8 +159,10 @@ _Unwind_ForcedUnwind (struct _Unwind_Exception *exc, _Unwind_Stop_Fn stop,
>  _Unwind_Word
>  _Unwind_GetCFA (struct _Unwind_Context *context)
>  {
> -  if (__builtin_expect (libgcc_s_getcfa == NULL, 0))
> +  if (__builtin_expect (libgcc_s_handle == NULL, 0))
>      pthread_cancel_init ();
> +  else
> +    atomic_read_barrier ();
>  
>    return libgcc_s_getcfa (context);
>  }

Not sufficient.

You need to fix the assembly also.

 83 /* It's vitally important that _Unwind_Resume not have a stack frame; the
 84    ARM unwinder relies on register state at entrance.  So we write this in
 85    assembly.  */
 86 
 87 asm (
 88 "       .globl  _Unwind_Resume\n"
 89 "       .type   _Unwind_Resume, %function\n"
 90 "_Unwind_Resume:\n"
 91 "       .cfi_sections .debug_frame\n"
 92 "       " CFI_STARTPROC "\n"
 93 "       stmfd   sp!, {r4, r5, r6, lr}\n"
 94 "       " CFI_ADJUST_CFA_OFFSET (16)" \n"
 95 "       " CFI_REL_OFFSET (r4, 0) "\n"
 96 "       " CFI_REL_OFFSET (r5, 4) "\n"
 97 "       " CFI_REL_OFFSET (r6, 8) "\n"
 98 "       " CFI_REL_OFFSET (lr, 12) "\n"
 99 "       " CFI_REMEMBER_STATE "\n"
100 "       ldr     r4, 1f\n"
101 "       ldr     r5, 2f\n"
102 "3:     add     r4, pc, r4\n"
103 "       ldr     r3, [r4, r5]\n"
104 "       mov     r6, r0\n"
105 "       cmp     r3, #0\n"
106 "       beq     4f\n"

The above assembly is equivalent to:

if (libgcc_s_resume == NULL)
  pthread_cancel_init ();

Which is missing the atomic_read_barrier ().

107 "5:     mov     r0, r6\n"
108 "       ldmfd   sp!, {r4, r5, r6, lr}\n"
109 "       " CFI_ADJUST_CFA_OFFSET (-16) "\n"
110 "       " CFI_RESTORE (r4) "\n"
111 "       " CFI_RESTORE (r5) "\n"
112 "       " CFI_RESTORE (r6) "\n"
113 "       " CFI_RESTORE (lr) "\n"
114 "       bx      r3\n"
115 "       " CFI_RESTORE_STATE "\n"
116 "4:     bl      pthread_cancel_init\n"
117 "       ldr     r3, [r4, r5]\n"
118 "       b       5b\n"
119 "       " CFI_ENDPROC "\n"
120 "       .align 2\n"
121 #ifdef __thumb2__
122 "1:     .word   _GLOBAL_OFFSET_TABLE_ - 3b - 4\n"
123 #else
124 "1:     .word   _GLOBAL_OFFSET_TABLE_ - 3b - 8\n"
125 #endif
126 "2:     .word   libgcc_s_resume(GOTOFF)\n"
127 "       .size   _Unwind_Resume, .-_Unwind_Resume\n"
128 );

Thus you need something like this completely untested patch:

diff --git a/ports/sysdeps/arm/bits/atomic.h b/ports/sysdeps/arm/bits/atomic.h
index 6e20df7..b0f45ad 100644
--- a/ports/sysdeps/arm/bits/atomic.h
+++ b/ports/sysdeps/arm/bits/atomic.h
@@ -42,6 +42,7 @@ void __arm_link_error (void);
 # define atomic_full_barrier() __sync_synchronize ()
 #else
 # define atomic_full_barrier() __arm_assisted_full_barrier ()
+# define atomic_asm_full_barrier() __arm_asm_assisted_full_barrier ()
 #endif
 
 /* An OS-specific bits/atomic.h file will define this macro if
diff --git a/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h b/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h
index c76b8f3..726d3ce 100644
--- a/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h
+++ b/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h
@@ -17,7 +17,8 @@
    <http://www.gnu.org/licenses/>.  */
 
 /* If the compiler doesn't provide a primitive, we'll use this macro
-   to get assistance from the kernel.  */
+   to get assistance from the kernel. This should call the
+   __kuser_memory_barrier helper in the kernel.  */
 #ifdef __thumb2__
 # define __arm_assisted_full_barrier() \
      __asm__ __volatile__                                                    \
@@ -25,6 +26,11 @@
              "movt\tip, #0xffff\n\t"                                         \
              "blx\tip"                                                       \
              : : : "ip", "lr", "cc", "memory");
+/* The asm variant is used as an insert into existing asm statements.  */
+# define __arm_asm_assisted_full_barrier() \
+       "       movw    ip, #0x0fa0\n" \
+       "       movt    ip, #0xffff\n" \
+       "       blx     ip"
 #else
 # define __arm_assisted_full_barrier() \
      __asm__ __volatile__                                                    \
@@ -32,6 +38,11 @@
              "mov\tlr, pc\n\t"                                               \
              "add\tpc, ip, #(0xffff0fa0 - 0xffff0fff)"                       \
              : : : "ip", "lr", "cc", "memory");
+/* The asm variant is used as an insert into existing asm statements.  */
+# define __arm_asm_assisted_full_barrier() \
+       "       mov     ip, #0xffff0fff\n" \
+       "       mov     lr, pc\n" \
+       "       add     pc, ip, #(0xffff0fa0 - 0xffff0fff)"
 #endif
 
 /* Atomic compare and exchange.  This sequence relies on the kernel to
diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
index 58ca9ac..d7ac847 100644
--- a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
+++ b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
@@ -104,6 +104,7 @@ asm (
 "      mov     r6, r0\n"
 "      cmp     r3, #0\n"
 "      beq     4f\n"
+       atomic_asm_full_barrier()
 "5:    mov     r0, r6\n"
 "      ldmfd   sp!, {r4, r5, r6, lr}\n"
 "      " CFI_ADJUST_CFA_OFFSET (-16) "\n"
@@ -123,7 +124,7 @@ asm (
 #else
 "1:    .word   _GLOBAL_OFFSET_TABLE_ - 3b - 8\n"
 #endif
-"2:    .word   libgcc_s_resume(GOTOFF)\n"
+"2:    .word   libgcc_s_handle(GOTOFF)\n"
 "      .size   _Unwind_Resume, .-_Unwind_Resume\n"
 );
 
---

Note: 
* As of ARMv6 we should really have used: `MCR p15, 0, r0, c7, c10, 5'
  instead of calling the kernel helper.
* As of ARMv7 we should really have used: `DMB' instead of calling the
  kernel barrier.
* As of ARMv7 we should be using DMB with specific completers 
  to implement atomic_read_barrier() and atomic_write_barrier().

> diff --git a/ports/sysdeps/unix/sysv/linux/ia64/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/ia64/nptl/unwind-forcedunwind.c
> index 8562afd..1d197e1 100644
> --- a/ports/sysdeps/unix/sysv/linux/ia64/nptl/unwind-forcedunwind.c
> +++ b/ports/sysdeps/unix/sysv/linux/ia64/nptl/unwind-forcedunwind.c
> @@ -31,8 +31,10 @@ static _Unwind_Word (*libgcc_s_getbsp) (struct _Unwind_Context *);
>  _Unwind_Word
>  _Unwind_GetBSP (struct _Unwind_Context *context)
>  {
> -  if (__builtin_expect (libgcc_s_getbsp == NULL, 0))
> +  if (__builtin_expect (libgcc_s_handle == NULL, 0))
>      pthread_cancel_init ();
> +  else
> +    atomic_read_barrier ();
>  
>    return libgcc_s_getbsp (context);
>  }

The IA64 fix looks OK to me.

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Memory fencing problem in pthread cancellation
  2013-01-14 22:11 ` [PATCH] Memory fencing problem in pthread cancellation Carlos O'Donell
@ 2013-01-14 22:21   ` Joseph S. Myers
  2013-01-14 22:58     ` Carlos O'Donell
  2013-01-31 22:05   ` Jeff Law
  1 sibling, 1 reply; 6+ messages in thread
From: Joseph S. Myers @ 2013-01-14 22:21 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Jeff Law, Mike Frysinger, libc-alpha, libc-ports, Marcus Shawcroft

On Mon, 14 Jan 2013, Carlos O'Donell wrote:

> diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> index 58ca9ac..d7ac847 100644
> --- a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> +++ b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> @@ -104,6 +104,7 @@ asm (
>  "      mov     r6, r0\n"
>  "      cmp     r3, #0\n"
>  "      beq     4f\n"
> +       atomic_asm_full_barrier()
>  "5:    mov     r0, r6\n"
>  "      ldmfd   sp!, {r4, r5, r6, lr}\n"
>  "      " CFI_ADJUST_CFA_OFFSET (-16) "\n"
> @@ -123,7 +124,7 @@ asm (
>  #else
>  "1:    .word   _GLOBAL_OFFSET_TABLE_ - 3b - 8\n"
>  #endif
> -"2:    .word   libgcc_s_resume(GOTOFF)\n"
> +"2:    .word   libgcc_s_handle(GOTOFF)\n"
>  "      .size   _Unwind_Resume, .-_Unwind_Resume\n"
>  );

That last change doesn't seem right.  Note the subsequent "bx r3" if r3 
isn't zero.  If you want to test libgcc_s_handle for being zero, you then 
need, separately (after the barrier?) to load the libgcc_s_resume value.

(Whatever you do in unwind-forcedunwind.c, presumably much the same change 
should go in unwind-resume.c as well.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Memory fencing problem in pthread cancellation
  2013-01-14 22:21   ` Joseph S. Myers
@ 2013-01-14 22:58     ` Carlos O'Donell
  2013-01-14 23:50       ` Joseph S. Myers
  2013-01-31 22:08       ` Jeff Law
  0 siblings, 2 replies; 6+ messages in thread
From: Carlos O'Donell @ 2013-01-14 22:58 UTC (permalink / raw)
  To: Joseph S. Myers
  Cc: Jeff Law, Mike Frysinger, libc-alpha, libc-ports, Marcus Shawcroft

On 01/14/2013 05:21 PM, Joseph S. Myers wrote:
> On Mon, 14 Jan 2013, Carlos O'Donell wrote:
> 
>> diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
>> index 58ca9ac..d7ac847 100644
>> --- a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
>> +++ b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
>> @@ -104,6 +104,7 @@ asm (
>>  "      mov     r6, r0\n"
>>  "      cmp     r3, #0\n"
>>  "      beq     4f\n"
>> +       atomic_asm_full_barrier()
>>  "5:    mov     r0, r6\n"
>>  "      ldmfd   sp!, {r4, r5, r6, lr}\n"
>>  "      " CFI_ADJUST_CFA_OFFSET (-16) "\n"
>> @@ -123,7 +124,7 @@ asm (
>>  #else
>>  "1:    .word   _GLOBAL_OFFSET_TABLE_ - 3b - 8\n"
>>  #endif
>> -"2:    .word   libgcc_s_resume(GOTOFF)\n"
>> +"2:    .word   libgcc_s_handle(GOTOFF)\n"
>>  "      .size   _Unwind_Resume, .-_Unwind_Resume\n"
>>  );
> 
> That last change doesn't seem right.  Note the subsequent "bx r3" if r3 
> isn't zero.  If you want to test libgcc_s_handle for being zero, you then 
> need, separately (after the barrier?) to load the libgcc_s_resume value.

You are correct, we need another entry for libgcc_s_resume, and we need
to load that after the barrier, otherwise we won't call libgcc_s_resume.

Something like this:

diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
index 58ca9ac..1e03b13 100644
--- a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
+++ b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
@@ -104,7 +104,12 @@ asm (
 "      mov     r6, r0\n"
 "      cmp     r3, #0\n"
 "      beq     4f\n"
-"5:    mov     r0, r6\n"
+       atomic_asm_full_barrier()
+"5:    ldr     r4, 6f\n"
+"      ldr     r5, 7f\n"
+"8:    add     r4, pc, r4\n"
+"      ldr     r3, [r4, r5]\n"
+"      mov     r0, r6\n"
 "      ldmfd   sp!, {r4, r5, r6, lr}\n"
 "      " CFI_ADJUST_CFA_OFFSET (-16) "\n"
 "      " CFI_RESTORE (r4) "\n"
@@ -123,7 +128,13 @@ asm (
 #else
 "1:    .word   _GLOBAL_OFFSET_TABLE_ - 3b - 8\n"
 #endif
-"2:    .word   libgcc_s_resume(GOTOFF)\n"
+"2:    .word   libgcc_s_handle(GOTOFF)\n"
+#ifdef __thumb2__
+"6:    .word   _GLOBAL_OFFSET_TABLE_ - 8b - 4\n"
+#else
+"6:    .word   _GLOBAL_OFFSET_TABLE_ - 8b - 8\n"
+#endif
+"7:    .word   libgcc_s_resume(GOTOFF)\n"
 "      .size   _Unwind_Resume, .-_Unwind_Resume\n"
 );
---

I was hoping that someone with an ARM cross-build environment
could test this and tell us how it goes? :-)

> (Whatever you do in unwind-forcedunwind.c, presumably much the same change 
> should go in unwind-resume.c as well.)

Yes, all the same changes are required in:

./sysdeps/gnu/unwind-resume.c
./ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-resume.c

I'd say something like this?

diff --git a/sysdeps/gnu/unwind-resume.c b/sysdeps/gnu/unwind-resume.c
index df845cd..fecfc94 100644
--- a/sysdeps/gnu/unwind-resume.c
+++ b/sysdeps/gnu/unwind-resume.c
@@ -21,6 +21,7 @@
 #include <unwind.h>
 #include <gnu/lib-names.h>
 
+static void *libgcc_s_handle;
 static void (*libgcc_s_resume) (struct _Unwind_Exception *exc);
 static _Unwind_Reason_Code (*libgcc_s_personality)
   (int, _Unwind_Action, _Unwind_Exception_Class, struct _Unwind_Exception *,
@@ -41,13 +42,25 @@ init (void)
 
   libgcc_s_resume = resume;
   libgcc_s_personality = personality;
+  atomic_write_barrier ();
+  /* At the point at which any thread writes the handle
+     to libgcc_s_handle, the initialization is complete.
+     The writing of libgcc_s_handle is atomic. All other
+     threads reading libgcc_s_handle do so atomically. Any
+     thread that does not execute this function must issue
+     a read barrier to ensure that all of the above has
+     actually completed and that the values of the
+     function pointers are correct.   */
+  libgcc_s_handle = handle;
 }
 
 void
 _Unwind_Resume (struct _Unwind_Exception *exc)
 {
-  if (__builtin_expect (libgcc_s_resume == NULL, 0))
+  if (__builtin_expect (libgcc_s_handle == NULL, 0))
     init ();
+  else
+    atomic_read_barrier ();
   libgcc_s_resume (exc);
 }
 
@@ -57,8 +70,10 @@ __gcc_personality_v0 (int version, _Unwind_Action actions,
                       struct _Unwind_Exception *ue_header,
                       struct _Unwind_Context *context)
 {
-  if (__builtin_expect (libgcc_s_personality == NULL, 0))
+  if (__builtin_expect (libgcc_s_handle == NULL, 0))
     init ();
+  else
+    atomic_read_barrier ();
   return libgcc_s_personality (version, actions, exception_class,
                               ue_header, context);
 }
---
diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-resume.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-resume.c
index 0a3ad95..a67190d 100644
--- a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-resume.c
+++ b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-resume.c
@@ -20,6 +20,7 @@
 #include <stdio.h>
 #include <unwind.h>
 
+static void *libgcc_s_handle;
 static void (*libgcc_s_resume) (struct _Unwind_Exception *exc);
 static _Unwind_Reason_Code (*libgcc_s_personality)
   (_Unwind_State, struct _Unwind_Exception *, struct _Unwind_Context *);
@@ -41,6 +42,16 @@ init (void)
 
   libgcc_s_resume = resume;
   libgcc_s_personality = personality;
+  atomic_write_barrier ();
+  /* At the point at which any thread writes the handle
+     to libgcc_s_handle, the initialization is complete.
+     The writing of libgcc_s_handle is atomic. All other
+     threads reading libgcc_s_handle do so atomically. Any
+     thread that does not execute this function must issue
+     a read barrier to ensure that all of the above has
+     actually completed and that the values of the
+     function pointers are correct.   */
+  libgcc_s_handle = handle;
 }
 
 /* It's vitally important that _Unwind_Resume not have a stack frame; the
@@ -67,7 +78,12 @@ asm (
 "      mov     r6, r0\n"
 "      cmp     r3, #0\n"
 "      beq     4f\n"
-"5:    mov     r0, r6\n"
+       atomic_asm_full_barrier()
+"5:    ldr     r4, 6f\n"
+"      ldr     r5, 7f\n"
+"8:    add     r4, pc, r4\n"
+"      ldr     r3, [r4, r5]\n"
+"      mov     r0, r6\n"
 "      ldmfd   sp!, {r4, r5, r6, lr}\n"
 "      " CFI_ADJUST_CFA_OFFSET (-16) "\n"
 "      " CFI_RESTORE (r4) "\n"
@@ -76,7 +92,7 @@ asm (
 "      " CFI_RESTORE (lr) "\n"
 "      bx      r3\n"
 "      " CFI_RESTORE_STATE "\n"
-"4:    bl      init\n"
+"4:    bl      pthread_cancel_init\n"
 "      ldr     r3, [r4, r5]\n"
 "      b       5b\n"
 "      " CFI_ENDPROC "\n"
@@ -86,7 +102,13 @@ asm (
 #else
 "1:    .word   _GLOBAL_OFFSET_TABLE_ - 3b - 8\n"
 #endif
-"2:    .word   libgcc_s_resume(GOTOFF)\n"
+"2:    .word   libgcc_s_handle(GOTOFF)\n"
+#ifdef __thumb2__
+"6:    .word   _GLOBAL_OFFSET_TABLE_ - 8b - 4\n"
+#else
+"6:    .word   _GLOBAL_OFFSET_TABLE_ - 8b - 8\n"
+#endif
+"7:    .word   libgcc_s_resume(GOTOFF)\n"
 "      .size   _Unwind_Resume, .-_Unwind_Resume\n"
 );
 
@@ -95,7 +117,9 @@ __gcc_personality_v0 (_Unwind_State state,
                      struct _Unwind_Exception *ue_header,
                      struct _Unwind_Context *context)
 {
-  if (__builtin_expect (libgcc_s_personality == NULL, 0))
+  if (__builtin_expect (libgcc_s_handle == NULL, 0))
     init ();
+  else
+    atomic_read_barrier ();
   return libgcc_s_personality (state, ue_header, context);
 }
---

It looks like the assembly in unwind-resume.c and unwind-forcedunwind.c
are identical and should be shared?

Cheers,
Carlos.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Memory fencing problem in pthread cancellation
  2013-01-14 22:58     ` Carlos O'Donell
@ 2013-01-14 23:50       ` Joseph S. Myers
  2013-01-31 22:08       ` Jeff Law
  1 sibling, 0 replies; 6+ messages in thread
From: Joseph S. Myers @ 2013-01-14 23:50 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Jeff Law, Mike Frysinger, libc-alpha, libc-ports, Marcus Shawcroft

On Mon, 14 Jan 2013, Carlos O'Donell wrote:

> It looks like the assembly in unwind-resume.c and unwind-forcedunwind.c
> are identical and should be shared?

They call different functions (init versus pthread_cancel_init).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Memory fencing problem in pthread cancellation
  2013-01-14 22:11 ` [PATCH] Memory fencing problem in pthread cancellation Carlos O'Donell
  2013-01-14 22:21   ` Joseph S. Myers
@ 2013-01-31 22:05   ` Jeff Law
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff Law @ 2013-01-31 22:05 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Joseph S. Myers, Mike Frysinger, libc-alpha, libc-ports,
	Marcus Shawcroft

On 01/14/2013 03:11 PM, Carlos O'Donell wrote:
> On 01/14/2013 03:24 PM, Jeff Law wrote:
>
>
> You need to fix the assembly also.
[ ... ]

>
> Thus you need something like this completely untested patch:
>
> diff --git a/ports/sysdeps/arm/bits/atomic.h b/ports/sysdeps/arm/bits/atomic.h
> index 6e20df7..b0f45ad 100644
> --- a/ports/sysdeps/arm/bits/atomic.h
> +++ b/ports/sysdeps/arm/bits/atomic.h
> @@ -42,6 +42,7 @@ void __arm_link_error (void);
>   # define atomic_full_barrier() __sync_synchronize ()
>   #else
>   # define atomic_full_barrier() __arm_assisted_full_barrier ()
> +# define atomic_asm_full_barrier() __arm_asm_assisted_full_barrier ()
>   #endif
Added.

>
>   /* An OS-specific bits/atomic.h file will define this macro if
> diff --git a/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h b/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h
> index c76b8f3..726d3ce 100644
> --- a/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h
> +++ b/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h
> @@ -17,7 +17,8 @@
>      <http://www.gnu.org/licenses/>.  */
>
>   /* If the compiler doesn't provide a primitive, we'll use this macro
> -   to get assistance from the kernel.  */
> +   to get assistance from the kernel. This should call the
> +   __kuser_memory_barrier helper in the kernel.  */
>   #ifdef __thumb2__
>   # define __arm_assisted_full_barrier() \
>        __asm__ __volatile__                                                    \
> @@ -25,6 +26,11 @@
>                "movt\tip, #0xffff\n\t"                                         \
>                "blx\tip"                                                       \
>                : : : "ip", "lr", "cc", "memory");
> +/* The asm variant is used as an insert into existing asm statements.  */
> +# define __arm_asm_assisted_full_barrier() \
> +       "       movw    ip, #0x0fa0\n" \
> +       "       movt    ip, #0xffff\n" \
> +       "       blx     ip"
>   #else
>   # define __arm_assisted_full_barrier() \
>        __asm__ __volatile__                                                    \
> @@ -32,6 +38,11 @@
>                "mov\tlr, pc\n\t"                                               \
>                "add\tpc, ip, #(0xffff0fa0 - 0xffff0fff)"                       \
>                : : : "ip", "lr", "cc", "memory");
> +/* The asm variant is used as an insert into existing asm statements.  */
> +# define __arm_asm_assisted_full_barrier() \
> +       "       mov     ip, #0xffff0fff\n" \
> +       "       mov     lr, pc\n" \
> +       "       add     pc, ip, #(0xffff0fa0 - 0xffff0fff)"
>   #endif
Added.


>
>   /* Atomic compare and exchange.  This sequence relies on the kernel to
> diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> index 58ca9ac..d7ac847 100644
> --- a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> +++ b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> @@ -104,6 +104,7 @@ asm (
>   "      mov     r6, r0\n"
>   "      cmp     r3, #0\n"
>   "      beq     4f\n"
> +       atomic_asm_full_barrier()
>   "5:    mov     r0, r6\n"
>   "      ldmfd   sp!, {r4, r5, r6, lr}\n"
>   "      " CFI_ADJUST_CFA_OFFSET (-16) "\n"
> @@ -123,7 +124,7 @@ asm (
>   #else
>   "1:    .word   _GLOBAL_OFFSET_TABLE_ - 3b - 8\n"
>   #endif
> -"2:    .word   libgcc_s_resume(GOTOFF)\n"
> +"2:    .word   libgcc_s_handle(GOTOFF)\n"
>   "      .size   _Unwind_Resume, .-_Unwind_Resume\n"
>   );
Not added as there's a later revision.

jeff


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] Memory fencing problem in pthread cancellation
  2013-01-14 22:58     ` Carlos O'Donell
  2013-01-14 23:50       ` Joseph S. Myers
@ 2013-01-31 22:08       ` Jeff Law
  1 sibling, 0 replies; 6+ messages in thread
From: Jeff Law @ 2013-01-31 22:08 UTC (permalink / raw)
  To: Carlos O'Donell
  Cc: Joseph S. Myers, Mike Frysinger, libc-alpha, libc-ports,
	Marcus Shawcroft

On 01/14/2013 03:58 PM, Carlos O'Donell wrote:
> You are correct, we need another entry for libgcc_s_resume, and we need
> to load that after the barrier, otherwise we won't call libgcc_s_resume.
>
> Something like this:
>
> diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> index 58ca9ac..1e03b13 100644
> --- a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> +++ b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> @@ -104,7 +104,12 @@ asm (
>   "      mov     r6, r0\n"
>   "      cmp     r3, #0\n"
>   "      beq     4f\n"
> -"5:    mov     r0, r6\n"
> +       atomic_asm_full_barrier()
> +"5:    ldr     r4, 6f\n"
> +"      ldr     r5, 7f\n"
> +"8:    add     r4, pc, r4\n"
> +"      ldr     r3, [r4, r5]\n"
> +"      mov     r0, r6\n"
>   "      ldmfd   sp!, {r4, r5, r6, lr}\n"
>   "      " CFI_ADJUST_CFA_OFFSET (-16) "\n"
>   "      " CFI_RESTORE (r4) "\n"
> @@ -123,7 +128,13 @@ asm (
>   #else
>   "1:    .word   _GLOBAL_OFFSET_TABLE_ - 3b - 8\n"
>   #endif
> -"2:    .word   libgcc_s_resume(GOTOFF)\n"
> +"2:    .word   libgcc_s_handle(GOTOFF)\n"
> +#ifdef __thumb2__
> +"6:    .word   _GLOBAL_OFFSET_TABLE_ - 8b - 4\n"
> +#else
> +"6:    .word   _GLOBAL_OFFSET_TABLE_ - 8b - 8\n"
> +#endif
> +"7:    .word   libgcc_s_resume(GOTOFF)\n"
>   "      .size   _Unwind_Resume, .-_Unwind_Resume\n"
>   );
Added.  Likewise for the other changes.  With one tweak:

[ ... ]

> @@ -76,7 +92,7 @@ asm (
>   "      " CFI_RESTORE (lr) "\n"
>   "      bx      r3\n"
>   "      " CFI_RESTORE_STATE "\n"
> -"4:    bl      init\n"
> +"4:    bl      pthread_cancel_init\n"
>   "      ldr     r3, [r4, r5]\n"
>   "      b       5b\n"
>   "      " CFI_ENDPROC "\n"
I got the impression via a later message from Joseph that this code was 
supposed to call init rather than pthread_cancel_init.  So I didn't 
apply that hunk.

Updated patch with all the updates/fixes to be posted tomorrow AM.

Jeff

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2013-01-31 22:08 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <50F46969.1000305@redhat.com>
2013-01-14 22:11 ` [PATCH] Memory fencing problem in pthread cancellation Carlos O'Donell
2013-01-14 22:21   ` Joseph S. Myers
2013-01-14 22:58     ` Carlos O'Donell
2013-01-14 23:50       ` Joseph S. Myers
2013-01-31 22:08       ` Jeff Law
2013-01-31 22:05   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).