From: Carlos O'Donell <carlos@systemhalted.org>
To: Jeff Law <law@redhat.com>,
"Joseph S. Myers" <joseph@codesourcery.com>,
Mike Frysinger <vapier@gentoo.org>
Cc: libc-alpha <libc-alpha@sourceware.org>,
libc-ports@sourceware.org,
"Marcus Shawcroft" <marcus.shawcroft@linaro.org>
Subject: Re: [PATCH] Memory fencing problem in pthread cancellation
Date: Mon, 14 Jan 2013 22:11:00 -0000 [thread overview]
Message-ID: <50F48281.3070207@systemhalted.org> (raw)
In-Reply-To: <50F46969.1000305@redhat.com>
On 01/14/2013 03:24 PM, Jeff Law wrote:
> Let's consider these two hunks of code in unwind-forcedunwind.c
I'm adding libc-ports, Joseph Myers, and Marcus Shawcroft
since this involves arm ARM devices that have a weakly ordered
memory model. I'm also adding Mike Frysinger since this involes
IA64.
See my review below.
> void
> __attribute_noinline__
> pthread_cancel_init (void)
> {
> void *resume;
> void *personality;
> void *forcedunwind;
> void *getcfa;
> void *handle;
>
> if (__builtin_expect (libgcc_s_handle != NULL, 1))
> {
> /* Force gcc to reload all values. */
> asm volatile ("" ::: "memory");
> return;
> }
> [ ... ]
>
> PTR_MANGLE (resume);
> libgcc_s_resume = resume;
> PTR_MANGLE (personality);
> libgcc_s_personality = personality;
> PTR_MANGLE (forcedunwind);
> libgcc_s_forcedunwind = forcedunwind;
> PTR_MANGLE (getcfa);
> libgcc_s_getcfa = getcfa;
> /* Make sure libgcc_s_handle is written last. Otherwise,
> pthread_cancel_init might return early even when the pointer the
> caller is interested in is not initialized yet. */
> atomic_write_barrier ();
> libgcc_s_handle = handle;
> }
>
> void
> _Unwind_Resume (struct _Unwind_Exception *exc)
> {
> if (__builtin_expect (libgcc_s_handle == NULL, 0))
> pthread_cancel_init ();
> else
> atomic_read_barrier ();
>
> void (*resume) (struct _Unwind_Exception *exc) = libgcc_s_resume;
> PTR_DEMANGLE (resume);
> resume (exc);
> }
>
>
> If we're in Unwind_Resume and libgcc_s_handle is null, we call
> pthread_cancel_init. pthread_cancel_init first checks if
> libgcc_s_handle is non-null. If so, then pthread_cancel_init exits
> early. Upon returning to Unwind_Resume we'll proceed to load &
> demangle libgcc_s_resume and call it.
>
> Note carefully in this case we never executed an atomic_read_barrier
> between the load of libgcc_s_handle and libgcc_s_resume loads. On a
> weakly ordered target such as power7, the processor might speculate
> the libgcc_s_resume load to a points prior to loading
> libgcc_s_handle.
>
> So if another thread happens to update libgcc_s_handle between the
> loads & checks of libgcc_s_handle in Unwind_Resume and
> pthread_cancel_init, then we can end up using stale data for
> libgcc_s_resume and blow up.
>
> This has been observed on a 32 processor power7 machine running 16
> instances of the attached testcase in parallel after a period of
> several hours.
>
> There's a couple ways to fix this. One could be to eliminate the
> early return from pthread_cancel_init. Unfortunately there's a call
> to pthread_cancel_init from pthread_cancel. So every one of those
> calls would suffer a performance penalty unless libgcc_s_handle as
> exported from unwind-forcedunwind.c
>
> Another is to add a call to atomic_read_barrier just before the early
> return from pthread_cancel_init. That's precisely what this patch
> does.
>
>
> While investigating, Carlos identified that the IA64 and ARM ports
> have their own unwind-forcedunwind.c implementations and that they
> were buggy in regards to fencing as well. Both fail to test
> libgcc_s_handle and do not have the necessary calls to
> atomic_read_barrier. This patch fixes those issues as well.
>
> While I have extensively tested the generic unwind-forcedunwind.c
> change backported to a glibc-2.12 base, I have not checked the ARM or
> IA64 bits in any way.>
>
>
> patch
>
>
> diff --git a/nptl/ChangeLog b/nptl/ChangeLog
> index 4aacc17..90d9408 100644
> --- a/nptl/ChangeLog
> +++ b/nptl/ChangeLog
> @@ -1,3 +1,8 @@
> +2013-01-14 Jeff Law <law@redhat.com>
> +
> + * sysdeps/pthrad/unwind-forcedunwind.c (pthread_cancel_init): Add
> + missing call to atomic_read_barrier in early return path.
> +
> 2013-01-11 Carlos O'Donell <codonell@redhat.com>
>
> * allocatestack.c (allocate_stack): Add comment. Remove assert
> diff --git a/nptl/sysdeps/pthread/unwind-forcedunwind.c b/nptl/sysdeps/pthread/unwind-forcedunwind.c
> index 9718606..6b72e3e 100644
> --- a/nptl/sysdeps/pthread/unwind-forcedunwind.c
> +++ b/nptl/sysdeps/pthread/unwind-forcedunwind.c
> @@ -46,6 +46,10 @@ pthread_cancel_init (void)
> {
> /* Force gcc to reload all values. */
> asm volatile ("" ::: "memory");
> + /* Order reads so as to prevent speculation of loads
> + of libgcc_s_{resume,personality,forcedunwind,getcfa}
> + to points prior to loading libgcc_s_handle. */
I'd say:
s/loading libgcc_s_handle/the write barrier./g
It is the writes of the other thread to globals
libgcc_s_{resume,personality,forcedunwind,getcfa}
which have not yet been seen by this thread which
we are synchronizing using the read barrier.
> + atomic_read_barrier ();
> return;
> }
>
> diff --git a/ports/ChangeLog.arm b/ports/ChangeLog.arm
> index d44ea76..cdd2210 100644
> --- a/ports/ChangeLog.arm
> +++ b/ports/ChangeLog.arm
> @@ -1,3 +1,13 @@
> +2013-01-14 Jeff Law <law@redhat.com>
> +
> + * sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> + (pthread_cancel_init): Add missing call to atomic_read_barrier
> + in early return path.
> + (__gcc_personality_v0): Guard call to pthread_cancel_init with
> + a test of libgcc_s_handle, not libgcc_s_personality. Add
> + missing call to atomic_read_barrier.
> + (_Unwind_ForcedUnwind, _Unwind_GetCFA): Similarly.
> +
> 2013-01-02 Joseph Myers <joseph@codesourcery.com>
>
> * All files with FSF copyright notices: Update copyright dates
> diff --git a/ports/ChangeLog.ia64 b/ports/ChangeLog.ia64
> index 28d5076..b6e1b9c 100644
> --- a/ports/ChangeLog.ia64
> +++ b/ports/ChangeLog.ia64
> @@ -1,3 +1,9 @@
> +2013-01-14 Jeff Law <law@redhat.com>
> +
> + * sysdeps/unix/sysv/linux/ia64/nptl/unwind-forcedunwind.c
> + (_Unwind_GetBSP): Guard call to pthread_cancel_init with
> + a test of libgcc_s_handle, not libgcc_s_getbsp.
> +
> 2013-01-02 Joseph Myers <joseph@codesourcery.com>
>
> * All files with FSF copyright notices: Update copyright dates
> diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> index 58ca9ac..1a081c8 100644
> --- a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
> +++ b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
Please also fix this incorrect comment:
63 libgcc_s_getcfa = getcfa;
64 /* Make sure libgcc_s_getcfa is written last. Otherwise,
65 pthread_cancel_init might return early even when the pointer the
66 caller is interested in is not initialized yet. */
67 atomic_write_barrier ();
68 libgcc_s_handle = handle;
69 }
It should talk about libcc_s_handle not libgcc_s_getcfa.
> @@ -40,6 +40,10 @@ pthread_cancel_init (void)
> {
> /* Force gcc to reload all values. */
> asm volatile ("" ::: "memory");
> + /* Order reads so as to prevent speculation of loads
> + of libgcc_s_{resume,personality,forcedunwind,getcfa}
> + to points prior to loading libgcc_s_handle. */
> + atomic_read_barrier ();
Same comment as above.
> return;
> }
>
> @@ -132,8 +136,10 @@ __gcc_personality_v0 (_Unwind_State state,
> struct _Unwind_Exception *ue_header,
> struct _Unwind_Context *context)
> {
> - if (__builtin_expect (libgcc_s_personality == NULL, 0))
> + if (__builtin_expect (libgcc_s_handle == NULL, 0))
> pthread_cancel_init ();
> + else
> + atomic_read_barrier ();
>
> return libgcc_s_personality (state, ue_header, context);
> }
> @@ -142,8 +148,10 @@ _Unwind_Reason_Code
> _Unwind_ForcedUnwind (struct _Unwind_Exception *exc, _Unwind_Stop_Fn stop,
> void *stop_argument)
> {
> - if (__builtin_expect (libgcc_s_forcedunwind == NULL, 0))
> + if (__builtin_expect (libgcc_s_handle == NULL, 0))
> pthread_cancel_init ();
> + else
> + atomic_read_barrier ();
>
> return libgcc_s_forcedunwind (exc, stop, stop_argument);
> }
> @@ -151,8 +159,10 @@ _Unwind_ForcedUnwind (struct _Unwind_Exception *exc, _Unwind_Stop_Fn stop,
> _Unwind_Word
> _Unwind_GetCFA (struct _Unwind_Context *context)
> {
> - if (__builtin_expect (libgcc_s_getcfa == NULL, 0))
> + if (__builtin_expect (libgcc_s_handle == NULL, 0))
> pthread_cancel_init ();
> + else
> + atomic_read_barrier ();
>
> return libgcc_s_getcfa (context);
> }
Not sufficient.
You need to fix the assembly also.
83 /* It's vitally important that _Unwind_Resume not have a stack frame; the
84 ARM unwinder relies on register state at entrance. So we write this in
85 assembly. */
86
87 asm (
88 " .globl _Unwind_Resume\n"
89 " .type _Unwind_Resume, %function\n"
90 "_Unwind_Resume:\n"
91 " .cfi_sections .debug_frame\n"
92 " " CFI_STARTPROC "\n"
93 " stmfd sp!, {r4, r5, r6, lr}\n"
94 " " CFI_ADJUST_CFA_OFFSET (16)" \n"
95 " " CFI_REL_OFFSET (r4, 0) "\n"
96 " " CFI_REL_OFFSET (r5, 4) "\n"
97 " " CFI_REL_OFFSET (r6, 8) "\n"
98 " " CFI_REL_OFFSET (lr, 12) "\n"
99 " " CFI_REMEMBER_STATE "\n"
100 " ldr r4, 1f\n"
101 " ldr r5, 2f\n"
102 "3: add r4, pc, r4\n"
103 " ldr r3, [r4, r5]\n"
104 " mov r6, r0\n"
105 " cmp r3, #0\n"
106 " beq 4f\n"
The above assembly is equivalent to:
if (libgcc_s_resume == NULL)
pthread_cancel_init ();
Which is missing the atomic_read_barrier ().
107 "5: mov r0, r6\n"
108 " ldmfd sp!, {r4, r5, r6, lr}\n"
109 " " CFI_ADJUST_CFA_OFFSET (-16) "\n"
110 " " CFI_RESTORE (r4) "\n"
111 " " CFI_RESTORE (r5) "\n"
112 " " CFI_RESTORE (r6) "\n"
113 " " CFI_RESTORE (lr) "\n"
114 " bx r3\n"
115 " " CFI_RESTORE_STATE "\n"
116 "4: bl pthread_cancel_init\n"
117 " ldr r3, [r4, r5]\n"
118 " b 5b\n"
119 " " CFI_ENDPROC "\n"
120 " .align 2\n"
121 #ifdef __thumb2__
122 "1: .word _GLOBAL_OFFSET_TABLE_ - 3b - 4\n"
123 #else
124 "1: .word _GLOBAL_OFFSET_TABLE_ - 3b - 8\n"
125 #endif
126 "2: .word libgcc_s_resume(GOTOFF)\n"
127 " .size _Unwind_Resume, .-_Unwind_Resume\n"
128 );
Thus you need something like this completely untested patch:
diff --git a/ports/sysdeps/arm/bits/atomic.h b/ports/sysdeps/arm/bits/atomic.h
index 6e20df7..b0f45ad 100644
--- a/ports/sysdeps/arm/bits/atomic.h
+++ b/ports/sysdeps/arm/bits/atomic.h
@@ -42,6 +42,7 @@ void __arm_link_error (void);
# define atomic_full_barrier() __sync_synchronize ()
#else
# define atomic_full_barrier() __arm_assisted_full_barrier ()
+# define atomic_asm_full_barrier() __arm_asm_assisted_full_barrier ()
#endif
/* An OS-specific bits/atomic.h file will define this macro if
diff --git a/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h b/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h
index c76b8f3..726d3ce 100644
--- a/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h
+++ b/ports/sysdeps/unix/sysv/linux/arm/bits/atomic.h
@@ -17,7 +17,8 @@
<http://www.gnu.org/licenses/>. */
/* If the compiler doesn't provide a primitive, we'll use this macro
- to get assistance from the kernel. */
+ to get assistance from the kernel. This should call the
+ __kuser_memory_barrier helper in the kernel. */
#ifdef __thumb2__
# define __arm_assisted_full_barrier() \
__asm__ __volatile__ \
@@ -25,6 +26,11 @@
"movt\tip, #0xffff\n\t" \
"blx\tip" \
: : : "ip", "lr", "cc", "memory");
+/* The asm variant is used as an insert into existing asm statements. */
+# define __arm_asm_assisted_full_barrier() \
+ " movw ip, #0x0fa0\n" \
+ " movt ip, #0xffff\n" \
+ " blx ip"
#else
# define __arm_assisted_full_barrier() \
__asm__ __volatile__ \
@@ -32,6 +38,11 @@
"mov\tlr, pc\n\t" \
"add\tpc, ip, #(0xffff0fa0 - 0xffff0fff)" \
: : : "ip", "lr", "cc", "memory");
+/* The asm variant is used as an insert into existing asm statements. */
+# define __arm_asm_assisted_full_barrier() \
+ " mov ip, #0xffff0fff\n" \
+ " mov lr, pc\n" \
+ " add pc, ip, #(0xffff0fa0 - 0xffff0fff)"
#endif
/* Atomic compare and exchange. This sequence relies on the kernel to
diff --git a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
index 58ca9ac..d7ac847 100644
--- a/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
+++ b/ports/sysdeps/unix/sysv/linux/arm/nptl/unwind-forcedunwind.c
@@ -104,6 +104,7 @@ asm (
" mov r6, r0\n"
" cmp r3, #0\n"
" beq 4f\n"
+ atomic_asm_full_barrier()
"5: mov r0, r6\n"
" ldmfd sp!, {r4, r5, r6, lr}\n"
" " CFI_ADJUST_CFA_OFFSET (-16) "\n"
@@ -123,7 +124,7 @@ asm (
#else
"1: .word _GLOBAL_OFFSET_TABLE_ - 3b - 8\n"
#endif
-"2: .word libgcc_s_resume(GOTOFF)\n"
+"2: .word libgcc_s_handle(GOTOFF)\n"
" .size _Unwind_Resume, .-_Unwind_Resume\n"
);
---
Note:
* As of ARMv6 we should really have used: `MCR p15, 0, r0, c7, c10, 5'
instead of calling the kernel helper.
* As of ARMv7 we should really have used: `DMB' instead of calling the
kernel barrier.
* As of ARMv7 we should be using DMB with specific completers
to implement atomic_read_barrier() and atomic_write_barrier().
> diff --git a/ports/sysdeps/unix/sysv/linux/ia64/nptl/unwind-forcedunwind.c b/ports/sysdeps/unix/sysv/linux/ia64/nptl/unwind-forcedunwind.c
> index 8562afd..1d197e1 100644
> --- a/ports/sysdeps/unix/sysv/linux/ia64/nptl/unwind-forcedunwind.c
> +++ b/ports/sysdeps/unix/sysv/linux/ia64/nptl/unwind-forcedunwind.c
> @@ -31,8 +31,10 @@ static _Unwind_Word (*libgcc_s_getbsp) (struct _Unwind_Context *);
> _Unwind_Word
> _Unwind_GetBSP (struct _Unwind_Context *context)
> {
> - if (__builtin_expect (libgcc_s_getbsp == NULL, 0))
> + if (__builtin_expect (libgcc_s_handle == NULL, 0))
> pthread_cancel_init ();
> + else
> + atomic_read_barrier ();
>
> return libgcc_s_getbsp (context);
> }
The IA64 fix looks OK to me.
Cheers,
Carlos.
next parent reply other threads:[~2013-01-14 22:11 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <50F46969.1000305@redhat.com>
2013-01-14 22:11 ` Carlos O'Donell [this message]
2013-01-14 22:21 ` Joseph S. Myers
2013-01-14 22:58 ` Carlos O'Donell
2013-01-14 23:50 ` Joseph S. Myers
2013-01-31 22:08 ` Jeff Law
2013-01-31 22:05 ` Jeff Law
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50F48281.3070207@systemhalted.org \
--to=carlos@systemhalted.org \
--cc=joseph@codesourcery.com \
--cc=law@redhat.com \
--cc=libc-alpha@sourceware.org \
--cc=libc-ports@sourceware.org \
--cc=marcus.shawcroft@linaro.org \
--cc=vapier@gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).