public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed
* [PATCH] powerpc: Use lwsync on 64bit
@ 2017-02-12 23:22 Anton Blanchard
  2017-02-13 13:42 ` Tulio Magno Quites Machado Filho
  2017-02-13 14:24 ` Carlos Eduardo Seo
  0 siblings, 2 replies; 6+ messages in thread
From: Anton Blanchard @ 2017-02-12 23:22 UTC (permalink / raw)
  To: tuliom, cseo; +Cc: libc-alpha

Either an isync or an lwsync can be used as an acquire barrier after
a larx/stcx/bne sequence. All 64bit CPUs support lwsync and since the
isync instruction has other side effects that we don't need, use lwsync.

2017-02-12  Anton Blanchard  <anton@samba.org>

	* sysdeps/powerpc/atomic-machine.h: Allow __ARCH_ACQ_INSTR to be
	overridden.
	* sysdeps/powerpc/powerpc64/atomic-machine.h: define __ARCH_ACQ_INSTR

diff --git a/sysdeps/powerpc/atomic-machine.h b/sysdeps/powerpc/atomic-machine.h
index 0a58203..31d67fa 100644
--- a/sysdeps/powerpc/atomic-machine.h
+++ b/sysdeps/powerpc/atomic-machine.h
@@ -57,7 +57,9 @@ typedef uintmax_t uatomic_max_t;
 # define __ARCH_ACQ_INSTR	""
 # define __ARCH_REL_INSTR	""
 #else
-# define __ARCH_ACQ_INSTR	"isync"
+# ifndef __ARCH_ACQ_INSTR
+#  define __ARCH_ACQ_INSTR	"isync"
+# endif
 # ifndef __ARCH_REL_INSTR
 #  define __ARCH_REL_INSTR	"sync"
 # endif
diff --git a/sysdeps/powerpc/powerpc64/atomic-machine.h b/sysdeps/powerpc/powerpc64/atomic-machine.h
index 40c308e..76c586a 100644
--- a/sysdeps/powerpc/powerpc64/atomic-machine.h
+++ b/sysdeps/powerpc/powerpc64/atomic-machine.h
@@ -230,6 +230,7 @@
  * "light weight" sync can also be used for the release barrier.
  */
 #ifndef UP
+# define __ARCH_ACQ_INSTR	"lwsync"
 # define __ARCH_REL_INSTR	"lwsync"
 #endif
 #define atomic_write_barrier()	__asm ("lwsync" ::: "memory")

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] powerpc: Use lwsync on 64bit
  2017-02-12 23:22 [PATCH] powerpc: Use lwsync on 64bit Anton Blanchard
@ 2017-02-13 13:42 ` Tulio Magno Quites Machado Filho
  2017-02-13 14:23   ` Adhemerval Zanella
  2017-02-13 14:24 ` Carlos Eduardo Seo
  1 sibling, 1 reply; 6+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2017-02-13 13:42 UTC (permalink / raw)
  To: Anton Blanchard, cseo; +Cc: libc-alpha

Anton Blanchard <anton@samba.org> writes:

> Either an isync or an lwsync can be used as an acquire barrier after
> a larx/stcx/bne sequence. All 64bit CPUs support lwsync and since the
> isync instruction has other side effects that we don't need, use lwsync.
>
> 2017-02-12  Anton Blanchard  <anton@samba.org>
>
> 	* sysdeps/powerpc/atomic-machine.h: Allow __ARCH_ACQ_INSTR to be
> 	overridden.
> 	* sysdeps/powerpc/powerpc64/atomic-machine.h: define __ARCH_ACQ_INSTR

LGTM.

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] powerpc: Use lwsync on 64bit
  2017-02-13 13:42 ` Tulio Magno Quites Machado Filho
@ 2017-02-13 14:23   ` Adhemerval Zanella
  2017-02-13 15:25     ` Tulio Magno Quites Machado Filho
  2017-02-13 19:44     ` Torvald Riegel
  0 siblings, 2 replies; 6+ messages in thread
From: Adhemerval Zanella @ 2017-02-13 14:23 UTC (permalink / raw)
  To: libc-alpha



On 13/02/2017 11:42, Tulio Magno Quites Machado Filho wrote:
> Anton Blanchard <anton@samba.org> writes:
> 
>> Either an isync or an lwsync can be used as an acquire barrier after
>> a larx/stcx/bne sequence. All 64bit CPUs support lwsync and since the
>> isync instruction has other side effects that we don't need, use lwsync.
>>
>> 2017-02-12  Anton Blanchard  <anton@samba.org>
>>
>> 	* sysdeps/powerpc/atomic-machine.h: Allow __ARCH_ACQ_INSTR to be
>> 	overridden.
>> 	* sysdeps/powerpc/powerpc64/atomic-machine.h: define __ARCH_ACQ_INSTR
> 
> LGTM.
> 

I do not think this is correct for all the primitives that use
__ARCH_ACQ_INSTR.  For instance __arch_atomic_exchange_32_acq is defined as

#define __arch_atomic_exchange_32_acq(mem, value)                             \
  ({                                                                          \
    __typeof (*mem) __val;                                                    \
    __asm __volatile (                                                        \
                      "1:       lwarx   %0,0,%2" MUTEX_HINT_ACQ "\n"          \
                      "         stwcx.  %3,0,%2\n"                            \
                      "         bne-    1b\n"                                 \
                      "   " __ARCH_ACQ_INSTR                                  \
                      : "=&r" (__val), "=m" (*mem)                            \
                      : "b" (mem), "r" (value), "m" (*mem)                    \
                      : "cr0", "memory");                                     \
    __val;                                                                    \
  })


Which is analogous to C11:

#include <stdatomic.h>
#include <stdint.h>

uint32_t foo (uint32_t *x)
{
  return atomic_exchange_explicit (x, 0, memory_order_acquire);
}

And GCC 6.2.1 uses and 'isync' a memory barrier.  It is also on par with [1]
which defines acquire semantics to require 'isync' instructions.


[1] https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] powerpc: Use lwsync on 64bit
  2017-02-12 23:22 [PATCH] powerpc: Use lwsync on 64bit Anton Blanchard
  2017-02-13 13:42 ` Tulio Magno Quites Machado Filho
@ 2017-02-13 14:24 ` Carlos Eduardo Seo
  1 sibling, 0 replies; 6+ messages in thread
From: Carlos Eduardo Seo @ 2017-02-13 14:24 UTC (permalink / raw)
  To: Anton Blanchard; +Cc: libc-alpha, tuliom



On 2/12/17, 9:22 PM, "Anton Blanchard" <libc-alpha-owner@sourceware.org on behalf of anton@samba.org> wrote:

    Either an isync or an lwsync can be used as an acquire barrier after
    a larx/stcx/bne sequence. All 64bit CPUs support lwsync and since the
    isync instruction has other side effects that we don't need, use lwsync.
    

OK


--
Carlos Eduardo Seo
Software Engineer - Linux on Power Toolchain
cseo@linux.vnet.ibm.com
 



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] powerpc: Use lwsync on 64bit
  2017-02-13 14:23   ` Adhemerval Zanella
@ 2017-02-13 15:25     ` Tulio Magno Quites Machado Filho
  2017-02-13 19:44     ` Torvald Riegel
  1 sibling, 0 replies; 6+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2017-02-13 15:25 UTC (permalink / raw)
  To: Adhemerval Zanella, libc-alpha

Adhemerval Zanella <adhemerval.zanella@linaro.org> writes:

> I do not think this is correct for all the primitives that use
> __ARCH_ACQ_INSTR.  For instance __arch_atomic_exchange_32_acq is defined as
>
> #define __arch_atomic_exchange_32_acq(mem, value)                             \
>   ({                                                                          \
>     __typeof (*mem) __val;                                                    \
>     __asm __volatile (                                                        \
>                       "1:       lwarx   %0,0,%2" MUTEX_HINT_ACQ "\n"          \
>                       "         stwcx.  %3,0,%2\n"                            \
>                       "         bne-    1b\n"                                 \
>                       "   " __ARCH_ACQ_INSTR                                  \
>                       : "=&r" (__val), "=m" (*mem)                            \
>                       : "b" (mem), "r" (value), "m" (*mem)                    \
>                       : "cr0", "memory");                                     \
>     __val;                                                                    \
>   })
>
>
> Which is analogous to C11:
>
> #include <stdatomic.h>
> #include <stdint.h>
>
> uint32_t foo (uint32_t *x)
> {
>   return atomic_exchange_explicit (x, 0, memory_order_acquire);
> }
>
> And GCC 6.2.1 uses and 'isync' a memory barrier.  It is also on par with [1]
> which defines acquire semantics to require 'isync' instructions.
>
>
> [1] https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

Good point.
Per this document, it would also mess with the spinlock.

Anton, are you trying to fix a more specific issue?

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] powerpc: Use lwsync on 64bit
  2017-02-13 14:23   ` Adhemerval Zanella
  2017-02-13 15:25     ` Tulio Magno Quites Machado Filho
@ 2017-02-13 19:44     ` Torvald Riegel
  1 sibling, 0 replies; 6+ messages in thread
From: Torvald Riegel @ 2017-02-13 19:44 UTC (permalink / raw)
  To: Adhemerval Zanella; +Cc: libc-alpha

On Mon, 2017-02-13 at 12:23 -0200, Adhemerval Zanella wrote:
> 
> On 13/02/2017 11:42, Tulio Magno Quites Machado Filho wrote:
> > Anton Blanchard <anton@samba.org> writes:
> > 
> >> Either an isync or an lwsync can be used as an acquire barrier after
> >> a larx/stcx/bne sequence. All 64bit CPUs support lwsync and since the
> >> isync instruction has other side effects that we don't need, use lwsync.
> >>
> >> 2017-02-12  Anton Blanchard  <anton@samba.org>
> >>
> >> 	* sysdeps/powerpc/atomic-machine.h: Allow __ARCH_ACQ_INSTR to be
> >> 	overridden.
> >> 	* sysdeps/powerpc/powerpc64/atomic-machine.h: define __ARCH_ACQ_INSTR
> > 
> > LGTM.
> > 
> 
> I do not think this is correct for all the primitives that use
> __ARCH_ACQ_INSTR.  For instance __arch_atomic_exchange_32_acq is defined as
> 
> #define __arch_atomic_exchange_32_acq(mem, value)                             \
>   ({                                                                          \
>     __typeof (*mem) __val;                                                    \
>     __asm __volatile (                                                        \
>                       "1:       lwarx   %0,0,%2" MUTEX_HINT_ACQ "\n"          \
>                       "         stwcx.  %3,0,%2\n"                            \
>                       "         bne-    1b\n"                                 \
>                       "   " __ARCH_ACQ_INSTR                                  \
>                       : "=&r" (__val), "=m" (*mem)                            \
>                       : "b" (mem), "r" (value), "m" (*mem)                    \
>                       : "cr0", "memory");                                     \
>     __val;                                                                    \
>   })
> 
> 
> Which is analogous to C11:
> 
> #include <stdatomic.h>
> #include <stdint.h>
> 
> uint32_t foo (uint32_t *x)
> {
>   return atomic_exchange_explicit (x, 0, memory_order_acquire);
> }
> 
> And GCC 6.2.1 uses and 'isync' a memory barrier.  It is also on par with [1]
> which defines acquire semantics to require 'isync' instructions.
> 
> 
> [1] https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html

I don't know enough about the details of powerpc to comment in detail.
However, I'd like to make a few general comments:

* We should strive to be as close to C11 atomics as produced by the
compiler.  For example, if GCC issues an isync instead of lwsync, we
should be fine doing the same.  If you think GCC is generating
inefficient code, please take the change to GCC first and reach
consensus for a change there.  We'll just use GCC's atomics eventually
(though MUTEX_HINT_ACQ may be a reason for a special case), so applying
any change in GCC is necessary anyway to make such a change effective.

* Any changes in the atomics implementation should be reviewed carefully
by people deeply familiar with the HW memory models and how they map to
C11.  If in doubt, take your time.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-02-13 19:44 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-02-12 23:22 [PATCH] powerpc: Use lwsync on 64bit Anton Blanchard
2017-02-13 13:42 ` Tulio Magno Quites Machado Filho
2017-02-13 14:23   ` Adhemerval Zanella
2017-02-13 15:25     ` Tulio Magno Quites Machado Filho
2017-02-13 19:44     ` Torvald Riegel
2017-02-13 14:24 ` Carlos Eduardo Seo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).