public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins.
@ 2015-05-21 15:59 Matthew Wahab
  2015-05-21 16:00 ` [AArch64][PATCH 2/3] Strengthen barriers for sync-compare-swap builtins Matthew Wahab
                   ` (4 more replies)
  0 siblings, 5 replies; 15+ messages in thread
From: Matthew Wahab @ 2015-05-21 15:59 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1905 bytes --]

On Aarch64, the __sync builtins are implemented using the __atomic operations
and barriers. This makes the the __sync builtins inconsistent with their
documentation which requires stronger barriers than those for the __atomic
builtins.

The difference between __sync and __atomic builtins is that the restrictions
imposed by a __sync operation's barrier apply to all memory references while the
restrictions of an __atomic operation's barrier only need to apply to a
subset. This affects Aarch64 in particular because, although its implementation
of __atomic builtins is correct, the barriers generated are too weak for the
__sync builtins.

The affected __sync builtins are the __sync_fetch_and_op (and
__sync_op_and_fetch) functions, __sync_compare_and_swap and
__sync_lock_test_and_set. This and a following patch modifies the code generated
for these functions to weaken initial load-acquires to a simple load and to add
a final fence to prevent code-hoisting. The last patch will add tests for the
code generated by the Aarch64 backend for the __sync builtins.

- Full barriers:  __sync_fetch_and_op, __sync_op_and_fetch
   __sync_*_compare_and_swap

   [load-acquire; code; store-release]
   becomes
   [load; code ; store-release; fence].

- Acquire barriers:  __sync_lock_test_and_set

   [load-acquire; code; store]
   becomes
   [load; code; store; fence]

The code generated for release barriers and for the __atomic builtins is
unchanged.

This patch changes the code generated for __sync_fetch_and_<op> and
__sync_<op>_and_fetch builtins.

Tested with check-gcc for aarch64-none-linux-gnu.

Ok for trunk?
Matthew

gcc/
2015-05-21  Matthew Wahab  <matthew.wahab@arm.com>

	* config/aarch64/aarch64.c (aarch64_emit_post_barrier): New.
	(aarch64_split_atomic_op): Check for __sync memory models, emit
	appropriate initial and final barriers.



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-AArch64-Strengthen-barriers-for-sync-fetch-op-builti.patch --]
[-- Type: text/x-patch;  name=0001-AArch64-Strengthen-barriers-for-sync-fetch-op-builti.patch, Size: 2668 bytes --]

From 2092902d2738b0c24a6272e0b3480bb9cffd275c Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Fri, 15 May 2015 09:26:28 +0100
Subject: [PATCH 1/3] [AArch64] Strengthen barriers for sync-fetch-op builtin.

Change-Id: I3342a572d672163ffc703e4e51603744680334fc
---
 gcc/config/aarch64/aarch64.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 7f0cc0d..778571f 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9249,6 +9249,22 @@ aarch64_expand_compare_and_swap (rtx operands[])
   emit_insn (gen_rtx_SET (bval, x));
 }
 
+/* Emit a post-operation barrier.  */
+
+static void
+aarch64_emit_post_barrier (enum memmodel model)
+{
+  const enum memmodel base_model = memmodel_base (model);
+
+  if (is_mm_sync (model)
+      && (base_model == MEMMODEL_ACQUIRE
+	  || base_model == MEMMODEL_ACQ_REL
+	  || base_model == MEMMODEL_SEQ_CST))
+    {
+      emit_insn (gen_mem_thread_fence (GEN_INT (MEMMODEL_SEQ_CST)));
+    }
+}
+
 /* Split a compare and swap pattern.  */
 
 void
@@ -9311,12 +9327,20 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
 {
   machine_mode mode = GET_MODE (mem);
   machine_mode wmode = (mode == DImode ? DImode : SImode);
+  const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
+  const bool is_sync = is_mm_sync (model);
+  rtx load_model_rtx = model_rtx;
   rtx_code_label *label;
   rtx x;
 
   label = gen_label_rtx ();
   emit_label (label);
 
+  /* A __sync operation will emit a final fence to stop code hoisting, so the
+     load can be relaxed.  */
+  if (is_sync)
+    load_model_rtx = GEN_INT (MEMMODEL_RELAXED);
+
   if (new_out)
     new_out = gen_lowpart (wmode, new_out);
   if (old_out)
@@ -9325,7 +9349,7 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
     old_out = new_out;
   value = simplify_gen_subreg (wmode, value, mode, 0);
 
-  aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);
+  aarch64_emit_load_exclusive (mode, old_out, mem, load_model_rtx);
 
   switch (code)
     {
@@ -9361,6 +9385,10 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
   x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
 			    gen_rtx_LABEL_REF (Pmode, label), pc_rtx);
   aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+
+  /* Emit any fence needed for a __sync operation.  */
+  if (is_sync)
+    aarch64_emit_post_barrier (model);
 }
 
 static void
-- 
1.9.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [AArch64][PATCH 2/3] Strengthen barriers for sync-compare-swap builtins.
  2015-05-21 15:59 [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins Matthew Wahab
@ 2015-05-21 16:00 ` Matthew Wahab
  2015-05-22  8:50   ` [PATCH 2/3][AArch64][PR target/65697] " Matthew Wahab
  2015-05-21 16:03 ` [PATCH 3/3][Aarch64] Add tests for __sync_builtins Matthew Wahab
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Matthew Wahab @ 2015-05-21 16:00 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 853 bytes --]

This patch changes the code generated for __sync_type_compare_and_swap to

   ldxr reg; cmp; bne label; stlxr; cbnz; label: dmb ish; mov .., reg

This removes the acquire-barrier from the load and ends the operation with a
fence to prevent memory references appearing after the __sync operation from
being moved ahead of the store-release.

This also strengthens the acquire barrier generated for __sync_lock_test_and_set
(which, like compare-and-swap, is implemented as a form of atomic exchange):

   ldaxr; stxr; cbnz
becomes
   ldxr; stxr; cbnz; dmb ish

Tested with check-gcc for aarch64-none-linux-gnu.

Ok for trunk?
Matthew

2015-05-21  Matthew Wahab  <matthew.wahab@arm.com>

	* config/aarch64/aarch64.c (aarch64_split_compare_and_swap): Check
	for __sync memory models, emit appropriate initial and final
	barriers.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0002-AArch64-Strengthen-barriers-for-sync-compare-swap-bu.patch --]
[-- Type: text/x-patch;  name=0002-AArch64-Strengthen-barriers-for-sync-compare-swap-bu.patch, Size: 2334 bytes --]

From 6f748034d25b75ea7829192d94e54189c2fbf99e Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Fri, 15 May 2015 09:31:06 +0100
Subject: [PATCH 2/3] [AArch64] Strengthen barriers for sync-compare-swap
 builtins.

Change-Id: I335771f2f42ea951d227f20f6cb9daa07330614d
---
 gcc/config/aarch64/aarch64.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 778571f..11a8cd0 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9275,14 +9275,19 @@ aarch64_split_compare_and_swap (rtx operands[])
   bool is_weak;
   rtx_code_label *label1, *label2;
   rtx x, cond;
+  enum memmodel model;
+  rtx model_rtx;
+  rtx load_model_rtx;
 
   rval = operands[0];
   mem = operands[1];
   oldval = operands[2];
   newval = operands[3];
   is_weak = (operands[4] != const0_rtx);
+  model_rtx = operands[5];
   scratch = operands[7];
   mode = GET_MODE (mem);
+  model = memmodel_from_int (INTVAL (model_rtx));
 
   label1 = NULL;
   if (!is_weak)
@@ -9292,7 +9297,13 @@ aarch64_split_compare_and_swap (rtx operands[])
     }
   label2 = gen_label_rtx ();
 
-  aarch64_emit_load_exclusive (mode, rval, mem, operands[5]);
+  /* A __sync operation will end with a fence so the load can be relaxed.  */
+  if (is_mm_sync (model))
+    load_model_rtx = GEN_INT (MEMMODEL_RELAXED);
+  else
+    load_model_rtx = model_rtx;
+
+  aarch64_emit_load_exclusive (mode, rval, mem, load_model_rtx);
 
   cond = aarch64_gen_compare_reg (NE, rval, oldval);
   x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
@@ -9300,7 +9311,7 @@ aarch64_split_compare_and_swap (rtx operands[])
 			    gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
   aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
 
-  aarch64_emit_store_exclusive (mode, scratch, mem, newval, operands[5]);
+  aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
 
   if (!is_weak)
     {
@@ -9317,6 +9328,10 @@ aarch64_split_compare_and_swap (rtx operands[])
     }
 
   emit_label (label2);
+
+  /* A __sync operation may need a final fence.  */
+  if (is_mm_sync (model))
+    aarch64_emit_post_barrier (model);
 }
 
 /* Split an atomic operation.  */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 3/3][Aarch64] Add tests for __sync_builtins.
  2015-05-21 15:59 [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins Matthew Wahab
  2015-05-21 16:00 ` [AArch64][PATCH 2/3] Strengthen barriers for sync-compare-swap builtins Matthew Wahab
@ 2015-05-21 16:03 ` Matthew Wahab
  2015-05-22  8:51   ` [PATCH 3/3][Aarch64][PR target/65697] " Matthew Wahab
  2015-05-22  8:30 ` [PATCH 1/3][AArch64][PR target/65697] Strengthen barriers for sync-fetch-op builtins Matthew Wahab
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 15+ messages in thread
From: Matthew Wahab @ 2015-05-21 16:03 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 612 bytes --]

This patch adds tests for the code generated by the Aarch64 backend for the
__sync builtins.

Tested aarch64-none-linux-gnu with check-gcc.

Ok for trunk?
Matthew

gcc/testsuite/
2015-05-21  Matthew Wahab  <matthew.wahab@arm.com>

	* gcc.target/aarch64/sync-comp-swap.c: New.
	* gcc.target/aarch64/sync-comp-swap.x: New.
	* gcc.target/aarch64/sync-op-acquire.c: New.
	* gcc.target/aarch64/sync-op-acquire.x: New.
	* gcc.target/aarch64/sync-op-full.c: New.
	* gcc.target/aarch64/sync-op-full.x: New.
	* gcc.target/aarch64/sync-op-release.c: New.
	* gcc.target/aarch64/sync-op-release.x: New.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0003-Aarch64-Add-tests-for-__sync_builtins.patch --]
[-- Type: text/x-patch;  name=0003-Aarch64-Add-tests-for-__sync_builtins.patch, Size: 6062 bytes --]

From 74738b2c0ceb9d5cae281b9609c134fde1d459e9 Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Fri, 15 May 2015 09:31:42 +0100
Subject: [PATCH 3/3] [Aarch64] Add tests for __sync_builtins.

Change-Id: I9f7cde85613dfe2cb6df55cbc732e683092f14d8
---
 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c  |  8 +++
 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x  | 13 ++++
 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c |  8 +++
 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x |  7 +++
 gcc/testsuite/gcc.target/aarch64/sync-op-full.c    |  8 +++
 gcc/testsuite/gcc.target/aarch64/sync-op-full.x    | 73 ++++++++++++++++++++++
 gcc/testsuite/gcc.target/aarch64/sync-op-release.c |  6 ++
 gcc/testsuite/gcc.target/aarch64/sync-op-release.x |  7 +++
 8 files changed, 130 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-full.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-full.x
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-release.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-release.x

diff --git a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
new file mode 100644
index 0000000..126b997
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-ipa-icf" } */
+
+#include "sync-comp-swap.x"
+
+/* { dg-final { scan-assembler-times "ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\]" 2 } } */
+/* { dg-final { scan-assembler-times "stlxr\tw\[0-9\]+, w\[0-9\]+, \\\[x\[0-9\]+\\\]" 2 } } */
+/* { dg-final { scan-assembler-times "dmb\tish" 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x
new file mode 100644
index 0000000..eda52e40
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x
@@ -0,0 +1,13 @@
+int v = 0;
+
+int
+sync_bool_compare_swap (int a, int b)
+{
+  return __sync_bool_compare_and_swap (&v, &a, &b);
+}
+
+int
+sync_val_compare_swap (int a, int b)
+{
+  return __sync_val_compare_and_swap (&v, &a, &b);
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
new file mode 100644
index 0000000..2639f9f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include "sync-op-acquire.x"
+
+/* { dg-final { scan-assembler-times "ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\]" 1 } } */
+/* { dg-final { scan-assembler-times "stxr\tw\[0-9\]+, w\[0-9\]+, \\\[x\[0-9\]+\\\]" 1 } } */
+/* { dg-final { scan-assembler-times "dmb\tish" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x
new file mode 100644
index 0000000..4c4548c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x
@@ -0,0 +1,7 @@
+int v;
+
+int
+sync_lock_test_and_set (int a)
+{
+  return __sync_lock_test_and_set (&v, a);
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-full.c b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
new file mode 100644
index 0000000..10fc8fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include "sync-op-full.x"
+
+/* { dg-final { scan-assembler-times "ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\]" 12 } } */
+/* { dg-final { scan-assembler-times "stlxr\tw\[0-9\]+, w\[0-9\]+, \\\[x\[0-9\]+\\\]" 12 } } */
+/* { dg-final { scan-assembler-times "dmb\tish" 12 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-full.x b/gcc/testsuite/gcc.target/aarch64/sync-op-full.x
new file mode 100644
index 0000000..c24223d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-full.x
@@ -0,0 +1,73 @@
+int v = 0;
+
+int
+sync_fetch_and_add (int a)
+{
+  return __sync_fetch_and_add (&v, a);
+}
+
+int
+sync_fetch_and_sub (int a)
+{
+  return __sync_fetch_and_sub (&v, a);
+}
+
+int
+sync_fetch_and_and (int a)
+{
+  return __sync_fetch_and_and (&v, a);
+}
+
+int
+sync_fetch_and_nand (int a)
+{
+  return __sync_fetch_and_nand (&v, a);
+}
+
+int
+sync_fetch_and_xor (int a)
+{
+  return __sync_fetch_and_xor (&v, a);
+}
+
+int
+sync_fetch_and_or (int a)
+{
+  return __sync_fetch_and_or (&v, a);
+}
+
+int
+sync_add_and_fetch (int a)
+{
+  return __sync_add_and_fetch (&v, a);
+}
+
+int
+sync_sub_and_fetch (int a)
+{
+  return __sync_sub_and_fetch (&v, a);
+}
+
+int
+sync_and_and_fetch (int a)
+{
+  return __sync_and_and_fetch (&v, a);
+}
+
+int
+sync_nand_and_fetch (int a)
+{
+  return __sync_nand_and_fetch (&v, a);
+}
+
+int
+sync_xor_and_fetch (int a)
+{
+  return __sync_xor_and_fetch (&v, a);
+}
+
+int
+sync_or_and_fetch (int a)
+{
+  return __sync_or_and_fetch (&v, a);
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-release.c b/gcc/testsuite/gcc.target/aarch64/sync-op-release.c
new file mode 100644
index 0000000..d25b46f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-release.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include "sync-op-release.x"
+
+/* { dg-final { scan-assembler-times "stlr" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-release.x b/gcc/testsuite/gcc.target/aarch64/sync-op-release.x
new file mode 100644
index 0000000..704bcff
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-release.x
@@ -0,0 +1,7 @@
+int v;
+
+void
+sync_lock_release (void)
+{
+  __sync_lock_release (&v);
+}
-- 
1.9.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3][AArch64][PR target/65697] Strengthen barriers for sync-fetch-op builtins.
  2015-05-21 15:59 [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins Matthew Wahab
  2015-05-21 16:00 ` [AArch64][PATCH 2/3] Strengthen barriers for sync-compare-swap builtins Matthew Wahab
  2015-05-21 16:03 ` [PATCH 3/3][Aarch64] Add tests for __sync_builtins Matthew Wahab
@ 2015-05-22  8:30 ` Matthew Wahab
  2015-05-22 11:47 ` [PATCH 1/3][AArch64] " Ramana Radhakrishnan
  2015-05-26  9:52 ` James Greenhalgh
  4 siblings, 0 replies; 15+ messages in thread
From: Matthew Wahab @ 2015-05-22  8:30 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1974 bytes --]

[Added PR number and updated patches]

On Aarch64, the __sync builtins are implemented using the __atomic operations
and barriers. This makes the the __sync builtins inconsistent with their
documentation which requires stronger barriers than those for the __atomic
builtins.

The difference between __sync and __atomic builtins is that the restrictions
imposed by a __sync operation's barrier apply to all memory references while the
restrictions of an __atomic operation's barrier only need to apply to a
subset. This affects Aarch64 in particular because, although its implementation
of __atomic builtins is correct, the barriers generated are too weak for the
__sync builtins.

The affected __sync builtins are the __sync_fetch_and_op (and
__sync_op_and_fetch) functions, __sync_compare_and_swap and
__sync_lock_test_and_set. This and a following patch modifies the code generated
for these functions to weaken initial load-acquires to a simple load and to add
a final fence to prevent code-hoisting. The last patch will add tests for the
code generated by the Aarch64 backend for the __sync builtins.

- Full barriers:  __sync_fetch_and_op, __sync_op_and_fetch
   __sync_*_compare_and_swap

   [load-acquire; code; store-release]
   becomes
   [load; code ; store-release; fence].

- Acquire barriers:  __sync_lock_test_and_set

   [load-acquire; code; store]
   becomes
   [load; code; store; fence]

The code generated for release barriers and for the __atomic builtins is
unchanged.

This patch changes the code generated for __sync_fetch_and_<op> and
__sync_<op>_and_fetch builtins.

Tested with check-gcc for aarch64-none-linux-gnu.

Ok for trunk?
Matthew

gcc/
2015-05-22  Matthew Wahab  <matthew.wahab@arm.com>

         PR target/65697
	* config/aarch64/aarch64.c (aarch64_emit_post_barrier): New.
	(aarch64_split_atomic_op): Check for __sync memory models, emit
	appropriate initial and final barriers.




[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-AArch64-Strengthen-barriers-for-sync-fetch-op-builti.patch --]
[-- Type: text/x-patch;  name=0001-AArch64-Strengthen-barriers-for-sync-fetch-op-builti.patch, Size: 2668 bytes --]

From 5a2d546359f78cd3f304a62617f0fc385664374e Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Fri, 15 May 2015 09:26:28 +0100
Subject: [PATCH 1/3] [AArch64] Strengthen barriers for sync-fetch-op builtin.

Change-Id: I3342a572d672163ffc703e4e51603744680334fc
---
 gcc/config/aarch64/aarch64.c | 30 +++++++++++++++++++++++++++++-
 1 file changed, 29 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 8c25d75..182dbad 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9407,6 +9407,22 @@ aarch64_expand_compare_and_swap (rtx operands[])
   emit_insn (gen_rtx_SET (bval, x));
 }
 
+/* Emit a post-operation barrier.  */
+
+static void
+aarch64_emit_post_barrier (enum memmodel model)
+{
+  const enum memmodel base_model = memmodel_base (model);
+
+  if (is_mm_sync (model)
+      && (base_model == MEMMODEL_ACQUIRE
+	  || base_model == MEMMODEL_ACQ_REL
+	  || base_model == MEMMODEL_SEQ_CST))
+    {
+      emit_insn (gen_mem_thread_fence (GEN_INT (MEMMODEL_SEQ_CST)));
+    }
+}
+
 /* Split a compare and swap pattern.  */
 
 void
@@ -9469,12 +9485,20 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
 {
   machine_mode mode = GET_MODE (mem);
   machine_mode wmode = (mode == DImode ? DImode : SImode);
+  const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
+  const bool is_sync = is_mm_sync (model);
+  rtx load_model_rtx = model_rtx;
   rtx_code_label *label;
   rtx x;
 
   label = gen_label_rtx ();
   emit_label (label);
 
+  /* A __sync operation will emit a final fence to stop code hoisting, so the
+     load can be relaxed.  */
+  if (is_sync)
+    load_model_rtx = GEN_INT (MEMMODEL_RELAXED);
+
   if (new_out)
     new_out = gen_lowpart (wmode, new_out);
   if (old_out)
@@ -9483,7 +9507,7 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
     old_out = new_out;
   value = simplify_gen_subreg (wmode, value, mode, 0);
 
-  aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);
+  aarch64_emit_load_exclusive (mode, old_out, mem, load_model_rtx);
 
   switch (code)
     {
@@ -9519,6 +9543,10 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
   x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
 			    gen_rtx_LABEL_REF (Pmode, label), pc_rtx);
   aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+
+  /* Emit any fence needed for a __sync operation.  */
+  if (is_sync)
+    aarch64_emit_post_barrier (model);
 }
 
 static void
-- 
1.9.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3][AArch64][PR target/65697] Strengthen barriers for sync-compare-swap builtins.
  2015-05-21 16:00 ` [AArch64][PATCH 2/3] Strengthen barriers for sync-compare-swap builtins Matthew Wahab
@ 2015-05-22  8:50   ` Matthew Wahab
  2015-06-01 12:08     ` Matthew Wahab
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Wahab @ 2015-05-22  8:50 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 914 bytes --]

[Added PR number and updated patches]

This patch changes the code generated for __sync_type_compare_and_swap to

   ldxr reg; cmp; bne label; stlxr; cbnz; label: dmb ish; mov .., reg

This removes the acquire-barrier from the load and ends the operation with a
fence to prevent memory references appearing after the __sync operation from
being moved ahead of the store-release.

This also strengthens the acquire barrier generated for __sync_lock_test_and_set
(which, like compare-and-swap, is implemented as a form of atomic exchange):

   ldaxr; stxr; cbnz
becomes
   ldxr; stxr; cbnz; dmb ish


Tested with check-gcc for aarch64-none-linux-gnu.

Ok for trunk?
Matthew

2015-05-22  Matthew Wahab  <matthew.wahab@arm.com>

	PR target/65697
	* config/aarch64/aarch64.c (aarch64_split_compare_and_swap): Check
	for __sync memory models, emit appropriate initial and final
	barriers.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0002-AArch64-Strengthen-barriers-for-sync-compare-swap-bu.patch --]
[-- Type: text/x-patch;  name=0002-AArch64-Strengthen-barriers-for-sync-compare-swap-bu.patch, Size: 2334 bytes --]

From 1e5cda95944e7176b8934296b1bb1ec4c9fb1362 Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Fri, 15 May 2015 09:31:06 +0100
Subject: [PATCH 2/3] [AArch64] Strengthen barriers for sync-compare-swap
 builtins.

Change-Id: I335771f2f42ea951d227f20f6cb9daa07330614d
---
 gcc/config/aarch64/aarch64.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 182dbad..5b9feee 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9433,14 +9433,19 @@ aarch64_split_compare_and_swap (rtx operands[])
   bool is_weak;
   rtx_code_label *label1, *label2;
   rtx x, cond;
+  enum memmodel model;
+  rtx model_rtx;
+  rtx load_model_rtx;
 
   rval = operands[0];
   mem = operands[1];
   oldval = operands[2];
   newval = operands[3];
   is_weak = (operands[4] != const0_rtx);
+  model_rtx = operands[5];
   scratch = operands[7];
   mode = GET_MODE (mem);
+  model = memmodel_from_int (INTVAL (model_rtx));
 
   label1 = NULL;
   if (!is_weak)
@@ -9450,7 +9455,13 @@ aarch64_split_compare_and_swap (rtx operands[])
     }
   label2 = gen_label_rtx ();
 
-  aarch64_emit_load_exclusive (mode, rval, mem, operands[5]);
+  /* A __sync operation will end with a fence so the load can be relaxed.  */
+  if (is_mm_sync (model))
+    load_model_rtx = GEN_INT (MEMMODEL_RELAXED);
+  else
+    load_model_rtx = model_rtx;
+
+  aarch64_emit_load_exclusive (mode, rval, mem, load_model_rtx);
 
   cond = aarch64_gen_compare_reg (NE, rval, oldval);
   x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
@@ -9458,7 +9469,7 @@ aarch64_split_compare_and_swap (rtx operands[])
 			    gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
   aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
 
-  aarch64_emit_store_exclusive (mode, scratch, mem, newval, operands[5]);
+  aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
 
   if (!is_weak)
     {
@@ -9475,6 +9486,10 @@ aarch64_split_compare_and_swap (rtx operands[])
     }
 
   emit_label (label2);
+
+  /* A __sync operation may need a final fence.  */
+  if (is_mm_sync (model))
+    aarch64_emit_post_barrier (model);
 }
 
 /* Split an atomic operation.  */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3][Aarch64][PR target/65697] Add tests for __sync_builtins.
  2015-05-21 16:03 ` [PATCH 3/3][Aarch64] Add tests for __sync_builtins Matthew Wahab
@ 2015-05-22  8:51   ` Matthew Wahab
  2015-05-26 10:11     ` James Greenhalgh
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Wahab @ 2015-05-22  8:51 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 673 bytes --]

[Added PR number and updated patches]

This patch adds tests for the code generated by the Aarch64 backend for the
__sync builtins.

Tested aarch64-none-linux-gnu with check-gcc.

Ok for trunk?
Matthew

gcc/testsuite/
2015-05-21  Matthew Wahab  <matthew.wahab@arm.com>

	PR target/65697
	* gcc.target/aarch64/sync-comp-swap.c: New.
	* gcc.target/aarch64/sync-comp-swap.x: New.
	* gcc.target/aarch64/sync-op-acquire.c: New.
	* gcc.target/aarch64/sync-op-acquire.x: New.
	* gcc.target/aarch64/sync-op-full.c: New.
	* gcc.target/aarch64/sync-op-full.x: New.
	* gcc.target/aarch64/sync-op-release.c: New.
	* gcc.target/aarch64/sync-op-release.x: New.



[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0003-Aarch64-Add-tests-for-__sync_builtins.patch --]
[-- Type: text/x-patch;  name=0003-Aarch64-Add-tests-for-__sync_builtins.patch, Size: 6062 bytes --]

From a3e8df9afce1098c1c616d66a309ce5bc5b95593 Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Fri, 15 May 2015 09:31:42 +0100
Subject: [PATCH 3/3] [Aarch64] Add tests for __sync_builtins.

Change-Id: I9f7cde85613dfe2cb6df55cbc732e683092f14d8
---
 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c  |  8 +++
 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x  | 13 ++++
 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c |  8 +++
 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x |  7 +++
 gcc/testsuite/gcc.target/aarch64/sync-op-full.c    |  8 +++
 gcc/testsuite/gcc.target/aarch64/sync-op-full.x    | 73 ++++++++++++++++++++++
 gcc/testsuite/gcc.target/aarch64/sync-op-release.c |  6 ++
 gcc/testsuite/gcc.target/aarch64/sync-op-release.x |  7 +++
 8 files changed, 130 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-full.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-full.x
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-release.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sync-op-release.x

diff --git a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
new file mode 100644
index 0000000..126b997
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fno-ipa-icf" } */
+
+#include "sync-comp-swap.x"
+
+/* { dg-final { scan-assembler-times "ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\]" 2 } } */
+/* { dg-final { scan-assembler-times "stlxr\tw\[0-9\]+, w\[0-9\]+, \\\[x\[0-9\]+\\\]" 2 } } */
+/* { dg-final { scan-assembler-times "dmb\tish" 2 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x
new file mode 100644
index 0000000..eda52e40
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.x
@@ -0,0 +1,13 @@
+int v = 0;
+
+int
+sync_bool_compare_swap (int a, int b)
+{
+  return __sync_bool_compare_and_swap (&v, &a, &b);
+}
+
+int
+sync_val_compare_swap (int a, int b)
+{
+  return __sync_val_compare_and_swap (&v, &a, &b);
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
new file mode 100644
index 0000000..2639f9f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include "sync-op-acquire.x"
+
+/* { dg-final { scan-assembler-times "ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\]" 1 } } */
+/* { dg-final { scan-assembler-times "stxr\tw\[0-9\]+, w\[0-9\]+, \\\[x\[0-9\]+\\\]" 1 } } */
+/* { dg-final { scan-assembler-times "dmb\tish" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x
new file mode 100644
index 0000000..4c4548c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.x
@@ -0,0 +1,7 @@
+int v;
+
+int
+sync_lock_test_and_set (int a)
+{
+  return __sync_lock_test_and_set (&v, a);
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-full.c b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
new file mode 100644
index 0000000..10fc8fc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include "sync-op-full.x"
+
+/* { dg-final { scan-assembler-times "ldxr\tw\[0-9\]+, \\\[x\[0-9\]+\\\]" 12 } } */
+/* { dg-final { scan-assembler-times "stlxr\tw\[0-9\]+, w\[0-9\]+, \\\[x\[0-9\]+\\\]" 12 } } */
+/* { dg-final { scan-assembler-times "dmb\tish" 12 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-full.x b/gcc/testsuite/gcc.target/aarch64/sync-op-full.x
new file mode 100644
index 0000000..c24223d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-full.x
@@ -0,0 +1,73 @@
+int v = 0;
+
+int
+sync_fetch_and_add (int a)
+{
+  return __sync_fetch_and_add (&v, a);
+}
+
+int
+sync_fetch_and_sub (int a)
+{
+  return __sync_fetch_and_sub (&v, a);
+}
+
+int
+sync_fetch_and_and (int a)
+{
+  return __sync_fetch_and_and (&v, a);
+}
+
+int
+sync_fetch_and_nand (int a)
+{
+  return __sync_fetch_and_nand (&v, a);
+}
+
+int
+sync_fetch_and_xor (int a)
+{
+  return __sync_fetch_and_xor (&v, a);
+}
+
+int
+sync_fetch_and_or (int a)
+{
+  return __sync_fetch_and_or (&v, a);
+}
+
+int
+sync_add_and_fetch (int a)
+{
+  return __sync_add_and_fetch (&v, a);
+}
+
+int
+sync_sub_and_fetch (int a)
+{
+  return __sync_sub_and_fetch (&v, a);
+}
+
+int
+sync_and_and_fetch (int a)
+{
+  return __sync_and_and_fetch (&v, a);
+}
+
+int
+sync_nand_and_fetch (int a)
+{
+  return __sync_nand_and_fetch (&v, a);
+}
+
+int
+sync_xor_and_fetch (int a)
+{
+  return __sync_xor_and_fetch (&v, a);
+}
+
+int
+sync_or_and_fetch (int a)
+{
+  return __sync_or_and_fetch (&v, a);
+}
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-release.c b/gcc/testsuite/gcc.target/aarch64/sync-op-release.c
new file mode 100644
index 0000000..d25b46f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-release.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+#include "sync-op-release.x"
+
+/* { dg-final { scan-assembler-times "stlr" 1 } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-release.x b/gcc/testsuite/gcc.target/aarch64/sync-op-release.x
new file mode 100644
index 0000000..704bcff
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-release.x
@@ -0,0 +1,7 @@
+int v;
+
+void
+sync_lock_release (void)
+{
+  __sync_lock_release (&v);
+}
-- 
1.9.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins.
  2015-05-21 15:59 [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins Matthew Wahab
                   ` (2 preceding siblings ...)
  2015-05-22  8:30 ` [PATCH 1/3][AArch64][PR target/65697] Strengthen barriers for sync-fetch-op builtins Matthew Wahab
@ 2015-05-22 11:47 ` Ramana Radhakrishnan
  2015-05-22 12:15   ` Matthew Wahab
  2015-05-26  9:52 ` James Greenhalgh
  4 siblings, 1 reply; 15+ messages in thread
From: Ramana Radhakrishnan @ 2015-05-22 11:47 UTC (permalink / raw)
  To: Matthew Wahab; +Cc: gcc-patches

>
> Ok for trunk?

I can't approve but do you mind taking care of -march=armv8-a in the
arm backend too as that would have the same issues.

Ramana

> Matthew
>
> gcc/
> 2015-05-21  Matthew Wahab  <matthew.wahab@arm.com>
>
>         * config/aarch64/aarch64.c (aarch64_emit_post_barrier): New.
>         (aarch64_split_atomic_op): Check for __sync memory models, emit
>         appropriate initial and final barriers.
>
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins.
  2015-05-22 11:47 ` [PATCH 1/3][AArch64] " Ramana Radhakrishnan
@ 2015-05-22 12:15   ` Matthew Wahab
  0 siblings, 0 replies; 15+ messages in thread
From: Matthew Wahab @ 2015-05-22 12:15 UTC (permalink / raw)
  To: Ramana Radhakrishnan; +Cc: gcc-patches

On 22/05/15 12:26, Ramana Radhakrishnan wrote:
>>
>> Ok for trunk?
>
> I can't approve but do you mind taking care of -march=armv8-a in the
> arm backend too as that would have the same issues.
>

Will do,
Matthew

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins.
  2015-05-21 15:59 [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins Matthew Wahab
                   ` (3 preceding siblings ...)
  2015-05-22 11:47 ` [PATCH 1/3][AArch64] " Ramana Radhakrishnan
@ 2015-05-26  9:52 ` James Greenhalgh
  2015-06-01 12:06   ` [PATCH 1/3][AArch64][PR target/65797] " Matthew Wahab
  4 siblings, 1 reply; 15+ messages in thread
From: James Greenhalgh @ 2015-05-26  9:52 UTC (permalink / raw)
  To: Matthew Wahab; +Cc: gcc-patches

On Thu, May 21, 2015 at 04:57:00PM +0100, Matthew Wahab wrote:
> On Aarch64, the __sync builtins are implemented using the __atomic operations
> and barriers. This makes the the __sync builtins inconsistent with their
> documentation which requires stronger barriers than those for the __atomic
> builtins.

<snip>

> Ok for trunk?

Close, but I have some comments on style.

Please tie this to the PR which was open in the ChangLog entry.

> 
> gcc/
> 2015-05-21  Matthew Wahab  <matthew.wahab@arm.com>
> 
> 	* config/aarch64/aarch64.c (aarch64_emit_post_barrier): New.

If I was being picky, this should be something like
aarch64_emit_sync_op_epilogue , but I don't want to bikeshed too much.


> 	(aarch64_split_atomic_op): Check for __sync memory models, emit
> 	appropriate initial and final barriers.

I don't see any new initial barriers. I think you are referring to
relaxing the ldaxr to an ldxr for __sync primitives, in which case, say
that.

> From 2092902d2738b0c24a6272e0b3480bb9cffd275c Mon Sep 17 00:00:00 2001
> From: Matthew Wahab <matthew.wahab@arm.com>
> Date: Fri, 15 May 2015 09:26:28 +0100
> Subject: [PATCH 1/3] [AArch64] Strengthen barriers for sync-fetch-op builtin.
> 
> Change-Id: I3342a572d672163ffc703e4e51603744680334fc
> ---
>  gcc/config/aarch64/aarch64.c | 30 +++++++++++++++++++++++++++++-
>  1 file changed, 29 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 7f0cc0d..778571f 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -9249,6 +9249,22 @@ aarch64_expand_compare_and_swap (rtx operands[])
>    emit_insn (gen_rtx_SET (bval, x));
>  }
>  
> +/* Emit a post-operation barrier.  */

This comment could do with some more detail. What is a post-operation
barrier? When do we need one? What is the MODEL parameter?

> +static void
> +aarch64_emit_post_barrier (enum memmodel model)
> +{
> +  const enum memmodel base_model = memmodel_base (model);
> +
> +  if (is_mm_sync (model)
> +      && (base_model == MEMMODEL_ACQUIRE
> +	  || base_model == MEMMODEL_ACQ_REL
> +	  || base_model == MEMMODEL_SEQ_CST))
> +    {
> +      emit_insn (gen_mem_thread_fence (GEN_INT (MEMMODEL_SEQ_CST)));
> +    }
> +}
> +
>  /* Split a compare and swap pattern.  */
>  
>  void
> @@ -9311,12 +9327,20 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
>  {
>    machine_mode mode = GET_MODE (mem);
>    machine_mode wmode = (mode == DImode ? DImode : SImode);
> +  const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
> +  const bool is_sync = is_mm_sync (model);
> +  rtx load_model_rtx = model_rtx;
>    rtx_code_label *label;
>    rtx x;
>  
>    label = gen_label_rtx ();
>    emit_label (label);
>  
> +  /* A __sync operation will emit a final fence to stop code hoisting, so the

Can we pick a consistent terminology between fence/barrier? They are
currently used interchangeably, but I think we generally prefer barrier
in the AArch64 port.

> +     load can be relaxed.  */
> +  if (is_sync)
> +    load_model_rtx = GEN_INT (MEMMODEL_RELAXED);
> +
>    if (new_out)
>      new_out = gen_lowpart (wmode, new_out);
>    if (old_out)
> @@ -9325,7 +9349,7 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
>      old_out = new_out;
>    value = simplify_gen_subreg (wmode, value, mode, 0);
>  
> -  aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);
> +  aarch64_emit_load_exclusive (mode, old_out, mem, load_model_rtx);

To my mind, these two hunks would be marginally easier to follow if
we combined them, as so:

  /* A __sync operation will emit a final barrier to stop code hoisting,
     so the load can be relaxed.  */
  if (is_sync)
    aarch64_emit_load_exclusive (mode, old_out,
				 mem, GEN_INT (MEMMODEL_RELAXED));
  else
    aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);

it is just one less piece of information to juggle when thinking through
what we are emitting here.

>  
>    switch (code)
>      {
> @@ -9361,6 +9385,10 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
>    x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
>  			    gen_rtx_LABEL_REF (Pmode, label), pc_rtx);
>    aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
> +
> +  /* Emit any fence needed for a __sync operation.  */
> +  if (is_sync)
> +    aarch64_emit_post_barrier (model);
>  }

Cheers,
James

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3][Aarch64][PR target/65697] Add tests for __sync_builtins.
  2015-05-22  8:51   ` [PATCH 3/3][Aarch64][PR target/65697] " Matthew Wahab
@ 2015-05-26 10:11     ` James Greenhalgh
  0 siblings, 0 replies; 15+ messages in thread
From: James Greenhalgh @ 2015-05-26 10:11 UTC (permalink / raw)
  To: Matthew Wahab; +Cc: gcc-patches

On Fri, May 22, 2015 at 09:30:18AM +0100, Matthew Wahab wrote:
> [Added PR number and updated patches]
> 
> This patch adds tests for the code generated by the Aarch64 backend for the
> __sync builtins.
> 
> Tested aarch64-none-linux-gnu with check-gcc.
> 
> Ok for trunk?
> Matthew

This is OK once the rest of the set has been committed.

Thanks,
James

> 
> gcc/testsuite/
> 2015-05-21  Matthew Wahab  <matthew.wahab@arm.com>
> 
> 	PR target/65697
> 	* gcc.target/aarch64/sync-comp-swap.c: New.
> 	* gcc.target/aarch64/sync-comp-swap.x: New.
> 	* gcc.target/aarch64/sync-op-acquire.c: New.
> 	* gcc.target/aarch64/sync-op-acquire.x: New.
> 	* gcc.target/aarch64/sync-op-full.c: New.
> 	* gcc.target/aarch64/sync-op-full.x: New.
> 	* gcc.target/aarch64/sync-op-release.c: New.
> 	* gcc.target/aarch64/sync-op-release.x: New.
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3][AArch64][PR target/65797] Strengthen barriers for sync-fetch-op builtins.
  2015-05-26  9:52 ` James Greenhalgh
@ 2015-06-01 12:06   ` Matthew Wahab
  2015-06-01 12:15     ` James Greenhalgh
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Wahab @ 2015-06-01 12:06 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1551 bytes --]

On 26/05/15 10:32, James Greenhalgh wrote:
> Please tie this to the PR which was open in the ChangLog entry.
>
>> 	(aarch64_split_atomic_op): Check for __sync memory models, emit
>> 	appropriate initial and final barriers.
>
> I don't see any new initial barriers. I think you are referring to
> relaxing the ldaxr to an ldxr for __sync primitives, in which case, say
> that.
>
>> +/* Emit a post-operation barrier.  */
>
> This comment could do with some more detail. What is a post-operation
> barrier? When do we need one? What is the MODEL parameter?
>
>> +  /* A __sync operation will emit a final fence to stop code hoisting, so the
>
> Can we pick a consistent terminology between fence/barrier? They are
> currently used interchangeably, but I think we generally prefer barrier
> in the AArch64 port.
>
>> -  aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);
>> +  aarch64_emit_load_exclusive (mode, old_out, mem, load_model_rtx);
>
> To my mind, these two hunks would be marginally easier to follow if
> we combined them, as so:

Attached updated patch:
- Expanded the comment for aarch64_emit_post_barrier.
- Used 'barrier' rather than 'fence' in comments.
- Simplified the code for the initial load.

Tested with check-gcc for aarch64-none-linux-gnu.

Ok?
Matthew

2015-06-01  Matthew Wahab  <matthew.wahab@arm.com>

	PR target/65697
	* config/aarch64/aarch64.c (aarch64_split_compare_and_swap): Check
	for __sync memory models, emit initial loads and final barriers as
	appropriate.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0001-AArch64-Strengthen-barriers-for-sync-fetch-op-builti.patch --]
[-- Type: text/x-patch;  name=0001-AArch64-Strengthen-barriers-for-sync-fetch-op-builti.patch, Size: 2660 bytes --]

From 02effa4c1e3e219f727c88091ebd9938d90c3f8a Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Fri, 15 May 2015 09:26:28 +0100
Subject: [PATCH 1/3] [AArch64] Strengthen barriers for sync-fetch-op builtin.

Change-Id: I3342a572d672163ffc703e4e51603744680334fc
---
 gcc/config/aarch64/aarch64.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 083b9b4..b083e12 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9409,6 +9409,23 @@ aarch64_expand_compare_and_swap (rtx operands[])
   emit_insn (gen_rtx_SET (bval, x));
 }
 
+/* Emit a barrier, that is appropriate for memory model MODEL, at the end of a
+   sequence implementing an atomic operation.  */
+
+static void
+aarch64_emit_post_barrier (enum memmodel model)
+{
+  const enum memmodel base_model = memmodel_base (model);
+
+  if (is_mm_sync (model)
+      && (base_model == MEMMODEL_ACQUIRE
+	  || base_model == MEMMODEL_ACQ_REL
+	  || base_model == MEMMODEL_SEQ_CST))
+    {
+      emit_insn (gen_mem_thread_fence (GEN_INT (MEMMODEL_SEQ_CST)));
+    }
+}
+
 /* Split a compare and swap pattern.  */
 
 void
@@ -9471,6 +9488,8 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
 {
   machine_mode mode = GET_MODE (mem);
   machine_mode wmode = (mode == DImode ? DImode : SImode);
+  const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
+  const bool is_sync = is_mm_sync (model);
   rtx_code_label *label;
   rtx x;
 
@@ -9485,7 +9504,13 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
     old_out = new_out;
   value = simplify_gen_subreg (wmode, value, mode, 0);
 
-  aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);
+  /* The initial load can be relaxed for a __sync operation since a final
+     barrier will be emitted to stop code hoisting.  */
+ if (is_sync)
+    aarch64_emit_load_exclusive (mode, old_out, mem,
+				 GEN_INT (MEMMODEL_RELAXED));
+  else
+    aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);
 
   switch (code)
     {
@@ -9521,6 +9546,10 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
   x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
 			    gen_rtx_LABEL_REF (Pmode, label), pc_rtx);
   aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+
+  /* Emit any final barrier needed for a __sync operation.  */
+  if (is_sync)
+    aarch64_emit_post_barrier (model);
 }
 
 static void
-- 
1.9.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3][AArch64][PR target/65697] Strengthen barriers for sync-compare-swap builtins.
  2015-05-22  8:50   ` [PATCH 2/3][AArch64][PR target/65697] " Matthew Wahab
@ 2015-06-01 12:08     ` Matthew Wahab
  2015-06-01 12:14       ` James Greenhalgh
  0 siblings, 1 reply; 15+ messages in thread
From: Matthew Wahab @ 2015-06-01 12:08 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1109 bytes --]

On 22/05/15 09:28, Matthew Wahab wrote:
> [Added PR number and updated patches]
>
> This patch changes the code generated for __sync_type_compare_and_swap to
>
>     ldxr reg; cmp; bne label; stlxr; cbnz; label: dmb ish; mov .., reg
>
> This removes the acquire-barrier from the load and ends the operation with a
> fence to prevent memory references appearing after the __sync operation from
> being moved ahead of the store-release.
>
> This also strengthens the acquire barrier generated for __sync_lock_test_and_set
> (which, like compare-and-swap, is implemented as a form of atomic exchange):
>
>     ldaxr; stxr; cbnz
> becomes
>     ldxr; stxr; cbnz; dmb ish
>

Updated patch:
- Used 'barrier' rather than 'fence' in comments.
- Simplified the code for the initial load.

Tested with check-gcc for aarch64-none-linux-gnu.

Ok for trunk?
Matthew

2015-06-01  Matthew Wahab  <matthew.wahab@arm.com>

	PR target/65697
	* config/aarch64/aarch64.c (aarch64_split_compare_and_swap): Check
	for __sync memory models, emit initial loads and final barriers as
	appropriate.


[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: 0002-AArch64-Strengthen-barriers-for-sync-compare-swap-bu.patch --]
[-- Type: text/x-patch;  name=0002-AArch64-Strengthen-barriers-for-sync-compare-swap-bu.patch, Size: 2371 bytes --]

From e1f68896db3b367d43a7ae863339dbe100244360 Mon Sep 17 00:00:00 2001
From: Matthew Wahab <matthew.wahab@arm.com>
Date: Fri, 15 May 2015 09:31:06 +0100
Subject: [PATCH 2/3] [AArch64] Strengthen barriers for sync-compare-swap
 builtins.

Change-Id: I335771f2f42ea951d227f20f6cb9daa07330614d
---
 gcc/config/aarch64/aarch64.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b083e12..2db8f4f 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -9436,14 +9436,18 @@ aarch64_split_compare_and_swap (rtx operands[])
   bool is_weak;
   rtx_code_label *label1, *label2;
   rtx x, cond;
+  enum memmodel model;
+  rtx model_rtx;
 
   rval = operands[0];
   mem = operands[1];
   oldval = operands[2];
   newval = operands[3];
   is_weak = (operands[4] != const0_rtx);
+  model_rtx = operands[5];
   scratch = operands[7];
   mode = GET_MODE (mem);
+  model = memmodel_from_int (INTVAL (model_rtx));
 
   label1 = NULL;
   if (!is_weak)
@@ -9453,7 +9457,13 @@ aarch64_split_compare_and_swap (rtx operands[])
     }
   label2 = gen_label_rtx ();
 
-  aarch64_emit_load_exclusive (mode, rval, mem, operands[5]);
+  /* The initial load can be relaxed for a __sync operation since a final
+     barrier will be emitted to stop code hoisting.  */
+  if (is_mm_sync (model))
+    aarch64_emit_load_exclusive (mode, rval, mem,
+				 GEN_INT (MEMMODEL_RELAXED));
+  else
+    aarch64_emit_load_exclusive (mode, rval, mem, model_rtx);
 
   cond = aarch64_gen_compare_reg (NE, rval, oldval);
   x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
@@ -9461,7 +9471,7 @@ aarch64_split_compare_and_swap (rtx operands[])
 			    gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
   aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
 
-  aarch64_emit_store_exclusive (mode, scratch, mem, newval, operands[5]);
+  aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
 
   if (!is_weak)
     {
@@ -9478,6 +9488,10 @@ aarch64_split_compare_and_swap (rtx operands[])
     }
 
   emit_label (label2);
+
+  /* Emit any final barrier needed for a __sync operation.  */
+  if (is_mm_sync (model))
+    aarch64_emit_post_barrier (model);
 }
 
 /* Split an atomic operation.  */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3][AArch64][PR target/65697] Strengthen barriers for sync-compare-swap builtins.
  2015-06-01 12:08     ` Matthew Wahab
@ 2015-06-01 12:14       ` James Greenhalgh
  0 siblings, 0 replies; 15+ messages in thread
From: James Greenhalgh @ 2015-06-01 12:14 UTC (permalink / raw)
  To: Matthew Wahab; +Cc: gcc-patches

On Mon, Jun 01, 2015 at 01:08:15PM +0100, Matthew Wahab wrote:
> On 22/05/15 09:28, Matthew Wahab wrote:
> > [Added PR number and updated patches]
> >
> > This patch changes the code generated for __sync_type_compare_and_swap to
> >
> >     ldxr reg; cmp; bne label; stlxr; cbnz; label: dmb ish; mov .., reg
> >
> > This removes the acquire-barrier from the load and ends the operation with a
> > fence to prevent memory references appearing after the __sync operation from
> > being moved ahead of the store-release.
> >
> > This also strengthens the acquire barrier generated for __sync_lock_test_and_set
> > (which, like compare-and-swap, is implemented as a form of atomic exchange):
> >
> >     ldaxr; stxr; cbnz
> > becomes
> >     ldxr; stxr; cbnz; dmb ish
> >
> 
> Updated patch:
> - Used 'barrier' rather than 'fence' in comments.
> - Simplified the code for the initial load.
> 
> Tested with check-gcc for aarch64-none-linux-gnu.
> 
> Ok for trunk?
> Matthew
> 
> 2015-06-01  Matthew Wahab  <matthew.wahab@arm.com>
> 
> 	PR target/65697
> 	* config/aarch64/aarch64.c (aarch64_split_compare_and_swap): Check
> 	for __sync memory models, emit initial loads and final barriers as
> 	appropriate.
> 

OK,

Thanks,
James


> From e1f68896db3b367d43a7ae863339dbe100244360 Mon Sep 17 00:00:00 2001
> From: Matthew Wahab <matthew.wahab@arm.com>
> Date: Fri, 15 May 2015 09:31:06 +0100
> Subject: [PATCH 2/3] [AArch64] Strengthen barriers for sync-compare-swap
>  builtins.
> 
> Change-Id: I335771f2f42ea951d227f20f6cb9daa07330614d
> ---
>  gcc/config/aarch64/aarch64.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index b083e12..2db8f4f 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -9436,14 +9436,18 @@ aarch64_split_compare_and_swap (rtx operands[])
>    bool is_weak;
>    rtx_code_label *label1, *label2;
>    rtx x, cond;
> +  enum memmodel model;
> +  rtx model_rtx;
>  
>    rval = operands[0];
>    mem = operands[1];
>    oldval = operands[2];
>    newval = operands[3];
>    is_weak = (operands[4] != const0_rtx);
> +  model_rtx = operands[5];
>    scratch = operands[7];
>    mode = GET_MODE (mem);
> +  model = memmodel_from_int (INTVAL (model_rtx));
>  
>    label1 = NULL;
>    if (!is_weak)
> @@ -9453,7 +9457,13 @@ aarch64_split_compare_and_swap (rtx operands[])
>      }
>    label2 = gen_label_rtx ();
>  
> -  aarch64_emit_load_exclusive (mode, rval, mem, operands[5]);
> +  /* The initial load can be relaxed for a __sync operation since a final
> +     barrier will be emitted to stop code hoisting.  */
> +  if (is_mm_sync (model))
> +    aarch64_emit_load_exclusive (mode, rval, mem,
> +				 GEN_INT (MEMMODEL_RELAXED));
> +  else
> +    aarch64_emit_load_exclusive (mode, rval, mem, model_rtx);
>  
>    cond = aarch64_gen_compare_reg (NE, rval, oldval);
>    x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
> @@ -9461,7 +9471,7 @@ aarch64_split_compare_and_swap (rtx operands[])
>  			    gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
>    aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
>  
> -  aarch64_emit_store_exclusive (mode, scratch, mem, newval, operands[5]);
> +  aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
>  
>    if (!is_weak)
>      {
> @@ -9478,6 +9488,10 @@ aarch64_split_compare_and_swap (rtx operands[])
>      }
>  
>    emit_label (label2);
> +
> +  /* Emit any final barrier needed for a __sync operation.  */
> +  if (is_mm_sync (model))
> +    aarch64_emit_post_barrier (model);
>  }
>  
>  /* Split an atomic operation.  */
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3][AArch64][PR target/65797] Strengthen barriers for sync-fetch-op builtins.
  2015-06-01 12:06   ` [PATCH 1/3][AArch64][PR target/65797] " Matthew Wahab
@ 2015-06-01 12:15     ` James Greenhalgh
  2015-06-01 14:00       ` Marcus Shawcroft
  0 siblings, 1 reply; 15+ messages in thread
From: James Greenhalgh @ 2015-06-01 12:15 UTC (permalink / raw)
  To: Matthew Wahab; +Cc: gcc-patches

On Mon, Jun 01, 2015 at 01:06:08PM +0100, Matthew Wahab wrote:
> On 26/05/15 10:32, James Greenhalgh wrote:
> > Please tie this to the PR which was open in the ChangLog entry.
> >
> >> 	(aarch64_split_atomic_op): Check for __sync memory models, emit
> >> 	appropriate initial and final barriers.
> >
> > I don't see any new initial barriers. I think you are referring to
> > relaxing the ldaxr to an ldxr for __sync primitives, in which case, say
> > that.
> >
> >> +/* Emit a post-operation barrier.  */
> >
> > This comment could do with some more detail. What is a post-operation
> > barrier? When do we need one? What is the MODEL parameter?
> >
> >> +  /* A __sync operation will emit a final fence to stop code hoisting, so the
> >
> > Can we pick a consistent terminology between fence/barrier? They are
> > currently used interchangeably, but I think we generally prefer barrier
> > in the AArch64 port.
> >
> >> -  aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);
> >> +  aarch64_emit_load_exclusive (mode, old_out, mem, load_model_rtx);
> >
> > To my mind, these two hunks would be marginally easier to follow if
> > we combined them, as so:
> 
> Attached updated patch:
> - Expanded the comment for aarch64_emit_post_barrier.
> - Used 'barrier' rather than 'fence' in comments.
> - Simplified the code for the initial load.
> 
> Tested with check-gcc for aarch64-none-linux-gnu.
> 
> Ok?
> Matthew
> 
> 2015-06-01  Matthew Wahab  <matthew.wahab@arm.com>
> 
> 	PR target/65697
> 	* config/aarch64/aarch64.c (aarch64_split_compare_and_swap): Check
> 	for __sync memory models, emit initial loads and final barriers as
> 	appropriate.
> 

OK,

Thanks,
James


> From 02effa4c1e3e219f727c88091ebd9938d90c3f8a Mon Sep 17 00:00:00 2001
> From: Matthew Wahab <matthew.wahab@arm.com>
> Date: Fri, 15 May 2015 09:26:28 +0100
> Subject: [PATCH 1/3] [AArch64] Strengthen barriers for sync-fetch-op builtin.
> 
> Change-Id: I3342a572d672163ffc703e4e51603744680334fc
> ---
>  gcc/config/aarch64/aarch64.c | 31 ++++++++++++++++++++++++++++++-
>  1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 083b9b4..b083e12 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -9409,6 +9409,23 @@ aarch64_expand_compare_and_swap (rtx operands[])
>    emit_insn (gen_rtx_SET (bval, x));
>  }
>  
> +/* Emit a barrier, that is appropriate for memory model MODEL, at the end of a
> +   sequence implementing an atomic operation.  */
> +
> +static void
> +aarch64_emit_post_barrier (enum memmodel model)
> +{
> +  const enum memmodel base_model = memmodel_base (model);
> +
> +  if (is_mm_sync (model)
> +      && (base_model == MEMMODEL_ACQUIRE
> +	  || base_model == MEMMODEL_ACQ_REL
> +	  || base_model == MEMMODEL_SEQ_CST))
> +    {
> +      emit_insn (gen_mem_thread_fence (GEN_INT (MEMMODEL_SEQ_CST)));
> +    }
> +}
> +
>  /* Split a compare and swap pattern.  */
>  
>  void
> @@ -9471,6 +9488,8 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
>  {
>    machine_mode mode = GET_MODE (mem);
>    machine_mode wmode = (mode == DImode ? DImode : SImode);
> +  const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
> +  const bool is_sync = is_mm_sync (model);
>    rtx_code_label *label;
>    rtx x;
>  
> @@ -9485,7 +9504,13 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
>      old_out = new_out;
>    value = simplify_gen_subreg (wmode, value, mode, 0);
>  
> -  aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);
> +  /* The initial load can be relaxed for a __sync operation since a final
> +     barrier will be emitted to stop code hoisting.  */
> + if (is_sync)
> +    aarch64_emit_load_exclusive (mode, old_out, mem,
> +				 GEN_INT (MEMMODEL_RELAXED));
> +  else
> +    aarch64_emit_load_exclusive (mode, old_out, mem, model_rtx);
>  
>    switch (code)
>      {
> @@ -9521,6 +9546,10 @@ aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
>    x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
>  			    gen_rtx_LABEL_REF (Pmode, label), pc_rtx);
>    aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
> +
> +  /* Emit any final barrier needed for a __sync operation.  */
> +  if (is_sync)
> +    aarch64_emit_post_barrier (model);
>  }
>  
>  static void
> -- 
> 1.9.1
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3][AArch64][PR target/65797] Strengthen barriers for sync-fetch-op builtins.
  2015-06-01 12:15     ` James Greenhalgh
@ 2015-06-01 14:00       ` Marcus Shawcroft
  0 siblings, 0 replies; 15+ messages in thread
From: Marcus Shawcroft @ 2015-06-01 14:00 UTC (permalink / raw)
  To: James Greenhalgh; +Cc: Matthew Wahab, gcc-patches

>> Attached updated patch:
>> - Expanded the comment for aarch64_emit_post_barrier.
>> - Used 'barrier' rather than 'fence' in comments.
>> - Simplified the code for the initial load.
>>
>> Tested with check-gcc for aarch64-none-linux-gnu.
>>
>> Ok?
>> Matthew
>>
>> 2015-06-01  Matthew Wahab  <matthew.wahab@arm.com>
>>
>>       PR target/65697
>>       * config/aarch64/aarch64.c (aarch64_split_compare_and_swap): Check
>>       for __sync memory models, emit initial loads and final barriers as
>>       appropriate.
>>
>
> OK,
>
> Thanks,
> James

We are emitting __sync primitives with insufficient barriers in gcc5
(and gcc4.9?) I would expect that anyone hitting this issue in
production will have a tough job debugging it.   We need to look at
back porting this patch set and the prerequisite patches.... Cheers
/Marcus

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-06-01 14:00 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-21 15:59 [PATCH 1/3][AArch64] Strengthen barriers for sync-fetch-op builtins Matthew Wahab
2015-05-21 16:00 ` [AArch64][PATCH 2/3] Strengthen barriers for sync-compare-swap builtins Matthew Wahab
2015-05-22  8:50   ` [PATCH 2/3][AArch64][PR target/65697] " Matthew Wahab
2015-06-01 12:08     ` Matthew Wahab
2015-06-01 12:14       ` James Greenhalgh
2015-05-21 16:03 ` [PATCH 3/3][Aarch64] Add tests for __sync_builtins Matthew Wahab
2015-05-22  8:51   ` [PATCH 3/3][Aarch64][PR target/65697] " Matthew Wahab
2015-05-26 10:11     ` James Greenhalgh
2015-05-22  8:30 ` [PATCH 1/3][AArch64][PR target/65697] Strengthen barriers for sync-fetch-op builtins Matthew Wahab
2015-05-22 11:47 ` [PATCH 1/3][AArch64] " Ramana Radhakrishnan
2015-05-22 12:15   ` Matthew Wahab
2015-05-26  9:52 ` James Greenhalgh
2015-06-01 12:06   ` [PATCH 1/3][AArch64][PR target/65797] " Matthew Wahab
2015-06-01 12:15     ` James Greenhalgh
2015-06-01 14:00       ` Marcus Shawcroft

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).