Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
       [not found] ` <fc24deee-e649-5651-17d3-353ec43f0d81@foss.arm.com>
@ 2020-03-09 21:47   ` Pop, Sebastian
  2020-03-11 10:10     ` Kyrill Tkachov
  0 siblings, 1 reply; 12+ messages in thread
From: Pop, Sebastian @ 2020-03-09 21:47 UTC (permalink / raw)
  To: Kyrill Tkachov, gcc-patches; +Cc: richard.henderson, Wilco Dijkstra

[-- Attachment #1: Type: text/plain, Size: 5905 bytes --]

Hi,

Please see attached the patches to add -moutline-atomics to the gcc-9 branch.
Tested on graviton2 aarch64-linux with bootstrap and
`make check` passes with no new fails.
Tested `make check` on glibc built with gcc-9 with and without "-moutline-atomics"
and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".

Ok to commit to gcc-9 branch?

Does this mechanical `git am *.patch` require a copyright assignment?
I am still working with my employer on getting the FSF assignment signed.

Thanks,
Sebastian

PS: For gcc-8 backports there are 5 cleanup and improvement patches
that are needed for -moutline-atomics patches to apply cleanly.
Should these patches be back-ported in the same time as the flag patches,
or should I update the patches to apply to the older code base?
Here is the list of the extra patches:

From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 31 Oct 2018 09:29:29 +0000
Subject: [PATCH] aarch64: Simplify LSE cas generation

The cas insn is a single insn, and if expanded properly need not
be split after reload.  Use the proper inputs for the insn.

        * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
        Force oldval into the rval register for TARGET_LSE; emit the compare
        during initial expansion so that it may be deleted if unused.
        (aarch64_gen_atomic_cas): Remove.
        * config/aarch64/atomics.md (@aarch64_compare_and_swap<SHORT>_lse):
        Change =&r to +r for operand 0; use match_dup for operand 2;
        remove is_weak and mod_f operands as unused.  Drop the split
        and merge with...
        (@aarch64_atomic_cas<SHORT>): ... this pattern's output; remove.
        (@aarch64_compare_and_swap<GPI>_lse): Similarly.
        (@aarch64_atomic_cas<GPI>): Similarly.

From-SVN: r265656

From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 31 Oct 2018 09:42:39 +0000
Subject: [PATCH] aarch64: Improve cas generation

Do not zero-extend the input to the cas for subword operations;
instead, use the appropriate zero-extending compare insns.
Correct the predicates and constraints for immediate expected operand.

        * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New.
        (aarch64_split_compare_and_swap): Use it.
        (aarch64_expand_compare_and_swap): Likewise.  Remove convert_modes;
        test oldval against the proper predicate.
        * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI>):
        Use nonmemory_operand for expected.
        (cas_short_expected_pred): New.
        (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not "rI" to match.
        (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for expected.
        * config/aarch64/predicates.md (aarch64_plushi_immediate): New.
        (aarch64_plushi_operand): New.

From-SVN: r265657

From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 31 Oct 2018 09:47:21 +0000
Subject: [PATCH] aarch64: Improve swp generation

Allow zero as an input; fix constraints; avoid unnecessary split.

        * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove.
        (aarch64_gen_atomic_ldop): Don't call it.
        * config/aarch64/atomics.md (atomic_exchange<ALLI>):
        Use aarch64_reg_or_zero.
        (aarch64_atomic_exchange<ALLI>): Likewise.
        (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove & from
        operand 0; use aarch64_reg_or_zero for input; merge ...
        (@aarch64_atomic_swp<ALLI>): ... this and remove.

From-SVN: r265659

From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 31 Oct 2018 09:58:48 +0000
Subject: [PATCH] aarch64: Improve atomic-op lse generation

Fix constraints; avoid unnecessary split.  Drop the use of the atomic_op
iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more
logical for ldclr aka bic.

        * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
        (aarch64_atomic_ldop_supported_p): Remove.
        (aarch64_gen_atomic_ldop): Remove.
        * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>):
        Fully expand LSE operations here.
        (atomic_fetch_<atomic_optab><ALLI>): Likewise.
        (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
        (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op iterator
        and use ATOMIC_LDOP instead; use register_operand for the input;
        drop the split and emit insns directly.
        (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise.
        (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
        (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.

From-SVN: r265660

From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 31 Oct 2018 23:11:22 +0000
Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch

        * config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
        The scratch register need not be early-clobber.  Document the reason
        why we cannot use ST<OP>.

From-SVN: r265703





On 2/27/20, 12:06 PM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:

    Hi Sebastian,
    
    On 2/27/20 4:53 PM, Pop, Sebastian wrote:
    >
    > Hi,
    >
    > is somebody already working on backporting -moutline-atomics to gcc 
    > 8.x and 9.x branches?
    >
    I'm not aware of such work going on.
    
    Thanks,
    
    Kyrill
    
    > Thanks,
    >
    > Sebastian
    >
    


[-- Attachment #2: 0001-aarch64-Extend-R-for-integer-registers.patch --]
[-- Type: application/octet-stream, Size: 1766 bytes --]

From 6f2dba3cd5a72fda9682105ec3f3e8af5c0c1533 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:24 +0000
Subject: [PATCH 1/6] aarch64: Extend %R for integer registers

	* config/aarch64/aarch64.c (aarch64_print_operand): Allow integer
	registers with %R.

From-SVN: r275964
---
 gcc/config/aarch64/aarch64.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index b452a53af99..d7f6e1ebb63 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7541,7 +7541,7 @@ sizetochar (int size)
      'S/T/U/V':		Print a FP/SIMD register name for a register list.
 			The register printed is the FP/SIMD register name
 			of X + 0/1/2/3 for S/T/U/V.
-     'R':		Print a scalar FP/SIMD register name + 1.
+     'R':		Print a scalar Integer/FP/SIMD register name + 1.
      'X':		Print bottom 16 bits of integer constant in hex.
      'w/x':		Print a general register name or the zero register
 			(32-bit or 64-bit).
@@ -7733,12 +7733,13 @@ aarch64_print_operand (FILE *f, rtx x, int code)
       break;
 
     case 'R':
-      if (!REG_P (x) || !FP_REGNUM_P (REGNO (x)))
-	{
-	  output_operand_lossage ("incompatible floating point / vector register operand for '%%%c'", code);
-	  return;
-	}
-      asm_fprintf (f, "q%d", REGNO (x) - V0_REGNUM + 1);
+      if (REG_P (x) && FP_REGNUM_P (REGNO (x)))
+	asm_fprintf (f, "q%d", REGNO (x) - V0_REGNUM + 1);
+      else if (REG_P (x) && GP_REGNUM_P (REGNO (x)))
+	asm_fprintf (f, "x%d", REGNO (x) - R0_REGNUM + 1);
+      else
+	output_operand_lossage ("incompatible register operand for '%%%c'",
+				code);
       break;
 
     case 'X':
-- 
2.20.1


[-- Attachment #3: 0002-aarch64-Implement-TImode-compare-and-swap.patch --]
[-- Type: application/octet-stream, Size: 9163 bytes --]

From 4f080a93985d1e92741fa3865a91d519d3c85170 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:29 +0000
Subject: [PATCH 2/6] aarch64: Implement TImode compare-and-swap

This pattern will only be used with the __sync functions, because
we do not yet have a bare TImode atomic load.

	* config/aarch64/aarch64.c (aarch64_gen_compare_reg): Add support
	for NE comparison of TImode values.
	(aarch64_emit_load_exclusive): Add support for TImode.
	(aarch64_emit_store_exclusive): Likewise.
	(aarch64_split_compare_and_swap): Disable strong_zero_p for TImode.
	* config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI_TI>):
	Change iterator from ALLI to ALLI_TI.
	(@atomic_compare_and_swap<JUST_TI>): New.
	(@atomic_compare_and_swap<JUST_TI>_lse): New.
	(aarch64_load_exclusive_pair): New.
	(aarch64_store_exclusive_pair): New.
	* config/aarch64/iterators.md (JUST_TI): New.

From-SVN: r275965
---
 gcc/config/aarch64/aarch64.c    | 47 ++++++++++++++---
 gcc/config/aarch64/atomics.md   | 91 +++++++++++++++++++++++++++++++--
 gcc/config/aarch64/iterators.md |  3 ++
 3 files changed, 130 insertions(+), 11 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index d7f6e1ebb63..01efdcd82e1 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1879,10 +1879,33 @@ emit_set_insn (rtx x, rtx y)
 rtx
 aarch64_gen_compare_reg (RTX_CODE code, rtx x, rtx y)
 {
-  machine_mode mode = SELECT_CC_MODE (code, x, y);
-  rtx cc_reg = gen_rtx_REG (mode, CC_REGNUM);
+  machine_mode cmp_mode = GET_MODE (x);
+  machine_mode cc_mode;
+  rtx cc_reg;
 
-  emit_set_insn (cc_reg, gen_rtx_COMPARE (mode, x, y));
+  if (cmp_mode == TImode)
+    {
+      gcc_assert (code == NE);
+
+      cc_mode = CCmode;
+      cc_reg = gen_rtx_REG (cc_mode, CC_REGNUM);
+
+      rtx x_lo = operand_subword (x, 0, 0, TImode);
+      rtx y_lo = operand_subword (y, 0, 0, TImode);
+      emit_set_insn (cc_reg, gen_rtx_COMPARE (cc_mode, x_lo, y_lo));
+
+      rtx x_hi = operand_subword (x, 1, 0, TImode);
+      rtx y_hi = operand_subword (y, 1, 0, TImode);
+      emit_insn (gen_ccmpdi (cc_reg, cc_reg, x_hi, y_hi,
+			     gen_rtx_EQ (cc_mode, cc_reg, const0_rtx),
+			     GEN_INT (AARCH64_EQ)));
+    }
+  else
+    {
+      cc_mode = SELECT_CC_MODE (code, x, y);
+      cc_reg = gen_rtx_REG (cc_mode, CC_REGNUM);
+      emit_set_insn (cc_reg, gen_rtx_COMPARE (cc_mode, x, y));
+    }
   return cc_reg;
 }
 
@@ -15427,16 +15450,26 @@ static void
 aarch64_emit_load_exclusive (machine_mode mode, rtx rval,
 			     rtx mem, rtx model_rtx)
 {
-  emit_insn (gen_aarch64_load_exclusive (mode, rval, mem, model_rtx));
+  if (mode == TImode)
+    emit_insn (gen_aarch64_load_exclusive_pair (gen_lowpart (DImode, rval),
+						gen_highpart (DImode, rval),
+						mem, model_rtx));
+  else
+    emit_insn (gen_aarch64_load_exclusive (mode, rval, mem, model_rtx));
 }
 
 /* Emit store exclusive.  */
 
 static void
 aarch64_emit_store_exclusive (machine_mode mode, rtx bval,
-			      rtx rval, rtx mem, rtx model_rtx)
+			      rtx mem, rtx rval, rtx model_rtx)
 {
-  emit_insn (gen_aarch64_store_exclusive (mode, bval, rval, mem, model_rtx));
+  if (mode == TImode)
+    emit_insn (gen_aarch64_store_exclusive_pair
+	       (bval, mem, operand_subword (rval, 0, 0, TImode),
+		operand_subword (rval, 1, 0, TImode), model_rtx));
+  else
+    emit_insn (gen_aarch64_store_exclusive (mode, bval, mem, rval, model_rtx));
 }
 
 /* Mark the previous jump instruction as unlikely.  */
@@ -15566,7 +15599,7 @@ aarch64_split_compare_and_swap (rtx operands[])
 	CBNZ	scratch, .label1
     .label2:
 	CMP	rval, 0.  */
-  bool strong_zero_p = !is_weak && oldval == const0_rtx;
+  bool strong_zero_p = !is_weak && oldval == const0_rtx && mode != TImode;
 
   label1 = NULL;
   if (!is_weak)
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 0f357662ac3..09d2a63c620 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -22,10 +22,10 @@
 
 (define_expand "@atomic_compare_and_swap<mode>"
   [(match_operand:SI 0 "register_operand" "")			;; bool out
-   (match_operand:ALLI 1 "register_operand" "")			;; val out
-   (match_operand:ALLI 2 "aarch64_sync_memory_operand" "")	;; memory
-   (match_operand:ALLI 3 "nonmemory_operand" "")		;; expected
-   (match_operand:ALLI 4 "aarch64_reg_or_zero" "")		;; desired
+   (match_operand:ALLI_TI 1 "register_operand" "")		;; val out
+   (match_operand:ALLI_TI 2 "aarch64_sync_memory_operand" "")	;; memory
+   (match_operand:ALLI_TI 3 "nonmemory_operand" "")		;; expected
+   (match_operand:ALLI_TI 4 "aarch64_reg_or_zero" "")		;; desired
    (match_operand:SI 5 "const_int_operand")			;; is_weak
    (match_operand:SI 6 "const_int_operand")			;; mod_s
    (match_operand:SI 7 "const_int_operand")]			;; mod_f
@@ -88,6 +88,30 @@
   }
 )
 
+(define_insn_and_split "@aarch64_compare_and_swap<mode>"
+  [(set (reg:CC CC_REGNUM)					;; bool out
+    (unspec_volatile:CC [(const_int 0)] UNSPECV_ATOMIC_CMPSW))
+   (set (match_operand:JUST_TI 0 "register_operand" "=&r")	;; val out
+    (match_operand:JUST_TI 1 "aarch64_sync_memory_operand" "+Q")) ;; memory
+   (set (match_dup 1)
+    (unspec_volatile:JUST_TI
+      [(match_operand:JUST_TI 2 "aarch64_reg_or_zero" "rZ")	;; expect
+       (match_operand:JUST_TI 3 "aarch64_reg_or_zero" "rZ")	;; desired
+       (match_operand:SI 4 "const_int_operand")			;; is_weak
+       (match_operand:SI 5 "const_int_operand")			;; mod_s
+       (match_operand:SI 6 "const_int_operand")]		;; mod_f
+      UNSPECV_ATOMIC_CMPSW))
+   (clobber (match_scratch:SI 7 "=&r"))]
+  ""
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+  {
+    aarch64_split_compare_and_swap (operands);
+    DONE;
+  }
+)
+
 (define_insn "@aarch64_compare_and_swap<mode>_lse"
   [(set (match_operand:SI 0 "register_operand" "+r")		;; val out
     (zero_extend:SI
@@ -133,6 +157,28 @@
     return "casal<atomic_sfx>\t%<w>0, %<w>2, %1";
 })
 
+(define_insn "@aarch64_compare_and_swap<mode>_lse"
+  [(set (match_operand:JUST_TI 0 "register_operand" "+r")	;; val out
+    (match_operand:JUST_TI 1 "aarch64_sync_memory_operand" "+Q")) ;; memory
+   (set (match_dup 1)
+    (unspec_volatile:JUST_TI
+      [(match_dup 0)						;; expect
+       (match_operand:JUST_TI 2 "register_operand" "r")		;; desired
+       (match_operand:SI 3 "const_int_operand")]		;; mod_s
+      UNSPECV_ATOMIC_CMPSW))]
+  "TARGET_LSE"
+{
+  enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+  if (is_mm_relaxed (model))
+    return "casp\t%0, %R0, %2, %R2, %1";
+  else if (is_mm_acquire (model) || is_mm_consume (model))
+    return "caspa\t%0, %R0, %2, %R2, %1";
+  else if (is_mm_release (model))
+    return "caspl\t%0, %R0, %2, %R2, %1";
+  else
+    return "caspal\t%0, %R0, %2, %R2, %1";
+})
+
 (define_expand "atomic_exchange<mode>"
  [(match_operand:ALLI 0 "register_operand" "")
   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "")
@@ -581,6 +627,24 @@
   }
 )
 
+(define_insn "aarch64_load_exclusive_pair"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec_volatile:DI
+	  [(match_operand:TI 2 "aarch64_sync_memory_operand" "Q")
+	   (match_operand:SI 3 "const_int_operand")]
+	  UNSPECV_LX))
+   (set (match_operand:DI 1 "register_operand" "=r")
+	(unspec_volatile:DI [(match_dup 2) (match_dup 3)] UNSPECV_LX))]
+  ""
+  {
+    enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+    if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_release (model))
+      return "ldxp\t%0, %1, %2";
+    else
+      return "ldaxp\t%0, %1, %2";
+  }
+)
+
 (define_insn "@aarch64_store_exclusive<mode>"
   [(set (match_operand:SI 0 "register_operand" "=&r")
     (unspec_volatile:SI [(const_int 0)] UNSPECV_SX))
@@ -599,6 +663,25 @@
   }
 )
 
+(define_insn "aarch64_store_exclusive_pair"
+  [(set (match_operand:SI 0 "register_operand" "=&r")
+	(unspec_volatile:SI [(const_int 0)] UNSPECV_SX))
+   (set (match_operand:TI 1 "aarch64_sync_memory_operand" "=Q")
+	(unspec_volatile:TI
+	  [(match_operand:DI 2 "aarch64_reg_or_zero" "rZ")
+	   (match_operand:DI 3 "aarch64_reg_or_zero" "rZ")
+	   (match_operand:SI 4 "const_int_operand")]
+	  UNSPECV_SX))]
+  ""
+  {
+    enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+    if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_acquire (model))
+      return "stxp\t%w0, %x2, %x3, %1";
+    else
+      return "stlxp\t%w0, %x2, %x3, %1";
+  }
+)
+
 (define_expand "mem_thread_fence"
   [(match_operand:SI 0 "const_int_operand" "")]
   ""
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 6caeeac8086..3bc49ea0238 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -29,6 +29,9 @@
 ;; Iterator for HI, SI, DI, some instructions can only work on these modes.
 (define_mode_iterator GPI_I16 [(HI "AARCH64_ISA_F16") SI DI])
 
+;; "Iterator" for just TI -- features like @pattern only work with iterators.
+(define_mode_iterator JUST_TI [TI])
+
 ;; Iterator for QI and HI modes
 (define_mode_iterator SHORT [QI HI])
 
-- 
2.20.1


[-- Attachment #4: 0003-aarch64-Tidy-aarch64_split_compare_and_swap.patch --]
[-- Type: application/octet-stream, Size: 4393 bytes --]

From 7271cd81918fbf45b5060b48fc295da156ff6ad7 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:33 +0000
Subject: [PATCH 3/6] aarch64: Tidy aarch64_split_compare_and_swap

With aarch64_track_speculation, we had extra code to do exactly what the
!strong_zero_p path already did.  The rest is reducing code duplication.

	* config/aarch64/aarch64 (aarch64_split_compare_and_swap): Disable
	strong_zero_p for aarch64_track_speculation; unify some code paths;
	use aarch64_gen_compare_reg instead of open-coding.

From-SVN: r275966
---
 gcc/config/aarch64/aarch64.c | 50 ++++++++++--------------------------
 1 file changed, 14 insertions(+), 36 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 01efdcd82e1..131747d4616 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15572,13 +15572,11 @@ aarch64_split_compare_and_swap (rtx operands[])
   /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
   gcc_assert (epilogue_completed);
 
-  rtx rval, mem, oldval, newval, scratch;
+  rtx rval, mem, oldval, newval, scratch, x, model_rtx;
   machine_mode mode;
   bool is_weak;
   rtx_code_label *label1, *label2;
-  rtx x, cond;
   enum memmodel model;
-  rtx model_rtx;
 
   rval = operands[0];
   mem = operands[1];
@@ -15599,7 +15597,8 @@ aarch64_split_compare_and_swap (rtx operands[])
 	CBNZ	scratch, .label1
     .label2:
 	CMP	rval, 0.  */
-  bool strong_zero_p = !is_weak && oldval == const0_rtx && mode != TImode;
+  bool strong_zero_p = (!is_weak && !aarch64_track_speculation &&
+			oldval == const0_rtx && mode != TImode);
 
   label1 = NULL;
   if (!is_weak)
@@ -15612,35 +15611,20 @@ aarch64_split_compare_and_swap (rtx operands[])
   /* The initial load can be relaxed for a __sync operation since a final
      barrier will be emitted to stop code hoisting.  */
   if (is_mm_sync (model))
-    aarch64_emit_load_exclusive (mode, rval, mem,
-				 GEN_INT (MEMMODEL_RELAXED));
+    aarch64_emit_load_exclusive (mode, rval, mem, GEN_INT (MEMMODEL_RELAXED));
   else
     aarch64_emit_load_exclusive (mode, rval, mem, model_rtx);
 
   if (strong_zero_p)
-    {
-      if (aarch64_track_speculation)
-	{
-	  /* Emit an explicit compare instruction, so that we can correctly
-	     track the condition codes.  */
-	  rtx cc_reg = aarch64_gen_compare_reg (NE, rval, const0_rtx);
-	  x = gen_rtx_NE (GET_MODE (cc_reg), cc_reg, const0_rtx);
-	}
-      else
-	x = gen_rtx_NE (VOIDmode, rval, const0_rtx);
-
-      x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-				gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
-      aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
-    }
+    x = gen_rtx_NE (VOIDmode, rval, const0_rtx);
   else
     {
-      cond = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
-      x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
-      x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-				gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
-      aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+      rtx cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
+      x = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx);
     }
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			    gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
+  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
 
   aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
 
@@ -15661,22 +15645,16 @@ aarch64_split_compare_and_swap (rtx operands[])
       aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
     }
   else
-    {
-      cond = gen_rtx_REG (CCmode, CC_REGNUM);
-      x = gen_rtx_COMPARE (CCmode, scratch, const0_rtx);
-      emit_insn (gen_rtx_SET (cond, x));
-    }
+    aarch64_gen_compare_reg (NE, scratch, const0_rtx);
 
   emit_label (label2);
+
   /* If we used a CBNZ in the exchange loop emit an explicit compare with RVAL
      to set the condition flags.  If this is not used it will be removed by
      later passes.  */
   if (strong_zero_p)
-    {
-      cond = gen_rtx_REG (CCmode, CC_REGNUM);
-      x = gen_rtx_COMPARE (CCmode, rval, const0_rtx);
-      emit_insn (gen_rtx_SET (cond, x));
-    }
+    aarch64_gen_compare_reg (NE, rval, const0_rtx);
+
   /* Emit any final barrier needed for a __sync operation.  */
   if (is_mm_sync (model))
     aarch64_emit_post_barrier (model);
-- 
2.20.1


[-- Attachment #5: 0004-aarch64-Add-out-of-line-functions-for-LSE-atomics.patch --]
[-- Type: application/octet-stream, Size: 12042 bytes --]

From fc82d9fa403f562991792be557c3c9996fb34529 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:38 +0000
Subject: [PATCH 4/6] aarch64: Add out-of-line functions for LSE atomics

This is the libgcc part of the interface -- providing the functions.
Rationale is provided at the top of libgcc/config/aarch64/lse.S.

	* config/aarch64/lse-init.c: New file.
	* config/aarch64/lse.S: New file.
	* config/aarch64/t-lse: New file.
	* config.host: Add t-lse to all aarch64 tuples.

From-SVN: r275967
---
 libgcc/config.host               |   4 +
 libgcc/config/aarch64/lse-init.c |  45 ++++++
 libgcc/config/aarch64/lse.S      | 235 +++++++++++++++++++++++++++++++
 libgcc/config/aarch64/t-lse      |  44 ++++++
 4 files changed, 328 insertions(+)
 create mode 100644 libgcc/config/aarch64/lse-init.c
 create mode 100644 libgcc/config/aarch64/lse.S
 create mode 100644 libgcc/config/aarch64/t-lse

diff --git a/libgcc/config.host b/libgcc/config.host
index 0f15fda3612..18e306b48a5 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -347,23 +347,27 @@ aarch64*-*-elf | aarch64*-*-rtems*)
 	extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o"
 	extra_parts="$extra_parts crtfastmath.o"
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	md_unwind_header=aarch64/aarch64-unwind.h
 	;;
 aarch64*-*-freebsd*)
 	extra_parts="$extra_parts crtfastmath.o"
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	md_unwind_header=aarch64/freebsd-unwind.h
 	;;
 aarch64*-*-fuchsia*)
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp"
 	;;
 aarch64*-*-linux*)
 	extra_parts="$extra_parts crtfastmath.o"
 	md_unwind_header=aarch64/linux-unwind.h
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	;;
 alpha*-*-linux*)
diff --git a/libgcc/config/aarch64/lse-init.c b/libgcc/config/aarch64/lse-init.c
new file mode 100644
index 00000000000..33d29147479
--- /dev/null
+++ b/libgcc/config/aarch64/lse-init.c
@@ -0,0 +1,45 @@
+/* Out-of-line LSE atomics for AArch64 architecture, Init.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   Contributed by Linaro Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+/* Define the symbol gating the LSE implementations.  */
+_Bool __aarch64_have_lse_atomics
+  __attribute__((visibility("hidden"), nocommon));
+
+/* Disable initialization of __aarch64_have_lse_atomics during bootstrap.  */
+#ifndef inhibit_libc
+# include <sys/auxv.h>
+
+/* Disable initialization if the system headers are too old.  */
+# if defined(AT_HWCAP) && defined(HWCAP_ATOMICS)
+
+static void __attribute__((constructor))
+init_have_lse_atomics (void)
+{
+  unsigned long hwcap = getauxval (AT_HWCAP);
+  __aarch64_have_lse_atomics = (hwcap & HWCAP_ATOMICS) != 0;
+}
+
+# endif /* HWCAP */
+#endif /* inhibit_libc */
diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
new file mode 100644
index 00000000000..a5f6673596c
--- /dev/null
+++ b/libgcc/config/aarch64/lse.S
@@ -0,0 +1,235 @@
+/* Out-of-line LSE atomics for AArch64 architecture.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   Contributed by Linaro Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+/*
+ * The problem that we are trying to solve is operating system deployment
+ * of ARMv8.1-Atomics, also known as Large System Exensions (LSE).
+ *
+ * There are a number of potential solutions for this problem which have
+ * been proposed and rejected for various reasons.  To recap:
+ *
+ * (1) Multiple builds.  The dynamic linker will examine /lib64/atomics/
+ * if HWCAP_ATOMICS is set, allowing entire libraries to be overwritten.
+ * However, not all Linux distributions are happy with multiple builds,
+ * and anyway it has no effect on main applications.
+ *
+ * (2) IFUNC.  We could put these functions into libgcc_s.so, and have
+ * a single copy of each function for all DSOs.  However, ARM is concerned
+ * that the branch-to-indirect-branch that is implied by using a PLT,
+ * as required by IFUNC, is too much overhead for smaller cpus.
+ *
+ * (3) Statically predicted direct branches.  This is the approach that
+ * is taken here.  These functions are linked into every DSO that uses them.
+ * All of the symbols are hidden, so that the functions are called via a
+ * direct branch.  The choice of LSE vs non-LSE is done via one byte load
+ * followed by a well-predicted direct branch.  The functions are compiled
+ * separately to minimize code size.
+ */
+
+/* Tell the assembler to accept LSE instructions.  */
+	.arch armv8-a+lse
+
+/* Declare the symbol gating the LSE implementations.  */
+	.hidden	__aarch64_have_lse_atomics
+
+/* Turn size and memory model defines into mnemonic fragments.  */
+#if SIZE == 1
+# define S     b
+# define UXT   uxtb
+#elif SIZE == 2
+# define S     h
+# define UXT   uxth
+#elif SIZE == 4 || SIZE == 8 || SIZE == 16
+# define S
+# define UXT   mov
+#else
+# error
+#endif
+
+#if MODEL == 1
+# define SUFF  _relax
+# define A
+# define L
+#elif MODEL == 2
+# define SUFF  _acq
+# define A     a
+# define L
+#elif MODEL == 3
+# define SUFF  _rel
+# define A
+# define L     l
+#elif MODEL == 4
+# define SUFF  _acq_rel
+# define A     a
+# define L     l
+#else
+# error
+#endif
+
+/* Concatenate symbols.  */
+#define glue2_(A, B)		A ## B
+#define glue2(A, B)		glue2_(A, B)
+#define glue3_(A, B, C)		A ## B ## C
+#define glue3(A, B, C)		glue3_(A, B, C)
+#define glue4_(A, B, C, D)	A ## B ## C ## D
+#define glue4(A, B, C, D)	glue4_(A, B, C, D)
+
+/* Select the size of a register, given a regno.  */
+#define x(N)			glue2(x, N)
+#define w(N)			glue2(w, N)
+#if SIZE < 8
+# define s(N)			w(N)
+#else
+# define s(N)			x(N)
+#endif
+
+#define NAME(BASE)		glue4(__aarch64_, BASE, SIZE, SUFF)
+#define LDXR			glue4(ld, A, xr, S)
+#define STXR			glue4(st, L, xr, S)
+
+/* Temporary registers used.  Other than these, only the return value
+   register (x0) and the flags are modified.  */
+#define tmp0	16
+#define tmp1	17
+#define tmp2	15
+
+/* Start and end a function.  */
+.macro	STARTFN name
+	.text
+	.balign	16
+	.globl	\name
+	.hidden	\name
+	.type	\name, %function
+	.cfi_startproc
+\name:
+.endm
+
+.macro	ENDFN name
+	.cfi_endproc
+	.size	\name, . - \name
+.endm
+
+/* Branch to LABEL if LSE is disabled.  */
+.macro	JUMP_IF_NOT_LSE label
+	adrp	x(tmp0), __aarch64_have_lse_atomics
+	ldrb	w(tmp0), [x(tmp0), :lo12:__aarch64_have_lse_atomics]
+	cbz	w(tmp0), \label
+.endm
+
+#ifdef L_cas
+
+STARTFN	NAME(cas)
+	JUMP_IF_NOT_LSE	8f
+
+#if SIZE < 16
+#define CAS	glue4(cas, A, L, S)
+
+	CAS		s(0), s(1), [x2]
+	ret
+
+8:	UXT		s(tmp0), s(0)
+0:	LDXR		s(0), [x2]
+	cmp		s(0), s(tmp0)
+	bne		1f
+	STXR		w(tmp1), s(1), [x2]
+	cbnz		w(tmp1), 0b
+1:	ret
+
+#else
+#define LDXP	glue3(ld, A, xp)
+#define STXP	glue3(st, L, xp)
+#define CASP	glue3(casp, A, L)
+
+	CASP		x0, x1, x2, x3, [x4]
+	ret
+
+8:	mov		x(tmp0), x0
+	mov		x(tmp1), x1
+0:	LDXP		x0, x1, [x4]
+	cmp		x0, x(tmp0)
+	ccmp		x1, x(tmp1), #0, eq
+	bne		1f
+	STXP		w(tmp2), x(tmp0), x(tmp1), [x4]
+	cbnz		w(tmp2), 0b
+1:	ret
+
+#endif
+
+ENDFN	NAME(cas)
+#endif
+
+#ifdef L_swp
+#define SWP	glue4(swp, A, L, S)
+
+STARTFN	NAME(swp)
+	JUMP_IF_NOT_LSE	8f
+
+	SWP		s(0), s(0), [x1]
+	ret
+
+8:	mov		s(tmp0), s(0)
+0:	LDXR		s(0), [x1]
+	STXR		w(tmp1), s(tmp0), [x1]
+	cbnz		w(tmp1), 0b
+	ret
+
+ENDFN	NAME(swp)
+#endif
+
+#if defined(L_ldadd) || defined(L_ldclr) \
+    || defined(L_ldeor) || defined(L_ldset)
+
+#ifdef L_ldadd
+#define LDNM	ldadd
+#define OP	add
+#elif defined(L_ldclr)
+#define LDNM	ldclr
+#define OP	bic
+#elif defined(L_ldeor)
+#define LDNM	ldeor
+#define OP	eor
+#elif defined(L_ldset)
+#define LDNM	ldset
+#define OP	orr
+#else
+#error
+#endif
+#define LDOP	glue4(LDNM, A, L, S)
+
+STARTFN	NAME(LDNM)
+	JUMP_IF_NOT_LSE	8f
+
+	LDOP		s(0), s(0), [x1]
+	ret
+
+8:	mov		s(tmp0), s(0)
+0:	LDXR		s(0), [x1]
+	OP		s(tmp1), s(0), s(tmp0)
+	STXR		w(tmp1), s(tmp1), [x1]
+	cbnz		w(tmp1), 0b
+	ret
+
+ENDFN	NAME(LDNM)
+#endif
diff --git a/libgcc/config/aarch64/t-lse b/libgcc/config/aarch64/t-lse
new file mode 100644
index 00000000000..fe3868dacbf
--- /dev/null
+++ b/libgcc/config/aarch64/t-lse
@@ -0,0 +1,44 @@
+# Out-of-line LSE atomics for AArch64 architecture.
+# Copyright (C) 2019 Free Software Foundation, Inc.
+# Contributed by Linaro Ltd.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Compare-and-swap has 5 sizes and 4 memory models.
+S0 := $(foreach s, 1 2 4 8 16, $(addsuffix _$(s), cas))
+O0 := $(foreach m, 1 2 3 4, $(addsuffix _$(m)$(objext), $(S0)))
+
+# Swap, Load-and-operate have 4 sizes and 4 memory models
+S1 := $(foreach s, 1 2 4 8, $(addsuffix _$(s), swp ldadd ldclr ldeor ldset))
+O1 := $(foreach m, 1 2 3 4, $(addsuffix _$(m)$(objext), $(S1)))
+
+LSE_OBJS := $(O0) $(O1)
+
+libgcc-objects += $(LSE_OBJS) lse-init$(objext)
+
+empty      =
+space      = $(empty) $(empty)
+PAT_SPLIT  = $(subst _,$(space),$(*F))
+PAT_BASE   = $(word 1,$(PAT_SPLIT))
+PAT_N      = $(word 2,$(PAT_SPLIT))
+PAT_M      = $(word 3,$(PAT_SPLIT))
+
+lse-init$(objext): $(srcdir)/config/aarch64/lse-init.c
+	$(gcc_compile) -c $<
+
+$(LSE_OBJS): $(srcdir)/config/aarch64/lse.S
+	$(gcc_compile) -DL_$(PAT_BASE) -DSIZE=$(PAT_N) -DMODEL=$(PAT_M) -c $<
-- 
2.20.1


[-- Attachment #6: 0005-aarch64-Implement-moutline-atomics.patch --]
[-- Type: application/octet-stream, Size: 20911 bytes --]

From 091d5523f6237a13941878202364a638c5d26ab3 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:43 +0000
Subject: [PATCH 5/6] aarch64: Implement -moutline-atomics

	* config/aarch64/aarch64.opt (-moutline-atomics): New.
	* config/aarch64/aarch64.c (aarch64_atomic_ool_func): New.
	(aarch64_ool_cas_names, aarch64_ool_swp_names): New.
	(aarch64_ool_ldadd_names, aarch64_ool_ldset_names): New.
	(aarch64_ool_ldclr_names, aarch64_ool_ldeor_names): New.
	(aarch64_expand_compare_and_swap): Honor TARGET_OUTLINE_ATOMICS.
	* config/aarch64/atomics.md (atomic_exchange<ALLI>): Likewise.
	(atomic_<atomic_op><ALLI>): Likewise.
	(atomic_fetch_<atomic_op><ALLI>): Likewise.
	(atomic_<atomic_op>_fetch<ALLI>): Likewise.
	* doc/invoke.texi: Document -moutline-atomics.
testsuite/
	* gcc.target/aarch64/atomic-op-acq_rel.c: Use -mno-outline-atomics.
	* gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Likewise.
	* gcc.target/aarch64/atomic-op-acquire.c: Likewise.
	* gcc.target/aarch64/atomic-op-char.c: Likewise.
	* gcc.target/aarch64/atomic-op-consume.c: Likewise.
	* gcc.target/aarch64/atomic-op-imm.c: Likewise.
	* gcc.target/aarch64/atomic-op-int.c: Likewise.
	* gcc.target/aarch64/atomic-op-long.c: Likewise.
	* gcc.target/aarch64/atomic-op-relaxed.c: Likewise.
	* gcc.target/aarch64/atomic-op-release.c: Likewise.
	* gcc.target/aarch64/atomic-op-seq_cst.c: Likewise.
	* gcc.target/aarch64/atomic-op-short.c: Likewise.
	* gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: Likewise.
	* gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: Likewise.
	* gcc.target/aarch64/sync-comp-swap.c: Likewise.
	* gcc.target/aarch64/sync-op-acquire.c: Likewise.
	* gcc.target/aarch64/sync-op-full.c: Likewise.

From-SVN: r275968
---
 gcc/config/aarch64/aarch64-protos.h           | 13 +++
 gcc/config/aarch64/aarch64.c                  | 87 +++++++++++++++++
 gcc/config/aarch64/aarch64.opt                |  3 +
 gcc/config/aarch64/atomics.md                 | 94 +++++++++++++++++--
 gcc/doc/invoke.texi                           | 16 +++-
 .../atomic-comp-swap-release-acquire.c        |  2 +-
 .../gcc.target/aarch64/atomic-op-acq_rel.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-acquire.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-char.c       |  2 +-
 .../gcc.target/aarch64/atomic-op-consume.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-imm.c        |  2 +-
 .../gcc.target/aarch64/atomic-op-int.c        |  2 +-
 .../gcc.target/aarch64/atomic-op-long.c       |  2 +-
 .../gcc.target/aarch64/atomic-op-relaxed.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-release.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-seq_cst.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-short.c      |  2 +-
 .../aarch64/atomic_cmp_exchange_zero_reg_1.c  |  2 +-
 .../atomic_cmp_exchange_zero_strong_1.c       |  2 +-
 .../gcc.target/aarch64/sync-comp-swap.c       |  2 +-
 .../gcc.target/aarch64/sync-op-acquire.c      |  2 +-
 .../gcc.target/aarch64/sync-op-full.c         |  2 +-
 22 files changed, 221 insertions(+), 26 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index c083cad5327..b9bfb281275 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -644,4 +644,17 @@ poly_uint64 aarch64_regmode_natural_size (machine_mode);
 
 bool aarch64_high_bits_all_ones_p (HOST_WIDE_INT);
 
+struct atomic_ool_names
+{
+    const char *str[5][4];
+};
+
+rtx aarch64_atomic_ool_func(machine_mode mode, rtx model_rtx,
+			    const atomic_ool_names *names);
+extern const atomic_ool_names aarch64_ool_swp_names;
+extern const atomic_ool_names aarch64_ool_ldadd_names;
+extern const atomic_ool_names aarch64_ool_ldset_names;
+extern const atomic_ool_names aarch64_ool_ldclr_names;
+extern const atomic_ool_names aarch64_ool_ldeor_names;
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 131747d4616..639666c5d08 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15481,6 +15481,82 @@ aarch64_emit_unlikely_jump (rtx insn)
   add_reg_br_prob_note (jump, profile_probability::very_unlikely ());
 }
 
+/* We store the names of the various atomic helpers in a 5x4 array.
+   Return the libcall function given MODE, MODEL and NAMES.  */
+
+rtx
+aarch64_atomic_ool_func(machine_mode mode, rtx model_rtx,
+			const atomic_ool_names *names)
+{
+  memmodel model = memmodel_base (INTVAL (model_rtx));
+  int mode_idx, model_idx;
+
+  switch (mode)
+    {
+    case E_QImode:
+      mode_idx = 0;
+      break;
+    case E_HImode:
+      mode_idx = 1;
+      break;
+    case E_SImode:
+      mode_idx = 2;
+      break;
+    case E_DImode:
+      mode_idx = 3;
+      break;
+    case E_TImode:
+      mode_idx = 4;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  switch (model)
+    {
+    case MEMMODEL_RELAXED:
+      model_idx = 0;
+      break;
+    case MEMMODEL_CONSUME:
+    case MEMMODEL_ACQUIRE:
+      model_idx = 1;
+      break;
+    case MEMMODEL_RELEASE:
+      model_idx = 2;
+      break;
+    case MEMMODEL_ACQ_REL:
+    case MEMMODEL_SEQ_CST:
+      model_idx = 3;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  return init_one_libfunc_visibility (names->str[mode_idx][model_idx],
+				      VISIBILITY_HIDDEN);
+}
+
+#define DEF0(B, N) \
+  { "__aarch64_" #B #N "_relax", \
+    "__aarch64_" #B #N "_acq", \
+    "__aarch64_" #B #N "_rel", \
+    "__aarch64_" #B #N "_acq_rel" }
+
+#define DEF4(B)  DEF0(B, 1), DEF0(B, 2), DEF0(B, 4), DEF0(B, 8), \
+		 { NULL, NULL, NULL, NULL }
+#define DEF5(B)  DEF0(B, 1), DEF0(B, 2), DEF0(B, 4), DEF0(B, 8), DEF0(B, 16)
+
+static const atomic_ool_names aarch64_ool_cas_names = { { DEF5(cas) } };
+const atomic_ool_names aarch64_ool_swp_names = { { DEF4(swp) } };
+const atomic_ool_names aarch64_ool_ldadd_names = { { DEF4(ldadd) } };
+const atomic_ool_names aarch64_ool_ldset_names = { { DEF4(ldset) } };
+const atomic_ool_names aarch64_ool_ldclr_names = { { DEF4(ldclr) } };
+const atomic_ool_names aarch64_ool_ldeor_names = { { DEF4(ldeor) } };
+
+#undef DEF0
+#undef DEF4
+#undef DEF5
+
 /* Expand a compare and swap pattern.  */
 
 void
@@ -15527,6 +15603,17 @@ aarch64_expand_compare_and_swap (rtx operands[])
 						   newval, mod_s));
       cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
     }
+  else if (TARGET_OUTLINE_ATOMICS)
+    {
+      /* Oldval must satisfy compare afterward.  */
+      if (!aarch64_plus_operand (oldval, mode))
+	oldval = force_reg (mode, oldval);
+      rtx func = aarch64_atomic_ool_func (mode, mod_s, &aarch64_ool_cas_names);
+      rval = emit_library_call_value (func, NULL_RTX, LCT_NORMAL, r_mode,
+				      oldval, mode, newval, mode,
+				      XEXP (mem, 0), Pmode);
+      cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
+    }
   else
     {
       /* The oldval predicate varies by mode.  Test it and force to reg.  */
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 3c6d1cc90ad..f474a28eb92 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -255,3 +255,6 @@ user-land code.
 TargetVariable
 long aarch64_stack_protector_guard_offset = 0
 
+moutline-atomics
+Target Report Mask(OUTLINE_ATOMICS) Save
+Generate local calls to out-of-line atomic operations.
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 09d2a63c620..cabcc58f1a0 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -186,16 +186,27 @@
   (match_operand:SI 3 "const_int_operand" "")]
   ""
   {
-    rtx (*gen) (rtx, rtx, rtx, rtx);
-
     /* Use an atomic SWP when available.  */
     if (TARGET_LSE)
-      gen = gen_aarch64_atomic_exchange<mode>_lse;
+      {
+	emit_insn (gen_aarch64_atomic_exchange<mode>_lse
+		   (operands[0], operands[1], operands[2], operands[3]));
+      }
+    else if (TARGET_OUTLINE_ATOMICS)
+      {
+	machine_mode mode = <MODE>mode;
+	rtx func = aarch64_atomic_ool_func (mode, operands[3],
+					    &aarch64_ool_swp_names);
+	rtx rval = emit_library_call_value (func, operands[0], LCT_NORMAL,
+					    mode, operands[2], mode,
+					    XEXP (operands[1], 0), Pmode);
+        emit_move_insn (operands[0], rval);
+      }
     else
-      gen = gen_aarch64_atomic_exchange<mode>;
-
-    emit_insn (gen (operands[0], operands[1], operands[2], operands[3]));
-
+      {
+	emit_insn (gen_aarch64_atomic_exchange<mode>
+		   (operands[0], operands[1], operands[2], operands[3]));
+      }
     DONE;
   }
 )
@@ -280,6 +291,39 @@
 	  }
 	operands[1] = force_reg (<MODE>mode, operands[1]);
       }
+    else if (TARGET_OUTLINE_ATOMICS)
+      {
+        const atomic_ool_names *names;
+	switch (<CODE>)
+	  {
+	  case MINUS:
+	    operands[1] = expand_simple_unop (<MODE>mode, NEG, operands[1],
+					      NULL, 1);
+	    /* fallthru */
+	  case PLUS:
+	    names = &aarch64_ool_ldadd_names;
+	    break;
+	  case IOR:
+	    names = &aarch64_ool_ldset_names;
+	    break;
+	  case XOR:
+	    names = &aarch64_ool_ldeor_names;
+	    break;
+	  case AND:
+	    operands[1] = expand_simple_unop (<MODE>mode, NOT, operands[1],
+					      NULL, 1);
+	    names = &aarch64_ool_ldclr_names;
+	    break;
+	  default:
+	    gcc_unreachable ();
+	  }
+        machine_mode mode = <MODE>mode;
+	rtx func = aarch64_atomic_ool_func (mode, operands[2], names);
+	emit_library_call_value (func, NULL_RTX, LCT_NORMAL, mode,
+				 operands[1], mode,
+				 XEXP (operands[0], 0), Pmode);
+        DONE;
+      }
     else
       gen = gen_aarch64_atomic_<atomic_optab><mode>;
 
@@ -405,6 +449,40 @@
 	}
       operands[2] = force_reg (<MODE>mode, operands[2]);
     }
+  else if (TARGET_OUTLINE_ATOMICS)
+    {
+      const atomic_ool_names *names;
+      switch (<CODE>)
+	{
+	case MINUS:
+	  operands[2] = expand_simple_unop (<MODE>mode, NEG, operands[2],
+					    NULL, 1);
+	  /* fallthru */
+	case PLUS:
+	  names = &aarch64_ool_ldadd_names;
+	  break;
+	case IOR:
+	  names = &aarch64_ool_ldset_names;
+	  break;
+	case XOR:
+	  names = &aarch64_ool_ldeor_names;
+	  break;
+	case AND:
+	  operands[2] = expand_simple_unop (<MODE>mode, NOT, operands[2],
+					    NULL, 1);
+	  names = &aarch64_ool_ldclr_names;
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+      machine_mode mode = <MODE>mode;
+      rtx func = aarch64_atomic_ool_func (mode, operands[3], names);
+      rtx rval = emit_library_call_value (func, operands[0], LCT_NORMAL, mode,
+					  operands[2], mode,
+					  XEXP (operands[1], 0), Pmode);
+      emit_move_insn (operands[0], rval);
+      DONE;
+    }
   else
     gen = gen_aarch64_atomic_fetch_<atomic_optab><mode>;
 
@@ -494,7 +572,7 @@
 {
   /* Use an atomic load-operate instruction when possible.  In this case
      we will re-compute the result from the original mem value. */
-  if (TARGET_LSE)
+  if (TARGET_LSE || TARGET_OUTLINE_ATOMICS)
     {
       rtx tmp = gen_reg_rtx (<MODE>mode);
       operands[2] = force_reg (<MODE>mode, operands[2]);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0ab6c9c6449..eadefff52a9 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -637,7 +637,8 @@ Objective-C and Objective-C++ Dialects}.
 -march=@var{name}  -mcpu=@var{name}  -mtune=@var{name}  @gol
 -moverride=@var{string}  -mverbose-cost-dump @gol
 -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{sysreg} @gol
--mstack-protector-guard-offset=@var{offset} -mtrack-speculation }
+-mstack-protector-guard-offset=@var{offset} -mtrack-speculation @gol
+-moutline-atomics }
 
 @emph{Adapteva Epiphany Options}
 @gccoptlist{-mhalf-reg-file  -mprefer-short-insn-regs @gol
@@ -15777,6 +15778,19 @@ be used by the compiler when expanding calls to
 @code{__builtin_speculation_safe_copy} to permit a more efficient code
 sequence to be generated.
 
+@item -moutline-atomics
+@itemx -mno-outline-atomics
+Enable or disable calls to out-of-line helpers to implement atomic operations.
+These helpers will, at runtime, determine if the LSE instructions from
+ARMv8.1-A can be used; if not, they will use the load/store-exclusive
+instructions that are present in the base ARMv8.0 ISA.
+
+This option is only applicable when compiling for the base ARMv8.0
+instruction set.  If using a later revision, e.g. @option{-march=armv8.1-a}
+or @option{-march=armv8-a+lse}, the ARMv8.1-Atomics instructions will be
+used directly.  The same applies when using @option{-mcpu=} when the
+selected cpu supports the @samp{lse} feature.
+
 @item -march=@var{name}
 @opindex march
 Specify the name of the target architecture and, optionally, one or
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
index 49ca5d0d09c..a828a72aa75 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf -mno-outline-atomics" } */
 
 #include "atomic-comp-swap-release-acquire.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c
index 74f26348e42..6823ce381b2 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-acq_rel.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c
index 66c1b1efe20..87937de378a 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-acquire.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c
index c09d0434ecf..60955e57da3 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-char.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
index 5783ab84f5c..16cb11aeeaf 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-consume.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c
index 18b8f0b04e9..bcab4e481e3 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 int v = 0;
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c
index 8520f0839ba..040e4a8d168 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-int.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c
index d011f8c5ce2..fc88b92cd3e 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 long v = 0;
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c
index ed96bfdb978..503d62b0280 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-relaxed.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c
index fc4be17de89..efe14aea7e4 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-release.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c
index 613000fe490..09973bf82ba 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-seq_cst.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c
index e82c8118ece..e1dcebb0f89 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-short.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
index f2a21ddf2e1..29246979bfb 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=armv8-a+nolse" } */
+/* { dg-options "-O2 -march=armv8-a+nolse -mno-outline-atomics" } */
 /* { dg-skip-if "" { *-*-* } { "-mcpu=*" } { "" } } */
 
 int
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
index 8d2ae67dfbe..6daf9b08f5a 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=armv8-a+nolse" } */
+/* { dg-options "-O2 -march=armv8-a+nolse -mno-outline-atomics" } */
 /* { dg-skip-if "" { *-*-* } { "-mcpu=*" } { "" } } */
 
 int
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
index e571b2f13b3..f56415f3354 100644
--- a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
+++ b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf -mno-outline-atomics" } */
 
 #include "sync-comp-swap.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
index 357bf1be3b2..39b3144aa36 100644
--- a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "sync-op-acquire.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-full.c b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
index c6ba1629965..6b8b2043f40 100644
--- a/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "sync-op-full.x"
 
-- 
2.20.1


[-- Attachment #7: 0006-Fix-shrinkwrapping-interactions-with-atomics-PR92692.patch --]
[-- Type: application/octet-stream, Size: 1529 bytes --]

From 4209a50711866d87b9c43addae698f98ba348ff3 Mon Sep 17 00:00:00 2001
From: Wilco Dijkstra <wdijkstr@arm.com>
Date: Fri, 17 Jan 2020 13:17:21 +0000
Subject: [PATCH 6/6] Fix shrinkwrapping interactions with atomics (PR92692)

The separate shrinkwrapping pass may insert stores in the middle
of atomics loops which can cause issues on some implementations.
Avoid this by delaying splitting atomics patterns until after
prolog/epilog generation.

gcc/
	PR target/92692
	* config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
	Add assert to ensure prolog has been emitted.
	(aarch64_split_atomic_op): Likewise.
	* config/aarch64/atomics.md (aarch64_compare_and_swap<mode>)
	Use epilogue_completed rather than reload_completed.
	(aarch64_atomic_exchange<mode>): Likewise.
	(aarch64_atomic_<atomic_optab><mode>): Likewise.
	(atomic_nand<mode>): Likewise.
	(aarch64_atomic_fetch_<atomic_optab><mode>): Likewise.
	(atomic_fetch_nand<mode>): Likewise.
	(aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise.
	(atomic_nand_fetch<mode>): Likewise.
---
 gcc/config/aarch64/atomics.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index cabcc58f1a0..1458bc00095 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -104,7 +104,7 @@
    (clobber (match_scratch:SI 7 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_compare_and_swap (operands);
-- 
2.20.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
  2020-03-09 21:47   ` [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x Pop, Sebastian
@ 2020-03-11 10:10     ` Kyrill Tkachov
  0 siblings, 0 replies; 12+ messages in thread
From: Kyrill Tkachov @ 2020-03-11 10:10 UTC (permalink / raw)
  To: Pop, Sebastian, gcc-patches; +Cc: richard.henderson, Wilco Dijkstra

Hi Sebastian,

On 3/9/20 9:47 PM, Pop, Sebastian wrote:
> Hi,
>
> Please see attached the patches to add -moutline-atomics to the gcc-9 branch.
> Tested on graviton2 aarch64-linux with bootstrap and
> `make check` passes with no new fails.
> Tested `make check` on glibc built with gcc-9 with and without "-moutline-atomics"
> and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".
>
> Ok to commit to gcc-9 branch?

Since this feature enables backwards-compatible deployment of LSE 
atomics, I'd support that.

That is okay with me in principle after GCC 9.3 is released (the branch 
is currently frozen).

However, there have been a few follow-up patches to fix some bugs 
revealed by testing.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91833

and

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91834

come to mind.

Can you please make sure the fixes for those are included as well?


>
> Does this mechanical `git am *.patch` require a copyright assignment?
> I am still working with my employer on getting the FSF assignment signed.
>
> Thanks,
> Sebastian
>
> PS: For gcc-8 backports there are 5 cleanup and improvement patches
> that are needed for -moutline-atomics patches to apply cleanly.
> Should these patches be back-ported in the same time as the flag patches,
> or should I update the patches to apply to the older code base?

Hmm... normally I'd be for them. In this case I'd want to make sure that 
there aren't any fallout fixes that we're missing.

Did these patches have any bug reports against them?

Thanks,

Kyrill


> Here is the list of the extra patches:
>
>  From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep 17 00:00:00 2001
> From: Richard Henderson <richard.henderson@linaro.org>
> Date: Wed, 31 Oct 2018 09:29:29 +0000
> Subject: [PATCH] aarch64: Simplify LSE cas generation
>
> The cas insn is a single insn, and if expanded properly need not
> be split after reload.  Use the proper inputs for the insn.
>
>          * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
>          Force oldval into the rval register for TARGET_LSE; emit the compare
>          during initial expansion so that it may be deleted if unused.
>          (aarch64_gen_atomic_cas): Remove.
>          * config/aarch64/atomics.md (@aarch64_compare_and_swap<SHORT>_lse):
>          Change =&r to +r for operand 0; use match_dup for operand 2;
>          remove is_weak and mod_f operands as unused.  Drop the split
>          and merge with...
>          (@aarch64_atomic_cas<SHORT>): ... this pattern's output; remove.
>          (@aarch64_compare_and_swap<GPI>_lse): Similarly.
>          (@aarch64_atomic_cas<GPI>): Similarly.
>
> From-SVN: r265656
>
>  From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep 17 00:00:00 2001
> From: Richard Henderson <richard.henderson@linaro.org>
> Date: Wed, 31 Oct 2018 09:42:39 +0000
> Subject: [PATCH] aarch64: Improve cas generation
>
> Do not zero-extend the input to the cas for subword operations;
> instead, use the appropriate zero-extending compare insns.
> Correct the predicates and constraints for immediate expected operand.
>
>          * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New.
>          (aarch64_split_compare_and_swap): Use it.
>          (aarch64_expand_compare_and_swap): Likewise.  Remove convert_modes;
>          test oldval against the proper predicate.
>          * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI>):
>          Use nonmemory_operand for expected.
>          (cas_short_expected_pred): New.
>          (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not "rI" to match.
>          (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for expected.
>          * config/aarch64/predicates.md (aarch64_plushi_immediate): New.
>          (aarch64_plushi_operand): New.
>
> From-SVN: r265657
>
>  From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep 17 00:00:00 2001
> From: Richard Henderson <richard.henderson@linaro.org>
> Date: Wed, 31 Oct 2018 09:47:21 +0000
> Subject: [PATCH] aarch64: Improve swp generation
>
> Allow zero as an input; fix constraints; avoid unnecessary split.
>
>          * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove.
>          (aarch64_gen_atomic_ldop): Don't call it.
>          * config/aarch64/atomics.md (atomic_exchange<ALLI>):
>          Use aarch64_reg_or_zero.
>          (aarch64_atomic_exchange<ALLI>): Likewise.
>          (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove & from
>          operand 0; use aarch64_reg_or_zero for input; merge ...
>          (@aarch64_atomic_swp<ALLI>): ... this and remove.
>
> From-SVN: r265659
>
>  From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep 17 00:00:00 2001
> From: Richard Henderson <richard.henderson@linaro.org>
> Date: Wed, 31 Oct 2018 09:58:48 +0000
> Subject: [PATCH] aarch64: Improve atomic-op lse generation
>
> Fix constraints; avoid unnecessary split.  Drop the use of the atomic_op
> iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more
> logical for ldclr aka bic.
>
>          * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
>          (aarch64_atomic_ldop_supported_p): Remove.
>          (aarch64_gen_atomic_ldop): Remove.
>          * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>):
>          Fully expand LSE operations here.
>          (atomic_fetch_<atomic_optab><ALLI>): Likewise.
>          (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
>          (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op iterator
>          and use ATOMIC_LDOP instead; use register_operand for the input;
>          drop the split and emit insns directly.
>          (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise.
>          (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
>          (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.
>
> From-SVN: r265660
>
>  From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep 17 00:00:00 2001
> From: Richard Henderson <richard.henderson@linaro.org>
> Date: Wed, 31 Oct 2018 23:11:22 +0000
> Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch
>
>          * config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
>          The scratch register need not be early-clobber.  Document the reason
>          why we cannot use ST<OP>.
>
> From-SVN: r265703
>
>
>
>
>
> On 2/27/20, 12:06 PM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
>
>      Hi Sebastian,
>      
>      On 2/27/20 4:53 PM, Pop, Sebastian wrote:
>      >
>      > Hi,
>      >
>      > is somebody already working on backporting -moutline-atomics to gcc
>      > 8.x and 9.x branches?
>      >
>      I'm not aware of such work going on.
>      
>      Thanks,
>      
>      Kyrill
>      
>      > Thanks,
>      >
>      > Sebastian
>      >
>      
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
@ 2020-03-25  0:24 Pop, Sebastian
  2020-03-31 15:47 ` Pop, Sebastian
  2020-04-01 22:13 ` Christophe Lyon
  0 siblings, 2 replies; 12+ messages in thread
From: Pop, Sebastian @ 2020-03-25  0:24 UTC (permalink / raw)
  To: Kyrill Tkachov, gcc-patches; +Cc: richard.henderson, Wilco Dijkstra

[-- Attachment #1: Type: text/plain, Size: 9502 bytes --]

Hi Kyrill,

Thanks for pointing out the two missing bug fixes.
Please see attached all the back-ported patches.
All the patches from trunk applied cleanly with no conflicts (except for the ChangeLog files) to the gcc-9 branch.
An up to date gcc-9 branch on which I applied the attached patches has passed bootstrap on aarch64-linux (Graviton2 with 64 N1 cores) and make check with no extra fails.
Kyrill, could you please commit the attached patches to the gcc-9 branch?

As we still don't have a copyright assignment on file, would it be possible for ARM to finish the backport to the gcc-8 branch of these patches and the atomics cleanup patches mentioned below?

I did a `git log config/aarch64/atomics.md` and there is a follow-up patch to the atomics cleanup patches:

commit e21679a8bb17aac603b8704891e60ac502200629
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Nov 21 17:41:03 2018 +0100

    re PR target/87839 (ICE in final_scan_insn_1, at final.c:3070)

            PR target/87839
            * config/aarch64/atomics.md (@aarch64_compare_and_swap<mode>): Use
            rIJ constraint for aarch64_plus_operand rather than rn.

            * gcc.target/aarch64/pr87839.c: New test.

    From-SVN: r266346

That is fixing code modified in this cleanup patch:

commit d400fda3a8c3330f77eb9d51874f5482d3819a9f
Author: Richard Henderson <richard.henderson@linaro.org>
Date:   Wed Oct 31 09:42:39 2018 +0000

    aarch64: Improve cas generation


Thanks,
Sebastian


On 3/11/20, 5:11 AM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
    
    
    
    Hi Sebastian,
    
    On 3/9/20 9:47 PM, Pop, Sebastian wrote:
    > Hi,
    >
    > Please see attached the patches to add -moutline-atomics to the gcc-9 branch.
    > Tested on graviton2 aarch64-linux with bootstrap and
    > `make check` passes with no new fails.
    > Tested `make check` on glibc built with gcc-9 with and without "-moutline-atomics"
    > and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".
    >
    > Ok to commit to gcc-9 branch?
    
    Since this feature enables backwards-compatible deployment of LSE
    atomics, I'd support that.
    
    That is okay with me in principle after GCC 9.3 is released (the branch
    is currently frozen).
    
    However, there have been a few follow-up patches to fix some bugs
    revealed by testing.
    
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91833
    
    and
    
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91834
    
    come to mind.
    
    Can you please make sure the fixes for those are included as well?
    
    
    >
    > Does this mechanical `git am *.patch` require a copyright assignment?
    > I am still working with my employer on getting the FSF assignment signed.
    >
    > Thanks,
    > Sebastian
    >
    > PS: For gcc-8 backports there are 5 cleanup and improvement patches
    > that are needed for -moutline-atomics patches to apply cleanly.
    > Should these patches be back-ported in the same time as the flag patches,
    > or should I update the patches to apply to the older code base?
    
    Hmm... normally I'd be for them. In this case I'd want to make sure that
    there aren't any fallout fixes that we're missing.
    
    Did these patches have any bug reports against them?
    
    Thanks,
    
    Kyrill
    
    
    > Here is the list of the extra patches:
    >
    >  From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep 17 00:00:00 2001
    > From: Richard Henderson <richard.henderson@linaro.org>
    > Date: Wed, 31 Oct 2018 09:29:29 +0000
    > Subject: [PATCH] aarch64: Simplify LSE cas generation
    >
    > The cas insn is a single insn, and if expanded properly need not
    > be split after reload.  Use the proper inputs for the insn.
    >
    >          * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
    >          Force oldval into the rval register for TARGET_LSE; emit the compare
    >          during initial expansion so that it may be deleted if unused.
    >          (aarch64_gen_atomic_cas): Remove.
    >          * config/aarch64/atomics.md (@aarch64_compare_and_swap<SHORT>_lse):
    >          Change =&r to +r for operand 0; use match_dup for operand 2;
    >          remove is_weak and mod_f operands as unused.  Drop the split
    >          and merge with...
    >          (@aarch64_atomic_cas<SHORT>): ... this pattern's output; remove.
    >          (@aarch64_compare_and_swap<GPI>_lse): Similarly.
    >          (@aarch64_atomic_cas<GPI>): Similarly.
    >
    > From-SVN: r265656
    >
    >  From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep 17 00:00:00 2001
    > From: Richard Henderson <richard.henderson@linaro.org>
    > Date: Wed, 31 Oct 2018 09:42:39 +0000
    > Subject: [PATCH] aarch64: Improve cas generation
    >
    > Do not zero-extend the input to the cas for subword operations;
    > instead, use the appropriate zero-extending compare insns.
    > Correct the predicates and constraints for immediate expected operand.
    >
    >          * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New.
    >          (aarch64_split_compare_and_swap): Use it.
    >          (aarch64_expand_compare_and_swap): Likewise.  Remove convert_modes;
    >          test oldval against the proper predicate.
    >          * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI>):
    >          Use nonmemory_operand for expected.
    >          (cas_short_expected_pred): New.
    >          (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not "rI" to match.
    >          (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for expected.
    >          * config/aarch64/predicates.md (aarch64_plushi_immediate): New.
    >          (aarch64_plushi_operand): New.
    >
    > From-SVN: r265657
    >
    >  From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep 17 00:00:00 2001
    > From: Richard Henderson <richard.henderson@linaro.org>
    > Date: Wed, 31 Oct 2018 09:47:21 +0000
    > Subject: [PATCH] aarch64: Improve swp generation
    >
    > Allow zero as an input; fix constraints; avoid unnecessary split.
    >
    >          * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove.
    >          (aarch64_gen_atomic_ldop): Don't call it.
    >          * config/aarch64/atomics.md (atomic_exchange<ALLI>):
    >          Use aarch64_reg_or_zero.
    >          (aarch64_atomic_exchange<ALLI>): Likewise.
    >          (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove & from
    >          operand 0; use aarch64_reg_or_zero for input; merge ...
    >          (@aarch64_atomic_swp<ALLI>): ... this and remove.
    >
    > From-SVN: r265659
    >
    >  From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep 17 00:00:00 2001
    > From: Richard Henderson <richard.henderson@linaro.org>
    > Date: Wed, 31 Oct 2018 09:58:48 +0000
    > Subject: [PATCH] aarch64: Improve atomic-op lse generation
    >
    > Fix constraints; avoid unnecessary split.  Drop the use of the atomic_op
    > iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more
    > logical for ldclr aka bic.
    >
    >          * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
    >          (aarch64_atomic_ldop_supported_p): Remove.
    >          (aarch64_gen_atomic_ldop): Remove.
    >          * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>):
    >          Fully expand LSE operations here.
    >          (atomic_fetch_<atomic_optab><ALLI>): Likewise.
    >          (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
    >          (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op iterator
    >          and use ATOMIC_LDOP instead; use register_operand for the input;
    >          drop the split and emit insns directly.
    >          (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise.
    >          (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
    >          (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.
    >
    > From-SVN: r265660
    >
    >  From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep 17 00:00:00 2001
    > From: Richard Henderson <richard.henderson@linaro.org>
    > Date: Wed, 31 Oct 2018 23:11:22 +0000
    > Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch
    >
    >          * config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
    >          The scratch register need not be early-clobber.  Document the reason
    >          why we cannot use ST<OP>.
    >
    > From-SVN: r265703
    >
    >
    >
    >
    >
    > On 2/27/20, 12:06 PM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
    >
    >      Hi Sebastian,
    >
    >      On 2/27/20 4:53 PM, Pop, Sebastian wrote:
    >      >
    >      > Hi,
    >      >
    >      > is somebody already working on backporting -moutline-atomics to gcc
    >      > 8.x and 9.x branches?
    >      >
    >      I'm not aware of such work going on.
    >
    >      Thanks,
    >
    >      Kyrill
    >
    >      > Thanks,
    >      >
    >      > Sebastian
    >      >
    >
    >
    


[-- Attachment #2: 0001-aarch64-Extend-R-for-integer-registers.patch --]
[-- Type: application/octet-stream, Size: 1766 bytes --]

From 8f806d9464ae450a209d94e98dac91a808674364 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:24 +0000
Subject: [PATCH 1/8] aarch64: Extend %R for integer registers

	* config/aarch64/aarch64.c (aarch64_print_operand): Allow integer
	registers with %R.

From-SVN: r275964
---
 gcc/config/aarch64/aarch64.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 2b09f317978..23846b2339d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7542,7 +7542,7 @@ sizetochar (int size)
      'S/T/U/V':		Print a FP/SIMD register name for a register list.
 			The register printed is the FP/SIMD register name
 			of X + 0/1/2/3 for S/T/U/V.
-     'R':		Print a scalar FP/SIMD register name + 1.
+     'R':		Print a scalar Integer/FP/SIMD register name + 1.
      'X':		Print bottom 16 bits of integer constant in hex.
      'w/x':		Print a general register name or the zero register
 			(32-bit or 64-bit).
@@ -7734,12 +7734,13 @@ aarch64_print_operand (FILE *f, rtx x, int code)
       break;
 
     case 'R':
-      if (!REG_P (x) || !FP_REGNUM_P (REGNO (x)))
-	{
-	  output_operand_lossage ("incompatible floating point / vector register operand for '%%%c'", code);
-	  return;
-	}
-      asm_fprintf (f, "q%d", REGNO (x) - V0_REGNUM + 1);
+      if (REG_P (x) && FP_REGNUM_P (REGNO (x)))
+	asm_fprintf (f, "q%d", REGNO (x) - V0_REGNUM + 1);
+      else if (REG_P (x) && GP_REGNUM_P (REGNO (x)))
+	asm_fprintf (f, "x%d", REGNO (x) - R0_REGNUM + 1);
+      else
+	output_operand_lossage ("incompatible register operand for '%%%c'",
+				code);
       break;
 
     case 'X':
-- 
2.20.1


[-- Attachment #3: 0002-aarch64-Implement-TImode-compare-and-swap.patch --]
[-- Type: application/octet-stream, Size: 9163 bytes --]

From 6d9363c6a96016f460cb17e18e17d29000f1db52 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:29 +0000
Subject: [PATCH 2/8] aarch64: Implement TImode compare-and-swap

This pattern will only be used with the __sync functions, because
we do not yet have a bare TImode atomic load.

	* config/aarch64/aarch64.c (aarch64_gen_compare_reg): Add support
	for NE comparison of TImode values.
	(aarch64_emit_load_exclusive): Add support for TImode.
	(aarch64_emit_store_exclusive): Likewise.
	(aarch64_split_compare_and_swap): Disable strong_zero_p for TImode.
	* config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI_TI>):
	Change iterator from ALLI to ALLI_TI.
	(@atomic_compare_and_swap<JUST_TI>): New.
	(@atomic_compare_and_swap<JUST_TI>_lse): New.
	(aarch64_load_exclusive_pair): New.
	(aarch64_store_exclusive_pair): New.
	* config/aarch64/iterators.md (JUST_TI): New.

From-SVN: r275965
---
 gcc/config/aarch64/aarch64.c    | 47 ++++++++++++++---
 gcc/config/aarch64/atomics.md   | 91 +++++++++++++++++++++++++++++++--
 gcc/config/aarch64/iterators.md |  3 ++
 3 files changed, 130 insertions(+), 11 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 23846b2339d..ae9f567cac7 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1879,10 +1879,33 @@ emit_set_insn (rtx x, rtx y)
 rtx
 aarch64_gen_compare_reg (RTX_CODE code, rtx x, rtx y)
 {
-  machine_mode mode = SELECT_CC_MODE (code, x, y);
-  rtx cc_reg = gen_rtx_REG (mode, CC_REGNUM);
+  machine_mode cmp_mode = GET_MODE (x);
+  machine_mode cc_mode;
+  rtx cc_reg;
 
-  emit_set_insn (cc_reg, gen_rtx_COMPARE (mode, x, y));
+  if (cmp_mode == TImode)
+    {
+      gcc_assert (code == NE);
+
+      cc_mode = CCmode;
+      cc_reg = gen_rtx_REG (cc_mode, CC_REGNUM);
+
+      rtx x_lo = operand_subword (x, 0, 0, TImode);
+      rtx y_lo = operand_subword (y, 0, 0, TImode);
+      emit_set_insn (cc_reg, gen_rtx_COMPARE (cc_mode, x_lo, y_lo));
+
+      rtx x_hi = operand_subword (x, 1, 0, TImode);
+      rtx y_hi = operand_subword (y, 1, 0, TImode);
+      emit_insn (gen_ccmpdi (cc_reg, cc_reg, x_hi, y_hi,
+			     gen_rtx_EQ (cc_mode, cc_reg, const0_rtx),
+			     GEN_INT (AARCH64_EQ)));
+    }
+  else
+    {
+      cc_mode = SELECT_CC_MODE (code, x, y);
+      cc_reg = gen_rtx_REG (cc_mode, CC_REGNUM);
+      emit_set_insn (cc_reg, gen_rtx_COMPARE (cc_mode, x, y));
+    }
   return cc_reg;
 }
 
@@ -15428,16 +15451,26 @@ static void
 aarch64_emit_load_exclusive (machine_mode mode, rtx rval,
 			     rtx mem, rtx model_rtx)
 {
-  emit_insn (gen_aarch64_load_exclusive (mode, rval, mem, model_rtx));
+  if (mode == TImode)
+    emit_insn (gen_aarch64_load_exclusive_pair (gen_lowpart (DImode, rval),
+						gen_highpart (DImode, rval),
+						mem, model_rtx));
+  else
+    emit_insn (gen_aarch64_load_exclusive (mode, rval, mem, model_rtx));
 }
 
 /* Emit store exclusive.  */
 
 static void
 aarch64_emit_store_exclusive (machine_mode mode, rtx bval,
-			      rtx rval, rtx mem, rtx model_rtx)
+			      rtx mem, rtx rval, rtx model_rtx)
 {
-  emit_insn (gen_aarch64_store_exclusive (mode, bval, rval, mem, model_rtx));
+  if (mode == TImode)
+    emit_insn (gen_aarch64_store_exclusive_pair
+	       (bval, mem, operand_subword (rval, 0, 0, TImode),
+		operand_subword (rval, 1, 0, TImode), model_rtx));
+  else
+    emit_insn (gen_aarch64_store_exclusive (mode, bval, mem, rval, model_rtx));
 }
 
 /* Mark the previous jump instruction as unlikely.  */
@@ -15567,7 +15600,7 @@ aarch64_split_compare_and_swap (rtx operands[])
 	CBNZ	scratch, .label1
     .label2:
 	CMP	rval, 0.  */
-  bool strong_zero_p = !is_weak && oldval == const0_rtx;
+  bool strong_zero_p = !is_weak && oldval == const0_rtx && mode != TImode;
 
   label1 = NULL;
   if (!is_weak)
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 0f357662ac3..09d2a63c620 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -22,10 +22,10 @@
 
 (define_expand "@atomic_compare_and_swap<mode>"
   [(match_operand:SI 0 "register_operand" "")			;; bool out
-   (match_operand:ALLI 1 "register_operand" "")			;; val out
-   (match_operand:ALLI 2 "aarch64_sync_memory_operand" "")	;; memory
-   (match_operand:ALLI 3 "nonmemory_operand" "")		;; expected
-   (match_operand:ALLI 4 "aarch64_reg_or_zero" "")		;; desired
+   (match_operand:ALLI_TI 1 "register_operand" "")		;; val out
+   (match_operand:ALLI_TI 2 "aarch64_sync_memory_operand" "")	;; memory
+   (match_operand:ALLI_TI 3 "nonmemory_operand" "")		;; expected
+   (match_operand:ALLI_TI 4 "aarch64_reg_or_zero" "")		;; desired
    (match_operand:SI 5 "const_int_operand")			;; is_weak
    (match_operand:SI 6 "const_int_operand")			;; mod_s
    (match_operand:SI 7 "const_int_operand")]			;; mod_f
@@ -88,6 +88,30 @@
   }
 )
 
+(define_insn_and_split "@aarch64_compare_and_swap<mode>"
+  [(set (reg:CC CC_REGNUM)					;; bool out
+    (unspec_volatile:CC [(const_int 0)] UNSPECV_ATOMIC_CMPSW))
+   (set (match_operand:JUST_TI 0 "register_operand" "=&r")	;; val out
+    (match_operand:JUST_TI 1 "aarch64_sync_memory_operand" "+Q")) ;; memory
+   (set (match_dup 1)
+    (unspec_volatile:JUST_TI
+      [(match_operand:JUST_TI 2 "aarch64_reg_or_zero" "rZ")	;; expect
+       (match_operand:JUST_TI 3 "aarch64_reg_or_zero" "rZ")	;; desired
+       (match_operand:SI 4 "const_int_operand")			;; is_weak
+       (match_operand:SI 5 "const_int_operand")			;; mod_s
+       (match_operand:SI 6 "const_int_operand")]		;; mod_f
+      UNSPECV_ATOMIC_CMPSW))
+   (clobber (match_scratch:SI 7 "=&r"))]
+  ""
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+  {
+    aarch64_split_compare_and_swap (operands);
+    DONE;
+  }
+)
+
 (define_insn "@aarch64_compare_and_swap<mode>_lse"
   [(set (match_operand:SI 0 "register_operand" "+r")		;; val out
     (zero_extend:SI
@@ -133,6 +157,28 @@
     return "casal<atomic_sfx>\t%<w>0, %<w>2, %1";
 })
 
+(define_insn "@aarch64_compare_and_swap<mode>_lse"
+  [(set (match_operand:JUST_TI 0 "register_operand" "+r")	;; val out
+    (match_operand:JUST_TI 1 "aarch64_sync_memory_operand" "+Q")) ;; memory
+   (set (match_dup 1)
+    (unspec_volatile:JUST_TI
+      [(match_dup 0)						;; expect
+       (match_operand:JUST_TI 2 "register_operand" "r")		;; desired
+       (match_operand:SI 3 "const_int_operand")]		;; mod_s
+      UNSPECV_ATOMIC_CMPSW))]
+  "TARGET_LSE"
+{
+  enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+  if (is_mm_relaxed (model))
+    return "casp\t%0, %R0, %2, %R2, %1";
+  else if (is_mm_acquire (model) || is_mm_consume (model))
+    return "caspa\t%0, %R0, %2, %R2, %1";
+  else if (is_mm_release (model))
+    return "caspl\t%0, %R0, %2, %R2, %1";
+  else
+    return "caspal\t%0, %R0, %2, %R2, %1";
+})
+
 (define_expand "atomic_exchange<mode>"
  [(match_operand:ALLI 0 "register_operand" "")
   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "")
@@ -581,6 +627,24 @@
   }
 )
 
+(define_insn "aarch64_load_exclusive_pair"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec_volatile:DI
+	  [(match_operand:TI 2 "aarch64_sync_memory_operand" "Q")
+	   (match_operand:SI 3 "const_int_operand")]
+	  UNSPECV_LX))
+   (set (match_operand:DI 1 "register_operand" "=r")
+	(unspec_volatile:DI [(match_dup 2) (match_dup 3)] UNSPECV_LX))]
+  ""
+  {
+    enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+    if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_release (model))
+      return "ldxp\t%0, %1, %2";
+    else
+      return "ldaxp\t%0, %1, %2";
+  }
+)
+
 (define_insn "@aarch64_store_exclusive<mode>"
   [(set (match_operand:SI 0 "register_operand" "=&r")
     (unspec_volatile:SI [(const_int 0)] UNSPECV_SX))
@@ -599,6 +663,25 @@
   }
 )
 
+(define_insn "aarch64_store_exclusive_pair"
+  [(set (match_operand:SI 0 "register_operand" "=&r")
+	(unspec_volatile:SI [(const_int 0)] UNSPECV_SX))
+   (set (match_operand:TI 1 "aarch64_sync_memory_operand" "=Q")
+	(unspec_volatile:TI
+	  [(match_operand:DI 2 "aarch64_reg_or_zero" "rZ")
+	   (match_operand:DI 3 "aarch64_reg_or_zero" "rZ")
+	   (match_operand:SI 4 "const_int_operand")]
+	  UNSPECV_SX))]
+  ""
+  {
+    enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+    if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_acquire (model))
+      return "stxp\t%w0, %x2, %x3, %1";
+    else
+      return "stlxp\t%w0, %x2, %x3, %1";
+  }
+)
+
 (define_expand "mem_thread_fence"
   [(match_operand:SI 0 "const_int_operand" "")]
   ""
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 6caeeac8086..3bc49ea0238 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -29,6 +29,9 @@
 ;; Iterator for HI, SI, DI, some instructions can only work on these modes.
 (define_mode_iterator GPI_I16 [(HI "AARCH64_ISA_F16") SI DI])
 
+;; "Iterator" for just TI -- features like @pattern only work with iterators.
+(define_mode_iterator JUST_TI [TI])
+
 ;; Iterator for QI and HI modes
 (define_mode_iterator SHORT [QI HI])
 
-- 
2.20.1


[-- Attachment #4: 0003-aarch64-Tidy-aarch64_split_compare_and_swap.patch --]
[-- Type: application/octet-stream, Size: 4393 bytes --]

From 5f3c1052ca4632872bceeb3724fdbdb9ecc619c2 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:33 +0000
Subject: [PATCH 3/8] aarch64: Tidy aarch64_split_compare_and_swap

With aarch64_track_speculation, we had extra code to do exactly what the
!strong_zero_p path already did.  The rest is reducing code duplication.

	* config/aarch64/aarch64 (aarch64_split_compare_and_swap): Disable
	strong_zero_p for aarch64_track_speculation; unify some code paths;
	use aarch64_gen_compare_reg instead of open-coding.

From-SVN: r275966
---
 gcc/config/aarch64/aarch64.c | 50 ++++++++++--------------------------
 1 file changed, 14 insertions(+), 36 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ae9f567cac7..d5515f859e5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15573,13 +15573,11 @@ aarch64_split_compare_and_swap (rtx operands[])
   /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
   gcc_assert (epilogue_completed);
 
-  rtx rval, mem, oldval, newval, scratch;
+  rtx rval, mem, oldval, newval, scratch, x, model_rtx;
   machine_mode mode;
   bool is_weak;
   rtx_code_label *label1, *label2;
-  rtx x, cond;
   enum memmodel model;
-  rtx model_rtx;
 
   rval = operands[0];
   mem = operands[1];
@@ -15600,7 +15598,8 @@ aarch64_split_compare_and_swap (rtx operands[])
 	CBNZ	scratch, .label1
     .label2:
 	CMP	rval, 0.  */
-  bool strong_zero_p = !is_weak && oldval == const0_rtx && mode != TImode;
+  bool strong_zero_p = (!is_weak && !aarch64_track_speculation &&
+			oldval == const0_rtx && mode != TImode);
 
   label1 = NULL;
   if (!is_weak)
@@ -15613,35 +15612,20 @@ aarch64_split_compare_and_swap (rtx operands[])
   /* The initial load can be relaxed for a __sync operation since a final
      barrier will be emitted to stop code hoisting.  */
   if (is_mm_sync (model))
-    aarch64_emit_load_exclusive (mode, rval, mem,
-				 GEN_INT (MEMMODEL_RELAXED));
+    aarch64_emit_load_exclusive (mode, rval, mem, GEN_INT (MEMMODEL_RELAXED));
   else
     aarch64_emit_load_exclusive (mode, rval, mem, model_rtx);
 
   if (strong_zero_p)
-    {
-      if (aarch64_track_speculation)
-	{
-	  /* Emit an explicit compare instruction, so that we can correctly
-	     track the condition codes.  */
-	  rtx cc_reg = aarch64_gen_compare_reg (NE, rval, const0_rtx);
-	  x = gen_rtx_NE (GET_MODE (cc_reg), cc_reg, const0_rtx);
-	}
-      else
-	x = gen_rtx_NE (VOIDmode, rval, const0_rtx);
-
-      x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-				gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
-      aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
-    }
+    x = gen_rtx_NE (VOIDmode, rval, const0_rtx);
   else
     {
-      cond = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
-      x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
-      x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-				gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
-      aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+      rtx cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
+      x = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx);
     }
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			    gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
+  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
 
   aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
 
@@ -15662,22 +15646,16 @@ aarch64_split_compare_and_swap (rtx operands[])
       aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
     }
   else
-    {
-      cond = gen_rtx_REG (CCmode, CC_REGNUM);
-      x = gen_rtx_COMPARE (CCmode, scratch, const0_rtx);
-      emit_insn (gen_rtx_SET (cond, x));
-    }
+    aarch64_gen_compare_reg (NE, scratch, const0_rtx);
 
   emit_label (label2);
+
   /* If we used a CBNZ in the exchange loop emit an explicit compare with RVAL
      to set the condition flags.  If this is not used it will be removed by
      later passes.  */
   if (strong_zero_p)
-    {
-      cond = gen_rtx_REG (CCmode, CC_REGNUM);
-      x = gen_rtx_COMPARE (CCmode, rval, const0_rtx);
-      emit_insn (gen_rtx_SET (cond, x));
-    }
+    aarch64_gen_compare_reg (NE, rval, const0_rtx);
+
   /* Emit any final barrier needed for a __sync operation.  */
   if (is_mm_sync (model))
     aarch64_emit_post_barrier (model);
-- 
2.20.1


[-- Attachment #5: 0004-aarch64-Add-out-of-line-functions-for-LSE-atomics.patch --]
[-- Type: application/octet-stream, Size: 12042 bytes --]

From e12f5f09ae40201a3a95179c223a93becd9ee67f Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:38 +0000
Subject: [PATCH 4/8] aarch64: Add out-of-line functions for LSE atomics

This is the libgcc part of the interface -- providing the functions.
Rationale is provided at the top of libgcc/config/aarch64/lse.S.

	* config/aarch64/lse-init.c: New file.
	* config/aarch64/lse.S: New file.
	* config/aarch64/t-lse: New file.
	* config.host: Add t-lse to all aarch64 tuples.

From-SVN: r275967
---
 libgcc/config.host               |   4 +
 libgcc/config/aarch64/lse-init.c |  45 ++++++
 libgcc/config/aarch64/lse.S      | 235 +++++++++++++++++++++++++++++++
 libgcc/config/aarch64/t-lse      |  44 ++++++
 4 files changed, 328 insertions(+)
 create mode 100644 libgcc/config/aarch64/lse-init.c
 create mode 100644 libgcc/config/aarch64/lse.S
 create mode 100644 libgcc/config/aarch64/t-lse

diff --git a/libgcc/config.host b/libgcc/config.host
index 0f15fda3612..18e306b48a5 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -347,23 +347,27 @@ aarch64*-*-elf | aarch64*-*-rtems*)
 	extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o"
 	extra_parts="$extra_parts crtfastmath.o"
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	md_unwind_header=aarch64/aarch64-unwind.h
 	;;
 aarch64*-*-freebsd*)
 	extra_parts="$extra_parts crtfastmath.o"
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	md_unwind_header=aarch64/freebsd-unwind.h
 	;;
 aarch64*-*-fuchsia*)
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp"
 	;;
 aarch64*-*-linux*)
 	extra_parts="$extra_parts crtfastmath.o"
 	md_unwind_header=aarch64/linux-unwind.h
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	;;
 alpha*-*-linux*)
diff --git a/libgcc/config/aarch64/lse-init.c b/libgcc/config/aarch64/lse-init.c
new file mode 100644
index 00000000000..33d29147479
--- /dev/null
+++ b/libgcc/config/aarch64/lse-init.c
@@ -0,0 +1,45 @@
+/* Out-of-line LSE atomics for AArch64 architecture, Init.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   Contributed by Linaro Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+/* Define the symbol gating the LSE implementations.  */
+_Bool __aarch64_have_lse_atomics
+  __attribute__((visibility("hidden"), nocommon));
+
+/* Disable initialization of __aarch64_have_lse_atomics during bootstrap.  */
+#ifndef inhibit_libc
+# include <sys/auxv.h>
+
+/* Disable initialization if the system headers are too old.  */
+# if defined(AT_HWCAP) && defined(HWCAP_ATOMICS)
+
+static void __attribute__((constructor))
+init_have_lse_atomics (void)
+{
+  unsigned long hwcap = getauxval (AT_HWCAP);
+  __aarch64_have_lse_atomics = (hwcap & HWCAP_ATOMICS) != 0;
+}
+
+# endif /* HWCAP */
+#endif /* inhibit_libc */
diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
new file mode 100644
index 00000000000..a5f6673596c
--- /dev/null
+++ b/libgcc/config/aarch64/lse.S
@@ -0,0 +1,235 @@
+/* Out-of-line LSE atomics for AArch64 architecture.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   Contributed by Linaro Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+/*
+ * The problem that we are trying to solve is operating system deployment
+ * of ARMv8.1-Atomics, also known as Large System Exensions (LSE).
+ *
+ * There are a number of potential solutions for this problem which have
+ * been proposed and rejected for various reasons.  To recap:
+ *
+ * (1) Multiple builds.  The dynamic linker will examine /lib64/atomics/
+ * if HWCAP_ATOMICS is set, allowing entire libraries to be overwritten.
+ * However, not all Linux distributions are happy with multiple builds,
+ * and anyway it has no effect on main applications.
+ *
+ * (2) IFUNC.  We could put these functions into libgcc_s.so, and have
+ * a single copy of each function for all DSOs.  However, ARM is concerned
+ * that the branch-to-indirect-branch that is implied by using a PLT,
+ * as required by IFUNC, is too much overhead for smaller cpus.
+ *
+ * (3) Statically predicted direct branches.  This is the approach that
+ * is taken here.  These functions are linked into every DSO that uses them.
+ * All of the symbols are hidden, so that the functions are called via a
+ * direct branch.  The choice of LSE vs non-LSE is done via one byte load
+ * followed by a well-predicted direct branch.  The functions are compiled
+ * separately to minimize code size.
+ */
+
+/* Tell the assembler to accept LSE instructions.  */
+	.arch armv8-a+lse
+
+/* Declare the symbol gating the LSE implementations.  */
+	.hidden	__aarch64_have_lse_atomics
+
+/* Turn size and memory model defines into mnemonic fragments.  */
+#if SIZE == 1
+# define S     b
+# define UXT   uxtb
+#elif SIZE == 2
+# define S     h
+# define UXT   uxth
+#elif SIZE == 4 || SIZE == 8 || SIZE == 16
+# define S
+# define UXT   mov
+#else
+# error
+#endif
+
+#if MODEL == 1
+# define SUFF  _relax
+# define A
+# define L
+#elif MODEL == 2
+# define SUFF  _acq
+# define A     a
+# define L
+#elif MODEL == 3
+# define SUFF  _rel
+# define A
+# define L     l
+#elif MODEL == 4
+# define SUFF  _acq_rel
+# define A     a
+# define L     l
+#else
+# error
+#endif
+
+/* Concatenate symbols.  */
+#define glue2_(A, B)		A ## B
+#define glue2(A, B)		glue2_(A, B)
+#define glue3_(A, B, C)		A ## B ## C
+#define glue3(A, B, C)		glue3_(A, B, C)
+#define glue4_(A, B, C, D)	A ## B ## C ## D
+#define glue4(A, B, C, D)	glue4_(A, B, C, D)
+
+/* Select the size of a register, given a regno.  */
+#define x(N)			glue2(x, N)
+#define w(N)			glue2(w, N)
+#if SIZE < 8
+# define s(N)			w(N)
+#else
+# define s(N)			x(N)
+#endif
+
+#define NAME(BASE)		glue4(__aarch64_, BASE, SIZE, SUFF)
+#define LDXR			glue4(ld, A, xr, S)
+#define STXR			glue4(st, L, xr, S)
+
+/* Temporary registers used.  Other than these, only the return value
+   register (x0) and the flags are modified.  */
+#define tmp0	16
+#define tmp1	17
+#define tmp2	15
+
+/* Start and end a function.  */
+.macro	STARTFN name
+	.text
+	.balign	16
+	.globl	\name
+	.hidden	\name
+	.type	\name, %function
+	.cfi_startproc
+\name:
+.endm
+
+.macro	ENDFN name
+	.cfi_endproc
+	.size	\name, . - \name
+.endm
+
+/* Branch to LABEL if LSE is disabled.  */
+.macro	JUMP_IF_NOT_LSE label
+	adrp	x(tmp0), __aarch64_have_lse_atomics
+	ldrb	w(tmp0), [x(tmp0), :lo12:__aarch64_have_lse_atomics]
+	cbz	w(tmp0), \label
+.endm
+
+#ifdef L_cas
+
+STARTFN	NAME(cas)
+	JUMP_IF_NOT_LSE	8f
+
+#if SIZE < 16
+#define CAS	glue4(cas, A, L, S)
+
+	CAS		s(0), s(1), [x2]
+	ret
+
+8:	UXT		s(tmp0), s(0)
+0:	LDXR		s(0), [x2]
+	cmp		s(0), s(tmp0)
+	bne		1f
+	STXR		w(tmp1), s(1), [x2]
+	cbnz		w(tmp1), 0b
+1:	ret
+
+#else
+#define LDXP	glue3(ld, A, xp)
+#define STXP	glue3(st, L, xp)
+#define CASP	glue3(casp, A, L)
+
+	CASP		x0, x1, x2, x3, [x4]
+	ret
+
+8:	mov		x(tmp0), x0
+	mov		x(tmp1), x1
+0:	LDXP		x0, x1, [x4]
+	cmp		x0, x(tmp0)
+	ccmp		x1, x(tmp1), #0, eq
+	bne		1f
+	STXP		w(tmp2), x(tmp0), x(tmp1), [x4]
+	cbnz		w(tmp2), 0b
+1:	ret
+
+#endif
+
+ENDFN	NAME(cas)
+#endif
+
+#ifdef L_swp
+#define SWP	glue4(swp, A, L, S)
+
+STARTFN	NAME(swp)
+	JUMP_IF_NOT_LSE	8f
+
+	SWP		s(0), s(0), [x1]
+	ret
+
+8:	mov		s(tmp0), s(0)
+0:	LDXR		s(0), [x1]
+	STXR		w(tmp1), s(tmp0), [x1]
+	cbnz		w(tmp1), 0b
+	ret
+
+ENDFN	NAME(swp)
+#endif
+
+#if defined(L_ldadd) || defined(L_ldclr) \
+    || defined(L_ldeor) || defined(L_ldset)
+
+#ifdef L_ldadd
+#define LDNM	ldadd
+#define OP	add
+#elif defined(L_ldclr)
+#define LDNM	ldclr
+#define OP	bic
+#elif defined(L_ldeor)
+#define LDNM	ldeor
+#define OP	eor
+#elif defined(L_ldset)
+#define LDNM	ldset
+#define OP	orr
+#else
+#error
+#endif
+#define LDOP	glue4(LDNM, A, L, S)
+
+STARTFN	NAME(LDNM)
+	JUMP_IF_NOT_LSE	8f
+
+	LDOP		s(0), s(0), [x1]
+	ret
+
+8:	mov		s(tmp0), s(0)
+0:	LDXR		s(0), [x1]
+	OP		s(tmp1), s(0), s(tmp0)
+	STXR		w(tmp1), s(tmp1), [x1]
+	cbnz		w(tmp1), 0b
+	ret
+
+ENDFN	NAME(LDNM)
+#endif
diff --git a/libgcc/config/aarch64/t-lse b/libgcc/config/aarch64/t-lse
new file mode 100644
index 00000000000..fe3868dacbf
--- /dev/null
+++ b/libgcc/config/aarch64/t-lse
@@ -0,0 +1,44 @@
+# Out-of-line LSE atomics for AArch64 architecture.
+# Copyright (C) 2019 Free Software Foundation, Inc.
+# Contributed by Linaro Ltd.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Compare-and-swap has 5 sizes and 4 memory models.
+S0 := $(foreach s, 1 2 4 8 16, $(addsuffix _$(s), cas))
+O0 := $(foreach m, 1 2 3 4, $(addsuffix _$(m)$(objext), $(S0)))
+
+# Swap, Load-and-operate have 4 sizes and 4 memory models
+S1 := $(foreach s, 1 2 4 8, $(addsuffix _$(s), swp ldadd ldclr ldeor ldset))
+O1 := $(foreach m, 1 2 3 4, $(addsuffix _$(m)$(objext), $(S1)))
+
+LSE_OBJS := $(O0) $(O1)
+
+libgcc-objects += $(LSE_OBJS) lse-init$(objext)
+
+empty      =
+space      = $(empty) $(empty)
+PAT_SPLIT  = $(subst _,$(space),$(*F))
+PAT_BASE   = $(word 1,$(PAT_SPLIT))
+PAT_N      = $(word 2,$(PAT_SPLIT))
+PAT_M      = $(word 3,$(PAT_SPLIT))
+
+lse-init$(objext): $(srcdir)/config/aarch64/lse-init.c
+	$(gcc_compile) -c $<
+
+$(LSE_OBJS): $(srcdir)/config/aarch64/lse.S
+	$(gcc_compile) -DL_$(PAT_BASE) -DSIZE=$(PAT_N) -DMODEL=$(PAT_M) -c $<
-- 
2.20.1


[-- Attachment #6: 0005-aarch64-Implement-moutline-atomics.patch --]
[-- Type: application/octet-stream, Size: 20911 bytes --]

From 6db8dfc27d0824af94fbc712615dcdcdac865f19 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:43 +0000
Subject: [PATCH 5/8] aarch64: Implement -moutline-atomics

	* config/aarch64/aarch64.opt (-moutline-atomics): New.
	* config/aarch64/aarch64.c (aarch64_atomic_ool_func): New.
	(aarch64_ool_cas_names, aarch64_ool_swp_names): New.
	(aarch64_ool_ldadd_names, aarch64_ool_ldset_names): New.
	(aarch64_ool_ldclr_names, aarch64_ool_ldeor_names): New.
	(aarch64_expand_compare_and_swap): Honor TARGET_OUTLINE_ATOMICS.
	* config/aarch64/atomics.md (atomic_exchange<ALLI>): Likewise.
	(atomic_<atomic_op><ALLI>): Likewise.
	(atomic_fetch_<atomic_op><ALLI>): Likewise.
	(atomic_<atomic_op>_fetch<ALLI>): Likewise.
	* doc/invoke.texi: Document -moutline-atomics.
testsuite/
	* gcc.target/aarch64/atomic-op-acq_rel.c: Use -mno-outline-atomics.
	* gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Likewise.
	* gcc.target/aarch64/atomic-op-acquire.c: Likewise.
	* gcc.target/aarch64/atomic-op-char.c: Likewise.
	* gcc.target/aarch64/atomic-op-consume.c: Likewise.
	* gcc.target/aarch64/atomic-op-imm.c: Likewise.
	* gcc.target/aarch64/atomic-op-int.c: Likewise.
	* gcc.target/aarch64/atomic-op-long.c: Likewise.
	* gcc.target/aarch64/atomic-op-relaxed.c: Likewise.
	* gcc.target/aarch64/atomic-op-release.c: Likewise.
	* gcc.target/aarch64/atomic-op-seq_cst.c: Likewise.
	* gcc.target/aarch64/atomic-op-short.c: Likewise.
	* gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: Likewise.
	* gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: Likewise.
	* gcc.target/aarch64/sync-comp-swap.c: Likewise.
	* gcc.target/aarch64/sync-op-acquire.c: Likewise.
	* gcc.target/aarch64/sync-op-full.c: Likewise.

From-SVN: r275968
---
 gcc/config/aarch64/aarch64-protos.h           | 13 +++
 gcc/config/aarch64/aarch64.c                  | 87 +++++++++++++++++
 gcc/config/aarch64/aarch64.opt                |  3 +
 gcc/config/aarch64/atomics.md                 | 94 +++++++++++++++++--
 gcc/doc/invoke.texi                           | 16 +++-
 .../atomic-comp-swap-release-acquire.c        |  2 +-
 .../gcc.target/aarch64/atomic-op-acq_rel.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-acquire.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-char.c       |  2 +-
 .../gcc.target/aarch64/atomic-op-consume.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-imm.c        |  2 +-
 .../gcc.target/aarch64/atomic-op-int.c        |  2 +-
 .../gcc.target/aarch64/atomic-op-long.c       |  2 +-
 .../gcc.target/aarch64/atomic-op-relaxed.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-release.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-seq_cst.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-short.c      |  2 +-
 .../aarch64/atomic_cmp_exchange_zero_reg_1.c  |  2 +-
 .../atomic_cmp_exchange_zero_strong_1.c       |  2 +-
 .../gcc.target/aarch64/sync-comp-swap.c       |  2 +-
 .../gcc.target/aarch64/sync-op-acquire.c      |  2 +-
 .../gcc.target/aarch64/sync-op-full.c         |  2 +-
 22 files changed, 221 insertions(+), 26 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index c083cad5327..b9bfb281275 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -644,4 +644,17 @@ poly_uint64 aarch64_regmode_natural_size (machine_mode);
 
 bool aarch64_high_bits_all_ones_p (HOST_WIDE_INT);
 
+struct atomic_ool_names
+{
+    const char *str[5][4];
+};
+
+rtx aarch64_atomic_ool_func(machine_mode mode, rtx model_rtx,
+			    const atomic_ool_names *names);
+extern const atomic_ool_names aarch64_ool_swp_names;
+extern const atomic_ool_names aarch64_ool_ldadd_names;
+extern const atomic_ool_names aarch64_ool_ldset_names;
+extern const atomic_ool_names aarch64_ool_ldclr_names;
+extern const atomic_ool_names aarch64_ool_ldeor_names;
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index d5515f859e5..f81c2947f16 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15482,6 +15482,82 @@ aarch64_emit_unlikely_jump (rtx insn)
   add_reg_br_prob_note (jump, profile_probability::very_unlikely ());
 }
 
+/* We store the names of the various atomic helpers in a 5x4 array.
+   Return the libcall function given MODE, MODEL and NAMES.  */
+
+rtx
+aarch64_atomic_ool_func(machine_mode mode, rtx model_rtx,
+			const atomic_ool_names *names)
+{
+  memmodel model = memmodel_base (INTVAL (model_rtx));
+  int mode_idx, model_idx;
+
+  switch (mode)
+    {
+    case E_QImode:
+      mode_idx = 0;
+      break;
+    case E_HImode:
+      mode_idx = 1;
+      break;
+    case E_SImode:
+      mode_idx = 2;
+      break;
+    case E_DImode:
+      mode_idx = 3;
+      break;
+    case E_TImode:
+      mode_idx = 4;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  switch (model)
+    {
+    case MEMMODEL_RELAXED:
+      model_idx = 0;
+      break;
+    case MEMMODEL_CONSUME:
+    case MEMMODEL_ACQUIRE:
+      model_idx = 1;
+      break;
+    case MEMMODEL_RELEASE:
+      model_idx = 2;
+      break;
+    case MEMMODEL_ACQ_REL:
+    case MEMMODEL_SEQ_CST:
+      model_idx = 3;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  return init_one_libfunc_visibility (names->str[mode_idx][model_idx],
+				      VISIBILITY_HIDDEN);
+}
+
+#define DEF0(B, N) \
+  { "__aarch64_" #B #N "_relax", \
+    "__aarch64_" #B #N "_acq", \
+    "__aarch64_" #B #N "_rel", \
+    "__aarch64_" #B #N "_acq_rel" }
+
+#define DEF4(B)  DEF0(B, 1), DEF0(B, 2), DEF0(B, 4), DEF0(B, 8), \
+		 { NULL, NULL, NULL, NULL }
+#define DEF5(B)  DEF0(B, 1), DEF0(B, 2), DEF0(B, 4), DEF0(B, 8), DEF0(B, 16)
+
+static const atomic_ool_names aarch64_ool_cas_names = { { DEF5(cas) } };
+const atomic_ool_names aarch64_ool_swp_names = { { DEF4(swp) } };
+const atomic_ool_names aarch64_ool_ldadd_names = { { DEF4(ldadd) } };
+const atomic_ool_names aarch64_ool_ldset_names = { { DEF4(ldset) } };
+const atomic_ool_names aarch64_ool_ldclr_names = { { DEF4(ldclr) } };
+const atomic_ool_names aarch64_ool_ldeor_names = { { DEF4(ldeor) } };
+
+#undef DEF0
+#undef DEF4
+#undef DEF5
+
 /* Expand a compare and swap pattern.  */
 
 void
@@ -15528,6 +15604,17 @@ aarch64_expand_compare_and_swap (rtx operands[])
 						   newval, mod_s));
       cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
     }
+  else if (TARGET_OUTLINE_ATOMICS)
+    {
+      /* Oldval must satisfy compare afterward.  */
+      if (!aarch64_plus_operand (oldval, mode))
+	oldval = force_reg (mode, oldval);
+      rtx func = aarch64_atomic_ool_func (mode, mod_s, &aarch64_ool_cas_names);
+      rval = emit_library_call_value (func, NULL_RTX, LCT_NORMAL, r_mode,
+				      oldval, mode, newval, mode,
+				      XEXP (mem, 0), Pmode);
+      cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
+    }
   else
     {
       /* The oldval predicate varies by mode.  Test it and force to reg.  */
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 3c6d1cc90ad..f474a28eb92 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -255,3 +255,6 @@ user-land code.
 TargetVariable
 long aarch64_stack_protector_guard_offset = 0
 
+moutline-atomics
+Target Report Mask(OUTLINE_ATOMICS) Save
+Generate local calls to out-of-line atomic operations.
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 09d2a63c620..cabcc58f1a0 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -186,16 +186,27 @@
   (match_operand:SI 3 "const_int_operand" "")]
   ""
   {
-    rtx (*gen) (rtx, rtx, rtx, rtx);
-
     /* Use an atomic SWP when available.  */
     if (TARGET_LSE)
-      gen = gen_aarch64_atomic_exchange<mode>_lse;
+      {
+	emit_insn (gen_aarch64_atomic_exchange<mode>_lse
+		   (operands[0], operands[1], operands[2], operands[3]));
+      }
+    else if (TARGET_OUTLINE_ATOMICS)
+      {
+	machine_mode mode = <MODE>mode;
+	rtx func = aarch64_atomic_ool_func (mode, operands[3],
+					    &aarch64_ool_swp_names);
+	rtx rval = emit_library_call_value (func, operands[0], LCT_NORMAL,
+					    mode, operands[2], mode,
+					    XEXP (operands[1], 0), Pmode);
+        emit_move_insn (operands[0], rval);
+      }
     else
-      gen = gen_aarch64_atomic_exchange<mode>;
-
-    emit_insn (gen (operands[0], operands[1], operands[2], operands[3]));
-
+      {
+	emit_insn (gen_aarch64_atomic_exchange<mode>
+		   (operands[0], operands[1], operands[2], operands[3]));
+      }
     DONE;
   }
 )
@@ -280,6 +291,39 @@
 	  }
 	operands[1] = force_reg (<MODE>mode, operands[1]);
       }
+    else if (TARGET_OUTLINE_ATOMICS)
+      {
+        const atomic_ool_names *names;
+	switch (<CODE>)
+	  {
+	  case MINUS:
+	    operands[1] = expand_simple_unop (<MODE>mode, NEG, operands[1],
+					      NULL, 1);
+	    /* fallthru */
+	  case PLUS:
+	    names = &aarch64_ool_ldadd_names;
+	    break;
+	  case IOR:
+	    names = &aarch64_ool_ldset_names;
+	    break;
+	  case XOR:
+	    names = &aarch64_ool_ldeor_names;
+	    break;
+	  case AND:
+	    operands[1] = expand_simple_unop (<MODE>mode, NOT, operands[1],
+					      NULL, 1);
+	    names = &aarch64_ool_ldclr_names;
+	    break;
+	  default:
+	    gcc_unreachable ();
+	  }
+        machine_mode mode = <MODE>mode;
+	rtx func = aarch64_atomic_ool_func (mode, operands[2], names);
+	emit_library_call_value (func, NULL_RTX, LCT_NORMAL, mode,
+				 operands[1], mode,
+				 XEXP (operands[0], 0), Pmode);
+        DONE;
+      }
     else
       gen = gen_aarch64_atomic_<atomic_optab><mode>;
 
@@ -405,6 +449,40 @@
 	}
       operands[2] = force_reg (<MODE>mode, operands[2]);
     }
+  else if (TARGET_OUTLINE_ATOMICS)
+    {
+      const atomic_ool_names *names;
+      switch (<CODE>)
+	{
+	case MINUS:
+	  operands[2] = expand_simple_unop (<MODE>mode, NEG, operands[2],
+					    NULL, 1);
+	  /* fallthru */
+	case PLUS:
+	  names = &aarch64_ool_ldadd_names;
+	  break;
+	case IOR:
+	  names = &aarch64_ool_ldset_names;
+	  break;
+	case XOR:
+	  names = &aarch64_ool_ldeor_names;
+	  break;
+	case AND:
+	  operands[2] = expand_simple_unop (<MODE>mode, NOT, operands[2],
+					    NULL, 1);
+	  names = &aarch64_ool_ldclr_names;
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+      machine_mode mode = <MODE>mode;
+      rtx func = aarch64_atomic_ool_func (mode, operands[3], names);
+      rtx rval = emit_library_call_value (func, operands[0], LCT_NORMAL, mode,
+					  operands[2], mode,
+					  XEXP (operands[1], 0), Pmode);
+      emit_move_insn (operands[0], rval);
+      DONE;
+    }
   else
     gen = gen_aarch64_atomic_fetch_<atomic_optab><mode>;
 
@@ -494,7 +572,7 @@
 {
   /* Use an atomic load-operate instruction when possible.  In this case
      we will re-compute the result from the original mem value. */
-  if (TARGET_LSE)
+  if (TARGET_LSE || TARGET_OUTLINE_ATOMICS)
     {
       rtx tmp = gen_reg_rtx (<MODE>mode);
       operands[2] = force_reg (<MODE>mode, operands[2]);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0f6247caf51..792b768fceb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -637,7 +637,8 @@ Objective-C and Objective-C++ Dialects}.
 -march=@var{name}  -mcpu=@var{name}  -mtune=@var{name}  @gol
 -moverride=@var{string}  -mverbose-cost-dump @gol
 -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{sysreg} @gol
--mstack-protector-guard-offset=@var{offset} -mtrack-speculation }
+-mstack-protector-guard-offset=@var{offset} -mtrack-speculation @gol
+-moutline-atomics }
 
 @emph{Adapteva Epiphany Options}
 @gccoptlist{-mhalf-reg-file  -mprefer-short-insn-regs @gol
@@ -15782,6 +15783,19 @@ be used by the compiler when expanding calls to
 @code{__builtin_speculation_safe_copy} to permit a more efficient code
 sequence to be generated.
 
+@item -moutline-atomics
+@itemx -mno-outline-atomics
+Enable or disable calls to out-of-line helpers to implement atomic operations.
+These helpers will, at runtime, determine if the LSE instructions from
+ARMv8.1-A can be used; if not, they will use the load/store-exclusive
+instructions that are present in the base ARMv8.0 ISA.
+
+This option is only applicable when compiling for the base ARMv8.0
+instruction set.  If using a later revision, e.g. @option{-march=armv8.1-a}
+or @option{-march=armv8-a+lse}, the ARMv8.1-Atomics instructions will be
+used directly.  The same applies when using @option{-mcpu=} when the
+selected cpu supports the @samp{lse} feature.
+
 @item -march=@var{name}
 @opindex march
 Specify the name of the target architecture and, optionally, one or
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
index 49ca5d0d09c..a828a72aa75 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf -mno-outline-atomics" } */
 
 #include "atomic-comp-swap-release-acquire.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c
index 74f26348e42..6823ce381b2 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-acq_rel.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c
index 66c1b1efe20..87937de378a 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-acquire.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c
index c09d0434ecf..60955e57da3 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-char.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
index 5783ab84f5c..16cb11aeeaf 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-consume.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c
index 18b8f0b04e9..bcab4e481e3 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 int v = 0;
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c
index 8520f0839ba..040e4a8d168 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-int.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c
index d011f8c5ce2..fc88b92cd3e 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 long v = 0;
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c
index ed96bfdb978..503d62b0280 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-relaxed.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c
index fc4be17de89..efe14aea7e4 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-release.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c
index 613000fe490..09973bf82ba 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-seq_cst.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c
index e82c8118ece..e1dcebb0f89 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-short.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
index f2a21ddf2e1..29246979bfb 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=armv8-a+nolse" } */
+/* { dg-options "-O2 -march=armv8-a+nolse -mno-outline-atomics" } */
 /* { dg-skip-if "" { *-*-* } { "-mcpu=*" } { "" } } */
 
 int
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
index 8d2ae67dfbe..6daf9b08f5a 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=armv8-a+nolse" } */
+/* { dg-options "-O2 -march=armv8-a+nolse -mno-outline-atomics" } */
 /* { dg-skip-if "" { *-*-* } { "-mcpu=*" } { "" } } */
 
 int
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
index e571b2f13b3..f56415f3354 100644
--- a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
+++ b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf -mno-outline-atomics" } */
 
 #include "sync-comp-swap.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
index 357bf1be3b2..39b3144aa36 100644
--- a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "sync-op-acquire.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-full.c b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
index c6ba1629965..6b8b2043f40 100644
--- a/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "sync-op-full.x"
 
-- 
2.20.1


[-- Attachment #7: 0006-Fix-shrinkwrapping-interactions-with-atomics-PR92692.patch --]
[-- Type: application/octet-stream, Size: 1529 bytes --]

From f8f2936f09484522044939e438d1b5a5df35eb46 Mon Sep 17 00:00:00 2001
From: Wilco Dijkstra <wdijkstr@arm.com>
Date: Fri, 17 Jan 2020 13:17:21 +0000
Subject: [PATCH 6/8] Fix shrinkwrapping interactions with atomics (PR92692)

The separate shrinkwrapping pass may insert stores in the middle
of atomics loops which can cause issues on some implementations.
Avoid this by delaying splitting atomics patterns until after
prolog/epilog generation.

gcc/
	PR target/92692
	* config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
	Add assert to ensure prolog has been emitted.
	(aarch64_split_atomic_op): Likewise.
	* config/aarch64/atomics.md (aarch64_compare_and_swap<mode>)
	Use epilogue_completed rather than reload_completed.
	(aarch64_atomic_exchange<mode>): Likewise.
	(aarch64_atomic_<atomic_optab><mode>): Likewise.
	(atomic_nand<mode>): Likewise.
	(aarch64_atomic_fetch_<atomic_optab><mode>): Likewise.
	(atomic_fetch_nand<mode>): Likewise.
	(aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise.
	(atomic_nand_fetch<mode>): Likewise.
---
 gcc/config/aarch64/atomics.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index cabcc58f1a0..1458bc00095 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -104,7 +104,7 @@
    (clobber (match_scratch:SI 7 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_compare_and_swap (operands);
-- 
2.20.1


[-- Attachment #8: 0007-aarch64-Fix-store-exclusive-in-load-operate-LSE-help.patch --]
[-- Type: application/octet-stream, Size: 881 bytes --]

From e817b86e534823ed51d49ef0b3cd7d1b02f83d83 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 25 Sep 2019 21:48:41 +0000
Subject: [PATCH 7/8] aarch64: Fix store-exclusive in load-operate LSE helpers

	PR target/91834
	* config/aarch64/lse.S (LDNM): Ensure STXR output does not
	overlap the inputs.

From-SVN: r276133
---
 libgcc/config/aarch64/lse.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
index a5f6673596c..c7979382ad7 100644
--- a/libgcc/config/aarch64/lse.S
+++ b/libgcc/config/aarch64/lse.S
@@ -227,8 +227,8 @@ STARTFN	NAME(LDNM)
 8:	mov		s(tmp0), s(0)
 0:	LDXR		s(0), [x1]
 	OP		s(tmp1), s(0), s(tmp0)
-	STXR		w(tmp1), s(tmp1), [x1]
-	cbnz		w(tmp1), 0b
+	STXR		w(tmp2), s(tmp1), [x1]
+	cbnz		w(tmp2), 0b
 	ret
 
 ENDFN	NAME(LDNM)
-- 
2.20.1


[-- Attachment #9: 0008-aarch64-Configure-for-sys-auxv.h-in-libgcc-for-lse-i.patch --]
[-- Type: application/octet-stream, Size: 7719 bytes --]

From aba70574696501513eeb862bb53da8509461b783 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 25 Sep 2019 22:51:55 +0000
Subject: [PATCH 8/8] aarch64: Configure for sys/auxv.h in libgcc for
 lse-init.c

	PR target/91833
	* config/aarch64/lse-init.c: Include auto-target.h.  Disable
	initialization if !HAVE_SYS_AUXV_H.
	* configure.ac (AC_CHECK_HEADERS): Add sys/auxv.h.
	* config.in, configure: Rebuild.

From-SVN: r276134
---
 libgcc/config.in                 |  8 ++++++++
 libgcc/config/aarch64/lse-init.c |  4 +++-
 libgcc/configure                 | 26 +++++++++++++++++++-------
 libgcc/configure.ac              |  2 +-
 4 files changed, 31 insertions(+), 9 deletions(-)
 mode change 100644 => 100755 libgcc/configure

diff --git a/libgcc/config.in b/libgcc/config.in
index d634af9d949..59a3d8daf52 100644
--- a/libgcc/config.in
+++ b/libgcc/config.in
@@ -43,6 +43,9 @@
 /* Define to 1 if you have the <string.h> header file. */
 #undef HAVE_STRING_H
 
+/* Define to 1 if you have the <sys/auxv.h> header file. */
+#undef HAVE_SYS_AUXV_H
+
 /* Define to 1 if you have the <sys/stat.h> header file. */
 #undef HAVE_SYS_STAT_H
 
@@ -82,6 +85,11 @@
 /* Define to 1 if the target use emutls for thread-local storage. */
 #undef USE_EMUTLS
 
+/* Enable large inode numbers on Mac OS X 10.5.  */
+#ifndef _DARWIN_USE_64_BIT_INODE
+# define _DARWIN_USE_64_BIT_INODE 1
+#endif
+
 /* Number of bits in a file offset, on hosts where this is settable. */
 #undef _FILE_OFFSET_BITS
 
diff --git a/libgcc/config/aarch64/lse-init.c b/libgcc/config/aarch64/lse-init.c
index 33d29147479..1a8f4c55213 100644
--- a/libgcc/config/aarch64/lse-init.c
+++ b/libgcc/config/aarch64/lse-init.c
@@ -23,12 +23,14 @@ a copy of the GCC Runtime Library Exception along with this program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 <http://www.gnu.org/licenses/>.  */
 
+#include "auto-target.h"
+
 /* Define the symbol gating the LSE implementations.  */
 _Bool __aarch64_have_lse_atomics
   __attribute__((visibility("hidden"), nocommon));
 
 /* Disable initialization of __aarch64_have_lse_atomics during bootstrap.  */
-#ifndef inhibit_libc
+#if !defined(inhibit_libc) && defined(HAVE_SYS_AUXV_H)
 # include <sys/auxv.h>
 
 /* Disable initialization if the system headers are too old.  */
diff --git a/libgcc/configure b/libgcc/configure
old mode 100644
new mode 100755
index 36dbbc1f699..cf42c057352
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -674,6 +674,7 @@ infodir
 docdir
 oldincludedir
 includedir
+runstatedir
 localstatedir
 sharedstatedir
 sysconfdir
@@ -763,6 +764,7 @@ datadir='${datarootdir}'
 sysconfdir='${prefix}/etc'
 sharedstatedir='${prefix}/com'
 localstatedir='${prefix}/var'
+runstatedir='${localstatedir}/run'
 includedir='${prefix}/include'
 oldincludedir='/usr/include'
 docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
@@ -1015,6 +1017,15 @@ do
   | -silent | --silent | --silen | --sile | --sil)
     silent=yes ;;
 
+  -runstatedir | --runstatedir | --runstatedi | --runstated \
+  | --runstate | --runstat | --runsta | --runst | --runs \
+  | --run | --ru | --r)
+    ac_prev=runstatedir ;;
+  -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
+  | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
+  | --run=* | --ru=* | --r=*)
+    runstatedir=$ac_optarg ;;
+
   -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
     ac_prev=sbindir ;;
   -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@@ -1152,7 +1163,7 @@ fi
 for ac_var in	exec_prefix prefix bindir sbindir libexecdir datarootdir \
 		datadir sysconfdir sharedstatedir localstatedir includedir \
 		oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
-		libdir localedir mandir
+		libdir localedir mandir runstatedir
 do
   eval ac_val=\$$ac_var
   # Remove trailing slashes.
@@ -1305,6 +1316,7 @@ Fine tuning of the installation directories:
   --sysconfdir=DIR        read-only single-machine data [PREFIX/etc]
   --sharedstatedir=DIR    modifiable architecture-independent data [PREFIX/com]
   --localstatedir=DIR     modifiable single-machine data [PREFIX/var]
+  --runstatedir=DIR       modifiable per-process data [LOCALSTATEDIR/run]
   --libdir=DIR            object code libraries [EPREFIX/lib]
   --includedir=DIR        C header files [PREFIX/include]
   --oldincludedir=DIR     C header files for non-gcc [/usr/include]
@@ -4170,7 +4182,7 @@ else
     We can't simply define LARGE_OFF_T to be 9223372036854775807,
     since some C++ compilers masquerading as C compilers
     incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
 		       && LARGE_OFF_T % 2147483647 == 1)
 		      ? 1 : -1];
@@ -4216,7 +4228,7 @@ else
     We can't simply define LARGE_OFF_T to be 9223372036854775807,
     since some C++ compilers masquerading as C compilers
     incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
 		       && LARGE_OFF_T % 2147483647 == 1)
 		      ? 1 : -1];
@@ -4240,7 +4252,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
     We can't simply define LARGE_OFF_T to be 9223372036854775807,
     since some C++ compilers masquerading as C compilers
     incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
 		       && LARGE_OFF_T % 2147483647 == 1)
 		      ? 1 : -1];
@@ -4285,7 +4297,7 @@ else
     We can't simply define LARGE_OFF_T to be 9223372036854775807,
     since some C++ compilers masquerading as C compilers
     incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
 		       && LARGE_OFF_T % 2147483647 == 1)
 		      ? 1 : -1];
@@ -4309,7 +4321,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
     We can't simply define LARGE_OFF_T to be 9223372036854775807,
     since some C++ compilers masquerading as C compilers
     incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
 		       && LARGE_OFF_T % 2147483647 == 1)
 		      ? 1 : -1];
@@ -4421,7 +4433,7 @@ as_fn_arith $ac_cv_sizeof_long_double \* 8 && long_double_type_size=$as_val
 
 for ac_header in inttypes.h stdint.h stdlib.h ftw.h \
 	unistd.h sys/stat.h sys/types.h \
-	string.h strings.h memory.h
+	string.h strings.h memory.h sys/auxv.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_preproc "$LINENO" "$ac_header" "$as_ac_Header"
diff --git a/libgcc/configure.ac b/libgcc/configure.ac
index 8e96cafdf8b..0762d77a801 100644
--- a/libgcc/configure.ac
+++ b/libgcc/configure.ac
@@ -207,7 +207,7 @@ AC_SUBST(long_double_type_size)
 
 AC_CHECK_HEADERS(inttypes.h stdint.h stdlib.h ftw.h \
 	unistd.h sys/stat.h sys/types.h \
-	string.h strings.h memory.h)
+	string.h strings.h memory.h sys/auxv.h)
 AC_HEADER_STDC
 
 # Check for decimal float support.
-- 
2.20.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
  2020-03-25  0:24 Pop, Sebastian
@ 2020-03-31 15:47 ` Pop, Sebastian
       [not found]   ` <DB7PR08MB300296EE8E27166D95152D1393C90@DB7PR08MB3002.eurprd08.prod.outlook.com>
  2020-04-01 22:13 ` Christophe Lyon
  1 sibling, 1 reply; 12+ messages in thread
From: Pop, Sebastian @ 2020-03-31 15:47 UTC (permalink / raw)
  To: Kyrill Tkachov, gcc-patches; +Cc: richard.henderson, Wilco Dijkstra

Ping, can we have the -moutline-atomics patches committed to the gcc-9 branch?

Thanks,
Sebastian 

On 3/24/20, 7:24 PM, "Pop, Sebastian" <spop@amazon.com> wrote:

    Hi Kyrill,
    
    Thanks for pointing out the two missing bug fixes.
    Please see attached all the back-ported patches.
    All the patches from trunk applied cleanly with no conflicts (except for the ChangeLog files) to the gcc-9 branch.
    An up to date gcc-9 branch on which I applied the attached patches has passed bootstrap on aarch64-linux (Graviton2 with 64 N1 cores) and make check with no extra fails.
    Kyrill, could you please commit the attached patches to the gcc-9 branch?
    
    As we still don't have a copyright assignment on file, would it be possible for ARM to finish the backport to the gcc-8 branch of these patches and the atomics cleanup patches mentioned below?
    
    I did a `git log config/aarch64/atomics.md` and there is a follow-up patch to the atomics cleanup patches:
    
    commit e21679a8bb17aac603b8704891e60ac502200629
    Author: Jakub Jelinek <jakub@redhat.com>
    Date:   Wed Nov 21 17:41:03 2018 +0100
    
        re PR target/87839 (ICE in final_scan_insn_1, at final.c:3070)
    
                PR target/87839
                * config/aarch64/atomics.md (@aarch64_compare_and_swap<mode>): Use
                rIJ constraint for aarch64_plus_operand rather than rn.
    
                * gcc.target/aarch64/pr87839.c: New test.
    
        From-SVN: r266346
    
    That is fixing code modified in this cleanup patch:
    
    commit d400fda3a8c3330f77eb9d51874f5482d3819a9f
    Author: Richard Henderson <richard.henderson@linaro.org>
    Date:   Wed Oct 31 09:42:39 2018 +0000
    
        aarch64: Improve cas generation
    
    
    Thanks,
    Sebastian
    
    
    On 3/11/20, 5:11 AM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
    
        CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
        
        
        
        Hi Sebastian,
        
        On 3/9/20 9:47 PM, Pop, Sebastian wrote:
        > Hi,
        >
        > Please see attached the patches to add -moutline-atomics to the gcc-9 branch.
        > Tested on graviton2 aarch64-linux with bootstrap and
        > `make check` passes with no new fails.
        > Tested `make check` on glibc built with gcc-9 with and without "-moutline-atomics"
        > and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".
        >
        > Ok to commit to gcc-9 branch?
        
        Since this feature enables backwards-compatible deployment of LSE
        atomics, I'd support that.
        
        That is okay with me in principle after GCC 9.3 is released (the branch
        is currently frozen).
        
        However, there have been a few follow-up patches to fix some bugs
        revealed by testing.
        
        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91833
        
        and
        
        https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91834
        
        come to mind.
        
        Can you please make sure the fixes for those are included as well?
        
        
        >
        > Does this mechanical `git am *.patch` require a copyright assignment?
        > I am still working with my employer on getting the FSF assignment signed.
        >
        > Thanks,
        > Sebastian
        >
        > PS: For gcc-8 backports there are 5 cleanup and improvement patches
        > that are needed for -moutline-atomics patches to apply cleanly.
        > Should these patches be back-ported in the same time as the flag patches,
        > or should I update the patches to apply to the older code base?
        
        Hmm... normally I'd be for them. In this case I'd want to make sure that
        there aren't any fallout fixes that we're missing.
        
        Did these patches have any bug reports against them?
        
        Thanks,
        
        Kyrill
        
        
        > Here is the list of the extra patches:
        >
        >  From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep 17 00:00:00 2001
        > From: Richard Henderson <richard.henderson@linaro.org>
        > Date: Wed, 31 Oct 2018 09:29:29 +0000
        > Subject: [PATCH] aarch64: Simplify LSE cas generation
        >
        > The cas insn is a single insn, and if expanded properly need not
        > be split after reload.  Use the proper inputs for the insn.
        >
        >          * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
        >          Force oldval into the rval register for TARGET_LSE; emit the compare
        >          during initial expansion so that it may be deleted if unused.
        >          (aarch64_gen_atomic_cas): Remove.
        >          * config/aarch64/atomics.md (@aarch64_compare_and_swap<SHORT>_lse):
        >          Change =&r to +r for operand 0; use match_dup for operand 2;
        >          remove is_weak and mod_f operands as unused.  Drop the split
        >          and merge with...
        >          (@aarch64_atomic_cas<SHORT>): ... this pattern's output; remove.
        >          (@aarch64_compare_and_swap<GPI>_lse): Similarly.
        >          (@aarch64_atomic_cas<GPI>): Similarly.
        >
        > From-SVN: r265656
        >
        >  From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep 17 00:00:00 2001
        > From: Richard Henderson <richard.henderson@linaro.org>
        > Date: Wed, 31 Oct 2018 09:42:39 +0000
        > Subject: [PATCH] aarch64: Improve cas generation
        >
        > Do not zero-extend the input to the cas for subword operations;
        > instead, use the appropriate zero-extending compare insns.
        > Correct the predicates and constraints for immediate expected operand.
        >
        >          * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New.
        >          (aarch64_split_compare_and_swap): Use it.
        >          (aarch64_expand_compare_and_swap): Likewise.  Remove convert_modes;
        >          test oldval against the proper predicate.
        >          * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI>):
        >          Use nonmemory_operand for expected.
        >          (cas_short_expected_pred): New.
        >          (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not "rI" to match.
        >          (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for expected.
        >          * config/aarch64/predicates.md (aarch64_plushi_immediate): New.
        >          (aarch64_plushi_operand): New.
        >
        > From-SVN: r265657
        >
        >  From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep 17 00:00:00 2001
        > From: Richard Henderson <richard.henderson@linaro.org>
        > Date: Wed, 31 Oct 2018 09:47:21 +0000
        > Subject: [PATCH] aarch64: Improve swp generation
        >
        > Allow zero as an input; fix constraints; avoid unnecessary split.
        >
        >          * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove.
        >          (aarch64_gen_atomic_ldop): Don't call it.
        >          * config/aarch64/atomics.md (atomic_exchange<ALLI>):
        >          Use aarch64_reg_or_zero.
        >          (aarch64_atomic_exchange<ALLI>): Likewise.
        >          (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove & from
        >          operand 0; use aarch64_reg_or_zero for input; merge ...
        >          (@aarch64_atomic_swp<ALLI>): ... this and remove.
        >
        > From-SVN: r265659
        >
        >  From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep 17 00:00:00 2001
        > From: Richard Henderson <richard.henderson@linaro.org>
        > Date: Wed, 31 Oct 2018 09:58:48 +0000
        > Subject: [PATCH] aarch64: Improve atomic-op lse generation
        >
        > Fix constraints; avoid unnecessary split.  Drop the use of the atomic_op
        > iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more
        > logical for ldclr aka bic.
        >
        >          * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
        >          (aarch64_atomic_ldop_supported_p): Remove.
        >          (aarch64_gen_atomic_ldop): Remove.
        >          * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>):
        >          Fully expand LSE operations here.
        >          (atomic_fetch_<atomic_optab><ALLI>): Likewise.
        >          (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
        >          (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op iterator
        >          and use ATOMIC_LDOP instead; use register_operand for the input;
        >          drop the split and emit insns directly.
        >          (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise.
        >          (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
        >          (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.
        >
        > From-SVN: r265660
        >
        >  From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep 17 00:00:00 2001
        > From: Richard Henderson <richard.henderson@linaro.org>
        > Date: Wed, 31 Oct 2018 23:11:22 +0000
        > Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch
        >
        >          * config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
        >          The scratch register need not be early-clobber.  Document the reason
        >          why we cannot use ST<OP>.
        >
        > From-SVN: r265703
        >
        >
        >
        >
        >
        > On 2/27/20, 12:06 PM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
        >
        >      Hi Sebastian,
        >
        >      On 2/27/20 4:53 PM, Pop, Sebastian wrote:
        >      >
        >      > Hi,
        >      >
        >      > is somebody already working on backporting -moutline-atomics to gcc
        >      > 8.x and 9.x branches?
        >      >
        >      I'm not aware of such work going on.
        >
        >      Thanks,
        >
        >      Kyrill
        >
        >      > Thanks,
        >      >
        >      > Sebastian
        >      >
        >
        >
        
    
    


^ permalink raw reply	[flat|nested] 12+ messages in thread

[parent not found: <DB7PR08MB300296EE8E27166D95152D1393C90@DB7PR08MB3002.eurprd08.prod.outlook.com>]

* RE: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
       [not found]   ` <DB7PR08MB300296EE8E27166D95152D1393C90@DB7PR08MB3002.eurprd08.prod.outlook.com>
@ 2020-04-01 14:26     ` Kyrylo Tkachov
  2020-04-01 14:32       ` Pop, Sebastian
  0 siblings, 1 reply; 12+ messages in thread
From: Kyrylo Tkachov @ 2020-04-01 14:26 UTC (permalink / raw)
  To: Pop, Sebastian; +Cc: Wilco Dijkstra, richard.henderson, gcc-patches

Adding gcc-patches as I had somehow deleted it from the addresses...

> -----Original Message-----
> From: Kyrylo Tkachov
> Sent: 01 April 2020 15:23
> To: Pop, Sebastian <spop@amazon.com>
> Cc: Wilco Dijkstra <Wilco.Dijkstra@arm.com>; richard.henderson@linaro.org
> Subject: RE: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
> 
> Hi Sebastian,
> 
> > -----Original Message-----
> > From: Gcc-patches <gcc-patches-bounces@gcc.gnu.org> On Behalf Of Pop,
> > Sebastian via Gcc-patches
> > Sent: 31 March 2020 16:47
> > To: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>;
> > gcc-patches@gcc.gnu.org
> > Cc: Wilco Dijkstra <Wilco.Dijkstra@arm.com>;
> > richard.henderson@linaro.org
> > Subject: Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and
> > 8.x
> >
> > Ping, can we have the -moutline-atomics patches committed to the gcc-9
> > branch?
> 
> Thanks for testing the patches.
> 
> >
> > Thanks,
> > Sebastian
> >
> > On 3/24/20, 7:24 PM, "Pop, Sebastian" <spop@amazon.com> wrote:
> >
> >     Hi Kyrill,
> >
> >     Thanks for pointing out the two missing bug fixes.
> >     Please see attached all the back-ported patches.
> >     All the patches from trunk applied cleanly with no conflicts
> > (except for the ChangeLog files) to the gcc-9 branch.
> >     An up to date gcc-9 branch on which I applied the attached patches
> > has passed bootstrap on aarch64-linux (Graviton2 with 64 N1 cores) and
> > make check with no extra fails.
> >     Kyrill, could you please commit the attached patches to the gcc-9 branch?
> 
> This series also needs Jakub's recent fix: https://gcc.gnu.org/pipermail/gcc-
> patches/2020-March/542952.html
> I've tested this together with the rest and committed the whole series to the
> gcc-9 branch.
> 
> >
> >     As we still don't have a copyright assignment on file, would it be
> > possible for ARM to finish the backport to the gcc-8 branch of these
> > patches and the atomics cleanup patches mentioned below?
> 
> I can help with that, but any help with testing the patch set would be
> appreciated.
> Thanks,
> Kyrill
> 
> >
> >     I did a `git log config/aarch64/atomics.md` and there is a
> > follow-up patch to the atomics cleanup patches:
> >
> >     commit e21679a8bb17aac603b8704891e60ac502200629
> >     Author: Jakub Jelinek <jakub@redhat.com>
> >     Date:   Wed Nov 21 17:41:03 2018 +0100
> >
> >         re PR target/87839 (ICE in final_scan_insn_1, at final.c:3070)
> >
> >                 PR target/87839
> >                 * config/aarch64/atomics.md
> > (@aarch64_compare_and_swap<mode>): Use
> >                 rIJ constraint for aarch64_plus_operand rather than rn.
> >
> >                 * gcc.target/aarch64/pr87839.c: New test.
> >
> >         From-SVN: r266346
> >
> >     That is fixing code modified in this cleanup patch:
> >
> >     commit d400fda3a8c3330f77eb9d51874f5482d3819a9f
> >     Author: Richard Henderson <richard.henderson@linaro.org>
> >     Date:   Wed Oct 31 09:42:39 2018 +0000
> >
> >         aarch64: Improve cas generation
> >
> >
> >     Thanks,
> >     Sebastian
> >
> >
> >     On 3/11/20, 5:11 AM, "Kyrill Tkachov"
> > <kyrylo.tkachov@foss.arm.com>
> > wrote:
> >
> >         CAUTION: This email originated from outside of the
> > organization. Do not click links or open attachments unless you can
> > confirm the sender and know the content is safe.
> >
> >
> >
> >         Hi Sebastian,
> >
> >         On 3/9/20 9:47 PM, Pop, Sebastian wrote:
> >         > Hi,
> >         >
> >         > Please see attached the patches to add -moutline-atomics to
> > the gcc-9 branch.
> >         > Tested on graviton2 aarch64-linux with bootstrap and
> >         > `make check` passes with no new fails.
> >         > Tested `make check` on glibc built with gcc-9 with and
> > without "- moutline-atomics"
> >         > and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".
> >         >
> >         > Ok to commit to gcc-9 branch?
> >
> >         Since this feature enables backwards-compatible deployment of LSE
> >         atomics, I'd support that.
> >
> >         That is okay with me in principle after GCC 9.3 is released (the branch
> >         is currently frozen).
> >
> >         However, there have been a few follow-up patches to fix some bugs
> >         revealed by testing.
> >
> >         https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91833
> >
> >         and
> >
> >         https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91834
> >
> >         come to mind.
> >
> >         Can you please make sure the fixes for those are included as well?
> >
> >
> >         >
> >         > Does this mechanical `git am *.patch` require a copyright assignment?
> >         > I am still working with my employer on getting the FSF
> > assignment signed.
> >         >
> >         > Thanks,
> >         > Sebastian
> >         >
> >         > PS: For gcc-8 backports there are 5 cleanup and improvement
> patches
> >         > that are needed for -moutline-atomics patches to apply cleanly.
> >         > Should these patches be back-ported in the same time as the
> > flag patches,
> >         > or should I update the patches to apply to the older code base?
> >
> >         Hmm... normally I'd be for them. In this case I'd want to make sure
> that
> >         there aren't any fallout fixes that we're missing.
> >
> >         Did these patches have any bug reports against them?
> >
> >         Thanks,
> >
> >         Kyrill
> >
> >
> >         > Here is the list of the extra patches:
> >         >
> >         >  From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep 17
> > 00:00:00 2001
> >         > From: Richard Henderson <richard.henderson@linaro.org>
> >         > Date: Wed, 31 Oct 2018 09:29:29 +0000
> >         > Subject: [PATCH] aarch64: Simplify LSE cas generation
> >         >
> >         > The cas insn is a single insn, and if expanded properly need not
> >         > be split after reload.  Use the proper inputs for the insn.
> >         >
> >         >          * config/aarch64/aarch64.c
> > (aarch64_expand_compare_and_swap):
> >         >          Force oldval into the rval register for TARGET_LSE; emit the
> > compare
> >         >          during initial expansion so that it may be deleted if unused.
> >         >          (aarch64_gen_atomic_cas): Remove.
> >         >          * config/aarch64/atomics.md
> > (@aarch64_compare_and_swap<SHORT>_lse):
> >         >          Change =&r to +r for operand 0; use match_dup for operand 2;
> >         >          remove is_weak and mod_f operands as unused.  Drop the split
> >         >          and merge with...
> >         >          (@aarch64_atomic_cas<SHORT>): ... this pattern's output;
> > remove.
> >         >          (@aarch64_compare_and_swap<GPI>_lse): Similarly.
> >         >          (@aarch64_atomic_cas<GPI>): Similarly.
> >         >
> >         > From-SVN: r265656
> >         >
> >         >  From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep 17
> > 00:00:00 2001
> >         > From: Richard Henderson <richard.henderson@linaro.org>
> >         > Date: Wed, 31 Oct 2018 09:42:39 +0000
> >         > Subject: [PATCH] aarch64: Improve cas generation
> >         >
> >         > Do not zero-extend the input to the cas for subword operations;
> >         > instead, use the appropriate zero-extending compare insns.
> >         > Correct the predicates and constraints for immediate
> > expected operand.
> >         >
> >         >          * config/aarch64/aarch64.c
> > (aarch64_gen_compare_reg_maybe_ze): New.
> >         >          (aarch64_split_compare_and_swap): Use it.
> >         >          (aarch64_expand_compare_and_swap): Likewise.  Remove
> > convert_modes;
> >         >          test oldval against the proper predicate.
> >         >          * config/aarch64/atomics.md
> > (@atomic_compare_and_swap<ALLI>):
> >         >          Use nonmemory_operand for expected.
> >         >          (cas_short_expected_pred): New.
> >         >          (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not
> > "rI" to match.
> >         >          (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for
> > expected.
> >         >          * config/aarch64/predicates.md (aarch64_plushi_immediate):
> > New.
> >         >          (aarch64_plushi_operand): New.
> >         >
> >         > From-SVN: r265657
> >         >
> >         >  From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep 17
> > 00:00:00 2001
> >         > From: Richard Henderson <richard.henderson@linaro.org>
> >         > Date: Wed, 31 Oct 2018 09:47:21 +0000
> >         > Subject: [PATCH] aarch64: Improve swp generation
> >         >
> >         > Allow zero as an input; fix constraints; avoid unnecessary split.
> >         >
> >         >          * config/aarch64/aarch64.c (aarch64_emit_atomic_swap):
> > Remove.
> >         >          (aarch64_gen_atomic_ldop): Don't call it.
> >         >          * config/aarch64/atomics.md (atomic_exchange<ALLI>):
> >         >          Use aarch64_reg_or_zero.
> >         >          (aarch64_atomic_exchange<ALLI>): Likewise.
> >         >          (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove &
> > from
> >         >          operand 0; use aarch64_reg_or_zero for input; merge ...
> >         >          (@aarch64_atomic_swp<ALLI>): ... this and remove.
> >         >
> >         > From-SVN: r265659
> >         >
> >         >  From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep 17
> > 00:00:00 2001
> >         > From: Richard Henderson <richard.henderson@linaro.org>
> >         > Date: Wed, 31 Oct 2018 09:58:48 +0000
> >         > Subject: [PATCH] aarch64: Improve atomic-op lse generation
> >         >
> >         > Fix constraints; avoid unnecessary split.  Drop the use of the
> atomic_op
> >         > iterator in favor of the ATOMIC_LDOP iterator; this is
> > simplier and more
> >         > logical for ldclr aka bic.
> >         >
> >         >          * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
> >         >          (aarch64_atomic_ldop_supported_p): Remove.
> >         >          (aarch64_gen_atomic_ldop): Remove.
> >         >          * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>):
> >         >          Fully expand LSE operations here.
> >         >          (atomic_fetch_<atomic_optab><ALLI>): Likewise.
> >         >          (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
> >         >          (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op
> > iterator
> >         >          and use ATOMIC_LDOP instead; use register_operand for the
> > input;
> >         >          drop the split and emit insns directly.
> >         >          (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise.
> >         >          (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
> >         >          (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.
> >         >
> >         > From-SVN: r265660
> >         >
> >         >  From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep 17
> > 00:00:00 2001
> >         > From: Richard Henderson <richard.henderson@linaro.org>
> >         > Date: Wed, 31 Oct 2018 23:11:22 +0000
> >         > Subject: [PATCH] aarch64: Remove early clobber from
> > ATOMIC_LDOP scratch
> >         >
> >         >          * config/aarch64/atomics.md
> > (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
> >         >          The scratch register need not be early-clobber.  Document the
> > reason
> >         >          why we cannot use ST<OP>.
> >         >
> >         > From-SVN: r265703
> >         >
> >         >
> >         >
> >         >
> >         >
> >         > On 2/27/20, 12:06 PM, "Kyrill Tkachov"
> > <kyrylo.tkachov@foss.arm.com>
> > wrote:
> >         >
> >         >      Hi Sebastian,
> >         >
> >         >      On 2/27/20 4:53 PM, Pop, Sebastian wrote:
> >         >      >
> >         >      > Hi,
> >         >      >
> >         >      > is somebody already working on backporting -moutline-atomics
> to
> > gcc
> >         >      > 8.x and 9.x branches?
> >         >      >
> >         >      I'm not aware of such work going on.
> >         >
> >         >      Thanks,
> >         >
> >         >      Kyrill
> >         >
> >         >      > Thanks,
> >         >      >
> >         >      > Sebastian
> >         >      >
> >         >
> >         >
> >
> >
> >


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
  2020-04-01 14:26     ` Kyrylo Tkachov
@ 2020-04-01 14:32       ` Pop, Sebastian
  2020-04-01 14:35         ` Kyrylo Tkachov
  2020-04-01 14:35         ` Jakub Jelinek
  0 siblings, 2 replies; 12+ messages in thread
From: Pop, Sebastian @ 2020-04-01 14:32 UTC (permalink / raw)
  To: Kyrylo Tkachov; +Cc: Wilco Dijkstra, richard.henderson, gcc-patches

Thanks Kyrill!  I will be happy to test the gcc-8 back-port of the patches.

We would also need to back-port the patches to gcc-7.
I hope it is ok to commit the changes to the gcc-7 branch even if it is not a maintained branch.

Sebastian

On 4/1/20, 9:27 AM, "Kyrylo Tkachov" <Kyrylo.Tkachov@arm.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
    
    
    
    Adding gcc-patches as I had somehow deleted it from the addresses...
    
    > -----Original Message-----
    > From: Kyrylo Tkachov
    > Sent: 01 April 2020 15:23
    > To: Pop, Sebastian <spop@amazon.com>
    > Cc: Wilco Dijkstra <Wilco.Dijkstra@arm.com>; richard.henderson@linaro.org
    > Subject: RE: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
    >
    > Hi Sebastian,
    >
    > > -----Original Message-----
    > > From: Gcc-patches <gcc-patches-bounces@gcc.gnu.org> On Behalf Of Pop,
    > > Sebastian via Gcc-patches
    > > Sent: 31 March 2020 16:47
    > > To: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>;
    > > gcc-patches@gcc.gnu.org
    > > Cc: Wilco Dijkstra <Wilco.Dijkstra@arm.com>;
    > > richard.henderson@linaro.org
    > > Subject: Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and
    > > 8.x
    > >
    > > Ping, can we have the -moutline-atomics patches committed to the gcc-9
    > > branch?
    >
    > Thanks for testing the patches.
    >
    > >
    > > Thanks,
    > > Sebastian
    > >
    > > On 3/24/20, 7:24 PM, "Pop, Sebastian" <spop@amazon.com> wrote:
    > >
    > >     Hi Kyrill,
    > >
    > >     Thanks for pointing out the two missing bug fixes.
    > >     Please see attached all the back-ported patches.
    > >     All the patches from trunk applied cleanly with no conflicts
    > > (except for the ChangeLog files) to the gcc-9 branch.
    > >     An up to date gcc-9 branch on which I applied the attached patches
    > > has passed bootstrap on aarch64-linux (Graviton2 with 64 N1 cores) and
    > > make check with no extra fails.
    > >     Kyrill, could you please commit the attached patches to the gcc-9 branch?
    >
    > This series also needs Jakub's recent fix: https://gcc.gnu.org/pipermail/gcc-
    > patches/2020-March/542952.html
    > I've tested this together with the rest and committed the whole series to the
    > gcc-9 branch.
    >
    > >
    > >     As we still don't have a copyright assignment on file, would it be
    > > possible for ARM to finish the backport to the gcc-8 branch of these
    > > patches and the atomics cleanup patches mentioned below?
    >
    > I can help with that, but any help with testing the patch set would be
    > appreciated.
    > Thanks,
    > Kyrill
    >
    > >
    > >     I did a `git log config/aarch64/atomics.md` and there is a
    > > follow-up patch to the atomics cleanup patches:
    > >
    > >     commit e21679a8bb17aac603b8704891e60ac502200629
    > >     Author: Jakub Jelinek <jakub@redhat.com>
    > >     Date:   Wed Nov 21 17:41:03 2018 +0100
    > >
    > >         re PR target/87839 (ICE in final_scan_insn_1, at final.c:3070)
    > >
    > >                 PR target/87839
    > >                 * config/aarch64/atomics.md
    > > (@aarch64_compare_and_swap<mode>): Use
    > >                 rIJ constraint for aarch64_plus_operand rather than rn.
    > >
    > >                 * gcc.target/aarch64/pr87839.c: New test.
    > >
    > >         From-SVN: r266346
    > >
    > >     That is fixing code modified in this cleanup patch:
    > >
    > >     commit d400fda3a8c3330f77eb9d51874f5482d3819a9f
    > >     Author: Richard Henderson <richard.henderson@linaro.org>
    > >     Date:   Wed Oct 31 09:42:39 2018 +0000
    > >
    > >         aarch64: Improve cas generation
    > >
    > >
    > >     Thanks,
    > >     Sebastian
    > >
    > >
    > >     On 3/11/20, 5:11 AM, "Kyrill Tkachov"
    > > <kyrylo.tkachov@foss.arm.com>
    > > wrote:
    > >
    > >         CAUTION: This email originated from outside of the
    > > organization. Do not click links or open attachments unless you can
    > > confirm the sender and know the content is safe.
    > >
    > >
    > >
    > >         Hi Sebastian,
    > >
    > >         On 3/9/20 9:47 PM, Pop, Sebastian wrote:
    > >         > Hi,
    > >         >
    > >         > Please see attached the patches to add -moutline-atomics to
    > > the gcc-9 branch.
    > >         > Tested on graviton2 aarch64-linux with bootstrap and
    > >         > `make check` passes with no new fails.
    > >         > Tested `make check` on glibc built with gcc-9 with and
    > > without "- moutline-atomics"
    > >         > and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".
    > >         >
    > >         > Ok to commit to gcc-9 branch?
    > >
    > >         Since this feature enables backwards-compatible deployment of LSE
    > >         atomics, I'd support that.
    > >
    > >         That is okay with me in principle after GCC 9.3 is released (the branch
    > >         is currently frozen).
    > >
    > >         However, there have been a few follow-up patches to fix some bugs
    > >         revealed by testing.
    > >
    > >         https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91833
    > >
    > >         and
    > >
    > >         https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91834
    > >
    > >         come to mind.
    > >
    > >         Can you please make sure the fixes for those are included as well?
    > >
    > >
    > >         >
    > >         > Does this mechanical `git am *.patch` require a copyright assignment?
    > >         > I am still working with my employer on getting the FSF
    > > assignment signed.
    > >         >
    > >         > Thanks,
    > >         > Sebastian
    > >         >
    > >         > PS: For gcc-8 backports there are 5 cleanup and improvement
    > patches
    > >         > that are needed for -moutline-atomics patches to apply cleanly.
    > >         > Should these patches be back-ported in the same time as the
    > > flag patches,
    > >         > or should I update the patches to apply to the older code base?
    > >
    > >         Hmm... normally I'd be for them. In this case I'd want to make sure
    > that
    > >         there aren't any fallout fixes that we're missing.
    > >
    > >         Did these patches have any bug reports against them?
    > >
    > >         Thanks,
    > >
    > >         Kyrill
    > >
    > >
    > >         > Here is the list of the extra patches:
    > >         >
    > >         >  From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep 17
    > > 00:00:00 2001
    > >         > From: Richard Henderson <richard.henderson@linaro.org>
    > >         > Date: Wed, 31 Oct 2018 09:29:29 +0000
    > >         > Subject: [PATCH] aarch64: Simplify LSE cas generation
    > >         >
    > >         > The cas insn is a single insn, and if expanded properly need not
    > >         > be split after reload.  Use the proper inputs for the insn.
    > >         >
    > >         >          * config/aarch64/aarch64.c
    > > (aarch64_expand_compare_and_swap):
    > >         >          Force oldval into the rval register for TARGET_LSE; emit the
    > > compare
    > >         >          during initial expansion so that it may be deleted if unused.
    > >         >          (aarch64_gen_atomic_cas): Remove.
    > >         >          * config/aarch64/atomics.md
    > > (@aarch64_compare_and_swap<SHORT>_lse):
    > >         >          Change =&r to +r for operand 0; use match_dup for operand 2;
    > >         >          remove is_weak and mod_f operands as unused.  Drop the split
    > >         >          and merge with...
    > >         >          (@aarch64_atomic_cas<SHORT>): ... this pattern's output;
    > > remove.
    > >         >          (@aarch64_compare_and_swap<GPI>_lse): Similarly.
    > >         >          (@aarch64_atomic_cas<GPI>): Similarly.
    > >         >
    > >         > From-SVN: r265656
    > >         >
    > >         >  From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep 17
    > > 00:00:00 2001
    > >         > From: Richard Henderson <richard.henderson@linaro.org>
    > >         > Date: Wed, 31 Oct 2018 09:42:39 +0000
    > >         > Subject: [PATCH] aarch64: Improve cas generation
    > >         >
    > >         > Do not zero-extend the input to the cas for subword operations;
    > >         > instead, use the appropriate zero-extending compare insns.
    > >         > Correct the predicates and constraints for immediate
    > > expected operand.
    > >         >
    > >         >          * config/aarch64/aarch64.c
    > > (aarch64_gen_compare_reg_maybe_ze): New.
    > >         >          (aarch64_split_compare_and_swap): Use it.
    > >         >          (aarch64_expand_compare_and_swap): Likewise.  Remove
    > > convert_modes;
    > >         >          test oldval against the proper predicate.
    > >         >          * config/aarch64/atomics.md
    > > (@atomic_compare_and_swap<ALLI>):
    > >         >          Use nonmemory_operand for expected.
    > >         >          (cas_short_expected_pred): New.
    > >         >          (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not
    > > "rI" to match.
    > >         >          (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for
    > > expected.
    > >         >          * config/aarch64/predicates.md (aarch64_plushi_immediate):
    > > New.
    > >         >          (aarch64_plushi_operand): New.
    > >         >
    > >         > From-SVN: r265657
    > >         >
    > >         >  From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep 17
    > > 00:00:00 2001
    > >         > From: Richard Henderson <richard.henderson@linaro.org>
    > >         > Date: Wed, 31 Oct 2018 09:47:21 +0000
    > >         > Subject: [PATCH] aarch64: Improve swp generation
    > >         >
    > >         > Allow zero as an input; fix constraints; avoid unnecessary split.
    > >         >
    > >         >          * config/aarch64/aarch64.c (aarch64_emit_atomic_swap):
    > > Remove.
    > >         >          (aarch64_gen_atomic_ldop): Don't call it.
    > >         >          * config/aarch64/atomics.md (atomic_exchange<ALLI>):
    > >         >          Use aarch64_reg_or_zero.
    > >         >          (aarch64_atomic_exchange<ALLI>): Likewise.
    > >         >          (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove &
    > > from
    > >         >          operand 0; use aarch64_reg_or_zero for input; merge ...
    > >         >          (@aarch64_atomic_swp<ALLI>): ... this and remove.
    > >         >
    > >         > From-SVN: r265659
    > >         >
    > >         >  From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep 17
    > > 00:00:00 2001
    > >         > From: Richard Henderson <richard.henderson@linaro.org>
    > >         > Date: Wed, 31 Oct 2018 09:58:48 +0000
    > >         > Subject: [PATCH] aarch64: Improve atomic-op lse generation
    > >         >
    > >         > Fix constraints; avoid unnecessary split.  Drop the use of the
    > atomic_op
    > >         > iterator in favor of the ATOMIC_LDOP iterator; this is
    > > simplier and more
    > >         > logical for ldclr aka bic.
    > >         >
    > >         >          * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
    > >         >          (aarch64_atomic_ldop_supported_p): Remove.
    > >         >          (aarch64_gen_atomic_ldop): Remove.
    > >         >          * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>):
    > >         >          Fully expand LSE operations here.
    > >         >          (atomic_fetch_<atomic_optab><ALLI>): Likewise.
    > >         >          (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
    > >         >          (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op
    > > iterator
    > >         >          and use ATOMIC_LDOP instead; use register_operand for the
    > > input;
    > >         >          drop the split and emit insns directly.
    > >         >          (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise.
    > >         >          (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
    > >         >          (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.
    > >         >
    > >         > From-SVN: r265660
    > >         >
    > >         >  From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep 17
    > > 00:00:00 2001
    > >         > From: Richard Henderson <richard.henderson@linaro.org>
    > >         > Date: Wed, 31 Oct 2018 23:11:22 +0000
    > >         > Subject: [PATCH] aarch64: Remove early clobber from
    > > ATOMIC_LDOP scratch
    > >         >
    > >         >          * config/aarch64/atomics.md
    > > (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
    > >         >          The scratch register need not be early-clobber.  Document the
    > > reason
    > >         >          why we cannot use ST<OP>.
    > >         >
    > >         > From-SVN: r265703
    > >         >
    > >         >
    > >         >
    > >         >
    > >         >
    > >         > On 2/27/20, 12:06 PM, "Kyrill Tkachov"
    > > <kyrylo.tkachov@foss.arm.com>
    > > wrote:
    > >         >
    > >         >      Hi Sebastian,
    > >         >
    > >         >      On 2/27/20 4:53 PM, Pop, Sebastian wrote:
    > >         >      >
    > >         >      > Hi,
    > >         >      >
    > >         >      > is somebody already working on backporting -moutline-atomics
    > to
    > > gcc
    > >         >      > 8.x and 9.x branches?
    > >         >      >
    > >         >      I'm not aware of such work going on.
    > >         >
    > >         >      Thanks,
    > >         >
    > >         >      Kyrill
    > >         >
    > >         >      > Thanks,
    > >         >      >
    > >         >      > Sebastian
    > >         >      >
    > >         >
    > >         >
    > >
    > >
    > >
    
    


^ permalink raw reply	[flat|nested] 12+ messages in thread

* RE: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
  2020-04-01 14:32       ` Pop, Sebastian
@ 2020-04-01 14:35         ` Kyrylo Tkachov
  2020-04-01 14:35         ` Jakub Jelinek
  1 sibling, 0 replies; 12+ messages in thread
From: Kyrylo Tkachov @ 2020-04-01 14:35 UTC (permalink / raw)
  To: Pop, Sebastian; +Cc: Wilco Dijkstra, richard.henderson, gcc-patches

Hi Sebastian,

> -----Original Message-----
> From: Pop, Sebastian <spop@amazon.com>
> Sent: 01 April 2020 15:32
> To: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Cc: Wilco Dijkstra <Wilco.Dijkstra@arm.com>; richard.henderson@linaro.org;
> gcc-patches@gcc.gnu.org
> Subject: Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
> 
> Thanks Kyrill!  I will be happy to test the gcc-8 back-port of the patches.
> 
> We would also need to back-port the patches to gcc-7.
> I hope it is ok to commit the changes to the gcc-7 branch even if it is not a
> maintained branch.

I don't think that will work. Given that the branch is not maintained there won't be any more point releases off of it.
Of course, if you have your own gcc-7-based vendor branch that's another matter...
Thanks,
Kyrill

> 
> Sebastian
> 
> On 4/1/20, 9:27 AM, "Kyrylo Tkachov" <Kyrylo.Tkachov@arm.com> wrote:
> 
>     CAUTION: This email originated from outside of the organization. Do not
> click links or open attachments unless you can confirm the sender and know
> the content is safe.
> 
> 
> 
>     Adding gcc-patches as I had somehow deleted it from the addresses...
> 
>     > -----Original Message-----
>     > From: Kyrylo Tkachov
>     > Sent: 01 April 2020 15:23
>     > To: Pop, Sebastian <spop@amazon.com>
>     > Cc: Wilco Dijkstra <Wilco.Dijkstra@arm.com>;
> richard.henderson@linaro.org
>     > Subject: RE: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
>     >
>     > Hi Sebastian,
>     >
>     > > -----Original Message-----
>     > > From: Gcc-patches <gcc-patches-bounces@gcc.gnu.org> On Behalf Of
> Pop,
>     > > Sebastian via Gcc-patches
>     > > Sent: 31 March 2020 16:47
>     > > To: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>;
>     > > gcc-patches@gcc.gnu.org
>     > > Cc: Wilco Dijkstra <Wilco.Dijkstra@arm.com>;
>     > > richard.henderson@linaro.org
>     > > Subject: Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and
>     > > 8.x
>     > >
>     > > Ping, can we have the -moutline-atomics patches committed to the gcc-
> 9
>     > > branch?
>     >
>     > Thanks for testing the patches.
>     >
>     > >
>     > > Thanks,
>     > > Sebastian
>     > >
>     > > On 3/24/20, 7:24 PM, "Pop, Sebastian" <spop@amazon.com> wrote:
>     > >
>     > >     Hi Kyrill,
>     > >
>     > >     Thanks for pointing out the two missing bug fixes.
>     > >     Please see attached all the back-ported patches.
>     > >     All the patches from trunk applied cleanly with no conflicts
>     > > (except for the ChangeLog files) to the gcc-9 branch.
>     > >     An up to date gcc-9 branch on which I applied the attached patches
>     > > has passed bootstrap on aarch64-linux (Graviton2 with 64 N1 cores)
> and
>     > > make check with no extra fails.
>     > >     Kyrill, could you please commit the attached patches to the gcc-9
> branch?
>     >
>     > This series also needs Jakub's recent fix:
> https://gcc.gnu.org/pipermail/gcc-
>     > patches/2020-March/542952.html
>     > I've tested this together with the rest and committed the whole series to
> the
>     > gcc-9 branch.
>     >
>     > >
>     > >     As we still don't have a copyright assignment on file, would it be
>     > > possible for ARM to finish the backport to the gcc-8 branch of these
>     > > patches and the atomics cleanup patches mentioned below?
>     >
>     > I can help with that, but any help with testing the patch set would be
>     > appreciated.
>     > Thanks,
>     > Kyrill
>     >
>     > >
>     > >     I did a `git log config/aarch64/atomics.md` and there is a
>     > > follow-up patch to the atomics cleanup patches:
>     > >
>     > >     commit e21679a8bb17aac603b8704891e60ac502200629
>     > >     Author: Jakub Jelinek <jakub@redhat.com>
>     > >     Date:   Wed Nov 21 17:41:03 2018 +0100
>     > >
>     > >         re PR target/87839 (ICE in final_scan_insn_1, at final.c:3070)
>     > >
>     > >                 PR target/87839
>     > >                 * config/aarch64/atomics.md
>     > > (@aarch64_compare_and_swap<mode>): Use
>     > >                 rIJ constraint for aarch64_plus_operand rather than rn.
>     > >
>     > >                 * gcc.target/aarch64/pr87839.c: New test.
>     > >
>     > >         From-SVN: r266346
>     > >
>     > >     That is fixing code modified in this cleanup patch:
>     > >
>     > >     commit d400fda3a8c3330f77eb9d51874f5482d3819a9f
>     > >     Author: Richard Henderson <richard.henderson@linaro.org>
>     > >     Date:   Wed Oct 31 09:42:39 2018 +0000
>     > >
>     > >         aarch64: Improve cas generation
>     > >
>     > >
>     > >     Thanks,
>     > >     Sebastian
>     > >
>     > >
>     > >     On 3/11/20, 5:11 AM, "Kyrill Tkachov"
>     > > <kyrylo.tkachov@foss.arm.com>
>     > > wrote:
>     > >
>     > >         CAUTION: This email originated from outside of the
>     > > organization. Do not click links or open attachments unless you can
>     > > confirm the sender and know the content is safe.
>     > >
>     > >
>     > >
>     > >         Hi Sebastian,
>     > >
>     > >         On 3/9/20 9:47 PM, Pop, Sebastian wrote:
>     > >         > Hi,
>     > >         >
>     > >         > Please see attached the patches to add -moutline-atomics to
>     > > the gcc-9 branch.
>     > >         > Tested on graviton2 aarch64-linux with bootstrap and
>     > >         > `make check` passes with no new fails.
>     > >         > Tested `make check` on glibc built with gcc-9 with and
>     > > without "- moutline-atomics"
>     > >         > and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".
>     > >         >
>     > >         > Ok to commit to gcc-9 branch?
>     > >
>     > >         Since this feature enables backwards-compatible deployment of
> LSE
>     > >         atomics, I'd support that.
>     > >
>     > >         That is okay with me in principle after GCC 9.3 is released (the
> branch
>     > >         is currently frozen).
>     > >
>     > >         However, there have been a few follow-up patches to fix some
> bugs
>     > >         revealed by testing.
>     > >
>     > >         https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91833
>     > >
>     > >         and
>     > >
>     > >         https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91834
>     > >
>     > >         come to mind.
>     > >
>     > >         Can you please make sure the fixes for those are included as well?
>     > >
>     > >
>     > >         >
>     > >         > Does this mechanical `git am *.patch` require a copyright
> assignment?
>     > >         > I am still working with my employer on getting the FSF
>     > > assignment signed.
>     > >         >
>     > >         > Thanks,
>     > >         > Sebastian
>     > >         >
>     > >         > PS: For gcc-8 backports there are 5 cleanup and improvement
>     > patches
>     > >         > that are needed for -moutline-atomics patches to apply cleanly.
>     > >         > Should these patches be back-ported in the same time as the
>     > > flag patches,
>     > >         > or should I update the patches to apply to the older code base?
>     > >
>     > >         Hmm... normally I'd be for them. In this case I'd want to make sure
>     > that
>     > >         there aren't any fallout fixes that we're missing.
>     > >
>     > >         Did these patches have any bug reports against them?
>     > >
>     > >         Thanks,
>     > >
>     > >         Kyrill
>     > >
>     > >
>     > >         > Here is the list of the extra patches:
>     > >         >
>     > >         >  From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep
> 17
>     > > 00:00:00 2001
>     > >         > From: Richard Henderson <richard.henderson@linaro.org>
>     > >         > Date: Wed, 31 Oct 2018 09:29:29 +0000
>     > >         > Subject: [PATCH] aarch64: Simplify LSE cas generation
>     > >         >
>     > >         > The cas insn is a single insn, and if expanded properly need not
>     > >         > be split after reload.  Use the proper inputs for the insn.
>     > >         >
>     > >         >          * config/aarch64/aarch64.c
>     > > (aarch64_expand_compare_and_swap):
>     > >         >          Force oldval into the rval register for TARGET_LSE; emit the
>     > > compare
>     > >         >          during initial expansion so that it may be deleted if unused.
>     > >         >          (aarch64_gen_atomic_cas): Remove.
>     > >         >          * config/aarch64/atomics.md
>     > > (@aarch64_compare_and_swap<SHORT>_lse):
>     > >         >          Change =&r to +r for operand 0; use match_dup for
> operand 2;
>     > >         >          remove is_weak and mod_f operands as unused.  Drop the
> split
>     > >         >          and merge with...
>     > >         >          (@aarch64_atomic_cas<SHORT>): ... this pattern's output;
>     > > remove.
>     > >         >          (@aarch64_compare_and_swap<GPI>_lse): Similarly.
>     > >         >          (@aarch64_atomic_cas<GPI>): Similarly.
>     > >         >
>     > >         > From-SVN: r265656
>     > >         >
>     > >         >  From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep
> 17
>     > > 00:00:00 2001
>     > >         > From: Richard Henderson <richard.henderson@linaro.org>
>     > >         > Date: Wed, 31 Oct 2018 09:42:39 +0000
>     > >         > Subject: [PATCH] aarch64: Improve cas generation
>     > >         >
>     > >         > Do not zero-extend the input to the cas for subword operations;
>     > >         > instead, use the appropriate zero-extending compare insns.
>     > >         > Correct the predicates and constraints for immediate
>     > > expected operand.
>     > >         >
>     > >         >          * config/aarch64/aarch64.c
>     > > (aarch64_gen_compare_reg_maybe_ze): New.
>     > >         >          (aarch64_split_compare_and_swap): Use it.
>     > >         >          (aarch64_expand_compare_and_swap): Likewise.  Remove
>     > > convert_modes;
>     > >         >          test oldval against the proper predicate.
>     > >         >          * config/aarch64/atomics.md
>     > > (@atomic_compare_and_swap<ALLI>):
>     > >         >          Use nonmemory_operand for expected.
>     > >         >          (cas_short_expected_pred): New.
>     > >         >          (@aarch64_compare_and_swap<SHORT>): Use it; use "rn"
> not
>     > > "rI" to match.
>     > >         >          (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for
>     > > expected.
>     > >         >          * config/aarch64/predicates.md
> (aarch64_plushi_immediate):
>     > > New.
>     > >         >          (aarch64_plushi_operand): New.
>     > >         >
>     > >         > From-SVN: r265657
>     > >         >
>     > >         >  From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep
> 17
>     > > 00:00:00 2001
>     > >         > From: Richard Henderson <richard.henderson@linaro.org>
>     > >         > Date: Wed, 31 Oct 2018 09:47:21 +0000
>     > >         > Subject: [PATCH] aarch64: Improve swp generation
>     > >         >
>     > >         > Allow zero as an input; fix constraints; avoid unnecessary split.
>     > >         >
>     > >         >          * config/aarch64/aarch64.c (aarch64_emit_atomic_swap):
>     > > Remove.
>     > >         >          (aarch64_gen_atomic_ldop): Don't call it.
>     > >         >          * config/aarch64/atomics.md (atomic_exchange<ALLI>):
>     > >         >          Use aarch64_reg_or_zero.
>     > >         >          (aarch64_atomic_exchange<ALLI>): Likewise.
>     > >         >          (aarch64_atomic_exchange<ALLI>_lse): Remove split;
> remove &
>     > > from
>     > >         >          operand 0; use aarch64_reg_or_zero for input; merge ...
>     > >         >          (@aarch64_atomic_swp<ALLI>): ... this and remove.
>     > >         >
>     > >         > From-SVN: r265659
>     > >         >
>     > >         >  From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep
> 17
>     > > 00:00:00 2001
>     > >         > From: Richard Henderson <richard.henderson@linaro.org>
>     > >         > Date: Wed, 31 Oct 2018 09:58:48 +0000
>     > >         > Subject: [PATCH] aarch64: Improve atomic-op lse generation
>     > >         >
>     > >         > Fix constraints; avoid unnecessary split.  Drop the use of the
>     > atomic_op
>     > >         > iterator in favor of the ATOMIC_LDOP iterator; this is
>     > > simplier and more
>     > >         > logical for ldclr aka bic.
>     > >         >
>     > >         >          * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
>     > >         >          (aarch64_atomic_ldop_supported_p): Remove.
>     > >         >          (aarch64_gen_atomic_ldop): Remove.
>     > >         >          * config/aarch64/atomic.md
> (atomic_<atomic_optab><ALLI>):
>     > >         >          Fully expand LSE operations here.
>     > >         >          (atomic_fetch_<atomic_optab><ALLI>): Likewise.
>     > >         >          (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
>     > >         >          (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop
> atomic_op
>     > > iterator
>     > >         >          and use ATOMIC_LDOP instead; use register_operand for
> the
>     > > input;
>     > >         >          drop the split and emit insns directly.
>     > >         >          (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse):
> Likewise.
>     > >         >          (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
>     > >         >          (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.
>     > >         >
>     > >         > From-SVN: r265660
>     > >         >
>     > >         >  From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep
> 17
>     > > 00:00:00 2001
>     > >         > From: Richard Henderson <richard.henderson@linaro.org>
>     > >         > Date: Wed, 31 Oct 2018 23:11:22 +0000
>     > >         > Subject: [PATCH] aarch64: Remove early clobber from
>     > > ATOMIC_LDOP scratch
>     > >         >
>     > >         >          * config/aarch64/atomics.md
>     > > (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
>     > >         >          The scratch register need not be early-clobber.  Document
> the
>     > > reason
>     > >         >          why we cannot use ST<OP>.
>     > >         >
>     > >         > From-SVN: r265703
>     > >         >
>     > >         >
>     > >         >
>     > >         >
>     > >         >
>     > >         > On 2/27/20, 12:06 PM, "Kyrill Tkachov"
>     > > <kyrylo.tkachov@foss.arm.com>
>     > > wrote:
>     > >         >
>     > >         >      Hi Sebastian,
>     > >         >
>     > >         >      On 2/27/20 4:53 PM, Pop, Sebastian wrote:
>     > >         >      >
>     > >         >      > Hi,
>     > >         >      >
>     > >         >      > is somebody already working on backporting -moutline-
> atomics
>     > to
>     > > gcc
>     > >         >      > 8.x and 9.x branches?
>     > >         >      >
>     > >         >      I'm not aware of such work going on.
>     > >         >
>     > >         >      Thanks,
>     > >         >
>     > >         >      Kyrill
>     > >         >
>     > >         >      > Thanks,
>     > >         >      >
>     > >         >      > Sebastian
>     > >         >      >
>     > >         >
>     > >         >
>     > >
>     > >
>     > >
> 
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
  2020-04-01 14:32       ` Pop, Sebastian
  2020-04-01 14:35         ` Kyrylo Tkachov
@ 2020-04-01 14:35         ` Jakub Jelinek
  2020-04-01 14:40           ` Pop, Sebastian
  1 sibling, 1 reply; 12+ messages in thread
From: Jakub Jelinek @ 2020-04-01 14:35 UTC (permalink / raw)
  To: Pop, Sebastian
  Cc: Kyrylo Tkachov, gcc-patches, richard.henderson, Wilco Dijkstra

On Wed, Apr 01, 2020 at 02:32:03PM +0000, Pop, Sebastian via Gcc-patches wrote:
> Thanks Kyrill!  I will be happy to test the gcc-8 back-port of the patches.

Note, I have another fix, PR94435, that I've already bootstrapped and am
regtesting ATM, that will need to be included in any backports too (if acked
for trunk).
> 
> We would also need to back-port the patches to gcc-7.
> I hope it is ok to commit the changes to the gcc-7 branch even if it is not a maintained branch.

No, that is not ok, the branch is closed and shouldn't have any changes.
You can create some vendor or devel or user branch for it though, merge
there the releases/gcc-7 and add whatever backports you want.

	Jakub


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
  2020-04-01 14:35         ` Jakub Jelinek
@ 2020-04-01 14:40           ` Pop, Sebastian
  0 siblings, 0 replies; 12+ messages in thread
From: Pop, Sebastian @ 2020-04-01 14:40 UTC (permalink / raw)
  To: Jakub Jelinek
  Cc: Kyrylo Tkachov, gcc-patches, richard.henderson, Wilco Dijkstra

Thanks Jakub and Kyrill to point that out.
We will create a new branch called gcc-7-aarch64-outline-atomics or so with the back-ported patches.

Sebastian

On 4/1/20, 9:36 AM, "Jakub Jelinek" <jakub@redhat.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
    
    
    
    On Wed, Apr 01, 2020 at 02:32:03PM +0000, Pop, Sebastian via Gcc-patches wrote:
    > Thanks Kyrill!  I will be happy to test the gcc-8 back-port of the patches.
    
    Note, I have another fix, PR94435, that I've already bootstrapped and am
    regtesting ATM, that will need to be included in any backports too (if acked
    for trunk).
    >
    > We would also need to back-port the patches to gcc-7.
    > I hope it is ok to commit the changes to the gcc-7 branch even if it is not a maintained branch.
    
    No, that is not ok, the branch is closed and shouldn't have any changes.
    You can create some vendor or devel or user branch for it though, merge
    there the releases/gcc-7 and add whatever backports you want.
    
            Jakub
    
    


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
  2020-03-25  0:24 Pop, Sebastian
  2020-03-31 15:47 ` Pop, Sebastian
@ 2020-04-01 22:13 ` Christophe Lyon
  2020-04-02  2:34   ` Pop, Sebastian
  1 sibling, 1 reply; 12+ messages in thread
From: Christophe Lyon @ 2020-04-01 22:13 UTC (permalink / raw)
  To: Pop, Sebastian
  Cc: Kyrill Tkachov, gcc-patches, Wilco Dijkstra, richard.henderson

On Wed, 25 Mar 2020 at 01:24, Pop, Sebastian via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi Kyrill,
>
> Thanks for pointing out the two missing bug fixes.
> Please see attached all the back-ported patches.
> All the patches from trunk applied cleanly with no conflicts (except for the ChangeLog files) to the gcc-9 branch.
> An up to date gcc-9 branch on which I applied the attached patches has passed bootstrap on aarch64-linux (Graviton2 with 64 N1 cores) and make check with no extra fails.
> Kyrill, could you please commit the attached patches to the gcc-9 branch?
>

Hi,

I'm seeing a GCC build failure after "aarch64: Implement TImode
compare-and-swap"
was backported to gcc-9 (commit 53c1356515ac1357c341b594326967ac4677d891)

The build log has:
0x14a1660 gen_split_100(rtx_insn*, rtx_def**)
        /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/aarch64/atomics.md:110
0xa81076 try_split(rtx_def*, rtx_insn*, int)
        /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/emit-rtl.c:3851
0xda2b0d split_insn
        /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.c:2901
0xda7057 split_all_insns()
        /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.c:3005
0xda7118 execute
        /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.c:3957
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
make[4]: *** [Makefile:659: tsan_interface_atomic.lo] Error 1

Maybe that problem is fixed by a patch later in the series? (I have
validations running after every patch on the release branches, so it
may take a while until I have the results for the end of the series)

Thanks,

Christophe

> As we still don't have a copyright assignment on file, would it be possible for ARM to finish the backport to the gcc-8 branch of these patches and the atomics cleanup patches mentioned below?
>
> I did a `git log config/aarch64/atomics.md` and there is a follow-up patch to the atomics cleanup patches:
>
> commit e21679a8bb17aac603b8704891e60ac502200629
> Author: Jakub Jelinek <jakub@redhat.com>
> Date:   Wed Nov 21 17:41:03 2018 +0100
>
>     re PR target/87839 (ICE in final_scan_insn_1, at final.c:3070)
>
>             PR target/87839
>             * config/aarch64/atomics.md (@aarch64_compare_and_swap<mode>): Use
>             rIJ constraint for aarch64_plus_operand rather than rn.
>
>             * gcc.target/aarch64/pr87839.c: New test.
>
>     From-SVN: r266346
>
> That is fixing code modified in this cleanup patch:
>
> commit d400fda3a8c3330f77eb9d51874f5482d3819a9f
> Author: Richard Henderson <richard.henderson@linaro.org>
> Date:   Wed Oct 31 09:42:39 2018 +0000
>
>     aarch64: Improve cas generation
>
>
> Thanks,
> Sebastian
>
>
> On 3/11/20, 5:11 AM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
>     Hi Sebastian,
>
>     On 3/9/20 9:47 PM, Pop, Sebastian wrote:
>     > Hi,
>     >
>     > Please see attached the patches to add -moutline-atomics to the gcc-9 branch.
>     > Tested on graviton2 aarch64-linux with bootstrap and
>     > `make check` passes with no new fails.
>     > Tested `make check` on glibc built with gcc-9 with and without "-moutline-atomics"
>     > and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".
>     >
>     > Ok to commit to gcc-9 branch?
>
>     Since this feature enables backwards-compatible deployment of LSE
>     atomics, I'd support that.
>
>     That is okay with me in principle after GCC 9.3 is released (the branch
>     is currently frozen).
>
>     However, there have been a few follow-up patches to fix some bugs
>     revealed by testing.
>
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91833
>
>     and
>
>     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91834
>
>     come to mind.
>
>     Can you please make sure the fixes for those are included as well?
>
>
>     >
>     > Does this mechanical `git am *.patch` require a copyright assignment?
>     > I am still working with my employer on getting the FSF assignment signed.
>     >
>     > Thanks,
>     > Sebastian
>     >
>     > PS: For gcc-8 backports there are 5 cleanup and improvement patches
>     > that are needed for -moutline-atomics patches to apply cleanly.
>     > Should these patches be back-ported in the same time as the flag patches,
>     > or should I update the patches to apply to the older code base?
>
>     Hmm... normally I'd be for them. In this case I'd want to make sure that
>     there aren't any fallout fixes that we're missing.
>
>     Did these patches have any bug reports against them?
>
>     Thanks,
>
>     Kyrill
>
>
>     > Here is the list of the extra patches:
>     >
>     >  From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep 17 00:00:00 2001
>     > From: Richard Henderson <richard.henderson@linaro.org>
>     > Date: Wed, 31 Oct 2018 09:29:29 +0000
>     > Subject: [PATCH] aarch64: Simplify LSE cas generation
>     >
>     > The cas insn is a single insn, and if expanded properly need not
>     > be split after reload.  Use the proper inputs for the insn.
>     >
>     >          * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
>     >          Force oldval into the rval register for TARGET_LSE; emit the compare
>     >          during initial expansion so that it may be deleted if unused.
>     >          (aarch64_gen_atomic_cas): Remove.
>     >          * config/aarch64/atomics.md (@aarch64_compare_and_swap<SHORT>_lse):
>     >          Change =&r to +r for operand 0; use match_dup for operand 2;
>     >          remove is_weak and mod_f operands as unused.  Drop the split
>     >          and merge with...
>     >          (@aarch64_atomic_cas<SHORT>): ... this pattern's output; remove.
>     >          (@aarch64_compare_and_swap<GPI>_lse): Similarly.
>     >          (@aarch64_atomic_cas<GPI>): Similarly.
>     >
>     > From-SVN: r265656
>     >
>     >  From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep 17 00:00:00 2001
>     > From: Richard Henderson <richard.henderson@linaro.org>
>     > Date: Wed, 31 Oct 2018 09:42:39 +0000
>     > Subject: [PATCH] aarch64: Improve cas generation
>     >
>     > Do not zero-extend the input to the cas for subword operations;
>     > instead, use the appropriate zero-extending compare insns.
>     > Correct the predicates and constraints for immediate expected operand.
>     >
>     >          * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New.
>     >          (aarch64_split_compare_and_swap): Use it.
>     >          (aarch64_expand_compare_and_swap): Likewise.  Remove convert_modes;
>     >          test oldval against the proper predicate.
>     >          * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI>):
>     >          Use nonmemory_operand for expected.
>     >          (cas_short_expected_pred): New.
>     >          (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not "rI" to match.
>     >          (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for expected.
>     >          * config/aarch64/predicates.md (aarch64_plushi_immediate): New.
>     >          (aarch64_plushi_operand): New.
>     >
>     > From-SVN: r265657
>     >
>     >  From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep 17 00:00:00 2001
>     > From: Richard Henderson <richard.henderson@linaro.org>
>     > Date: Wed, 31 Oct 2018 09:47:21 +0000
>     > Subject: [PATCH] aarch64: Improve swp generation
>     >
>     > Allow zero as an input; fix constraints; avoid unnecessary split.
>     >
>     >          * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove.
>     >          (aarch64_gen_atomic_ldop): Don't call it.
>     >          * config/aarch64/atomics.md (atomic_exchange<ALLI>):
>     >          Use aarch64_reg_or_zero.
>     >          (aarch64_atomic_exchange<ALLI>): Likewise.
>     >          (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove & from
>     >          operand 0; use aarch64_reg_or_zero for input; merge ...
>     >          (@aarch64_atomic_swp<ALLI>): ... this and remove.
>     >
>     > From-SVN: r265659
>     >
>     >  From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep 17 00:00:00 2001
>     > From: Richard Henderson <richard.henderson@linaro.org>
>     > Date: Wed, 31 Oct 2018 09:58:48 +0000
>     > Subject: [PATCH] aarch64: Improve atomic-op lse generation
>     >
>     > Fix constraints; avoid unnecessary split.  Drop the use of the atomic_op
>     > iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more
>     > logical for ldclr aka bic.
>     >
>     >          * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
>     >          (aarch64_atomic_ldop_supported_p): Remove.
>     >          (aarch64_gen_atomic_ldop): Remove.
>     >          * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>):
>     >          Fully expand LSE operations here.
>     >          (atomic_fetch_<atomic_optab><ALLI>): Likewise.
>     >          (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
>     >          (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op iterator
>     >          and use ATOMIC_LDOP instead; use register_operand for the input;
>     >          drop the split and emit insns directly.
>     >          (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise.
>     >          (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
>     >          (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.
>     >
>     > From-SVN: r265660
>     >
>     >  From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep 17 00:00:00 2001
>     > From: Richard Henderson <richard.henderson@linaro.org>
>     > Date: Wed, 31 Oct 2018 23:11:22 +0000
>     > Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch
>     >
>     >          * config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
>     >          The scratch register need not be early-clobber.  Document the reason
>     >          why we cannot use ST<OP>.
>     >
>     > From-SVN: r265703
>     >
>     >
>     >
>     >
>     >
>     > On 2/27/20, 12:06 PM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
>     >
>     >      Hi Sebastian,
>     >
>     >      On 2/27/20 4:53 PM, Pop, Sebastian wrote:
>     >      >
>     >      > Hi,
>     >      >
>     >      > is somebody already working on backporting -moutline-atomics to gcc
>     >      > 8.x and 9.x branches?
>     >      >
>     >      I'm not aware of such work going on.
>     >
>     >      Thanks,
>     >
>     >      Kyrill
>     >
>     >      > Thanks,
>     >      >
>     >      > Sebastian
>     >      >
>     >
>     >
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
  2020-04-01 22:13 ` Christophe Lyon
@ 2020-04-02  2:34   ` Pop, Sebastian
  2020-04-02  7:34     ` Christophe Lyon
  0 siblings, 1 reply; 12+ messages in thread
From: Pop, Sebastian @ 2020-04-02  2:34 UTC (permalink / raw)
  To: Christophe Lyon
  Cc: Kyrill Tkachov, gcc-patches, Wilco Dijkstra, richard.henderson

I have also seen this error in tsan.
The fix is https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ea376dd471a3b006bc48945c1d9a29408ab17a04
"Fix shrinkwrapping interactions with atomics (PR92692)".
This fix got committed as the last patch in the series.

Sebastian

On 4/1/20, 5:13 PM, "Christophe Lyon" <christophe.lyon@linaro.org> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
    
    
    
    On Wed, 25 Mar 2020 at 01:24, Pop, Sebastian via Gcc-patches
    <gcc-patches@gcc.gnu.org> wrote:
    >
    > Hi Kyrill,
    >
    > Thanks for pointing out the two missing bug fixes.
    > Please see attached all the back-ported patches.
    > All the patches from trunk applied cleanly with no conflicts (except for the ChangeLog files) to the gcc-9 branch.
    > An up to date gcc-9 branch on which I applied the attached patches has passed bootstrap on aarch64-linux (Graviton2 with 64 N1 cores) and make check with no extra fails.
    > Kyrill, could you please commit the attached patches to the gcc-9 branch?
    >
    
    Hi,
    
    I'm seeing a GCC build failure after "aarch64: Implement TImode
    compare-and-swap"
    was backported to gcc-9 (commit 53c1356515ac1357c341b594326967ac4677d891)
    
    The build log has:
    0x14a1660 gen_split_100(rtx_insn*, rtx_def**)
            /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/aarch64/atomics.md:110
    0xa81076 try_split(rtx_def*, rtx_insn*, int)
            /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/emit-rtl.c:3851
    0xda2b0d split_insn
            /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.c:2901
    0xda7057 split_all_insns()
            /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.c:3005
    0xda7118 execute
            /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.c:3957
    Please submit a full bug report,
    with preprocessed source if appropriate.
    Please include the complete backtrace with any bug report.
    See <https://gcc.gnu.org/bugs/> for instructions.
    make[4]: *** [Makefile:659: tsan_interface_atomic.lo] Error 1
    
    Maybe that problem is fixed by a patch later in the series? (I have
    validations running after every patch on the release branches, so it
    may take a while until I have the results for the end of the series)
    
    Thanks,
    
    Christophe
    
    > As we still don't have a copyright assignment on file, would it be possible for ARM to finish the backport to the gcc-8 branch of these patches and the atomics cleanup patches mentioned below?
    >
    > I did a `git log config/aarch64/atomics.md` and there is a follow-up patch to the atomics cleanup patches:
    >
    > commit e21679a8bb17aac603b8704891e60ac502200629
    > Author: Jakub Jelinek <jakub@redhat.com>
    > Date:   Wed Nov 21 17:41:03 2018 +0100
    >
    >     re PR target/87839 (ICE in final_scan_insn_1, at final.c:3070)
    >
    >             PR target/87839
    >             * config/aarch64/atomics.md (@aarch64_compare_and_swap<mode>): Use
    >             rIJ constraint for aarch64_plus_operand rather than rn.
    >
    >             * gcc.target/aarch64/pr87839.c: New test.
    >
    >     From-SVN: r266346
    >
    > That is fixing code modified in this cleanup patch:
    >
    > commit d400fda3a8c3330f77eb9d51874f5482d3819a9f
    > Author: Richard Henderson <richard.henderson@linaro.org>
    > Date:   Wed Oct 31 09:42:39 2018 +0000
    >
    >     aarch64: Improve cas generation
    >
    >
    > Thanks,
    > Sebastian
    >
    >
    > On 3/11/20, 5:11 AM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
    >
    >     CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
    >
    >
    >
    >     Hi Sebastian,
    >
    >     On 3/9/20 9:47 PM, Pop, Sebastian wrote:
    >     > Hi,
    >     >
    >     > Please see attached the patches to add -moutline-atomics to the gcc-9 branch.
    >     > Tested on graviton2 aarch64-linux with bootstrap and
    >     > `make check` passes with no new fails.
    >     > Tested `make check` on glibc built with gcc-9 with and without "-moutline-atomics"
    >     > and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".
    >     >
    >     > Ok to commit to gcc-9 branch?
    >
    >     Since this feature enables backwards-compatible deployment of LSE
    >     atomics, I'd support that.
    >
    >     That is okay with me in principle after GCC 9.3 is released (the branch
    >     is currently frozen).
    >
    >     However, there have been a few follow-up patches to fix some bugs
    >     revealed by testing.
    >
    >     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91833
    >
    >     and
    >
    >     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91834
    >
    >     come to mind.
    >
    >     Can you please make sure the fixes for those are included as well?
    >
    >
    >     >
    >     > Does this mechanical `git am *.patch` require a copyright assignment?
    >     > I am still working with my employer on getting the FSF assignment signed.
    >     >
    >     > Thanks,
    >     > Sebastian
    >     >
    >     > PS: For gcc-8 backports there are 5 cleanup and improvement patches
    >     > that are needed for -moutline-atomics patches to apply cleanly.
    >     > Should these patches be back-ported in the same time as the flag patches,
    >     > or should I update the patches to apply to the older code base?
    >
    >     Hmm... normally I'd be for them. In this case I'd want to make sure that
    >     there aren't any fallout fixes that we're missing.
    >
    >     Did these patches have any bug reports against them?
    >
    >     Thanks,
    >
    >     Kyrill
    >
    >
    >     > Here is the list of the extra patches:
    >     >
    >     >  From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep 17 00:00:00 2001
    >     > From: Richard Henderson <richard.henderson@linaro.org>
    >     > Date: Wed, 31 Oct 2018 09:29:29 +0000
    >     > Subject: [PATCH] aarch64: Simplify LSE cas generation
    >     >
    >     > The cas insn is a single insn, and if expanded properly need not
    >     > be split after reload.  Use the proper inputs for the insn.
    >     >
    >     >          * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
    >     >          Force oldval into the rval register for TARGET_LSE; emit the compare
    >     >          during initial expansion so that it may be deleted if unused.
    >     >          (aarch64_gen_atomic_cas): Remove.
    >     >          * config/aarch64/atomics.md (@aarch64_compare_and_swap<SHORT>_lse):
    >     >          Change =&r to +r for operand 0; use match_dup for operand 2;
    >     >          remove is_weak and mod_f operands as unused.  Drop the split
    >     >          and merge with...
    >     >          (@aarch64_atomic_cas<SHORT>): ... this pattern's output; remove.
    >     >          (@aarch64_compare_and_swap<GPI>_lse): Similarly.
    >     >          (@aarch64_atomic_cas<GPI>): Similarly.
    >     >
    >     > From-SVN: r265656
    >     >
    >     >  From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep 17 00:00:00 2001
    >     > From: Richard Henderson <richard.henderson@linaro.org>
    >     > Date: Wed, 31 Oct 2018 09:42:39 +0000
    >     > Subject: [PATCH] aarch64: Improve cas generation
    >     >
    >     > Do not zero-extend the input to the cas for subword operations;
    >     > instead, use the appropriate zero-extending compare insns.
    >     > Correct the predicates and constraints for immediate expected operand.
    >     >
    >     >          * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New.
    >     >          (aarch64_split_compare_and_swap): Use it.
    >     >          (aarch64_expand_compare_and_swap): Likewise.  Remove convert_modes;
    >     >          test oldval against the proper predicate.
    >     >          * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI>):
    >     >          Use nonmemory_operand for expected.
    >     >          (cas_short_expected_pred): New.
    >     >          (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not "rI" to match.
    >     >          (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for expected.
    >     >          * config/aarch64/predicates.md (aarch64_plushi_immediate): New.
    >     >          (aarch64_plushi_operand): New.
    >     >
    >     > From-SVN: r265657
    >     >
    >     >  From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep 17 00:00:00 2001
    >     > From: Richard Henderson <richard.henderson@linaro.org>
    >     > Date: Wed, 31 Oct 2018 09:47:21 +0000
    >     > Subject: [PATCH] aarch64: Improve swp generation
    >     >
    >     > Allow zero as an input; fix constraints; avoid unnecessary split.
    >     >
    >     >          * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove.
    >     >          (aarch64_gen_atomic_ldop): Don't call it.
    >     >          * config/aarch64/atomics.md (atomic_exchange<ALLI>):
    >     >          Use aarch64_reg_or_zero.
    >     >          (aarch64_atomic_exchange<ALLI>): Likewise.
    >     >          (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove & from
    >     >          operand 0; use aarch64_reg_or_zero for input; merge ...
    >     >          (@aarch64_atomic_swp<ALLI>): ... this and remove.
    >     >
    >     > From-SVN: r265659
    >     >
    >     >  From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep 17 00:00:00 2001
    >     > From: Richard Henderson <richard.henderson@linaro.org>
    >     > Date: Wed, 31 Oct 2018 09:58:48 +0000
    >     > Subject: [PATCH] aarch64: Improve atomic-op lse generation
    >     >
    >     > Fix constraints; avoid unnecessary split.  Drop the use of the atomic_op
    >     > iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more
    >     > logical for ldclr aka bic.
    >     >
    >     >          * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
    >     >          (aarch64_atomic_ldop_supported_p): Remove.
    >     >          (aarch64_gen_atomic_ldop): Remove.
    >     >          * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>):
    >     >          Fully expand LSE operations here.
    >     >          (atomic_fetch_<atomic_optab><ALLI>): Likewise.
    >     >          (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
    >     >          (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op iterator
    >     >          and use ATOMIC_LDOP instead; use register_operand for the input;
    >     >          drop the split and emit insns directly.
    >     >          (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise.
    >     >          (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
    >     >          (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.
    >     >
    >     > From-SVN: r265660
    >     >
    >     >  From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep 17 00:00:00 2001
    >     > From: Richard Henderson <richard.henderson@linaro.org>
    >     > Date: Wed, 31 Oct 2018 23:11:22 +0000
    >     > Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch
    >     >
    >     >          * config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
    >     >          The scratch register need not be early-clobber.  Document the reason
    >     >          why we cannot use ST<OP>.
    >     >
    >     > From-SVN: r265703
    >     >
    >     >
    >     >
    >     >
    >     >
    >     > On 2/27/20, 12:06 PM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
    >     >
    >     >      Hi Sebastian,
    >     >
    >     >      On 2/27/20 4:53 PM, Pop, Sebastian wrote:
    >     >      >
    >     >      > Hi,
    >     >      >
    >     >      > is somebody already working on backporting -moutline-atomics to gcc
    >     >      > 8.x and 9.x branches?
    >     >      >
    >     >      I'm not aware of such work going on.
    >     >
    >     >      Thanks,
    >     >
    >     >      Kyrill
    >     >
    >     >      > Thanks,
    >     >      >
    >     >      > Sebastian
    >     >      >
    >     >
    >     >
    >
    >
    


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
  2020-04-02  2:34   ` Pop, Sebastian
@ 2020-04-02  7:34     ` Christophe Lyon
  0 siblings, 0 replies; 12+ messages in thread
From: Christophe Lyon @ 2020-04-02  7:34 UTC (permalink / raw)
  To: Pop, Sebastian
  Cc: Kyrill Tkachov, gcc-patches, Wilco Dijkstra, richard.henderson

On Thu, 2 Apr 2020 at 04:34, Pop, Sebastian <spop@amazon.com> wrote:
>
> I have also seen this error in tsan.
> The fix is https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=ea376dd471a3b006bc48945c1d9a29408ab17a04
> "Fix shrinkwrapping interactions with atomics (PR92692)".
> This fix got committed as the last patch in the series.
>

Indeed, it's now OK, thanks!

> Sebastian
>
> On 4/1/20, 5:13 PM, "Christophe Lyon" <christophe.lyon@linaro.org> wrote:
>
>     CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
>     On Wed, 25 Mar 2020 at 01:24, Pop, Sebastian via Gcc-patches
>     <gcc-patches@gcc.gnu.org> wrote:
>     >
>     > Hi Kyrill,
>     >
>     > Thanks for pointing out the two missing bug fixes.
>     > Please see attached all the back-ported patches.
>     > All the patches from trunk applied cleanly with no conflicts (except for the ChangeLog files) to the gcc-9 branch.
>     > An up to date gcc-9 branch on which I applied the attached patches has passed bootstrap on aarch64-linux (Graviton2 with 64 N1 cores) and make check with no extra fails.
>     > Kyrill, could you please commit the attached patches to the gcc-9 branch?
>     >
>
>     Hi,
>
>     I'm seeing a GCC build failure after "aarch64: Implement TImode
>     compare-and-swap"
>     was backported to gcc-9 (commit 53c1356515ac1357c341b594326967ac4677d891)
>
>     The build log has:
>     0x14a1660 gen_split_100(rtx_insn*, rtx_def**)
>             /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/config/aarch64/atomics.md:110
>     0xa81076 try_split(rtx_def*, rtx_insn*, int)
>             /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/emit-rtl.c:3851
>     0xda2b0d split_insn
>             /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.c:2901
>     0xda7057 split_all_insns()
>             /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.c:3005
>     0xda7118 execute
>             /tmp/6477245_1.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/recog.c:3957
>     Please submit a full bug report,
>     with preprocessed source if appropriate.
>     Please include the complete backtrace with any bug report.
>     See <https://gcc.gnu.org/bugs/> for instructions.
>     make[4]: *** [Makefile:659: tsan_interface_atomic.lo] Error 1
>
>     Maybe that problem is fixed by a patch later in the series? (I have
>     validations running after every patch on the release branches, so it
>     may take a while until I have the results for the end of the series)
>
>     Thanks,
>
>     Christophe
>
>     > As we still don't have a copyright assignment on file, would it be possible for ARM to finish the backport to the gcc-8 branch of these patches and the atomics cleanup patches mentioned below?
>     >
>     > I did a `git log config/aarch64/atomics.md` and there is a follow-up patch to the atomics cleanup patches:
>     >
>     > commit e21679a8bb17aac603b8704891e60ac502200629
>     > Author: Jakub Jelinek <jakub@redhat.com>
>     > Date:   Wed Nov 21 17:41:03 2018 +0100
>     >
>     >     re PR target/87839 (ICE in final_scan_insn_1, at final.c:3070)
>     >
>     >             PR target/87839
>     >             * config/aarch64/atomics.md (@aarch64_compare_and_swap<mode>): Use
>     >             rIJ constraint for aarch64_plus_operand rather than rn.
>     >
>     >             * gcc.target/aarch64/pr87839.c: New test.
>     >
>     >     From-SVN: r266346
>     >
>     > That is fixing code modified in this cleanup patch:
>     >
>     > commit d400fda3a8c3330f77eb9d51874f5482d3819a9f
>     > Author: Richard Henderson <richard.henderson@linaro.org>
>     > Date:   Wed Oct 31 09:42:39 2018 +0000
>     >
>     >     aarch64: Improve cas generation
>     >
>     >
>     > Thanks,
>     > Sebastian
>     >
>     >
>     > On 3/11/20, 5:11 AM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
>     >
>     >     CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>     >
>     >
>     >
>     >     Hi Sebastian,
>     >
>     >     On 3/9/20 9:47 PM, Pop, Sebastian wrote:
>     >     > Hi,
>     >     >
>     >     > Please see attached the patches to add -moutline-atomics to the gcc-9 branch.
>     >     > Tested on graviton2 aarch64-linux with bootstrap and
>     >     > `make check` passes with no new fails.
>     >     > Tested `make check` on glibc built with gcc-9 with and without "-moutline-atomics"
>     >     > and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".
>     >     >
>     >     > Ok to commit to gcc-9 branch?
>     >
>     >     Since this feature enables backwards-compatible deployment of LSE
>     >     atomics, I'd support that.
>     >
>     >     That is okay with me in principle after GCC 9.3 is released (the branch
>     >     is currently frozen).
>     >
>     >     However, there have been a few follow-up patches to fix some bugs
>     >     revealed by testing.
>     >
>     >     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91833
>     >
>     >     and
>     >
>     >     https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91834
>     >
>     >     come to mind.
>     >
>     >     Can you please make sure the fixes for those are included as well?
>     >
>     >
>     >     >
>     >     > Does this mechanical `git am *.patch` require a copyright assignment?
>     >     > I am still working with my employer on getting the FSF assignment signed.
>     >     >
>     >     > Thanks,
>     >     > Sebastian
>     >     >
>     >     > PS: For gcc-8 backports there are 5 cleanup and improvement patches
>     >     > that are needed for -moutline-atomics patches to apply cleanly.
>     >     > Should these patches be back-ported in the same time as the flag patches,
>     >     > or should I update the patches to apply to the older code base?
>     >
>     >     Hmm... normally I'd be for them. In this case I'd want to make sure that
>     >     there aren't any fallout fixes that we're missing.
>     >
>     >     Did these patches have any bug reports against them?
>     >
>     >     Thanks,
>     >
>     >     Kyrill
>     >
>     >
>     >     > Here is the list of the extra patches:
>     >     >
>     >     >  From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep 17 00:00:00 2001
>     >     > From: Richard Henderson <richard.henderson@linaro.org>
>     >     > Date: Wed, 31 Oct 2018 09:29:29 +0000
>     >     > Subject: [PATCH] aarch64: Simplify LSE cas generation
>     >     >
>     >     > The cas insn is a single insn, and if expanded properly need not
>     >     > be split after reload.  Use the proper inputs for the insn.
>     >     >
>     >     >          * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
>     >     >          Force oldval into the rval register for TARGET_LSE; emit the compare
>     >     >          during initial expansion so that it may be deleted if unused.
>     >     >          (aarch64_gen_atomic_cas): Remove.
>     >     >          * config/aarch64/atomics.md (@aarch64_compare_and_swap<SHORT>_lse):
>     >     >          Change =&r to +r for operand 0; use match_dup for operand 2;
>     >     >          remove is_weak and mod_f operands as unused.  Drop the split
>     >     >          and merge with...
>     >     >          (@aarch64_atomic_cas<SHORT>): ... this pattern's output; remove.
>     >     >          (@aarch64_compare_and_swap<GPI>_lse): Similarly.
>     >     >          (@aarch64_atomic_cas<GPI>): Similarly.
>     >     >
>     >     > From-SVN: r265656
>     >     >
>     >     >  From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep 17 00:00:00 2001
>     >     > From: Richard Henderson <richard.henderson@linaro.org>
>     >     > Date: Wed, 31 Oct 2018 09:42:39 +0000
>     >     > Subject: [PATCH] aarch64: Improve cas generation
>     >     >
>     >     > Do not zero-extend the input to the cas for subword operations;
>     >     > instead, use the appropriate zero-extending compare insns.
>     >     > Correct the predicates and constraints for immediate expected operand.
>     >     >
>     >     >          * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New.
>     >     >          (aarch64_split_compare_and_swap): Use it.
>     >     >          (aarch64_expand_compare_and_swap): Likewise.  Remove convert_modes;
>     >     >          test oldval against the proper predicate.
>     >     >          * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI>):
>     >     >          Use nonmemory_operand for expected.
>     >     >          (cas_short_expected_pred): New.
>     >     >          (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not "rI" to match.
>     >     >          (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for expected.
>     >     >          * config/aarch64/predicates.md (aarch64_plushi_immediate): New.
>     >     >          (aarch64_plushi_operand): New.
>     >     >
>     >     > From-SVN: r265657
>     >     >
>     >     >  From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep 17 00:00:00 2001
>     >     > From: Richard Henderson <richard.henderson@linaro.org>
>     >     > Date: Wed, 31 Oct 2018 09:47:21 +0000
>     >     > Subject: [PATCH] aarch64: Improve swp generation
>     >     >
>     >     > Allow zero as an input; fix constraints; avoid unnecessary split.
>     >     >
>     >     >          * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove.
>     >     >          (aarch64_gen_atomic_ldop): Don't call it.
>     >     >          * config/aarch64/atomics.md (atomic_exchange<ALLI>):
>     >     >          Use aarch64_reg_or_zero.
>     >     >          (aarch64_atomic_exchange<ALLI>): Likewise.
>     >     >          (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove & from
>     >     >          operand 0; use aarch64_reg_or_zero for input; merge ...
>     >     >          (@aarch64_atomic_swp<ALLI>): ... this and remove.
>     >     >
>     >     > From-SVN: r265659
>     >     >
>     >     >  From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep 17 00:00:00 2001
>     >     > From: Richard Henderson <richard.henderson@linaro.org>
>     >     > Date: Wed, 31 Oct 2018 09:58:48 +0000
>     >     > Subject: [PATCH] aarch64: Improve atomic-op lse generation
>     >     >
>     >     > Fix constraints; avoid unnecessary split.  Drop the use of the atomic_op
>     >     > iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more
>     >     > logical for ldclr aka bic.
>     >     >
>     >     >          * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
>     >     >          (aarch64_atomic_ldop_supported_p): Remove.
>     >     >          (aarch64_gen_atomic_ldop): Remove.
>     >     >          * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>):
>     >     >          Fully expand LSE operations here.
>     >     >          (atomic_fetch_<atomic_optab><ALLI>): Likewise.
>     >     >          (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
>     >     >          (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op iterator
>     >     >          and use ATOMIC_LDOP instead; use register_operand for the input;
>     >     >          drop the split and emit insns directly.
>     >     >          (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise.
>     >     >          (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
>     >     >          (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.
>     >     >
>     >     > From-SVN: r265660
>     >     >
>     >     >  From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep 17 00:00:00 2001
>     >     > From: Richard Henderson <richard.henderson@linaro.org>
>     >     > Date: Wed, 31 Oct 2018 23:11:22 +0000
>     >     > Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch
>     >     >
>     >     >          * config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
>     >     >          The scratch register need not be early-clobber.  Document the reason
>     >     >          why we cannot use ST<OP>.
>     >     >
>     >     > From-SVN: r265703
>     >     >
>     >     >
>     >     >
>     >     >
>     >     >
>     >     > On 2/27/20, 12:06 PM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
>     >     >
>     >     >      Hi Sebastian,
>     >     >
>     >     >      On 2/27/20 4:53 PM, Pop, Sebastian wrote:
>     >     >      >
>     >     >      > Hi,
>     >     >      >
>     >     >      > is somebody already working on backporting -moutline-atomics to gcc
>     >     >      > 8.x and 9.x branches?
>     >     >      >
>     >     >      I'm not aware of such work going on.
>     >     >
>     >     >      Thanks,
>     >     >
>     >     >      Kyrill
>     >     >
>     >     >      > Thanks,
>     >     >      >
>     >     >      > Sebastian
>     >     >      >
>     >     >
>     >     >
>     >
>     >
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-04-02  7:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <DEDBE6CA-4163-4B27-BC3B-D993407813E5@amazon.com>
     [not found] ` <fc24deee-e649-5651-17d3-353ec43f0d81@foss.arm.com>
2020-03-09 21:47   ` [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x Pop, Sebastian
2020-03-11 10:10     ` Kyrill Tkachov
2020-03-25  0:24 Pop, Sebastian
2020-03-31 15:47 ` Pop, Sebastian
     [not found]   ` <DB7PR08MB300296EE8E27166D95152D1393C90@DB7PR08MB3002.eurprd08.prod.outlook.com>
2020-04-01 14:26     ` Kyrylo Tkachov
2020-04-01 14:32       ` Pop, Sebastian
2020-04-01 14:35         ` Kyrylo Tkachov
2020-04-01 14:35         ` Jakub Jelinek
2020-04-01 14:40           ` Pop, Sebastian
2020-04-01 22:13 ` Christophe Lyon
2020-04-02  2:34   ` Pop, Sebastian
2020-04-02  7:34     ` Christophe Lyon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).