public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Re: [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x
@ 2020-03-25  0:24 Pop, Sebastian
  2020-03-31 15:47 ` Pop, Sebastian
  2020-04-01 22:13 ` Christophe Lyon
  0 siblings, 2 replies; 12+ messages in thread
From: Pop, Sebastian @ 2020-03-25  0:24 UTC (permalink / raw)
  To: Kyrill Tkachov, gcc-patches; +Cc: richard.henderson, Wilco Dijkstra

[-- Attachment #1: Type: text/plain, Size: 9502 bytes --]

Hi Kyrill,

Thanks for pointing out the two missing bug fixes.
Please see attached all the back-ported patches.
All the patches from trunk applied cleanly with no conflicts (except for the ChangeLog files) to the gcc-9 branch.
An up to date gcc-9 branch on which I applied the attached patches has passed bootstrap on aarch64-linux (Graviton2 with 64 N1 cores) and make check with no extra fails.
Kyrill, could you please commit the attached patches to the gcc-9 branch?

As we still don't have a copyright assignment on file, would it be possible for ARM to finish the backport to the gcc-8 branch of these patches and the atomics cleanup patches mentioned below?

I did a `git log config/aarch64/atomics.md` and there is a follow-up patch to the atomics cleanup patches:

commit e21679a8bb17aac603b8704891e60ac502200629
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Wed Nov 21 17:41:03 2018 +0100

    re PR target/87839 (ICE in final_scan_insn_1, at final.c:3070)

            PR target/87839
            * config/aarch64/atomics.md (@aarch64_compare_and_swap<mode>): Use
            rIJ constraint for aarch64_plus_operand rather than rn.

            * gcc.target/aarch64/pr87839.c: New test.

    From-SVN: r266346

That is fixing code modified in this cleanup patch:

commit d400fda3a8c3330f77eb9d51874f5482d3819a9f
Author: Richard Henderson <richard.henderson@linaro.org>
Date:   Wed Oct 31 09:42:39 2018 +0000

    aarch64: Improve cas generation


Thanks,
Sebastian


On 3/11/20, 5:11 AM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
    
    
    
    Hi Sebastian,
    
    On 3/9/20 9:47 PM, Pop, Sebastian wrote:
    > Hi,
    >
    > Please see attached the patches to add -moutline-atomics to the gcc-9 branch.
    > Tested on graviton2 aarch64-linux with bootstrap and
    > `make check` passes with no new fails.
    > Tested `make check` on glibc built with gcc-9 with and without "-moutline-atomics"
    > and CFLAGS=" -O2 -g -fno-stack-protector -U_FORTIFY_SOURCE".
    >
    > Ok to commit to gcc-9 branch?
    
    Since this feature enables backwards-compatible deployment of LSE
    atomics, I'd support that.
    
    That is okay with me in principle after GCC 9.3 is released (the branch
    is currently frozen).
    
    However, there have been a few follow-up patches to fix some bugs
    revealed by testing.
    
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91833
    
    and
    
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91834
    
    come to mind.
    
    Can you please make sure the fixes for those are included as well?
    
    
    >
    > Does this mechanical `git am *.patch` require a copyright assignment?
    > I am still working with my employer on getting the FSF assignment signed.
    >
    > Thanks,
    > Sebastian
    >
    > PS: For gcc-8 backports there are 5 cleanup and improvement patches
    > that are needed for -moutline-atomics patches to apply cleanly.
    > Should these patches be back-ported in the same time as the flag patches,
    > or should I update the patches to apply to the older code base?
    
    Hmm... normally I'd be for them. In this case I'd want to make sure that
    there aren't any fallout fixes that we're missing.
    
    Did these patches have any bug reports against them?
    
    Thanks,
    
    Kyrill
    
    
    > Here is the list of the extra patches:
    >
    >  From 77f33f44baf24c22848197aa80962c003dd7b3e2 Mon Sep 17 00:00:00 2001
    > From: Richard Henderson <richard.henderson@linaro.org>
    > Date: Wed, 31 Oct 2018 09:29:29 +0000
    > Subject: [PATCH] aarch64: Simplify LSE cas generation
    >
    > The cas insn is a single insn, and if expanded properly need not
    > be split after reload.  Use the proper inputs for the insn.
    >
    >          * config/aarch64/aarch64.c (aarch64_expand_compare_and_swap):
    >          Force oldval into the rval register for TARGET_LSE; emit the compare
    >          during initial expansion so that it may be deleted if unused.
    >          (aarch64_gen_atomic_cas): Remove.
    >          * config/aarch64/atomics.md (@aarch64_compare_and_swap<SHORT>_lse):
    >          Change =&r to +r for operand 0; use match_dup for operand 2;
    >          remove is_weak and mod_f operands as unused.  Drop the split
    >          and merge with...
    >          (@aarch64_atomic_cas<SHORT>): ... this pattern's output; remove.
    >          (@aarch64_compare_and_swap<GPI>_lse): Similarly.
    >          (@aarch64_atomic_cas<GPI>): Similarly.
    >
    > From-SVN: r265656
    >
    >  From d400fda3a8c3330f77eb9d51874f5482d3819a9f Mon Sep 17 00:00:00 2001
    > From: Richard Henderson <richard.henderson@linaro.org>
    > Date: Wed, 31 Oct 2018 09:42:39 +0000
    > Subject: [PATCH] aarch64: Improve cas generation
    >
    > Do not zero-extend the input to the cas for subword operations;
    > instead, use the appropriate zero-extending compare insns.
    > Correct the predicates and constraints for immediate expected operand.
    >
    >          * config/aarch64/aarch64.c (aarch64_gen_compare_reg_maybe_ze): New.
    >          (aarch64_split_compare_and_swap): Use it.
    >          (aarch64_expand_compare_and_swap): Likewise.  Remove convert_modes;
    >          test oldval against the proper predicate.
    >          * config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI>):
    >          Use nonmemory_operand for expected.
    >          (cas_short_expected_pred): New.
    >          (@aarch64_compare_and_swap<SHORT>): Use it; use "rn" not "rI" to match.
    >          (@aarch64_compare_and_swap<GPI>): Use "rn" not "rI" for expected.
    >          * config/aarch64/predicates.md (aarch64_plushi_immediate): New.
    >          (aarch64_plushi_operand): New.
    >
    > From-SVN: r265657
    >
    >  From 8f5603d363a4e0453d2c38c7103aeb0bdca85c4e Mon Sep 17 00:00:00 2001
    > From: Richard Henderson <richard.henderson@linaro.org>
    > Date: Wed, 31 Oct 2018 09:47:21 +0000
    > Subject: [PATCH] aarch64: Improve swp generation
    >
    > Allow zero as an input; fix constraints; avoid unnecessary split.
    >
    >          * config/aarch64/aarch64.c (aarch64_emit_atomic_swap): Remove.
    >          (aarch64_gen_atomic_ldop): Don't call it.
    >          * config/aarch64/atomics.md (atomic_exchange<ALLI>):
    >          Use aarch64_reg_or_zero.
    >          (aarch64_atomic_exchange<ALLI>): Likewise.
    >          (aarch64_atomic_exchange<ALLI>_lse): Remove split; remove & from
    >          operand 0; use aarch64_reg_or_zero for input; merge ...
    >          (@aarch64_atomic_swp<ALLI>): ... this and remove.
    >
    > From-SVN: r265659
    >
    >  From 7803ec5ee2a547043fb6708a08ddb1361ba91202 Mon Sep 17 00:00:00 2001
    > From: Richard Henderson <richard.henderson@linaro.org>
    > Date: Wed, 31 Oct 2018 09:58:48 +0000
    > Subject: [PATCH] aarch64: Improve atomic-op lse generation
    >
    > Fix constraints; avoid unnecessary split.  Drop the use of the atomic_op
    > iterator in favor of the ATOMIC_LDOP iterator; this is simplier and more
    > logical for ldclr aka bic.
    >
    >          * config/aarch64/aarch64.c (aarch64_emit_bic): Remove.
    >          (aarch64_atomic_ldop_supported_p): Remove.
    >          (aarch64_gen_atomic_ldop): Remove.
    >          * config/aarch64/atomic.md (atomic_<atomic_optab><ALLI>):
    >          Fully expand LSE operations here.
    >          (atomic_fetch_<atomic_optab><ALLI>): Likewise.
    >          (atomic_<atomic_optab>_fetch<ALLI>): Likewise.
    >          (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse): Drop atomic_op iterator
    >          and use ATOMIC_LDOP instead; use register_operand for the input;
    >          drop the split and emit insns directly.
    >          (aarch64_atomic_fetch_<ATOMIC_LDOP><ALLI>_lse): Likewise.
    >          (aarch64_atomic_<atomic_op>_fetch<ALLI>_lse): Remove.
    >          (@aarch64_atomic_load<ATOMIC_LDOP><ALLI>): Remove.
    >
    > From-SVN: r265660
    >
    >  From 53de1ea800db54b47290d578c43892799b66c8dc Mon Sep 17 00:00:00 2001
    > From: Richard Henderson <richard.henderson@linaro.org>
    > Date: Wed, 31 Oct 2018 23:11:22 +0000
    > Subject: [PATCH] aarch64: Remove early clobber from ATOMIC_LDOP scratch
    >
    >          * config/aarch64/atomics.md (aarch64_atomic_<ATOMIC_LDOP><ALLI>_lse):
    >          The scratch register need not be early-clobber.  Document the reason
    >          why we cannot use ST<OP>.
    >
    > From-SVN: r265703
    >
    >
    >
    >
    >
    > On 2/27/20, 12:06 PM, "Kyrill Tkachov" <kyrylo.tkachov@foss.arm.com> wrote:
    >
    >      Hi Sebastian,
    >
    >      On 2/27/20 4:53 PM, Pop, Sebastian wrote:
    >      >
    >      > Hi,
    >      >
    >      > is somebody already working on backporting -moutline-atomics to gcc
    >      > 8.x and 9.x branches?
    >      >
    >      I'm not aware of such work going on.
    >
    >      Thanks,
    >
    >      Kyrill
    >
    >      > Thanks,
    >      >
    >      > Sebastian
    >      >
    >
    >
    


[-- Attachment #2: 0001-aarch64-Extend-R-for-integer-registers.patch --]
[-- Type: application/octet-stream, Size: 1766 bytes --]

From 8f806d9464ae450a209d94e98dac91a808674364 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:24 +0000
Subject: [PATCH 1/8] aarch64: Extend %R for integer registers

	* config/aarch64/aarch64.c (aarch64_print_operand): Allow integer
	registers with %R.

From-SVN: r275964
---
 gcc/config/aarch64/aarch64.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 2b09f317978..23846b2339d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7542,7 +7542,7 @@ sizetochar (int size)
      'S/T/U/V':		Print a FP/SIMD register name for a register list.
 			The register printed is the FP/SIMD register name
 			of X + 0/1/2/3 for S/T/U/V.
-     'R':		Print a scalar FP/SIMD register name + 1.
+     'R':		Print a scalar Integer/FP/SIMD register name + 1.
      'X':		Print bottom 16 bits of integer constant in hex.
      'w/x':		Print a general register name or the zero register
 			(32-bit or 64-bit).
@@ -7734,12 +7734,13 @@ aarch64_print_operand (FILE *f, rtx x, int code)
       break;
 
     case 'R':
-      if (!REG_P (x) || !FP_REGNUM_P (REGNO (x)))
-	{
-	  output_operand_lossage ("incompatible floating point / vector register operand for '%%%c'", code);
-	  return;
-	}
-      asm_fprintf (f, "q%d", REGNO (x) - V0_REGNUM + 1);
+      if (REG_P (x) && FP_REGNUM_P (REGNO (x)))
+	asm_fprintf (f, "q%d", REGNO (x) - V0_REGNUM + 1);
+      else if (REG_P (x) && GP_REGNUM_P (REGNO (x)))
+	asm_fprintf (f, "x%d", REGNO (x) - R0_REGNUM + 1);
+      else
+	output_operand_lossage ("incompatible register operand for '%%%c'",
+				code);
       break;
 
     case 'X':
-- 
2.20.1


[-- Attachment #3: 0002-aarch64-Implement-TImode-compare-and-swap.patch --]
[-- Type: application/octet-stream, Size: 9163 bytes --]

From 6d9363c6a96016f460cb17e18e17d29000f1db52 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:29 +0000
Subject: [PATCH 2/8] aarch64: Implement TImode compare-and-swap

This pattern will only be used with the __sync functions, because
we do not yet have a bare TImode atomic load.

	* config/aarch64/aarch64.c (aarch64_gen_compare_reg): Add support
	for NE comparison of TImode values.
	(aarch64_emit_load_exclusive): Add support for TImode.
	(aarch64_emit_store_exclusive): Likewise.
	(aarch64_split_compare_and_swap): Disable strong_zero_p for TImode.
	* config/aarch64/atomics.md (@atomic_compare_and_swap<ALLI_TI>):
	Change iterator from ALLI to ALLI_TI.
	(@atomic_compare_and_swap<JUST_TI>): New.
	(@atomic_compare_and_swap<JUST_TI>_lse): New.
	(aarch64_load_exclusive_pair): New.
	(aarch64_store_exclusive_pair): New.
	* config/aarch64/iterators.md (JUST_TI): New.

From-SVN: r275965
---
 gcc/config/aarch64/aarch64.c    | 47 ++++++++++++++---
 gcc/config/aarch64/atomics.md   | 91 +++++++++++++++++++++++++++++++--
 gcc/config/aarch64/iterators.md |  3 ++
 3 files changed, 130 insertions(+), 11 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 23846b2339d..ae9f567cac7 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1879,10 +1879,33 @@ emit_set_insn (rtx x, rtx y)
 rtx
 aarch64_gen_compare_reg (RTX_CODE code, rtx x, rtx y)
 {
-  machine_mode mode = SELECT_CC_MODE (code, x, y);
-  rtx cc_reg = gen_rtx_REG (mode, CC_REGNUM);
+  machine_mode cmp_mode = GET_MODE (x);
+  machine_mode cc_mode;
+  rtx cc_reg;
 
-  emit_set_insn (cc_reg, gen_rtx_COMPARE (mode, x, y));
+  if (cmp_mode == TImode)
+    {
+      gcc_assert (code == NE);
+
+      cc_mode = CCmode;
+      cc_reg = gen_rtx_REG (cc_mode, CC_REGNUM);
+
+      rtx x_lo = operand_subword (x, 0, 0, TImode);
+      rtx y_lo = operand_subword (y, 0, 0, TImode);
+      emit_set_insn (cc_reg, gen_rtx_COMPARE (cc_mode, x_lo, y_lo));
+
+      rtx x_hi = operand_subword (x, 1, 0, TImode);
+      rtx y_hi = operand_subword (y, 1, 0, TImode);
+      emit_insn (gen_ccmpdi (cc_reg, cc_reg, x_hi, y_hi,
+			     gen_rtx_EQ (cc_mode, cc_reg, const0_rtx),
+			     GEN_INT (AARCH64_EQ)));
+    }
+  else
+    {
+      cc_mode = SELECT_CC_MODE (code, x, y);
+      cc_reg = gen_rtx_REG (cc_mode, CC_REGNUM);
+      emit_set_insn (cc_reg, gen_rtx_COMPARE (cc_mode, x, y));
+    }
   return cc_reg;
 }
 
@@ -15428,16 +15451,26 @@ static void
 aarch64_emit_load_exclusive (machine_mode mode, rtx rval,
 			     rtx mem, rtx model_rtx)
 {
-  emit_insn (gen_aarch64_load_exclusive (mode, rval, mem, model_rtx));
+  if (mode == TImode)
+    emit_insn (gen_aarch64_load_exclusive_pair (gen_lowpart (DImode, rval),
+						gen_highpart (DImode, rval),
+						mem, model_rtx));
+  else
+    emit_insn (gen_aarch64_load_exclusive (mode, rval, mem, model_rtx));
 }
 
 /* Emit store exclusive.  */
 
 static void
 aarch64_emit_store_exclusive (machine_mode mode, rtx bval,
-			      rtx rval, rtx mem, rtx model_rtx)
+			      rtx mem, rtx rval, rtx model_rtx)
 {
-  emit_insn (gen_aarch64_store_exclusive (mode, bval, rval, mem, model_rtx));
+  if (mode == TImode)
+    emit_insn (gen_aarch64_store_exclusive_pair
+	       (bval, mem, operand_subword (rval, 0, 0, TImode),
+		operand_subword (rval, 1, 0, TImode), model_rtx));
+  else
+    emit_insn (gen_aarch64_store_exclusive (mode, bval, mem, rval, model_rtx));
 }
 
 /* Mark the previous jump instruction as unlikely.  */
@@ -15567,7 +15600,7 @@ aarch64_split_compare_and_swap (rtx operands[])
 	CBNZ	scratch, .label1
     .label2:
 	CMP	rval, 0.  */
-  bool strong_zero_p = !is_weak && oldval == const0_rtx;
+  bool strong_zero_p = !is_weak && oldval == const0_rtx && mode != TImode;
 
   label1 = NULL;
   if (!is_weak)
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 0f357662ac3..09d2a63c620 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -22,10 +22,10 @@
 
 (define_expand "@atomic_compare_and_swap<mode>"
   [(match_operand:SI 0 "register_operand" "")			;; bool out
-   (match_operand:ALLI 1 "register_operand" "")			;; val out
-   (match_operand:ALLI 2 "aarch64_sync_memory_operand" "")	;; memory
-   (match_operand:ALLI 3 "nonmemory_operand" "")		;; expected
-   (match_operand:ALLI 4 "aarch64_reg_or_zero" "")		;; desired
+   (match_operand:ALLI_TI 1 "register_operand" "")		;; val out
+   (match_operand:ALLI_TI 2 "aarch64_sync_memory_operand" "")	;; memory
+   (match_operand:ALLI_TI 3 "nonmemory_operand" "")		;; expected
+   (match_operand:ALLI_TI 4 "aarch64_reg_or_zero" "")		;; desired
    (match_operand:SI 5 "const_int_operand")			;; is_weak
    (match_operand:SI 6 "const_int_operand")			;; mod_s
    (match_operand:SI 7 "const_int_operand")]			;; mod_f
@@ -88,6 +88,30 @@
   }
 )
 
+(define_insn_and_split "@aarch64_compare_and_swap<mode>"
+  [(set (reg:CC CC_REGNUM)					;; bool out
+    (unspec_volatile:CC [(const_int 0)] UNSPECV_ATOMIC_CMPSW))
+   (set (match_operand:JUST_TI 0 "register_operand" "=&r")	;; val out
+    (match_operand:JUST_TI 1 "aarch64_sync_memory_operand" "+Q")) ;; memory
+   (set (match_dup 1)
+    (unspec_volatile:JUST_TI
+      [(match_operand:JUST_TI 2 "aarch64_reg_or_zero" "rZ")	;; expect
+       (match_operand:JUST_TI 3 "aarch64_reg_or_zero" "rZ")	;; desired
+       (match_operand:SI 4 "const_int_operand")			;; is_weak
+       (match_operand:SI 5 "const_int_operand")			;; mod_s
+       (match_operand:SI 6 "const_int_operand")]		;; mod_f
+      UNSPECV_ATOMIC_CMPSW))
+   (clobber (match_scratch:SI 7 "=&r"))]
+  ""
+  "#"
+  "&& reload_completed"
+  [(const_int 0)]
+  {
+    aarch64_split_compare_and_swap (operands);
+    DONE;
+  }
+)
+
 (define_insn "@aarch64_compare_and_swap<mode>_lse"
   [(set (match_operand:SI 0 "register_operand" "+r")		;; val out
     (zero_extend:SI
@@ -133,6 +157,28 @@
     return "casal<atomic_sfx>\t%<w>0, %<w>2, %1";
 })
 
+(define_insn "@aarch64_compare_and_swap<mode>_lse"
+  [(set (match_operand:JUST_TI 0 "register_operand" "+r")	;; val out
+    (match_operand:JUST_TI 1 "aarch64_sync_memory_operand" "+Q")) ;; memory
+   (set (match_dup 1)
+    (unspec_volatile:JUST_TI
+      [(match_dup 0)						;; expect
+       (match_operand:JUST_TI 2 "register_operand" "r")		;; desired
+       (match_operand:SI 3 "const_int_operand")]		;; mod_s
+      UNSPECV_ATOMIC_CMPSW))]
+  "TARGET_LSE"
+{
+  enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+  if (is_mm_relaxed (model))
+    return "casp\t%0, %R0, %2, %R2, %1";
+  else if (is_mm_acquire (model) || is_mm_consume (model))
+    return "caspa\t%0, %R0, %2, %R2, %1";
+  else if (is_mm_release (model))
+    return "caspl\t%0, %R0, %2, %R2, %1";
+  else
+    return "caspal\t%0, %R0, %2, %R2, %1";
+})
+
 (define_expand "atomic_exchange<mode>"
  [(match_operand:ALLI 0 "register_operand" "")
   (match_operand:ALLI 1 "aarch64_sync_memory_operand" "")
@@ -581,6 +627,24 @@
   }
 )
 
+(define_insn "aarch64_load_exclusive_pair"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+	(unspec_volatile:DI
+	  [(match_operand:TI 2 "aarch64_sync_memory_operand" "Q")
+	   (match_operand:SI 3 "const_int_operand")]
+	  UNSPECV_LX))
+   (set (match_operand:DI 1 "register_operand" "=r")
+	(unspec_volatile:DI [(match_dup 2) (match_dup 3)] UNSPECV_LX))]
+  ""
+  {
+    enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+    if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_release (model))
+      return "ldxp\t%0, %1, %2";
+    else
+      return "ldaxp\t%0, %1, %2";
+  }
+)
+
 (define_insn "@aarch64_store_exclusive<mode>"
   [(set (match_operand:SI 0 "register_operand" "=&r")
     (unspec_volatile:SI [(const_int 0)] UNSPECV_SX))
@@ -599,6 +663,25 @@
   }
 )
 
+(define_insn "aarch64_store_exclusive_pair"
+  [(set (match_operand:SI 0 "register_operand" "=&r")
+	(unspec_volatile:SI [(const_int 0)] UNSPECV_SX))
+   (set (match_operand:TI 1 "aarch64_sync_memory_operand" "=Q")
+	(unspec_volatile:TI
+	  [(match_operand:DI 2 "aarch64_reg_or_zero" "rZ")
+	   (match_operand:DI 3 "aarch64_reg_or_zero" "rZ")
+	   (match_operand:SI 4 "const_int_operand")]
+	  UNSPECV_SX))]
+  ""
+  {
+    enum memmodel model = memmodel_from_int (INTVAL (operands[3]));
+    if (is_mm_relaxed (model) || is_mm_consume (model) || is_mm_acquire (model))
+      return "stxp\t%w0, %x2, %x3, %1";
+    else
+      return "stlxp\t%w0, %x2, %x3, %1";
+  }
+)
+
 (define_expand "mem_thread_fence"
   [(match_operand:SI 0 "const_int_operand" "")]
   ""
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 6caeeac8086..3bc49ea0238 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -29,6 +29,9 @@
 ;; Iterator for HI, SI, DI, some instructions can only work on these modes.
 (define_mode_iterator GPI_I16 [(HI "AARCH64_ISA_F16") SI DI])
 
+;; "Iterator" for just TI -- features like @pattern only work with iterators.
+(define_mode_iterator JUST_TI [TI])
+
 ;; Iterator for QI and HI modes
 (define_mode_iterator SHORT [QI HI])
 
-- 
2.20.1


[-- Attachment #4: 0003-aarch64-Tidy-aarch64_split_compare_and_swap.patch --]
[-- Type: application/octet-stream, Size: 4393 bytes --]

From 5f3c1052ca4632872bceeb3724fdbdb9ecc619c2 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:33 +0000
Subject: [PATCH 3/8] aarch64: Tidy aarch64_split_compare_and_swap

With aarch64_track_speculation, we had extra code to do exactly what the
!strong_zero_p path already did.  The rest is reducing code duplication.

	* config/aarch64/aarch64 (aarch64_split_compare_and_swap): Disable
	strong_zero_p for aarch64_track_speculation; unify some code paths;
	use aarch64_gen_compare_reg instead of open-coding.

From-SVN: r275966
---
 gcc/config/aarch64/aarch64.c | 50 ++++++++++--------------------------
 1 file changed, 14 insertions(+), 36 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ae9f567cac7..d5515f859e5 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15573,13 +15573,11 @@ aarch64_split_compare_and_swap (rtx operands[])
   /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
   gcc_assert (epilogue_completed);
 
-  rtx rval, mem, oldval, newval, scratch;
+  rtx rval, mem, oldval, newval, scratch, x, model_rtx;
   machine_mode mode;
   bool is_weak;
   rtx_code_label *label1, *label2;
-  rtx x, cond;
   enum memmodel model;
-  rtx model_rtx;
 
   rval = operands[0];
   mem = operands[1];
@@ -15600,7 +15598,8 @@ aarch64_split_compare_and_swap (rtx operands[])
 	CBNZ	scratch, .label1
     .label2:
 	CMP	rval, 0.  */
-  bool strong_zero_p = !is_weak && oldval == const0_rtx && mode != TImode;
+  bool strong_zero_p = (!is_weak && !aarch64_track_speculation &&
+			oldval == const0_rtx && mode != TImode);
 
   label1 = NULL;
   if (!is_weak)
@@ -15613,35 +15612,20 @@ aarch64_split_compare_and_swap (rtx operands[])
   /* The initial load can be relaxed for a __sync operation since a final
      barrier will be emitted to stop code hoisting.  */
   if (is_mm_sync (model))
-    aarch64_emit_load_exclusive (mode, rval, mem,
-				 GEN_INT (MEMMODEL_RELAXED));
+    aarch64_emit_load_exclusive (mode, rval, mem, GEN_INT (MEMMODEL_RELAXED));
   else
     aarch64_emit_load_exclusive (mode, rval, mem, model_rtx);
 
   if (strong_zero_p)
-    {
-      if (aarch64_track_speculation)
-	{
-	  /* Emit an explicit compare instruction, so that we can correctly
-	     track the condition codes.  */
-	  rtx cc_reg = aarch64_gen_compare_reg (NE, rval, const0_rtx);
-	  x = gen_rtx_NE (GET_MODE (cc_reg), cc_reg, const0_rtx);
-	}
-      else
-	x = gen_rtx_NE (VOIDmode, rval, const0_rtx);
-
-      x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-				gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
-      aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
-    }
+    x = gen_rtx_NE (VOIDmode, rval, const0_rtx);
   else
     {
-      cond = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
-      x = gen_rtx_NE (VOIDmode, cond, const0_rtx);
-      x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
-				gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
-      aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
+      rtx cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
+      x = gen_rtx_NE (VOIDmode, cc_reg, const0_rtx);
     }
+  x = gen_rtx_IF_THEN_ELSE (VOIDmode, x,
+			    gen_rtx_LABEL_REF (Pmode, label2), pc_rtx);
+  aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
 
   aarch64_emit_store_exclusive (mode, scratch, mem, newval, model_rtx);
 
@@ -15662,22 +15646,16 @@ aarch64_split_compare_and_swap (rtx operands[])
       aarch64_emit_unlikely_jump (gen_rtx_SET (pc_rtx, x));
     }
   else
-    {
-      cond = gen_rtx_REG (CCmode, CC_REGNUM);
-      x = gen_rtx_COMPARE (CCmode, scratch, const0_rtx);
-      emit_insn (gen_rtx_SET (cond, x));
-    }
+    aarch64_gen_compare_reg (NE, scratch, const0_rtx);
 
   emit_label (label2);
+
   /* If we used a CBNZ in the exchange loop emit an explicit compare with RVAL
      to set the condition flags.  If this is not used it will be removed by
      later passes.  */
   if (strong_zero_p)
-    {
-      cond = gen_rtx_REG (CCmode, CC_REGNUM);
-      x = gen_rtx_COMPARE (CCmode, rval, const0_rtx);
-      emit_insn (gen_rtx_SET (cond, x));
-    }
+    aarch64_gen_compare_reg (NE, rval, const0_rtx);
+
   /* Emit any final barrier needed for a __sync operation.  */
   if (is_mm_sync (model))
     aarch64_emit_post_barrier (model);
-- 
2.20.1


[-- Attachment #5: 0004-aarch64-Add-out-of-line-functions-for-LSE-atomics.patch --]
[-- Type: application/octet-stream, Size: 12042 bytes --]

From e12f5f09ae40201a3a95179c223a93becd9ee67f Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:38 +0000
Subject: [PATCH 4/8] aarch64: Add out-of-line functions for LSE atomics

This is the libgcc part of the interface -- providing the functions.
Rationale is provided at the top of libgcc/config/aarch64/lse.S.

	* config/aarch64/lse-init.c: New file.
	* config/aarch64/lse.S: New file.
	* config/aarch64/t-lse: New file.
	* config.host: Add t-lse to all aarch64 tuples.

From-SVN: r275967
---
 libgcc/config.host               |   4 +
 libgcc/config/aarch64/lse-init.c |  45 ++++++
 libgcc/config/aarch64/lse.S      | 235 +++++++++++++++++++++++++++++++
 libgcc/config/aarch64/t-lse      |  44 ++++++
 4 files changed, 328 insertions(+)
 create mode 100644 libgcc/config/aarch64/lse-init.c
 create mode 100644 libgcc/config/aarch64/lse.S
 create mode 100644 libgcc/config/aarch64/t-lse

diff --git a/libgcc/config.host b/libgcc/config.host
index 0f15fda3612..18e306b48a5 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -347,23 +347,27 @@ aarch64*-*-elf | aarch64*-*-rtems*)
 	extra_parts="$extra_parts crtbegin.o crtend.o crti.o crtn.o"
 	extra_parts="$extra_parts crtfastmath.o"
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	md_unwind_header=aarch64/aarch64-unwind.h
 	;;
 aarch64*-*-freebsd*)
 	extra_parts="$extra_parts crtfastmath.o"
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	md_unwind_header=aarch64/freebsd-unwind.h
 	;;
 aarch64*-*-fuchsia*)
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp"
 	;;
 aarch64*-*-linux*)
 	extra_parts="$extra_parts crtfastmath.o"
 	md_unwind_header=aarch64/linux-unwind.h
 	tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+	tmake_file="${tmake_file} ${cpu_type}/t-lse t-slibgcc-libgcc"
 	tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
 	;;
 alpha*-*-linux*)
diff --git a/libgcc/config/aarch64/lse-init.c b/libgcc/config/aarch64/lse-init.c
new file mode 100644
index 00000000000..33d29147479
--- /dev/null
+++ b/libgcc/config/aarch64/lse-init.c
@@ -0,0 +1,45 @@
+/* Out-of-line LSE atomics for AArch64 architecture, Init.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   Contributed by Linaro Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+/* Define the symbol gating the LSE implementations.  */
+_Bool __aarch64_have_lse_atomics
+  __attribute__((visibility("hidden"), nocommon));
+
+/* Disable initialization of __aarch64_have_lse_atomics during bootstrap.  */
+#ifndef inhibit_libc
+# include <sys/auxv.h>
+
+/* Disable initialization if the system headers are too old.  */
+# if defined(AT_HWCAP) && defined(HWCAP_ATOMICS)
+
+static void __attribute__((constructor))
+init_have_lse_atomics (void)
+{
+  unsigned long hwcap = getauxval (AT_HWCAP);
+  __aarch64_have_lse_atomics = (hwcap & HWCAP_ATOMICS) != 0;
+}
+
+# endif /* HWCAP */
+#endif /* inhibit_libc */
diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
new file mode 100644
index 00000000000..a5f6673596c
--- /dev/null
+++ b/libgcc/config/aarch64/lse.S
@@ -0,0 +1,235 @@
+/* Out-of-line LSE atomics for AArch64 architecture.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   Contributed by Linaro Ltd.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+/*
+ * The problem that we are trying to solve is operating system deployment
+ * of ARMv8.1-Atomics, also known as Large System Exensions (LSE).
+ *
+ * There are a number of potential solutions for this problem which have
+ * been proposed and rejected for various reasons.  To recap:
+ *
+ * (1) Multiple builds.  The dynamic linker will examine /lib64/atomics/
+ * if HWCAP_ATOMICS is set, allowing entire libraries to be overwritten.
+ * However, not all Linux distributions are happy with multiple builds,
+ * and anyway it has no effect on main applications.
+ *
+ * (2) IFUNC.  We could put these functions into libgcc_s.so, and have
+ * a single copy of each function for all DSOs.  However, ARM is concerned
+ * that the branch-to-indirect-branch that is implied by using a PLT,
+ * as required by IFUNC, is too much overhead for smaller cpus.
+ *
+ * (3) Statically predicted direct branches.  This is the approach that
+ * is taken here.  These functions are linked into every DSO that uses them.
+ * All of the symbols are hidden, so that the functions are called via a
+ * direct branch.  The choice of LSE vs non-LSE is done via one byte load
+ * followed by a well-predicted direct branch.  The functions are compiled
+ * separately to minimize code size.
+ */
+
+/* Tell the assembler to accept LSE instructions.  */
+	.arch armv8-a+lse
+
+/* Declare the symbol gating the LSE implementations.  */
+	.hidden	__aarch64_have_lse_atomics
+
+/* Turn size and memory model defines into mnemonic fragments.  */
+#if SIZE == 1
+# define S     b
+# define UXT   uxtb
+#elif SIZE == 2
+# define S     h
+# define UXT   uxth
+#elif SIZE == 4 || SIZE == 8 || SIZE == 16
+# define S
+# define UXT   mov
+#else
+# error
+#endif
+
+#if MODEL == 1
+# define SUFF  _relax
+# define A
+# define L
+#elif MODEL == 2
+# define SUFF  _acq
+# define A     a
+# define L
+#elif MODEL == 3
+# define SUFF  _rel
+# define A
+# define L     l
+#elif MODEL == 4
+# define SUFF  _acq_rel
+# define A     a
+# define L     l
+#else
+# error
+#endif
+
+/* Concatenate symbols.  */
+#define glue2_(A, B)		A ## B
+#define glue2(A, B)		glue2_(A, B)
+#define glue3_(A, B, C)		A ## B ## C
+#define glue3(A, B, C)		glue3_(A, B, C)
+#define glue4_(A, B, C, D)	A ## B ## C ## D
+#define glue4(A, B, C, D)	glue4_(A, B, C, D)
+
+/* Select the size of a register, given a regno.  */
+#define x(N)			glue2(x, N)
+#define w(N)			glue2(w, N)
+#if SIZE < 8
+# define s(N)			w(N)
+#else
+# define s(N)			x(N)
+#endif
+
+#define NAME(BASE)		glue4(__aarch64_, BASE, SIZE, SUFF)
+#define LDXR			glue4(ld, A, xr, S)
+#define STXR			glue4(st, L, xr, S)
+
+/* Temporary registers used.  Other than these, only the return value
+   register (x0) and the flags are modified.  */
+#define tmp0	16
+#define tmp1	17
+#define tmp2	15
+
+/* Start and end a function.  */
+.macro	STARTFN name
+	.text
+	.balign	16
+	.globl	\name
+	.hidden	\name
+	.type	\name, %function
+	.cfi_startproc
+\name:
+.endm
+
+.macro	ENDFN name
+	.cfi_endproc
+	.size	\name, . - \name
+.endm
+
+/* Branch to LABEL if LSE is disabled.  */
+.macro	JUMP_IF_NOT_LSE label
+	adrp	x(tmp0), __aarch64_have_lse_atomics
+	ldrb	w(tmp0), [x(tmp0), :lo12:__aarch64_have_lse_atomics]
+	cbz	w(tmp0), \label
+.endm
+
+#ifdef L_cas
+
+STARTFN	NAME(cas)
+	JUMP_IF_NOT_LSE	8f
+
+#if SIZE < 16
+#define CAS	glue4(cas, A, L, S)
+
+	CAS		s(0), s(1), [x2]
+	ret
+
+8:	UXT		s(tmp0), s(0)
+0:	LDXR		s(0), [x2]
+	cmp		s(0), s(tmp0)
+	bne		1f
+	STXR		w(tmp1), s(1), [x2]
+	cbnz		w(tmp1), 0b
+1:	ret
+
+#else
+#define LDXP	glue3(ld, A, xp)
+#define STXP	glue3(st, L, xp)
+#define CASP	glue3(casp, A, L)
+
+	CASP		x0, x1, x2, x3, [x4]
+	ret
+
+8:	mov		x(tmp0), x0
+	mov		x(tmp1), x1
+0:	LDXP		x0, x1, [x4]
+	cmp		x0, x(tmp0)
+	ccmp		x1, x(tmp1), #0, eq
+	bne		1f
+	STXP		w(tmp2), x(tmp0), x(tmp1), [x4]
+	cbnz		w(tmp2), 0b
+1:	ret
+
+#endif
+
+ENDFN	NAME(cas)
+#endif
+
+#ifdef L_swp
+#define SWP	glue4(swp, A, L, S)
+
+STARTFN	NAME(swp)
+	JUMP_IF_NOT_LSE	8f
+
+	SWP		s(0), s(0), [x1]
+	ret
+
+8:	mov		s(tmp0), s(0)
+0:	LDXR		s(0), [x1]
+	STXR		w(tmp1), s(tmp0), [x1]
+	cbnz		w(tmp1), 0b
+	ret
+
+ENDFN	NAME(swp)
+#endif
+
+#if defined(L_ldadd) || defined(L_ldclr) \
+    || defined(L_ldeor) || defined(L_ldset)
+
+#ifdef L_ldadd
+#define LDNM	ldadd
+#define OP	add
+#elif defined(L_ldclr)
+#define LDNM	ldclr
+#define OP	bic
+#elif defined(L_ldeor)
+#define LDNM	ldeor
+#define OP	eor
+#elif defined(L_ldset)
+#define LDNM	ldset
+#define OP	orr
+#else
+#error
+#endif
+#define LDOP	glue4(LDNM, A, L, S)
+
+STARTFN	NAME(LDNM)
+	JUMP_IF_NOT_LSE	8f
+
+	LDOP		s(0), s(0), [x1]
+	ret
+
+8:	mov		s(tmp0), s(0)
+0:	LDXR		s(0), [x1]
+	OP		s(tmp1), s(0), s(tmp0)
+	STXR		w(tmp1), s(tmp1), [x1]
+	cbnz		w(tmp1), 0b
+	ret
+
+ENDFN	NAME(LDNM)
+#endif
diff --git a/libgcc/config/aarch64/t-lse b/libgcc/config/aarch64/t-lse
new file mode 100644
index 00000000000..fe3868dacbf
--- /dev/null
+++ b/libgcc/config/aarch64/t-lse
@@ -0,0 +1,44 @@
+# Out-of-line LSE atomics for AArch64 architecture.
+# Copyright (C) 2019 Free Software Foundation, Inc.
+# Contributed by Linaro Ltd.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it
+# under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# <http://www.gnu.org/licenses/>.
+
+# Compare-and-swap has 5 sizes and 4 memory models.
+S0 := $(foreach s, 1 2 4 8 16, $(addsuffix _$(s), cas))
+O0 := $(foreach m, 1 2 3 4, $(addsuffix _$(m)$(objext), $(S0)))
+
+# Swap, Load-and-operate have 4 sizes and 4 memory models
+S1 := $(foreach s, 1 2 4 8, $(addsuffix _$(s), swp ldadd ldclr ldeor ldset))
+O1 := $(foreach m, 1 2 3 4, $(addsuffix _$(m)$(objext), $(S1)))
+
+LSE_OBJS := $(O0) $(O1)
+
+libgcc-objects += $(LSE_OBJS) lse-init$(objext)
+
+empty      =
+space      = $(empty) $(empty)
+PAT_SPLIT  = $(subst _,$(space),$(*F))
+PAT_BASE   = $(word 1,$(PAT_SPLIT))
+PAT_N      = $(word 2,$(PAT_SPLIT))
+PAT_M      = $(word 3,$(PAT_SPLIT))
+
+lse-init$(objext): $(srcdir)/config/aarch64/lse-init.c
+	$(gcc_compile) -c $<
+
+$(LSE_OBJS): $(srcdir)/config/aarch64/lse.S
+	$(gcc_compile) -DL_$(PAT_BASE) -DSIZE=$(PAT_N) -DMODEL=$(PAT_M) -c $<
-- 
2.20.1


[-- Attachment #6: 0005-aarch64-Implement-moutline-atomics.patch --]
[-- Type: application/octet-stream, Size: 20911 bytes --]

From 6db8dfc27d0824af94fbc712615dcdcdac865f19 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Thu, 19 Sep 2019 14:36:43 +0000
Subject: [PATCH 5/8] aarch64: Implement -moutline-atomics

	* config/aarch64/aarch64.opt (-moutline-atomics): New.
	* config/aarch64/aarch64.c (aarch64_atomic_ool_func): New.
	(aarch64_ool_cas_names, aarch64_ool_swp_names): New.
	(aarch64_ool_ldadd_names, aarch64_ool_ldset_names): New.
	(aarch64_ool_ldclr_names, aarch64_ool_ldeor_names): New.
	(aarch64_expand_compare_and_swap): Honor TARGET_OUTLINE_ATOMICS.
	* config/aarch64/atomics.md (atomic_exchange<ALLI>): Likewise.
	(atomic_<atomic_op><ALLI>): Likewise.
	(atomic_fetch_<atomic_op><ALLI>): Likewise.
	(atomic_<atomic_op>_fetch<ALLI>): Likewise.
	* doc/invoke.texi: Document -moutline-atomics.
testsuite/
	* gcc.target/aarch64/atomic-op-acq_rel.c: Use -mno-outline-atomics.
	* gcc.target/aarch64/atomic-comp-swap-release-acquire.c: Likewise.
	* gcc.target/aarch64/atomic-op-acquire.c: Likewise.
	* gcc.target/aarch64/atomic-op-char.c: Likewise.
	* gcc.target/aarch64/atomic-op-consume.c: Likewise.
	* gcc.target/aarch64/atomic-op-imm.c: Likewise.
	* gcc.target/aarch64/atomic-op-int.c: Likewise.
	* gcc.target/aarch64/atomic-op-long.c: Likewise.
	* gcc.target/aarch64/atomic-op-relaxed.c: Likewise.
	* gcc.target/aarch64/atomic-op-release.c: Likewise.
	* gcc.target/aarch64/atomic-op-seq_cst.c: Likewise.
	* gcc.target/aarch64/atomic-op-short.c: Likewise.
	* gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c: Likewise.
	* gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c: Likewise.
	* gcc.target/aarch64/sync-comp-swap.c: Likewise.
	* gcc.target/aarch64/sync-op-acquire.c: Likewise.
	* gcc.target/aarch64/sync-op-full.c: Likewise.

From-SVN: r275968
---
 gcc/config/aarch64/aarch64-protos.h           | 13 +++
 gcc/config/aarch64/aarch64.c                  | 87 +++++++++++++++++
 gcc/config/aarch64/aarch64.opt                |  3 +
 gcc/config/aarch64/atomics.md                 | 94 +++++++++++++++++--
 gcc/doc/invoke.texi                           | 16 +++-
 .../atomic-comp-swap-release-acquire.c        |  2 +-
 .../gcc.target/aarch64/atomic-op-acq_rel.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-acquire.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-char.c       |  2 +-
 .../gcc.target/aarch64/atomic-op-consume.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-imm.c        |  2 +-
 .../gcc.target/aarch64/atomic-op-int.c        |  2 +-
 .../gcc.target/aarch64/atomic-op-long.c       |  2 +-
 .../gcc.target/aarch64/atomic-op-relaxed.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-release.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-seq_cst.c    |  2 +-
 .../gcc.target/aarch64/atomic-op-short.c      |  2 +-
 .../aarch64/atomic_cmp_exchange_zero_reg_1.c  |  2 +-
 .../atomic_cmp_exchange_zero_strong_1.c       |  2 +-
 .../gcc.target/aarch64/sync-comp-swap.c       |  2 +-
 .../gcc.target/aarch64/sync-op-acquire.c      |  2 +-
 .../gcc.target/aarch64/sync-op-full.c         |  2 +-
 22 files changed, 221 insertions(+), 26 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index c083cad5327..b9bfb281275 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -644,4 +644,17 @@ poly_uint64 aarch64_regmode_natural_size (machine_mode);
 
 bool aarch64_high_bits_all_ones_p (HOST_WIDE_INT);
 
+struct atomic_ool_names
+{
+    const char *str[5][4];
+};
+
+rtx aarch64_atomic_ool_func(machine_mode mode, rtx model_rtx,
+			    const atomic_ool_names *names);
+extern const atomic_ool_names aarch64_ool_swp_names;
+extern const atomic_ool_names aarch64_ool_ldadd_names;
+extern const atomic_ool_names aarch64_ool_ldset_names;
+extern const atomic_ool_names aarch64_ool_ldclr_names;
+extern const atomic_ool_names aarch64_ool_ldeor_names;
+
 #endif /* GCC_AARCH64_PROTOS_H */
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index d5515f859e5..f81c2947f16 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15482,6 +15482,82 @@ aarch64_emit_unlikely_jump (rtx insn)
   add_reg_br_prob_note (jump, profile_probability::very_unlikely ());
 }
 
+/* We store the names of the various atomic helpers in a 5x4 array.
+   Return the libcall function given MODE, MODEL and NAMES.  */
+
+rtx
+aarch64_atomic_ool_func(machine_mode mode, rtx model_rtx,
+			const atomic_ool_names *names)
+{
+  memmodel model = memmodel_base (INTVAL (model_rtx));
+  int mode_idx, model_idx;
+
+  switch (mode)
+    {
+    case E_QImode:
+      mode_idx = 0;
+      break;
+    case E_HImode:
+      mode_idx = 1;
+      break;
+    case E_SImode:
+      mode_idx = 2;
+      break;
+    case E_DImode:
+      mode_idx = 3;
+      break;
+    case E_TImode:
+      mode_idx = 4;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  switch (model)
+    {
+    case MEMMODEL_RELAXED:
+      model_idx = 0;
+      break;
+    case MEMMODEL_CONSUME:
+    case MEMMODEL_ACQUIRE:
+      model_idx = 1;
+      break;
+    case MEMMODEL_RELEASE:
+      model_idx = 2;
+      break;
+    case MEMMODEL_ACQ_REL:
+    case MEMMODEL_SEQ_CST:
+      model_idx = 3;
+      break;
+    default:
+      gcc_unreachable ();
+    }
+
+  return init_one_libfunc_visibility (names->str[mode_idx][model_idx],
+				      VISIBILITY_HIDDEN);
+}
+
+#define DEF0(B, N) \
+  { "__aarch64_" #B #N "_relax", \
+    "__aarch64_" #B #N "_acq", \
+    "__aarch64_" #B #N "_rel", \
+    "__aarch64_" #B #N "_acq_rel" }
+
+#define DEF4(B)  DEF0(B, 1), DEF0(B, 2), DEF0(B, 4), DEF0(B, 8), \
+		 { NULL, NULL, NULL, NULL }
+#define DEF5(B)  DEF0(B, 1), DEF0(B, 2), DEF0(B, 4), DEF0(B, 8), DEF0(B, 16)
+
+static const atomic_ool_names aarch64_ool_cas_names = { { DEF5(cas) } };
+const atomic_ool_names aarch64_ool_swp_names = { { DEF4(swp) } };
+const atomic_ool_names aarch64_ool_ldadd_names = { { DEF4(ldadd) } };
+const atomic_ool_names aarch64_ool_ldset_names = { { DEF4(ldset) } };
+const atomic_ool_names aarch64_ool_ldclr_names = { { DEF4(ldclr) } };
+const atomic_ool_names aarch64_ool_ldeor_names = { { DEF4(ldeor) } };
+
+#undef DEF0
+#undef DEF4
+#undef DEF5
+
 /* Expand a compare and swap pattern.  */
 
 void
@@ -15528,6 +15604,17 @@ aarch64_expand_compare_and_swap (rtx operands[])
 						   newval, mod_s));
       cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
     }
+  else if (TARGET_OUTLINE_ATOMICS)
+    {
+      /* Oldval must satisfy compare afterward.  */
+      if (!aarch64_plus_operand (oldval, mode))
+	oldval = force_reg (mode, oldval);
+      rtx func = aarch64_atomic_ool_func (mode, mod_s, &aarch64_ool_cas_names);
+      rval = emit_library_call_value (func, NULL_RTX, LCT_NORMAL, r_mode,
+				      oldval, mode, newval, mode,
+				      XEXP (mem, 0), Pmode);
+      cc_reg = aarch64_gen_compare_reg_maybe_ze (NE, rval, oldval, mode);
+    }
   else
     {
       /* The oldval predicate varies by mode.  Test it and force to reg.  */
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 3c6d1cc90ad..f474a28eb92 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -255,3 +255,6 @@ user-land code.
 TargetVariable
 long aarch64_stack_protector_guard_offset = 0
 
+moutline-atomics
+Target Report Mask(OUTLINE_ATOMICS) Save
+Generate local calls to out-of-line atomic operations.
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 09d2a63c620..cabcc58f1a0 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -186,16 +186,27 @@
   (match_operand:SI 3 "const_int_operand" "")]
   ""
   {
-    rtx (*gen) (rtx, rtx, rtx, rtx);
-
     /* Use an atomic SWP when available.  */
     if (TARGET_LSE)
-      gen = gen_aarch64_atomic_exchange<mode>_lse;
+      {
+	emit_insn (gen_aarch64_atomic_exchange<mode>_lse
+		   (operands[0], operands[1], operands[2], operands[3]));
+      }
+    else if (TARGET_OUTLINE_ATOMICS)
+      {
+	machine_mode mode = <MODE>mode;
+	rtx func = aarch64_atomic_ool_func (mode, operands[3],
+					    &aarch64_ool_swp_names);
+	rtx rval = emit_library_call_value (func, operands[0], LCT_NORMAL,
+					    mode, operands[2], mode,
+					    XEXP (operands[1], 0), Pmode);
+        emit_move_insn (operands[0], rval);
+      }
     else
-      gen = gen_aarch64_atomic_exchange<mode>;
-
-    emit_insn (gen (operands[0], operands[1], operands[2], operands[3]));
-
+      {
+	emit_insn (gen_aarch64_atomic_exchange<mode>
+		   (operands[0], operands[1], operands[2], operands[3]));
+      }
     DONE;
   }
 )
@@ -280,6 +291,39 @@
 	  }
 	operands[1] = force_reg (<MODE>mode, operands[1]);
       }
+    else if (TARGET_OUTLINE_ATOMICS)
+      {
+        const atomic_ool_names *names;
+	switch (<CODE>)
+	  {
+	  case MINUS:
+	    operands[1] = expand_simple_unop (<MODE>mode, NEG, operands[1],
+					      NULL, 1);
+	    /* fallthru */
+	  case PLUS:
+	    names = &aarch64_ool_ldadd_names;
+	    break;
+	  case IOR:
+	    names = &aarch64_ool_ldset_names;
+	    break;
+	  case XOR:
+	    names = &aarch64_ool_ldeor_names;
+	    break;
+	  case AND:
+	    operands[1] = expand_simple_unop (<MODE>mode, NOT, operands[1],
+					      NULL, 1);
+	    names = &aarch64_ool_ldclr_names;
+	    break;
+	  default:
+	    gcc_unreachable ();
+	  }
+        machine_mode mode = <MODE>mode;
+	rtx func = aarch64_atomic_ool_func (mode, operands[2], names);
+	emit_library_call_value (func, NULL_RTX, LCT_NORMAL, mode,
+				 operands[1], mode,
+				 XEXP (operands[0], 0), Pmode);
+        DONE;
+      }
     else
       gen = gen_aarch64_atomic_<atomic_optab><mode>;
 
@@ -405,6 +449,40 @@
 	}
       operands[2] = force_reg (<MODE>mode, operands[2]);
     }
+  else if (TARGET_OUTLINE_ATOMICS)
+    {
+      const atomic_ool_names *names;
+      switch (<CODE>)
+	{
+	case MINUS:
+	  operands[2] = expand_simple_unop (<MODE>mode, NEG, operands[2],
+					    NULL, 1);
+	  /* fallthru */
+	case PLUS:
+	  names = &aarch64_ool_ldadd_names;
+	  break;
+	case IOR:
+	  names = &aarch64_ool_ldset_names;
+	  break;
+	case XOR:
+	  names = &aarch64_ool_ldeor_names;
+	  break;
+	case AND:
+	  operands[2] = expand_simple_unop (<MODE>mode, NOT, operands[2],
+					    NULL, 1);
+	  names = &aarch64_ool_ldclr_names;
+	  break;
+	default:
+	  gcc_unreachable ();
+	}
+      machine_mode mode = <MODE>mode;
+      rtx func = aarch64_atomic_ool_func (mode, operands[3], names);
+      rtx rval = emit_library_call_value (func, operands[0], LCT_NORMAL, mode,
+					  operands[2], mode,
+					  XEXP (operands[1], 0), Pmode);
+      emit_move_insn (operands[0], rval);
+      DONE;
+    }
   else
     gen = gen_aarch64_atomic_fetch_<atomic_optab><mode>;
 
@@ -494,7 +572,7 @@
 {
   /* Use an atomic load-operate instruction when possible.  In this case
      we will re-compute the result from the original mem value. */
-  if (TARGET_LSE)
+  if (TARGET_LSE || TARGET_OUTLINE_ATOMICS)
     {
       rtx tmp = gen_reg_rtx (<MODE>mode);
       operands[2] = force_reg (<MODE>mode, operands[2]);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 0f6247caf51..792b768fceb 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -637,7 +637,8 @@ Objective-C and Objective-C++ Dialects}.
 -march=@var{name}  -mcpu=@var{name}  -mtune=@var{name}  @gol
 -moverride=@var{string}  -mverbose-cost-dump @gol
 -mstack-protector-guard=@var{guard} -mstack-protector-guard-reg=@var{sysreg} @gol
--mstack-protector-guard-offset=@var{offset} -mtrack-speculation }
+-mstack-protector-guard-offset=@var{offset} -mtrack-speculation @gol
+-moutline-atomics }
 
 @emph{Adapteva Epiphany Options}
 @gccoptlist{-mhalf-reg-file  -mprefer-short-insn-regs @gol
@@ -15782,6 +15783,19 @@ be used by the compiler when expanding calls to
 @code{__builtin_speculation_safe_copy} to permit a more efficient code
 sequence to be generated.
 
+@item -moutline-atomics
+@itemx -mno-outline-atomics
+Enable or disable calls to out-of-line helpers to implement atomic operations.
+These helpers will, at runtime, determine if the LSE instructions from
+ARMv8.1-A can be used; if not, they will use the load/store-exclusive
+instructions that are present in the base ARMv8.0 ISA.
+
+This option is only applicable when compiling for the base ARMv8.0
+instruction set.  If using a later revision, e.g. @option{-march=armv8.1-a}
+or @option{-march=armv8-a+lse}, the ARMv8.1-Atomics instructions will be
+used directly.  The same applies when using @option{-mcpu=} when the
+selected cpu supports the @samp{lse} feature.
+
 @item -march=@var{name}
 @opindex march
 Specify the name of the target architecture and, optionally, one or
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
index 49ca5d0d09c..a828a72aa75 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-comp-swap-release-acquire.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf -mno-outline-atomics" } */
 
 #include "atomic-comp-swap-release-acquire.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c
index 74f26348e42..6823ce381b2 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-acq_rel.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-acq_rel.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c
index 66c1b1efe20..87937de378a 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-acquire.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-acquire.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c
index c09d0434ecf..60955e57da3 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-char.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-char.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
index 5783ab84f5c..16cb11aeeaf 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-consume.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-consume.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c
index 18b8f0b04e9..bcab4e481e3 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-imm.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 int v = 0;
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c
index 8520f0839ba..040e4a8d168 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-int.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-int.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c
index d011f8c5ce2..fc88b92cd3e 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-long.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 long v = 0;
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c
index ed96bfdb978..503d62b0280 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-relaxed.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-relaxed.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c
index fc4be17de89..efe14aea7e4 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-release.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-release.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c
index 613000fe490..09973bf82ba 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-seq_cst.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-seq_cst.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c b/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c
index e82c8118ece..e1dcebb0f89 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic-op-short.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "atomic-op-short.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
index f2a21ddf2e1..29246979bfb 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=armv8-a+nolse" } */
+/* { dg-options "-O2 -march=armv8-a+nolse -mno-outline-atomics" } */
 /* { dg-skip-if "" { *-*-* } { "-mcpu=*" } { "" } } */
 
 int
diff --git a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
index 8d2ae67dfbe..6daf9b08f5a 100644
--- a/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_strong_1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -march=armv8-a+nolse" } */
+/* { dg-options "-O2 -march=armv8-a+nolse -mno-outline-atomics" } */
 /* { dg-skip-if "" { *-*-* } { "-mcpu=*" } { "" } } */
 
 int
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
index e571b2f13b3..f56415f3354 100644
--- a/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
+++ b/gcc/testsuite/gcc.target/aarch64/sync-comp-swap.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -fno-ipa-icf -mno-outline-atomics" } */
 
 #include "sync-comp-swap.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
index 357bf1be3b2..39b3144aa36 100644
--- a/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-acquire.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "sync-op-acquire.x"
 
diff --git a/gcc/testsuite/gcc.target/aarch64/sync-op-full.c b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
index c6ba1629965..6b8b2043f40 100644
--- a/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
+++ b/gcc/testsuite/gcc.target/aarch64/sync-op-full.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-march=armv8-a+nolse -O2" } */
+/* { dg-options "-march=armv8-a+nolse -O2 -mno-outline-atomics" } */
 
 #include "sync-op-full.x"
 
-- 
2.20.1


[-- Attachment #7: 0006-Fix-shrinkwrapping-interactions-with-atomics-PR92692.patch --]
[-- Type: application/octet-stream, Size: 1529 bytes --]

From f8f2936f09484522044939e438d1b5a5df35eb46 Mon Sep 17 00:00:00 2001
From: Wilco Dijkstra <wdijkstr@arm.com>
Date: Fri, 17 Jan 2020 13:17:21 +0000
Subject: [PATCH 6/8] Fix shrinkwrapping interactions with atomics (PR92692)

The separate shrinkwrapping pass may insert stores in the middle
of atomics loops which can cause issues on some implementations.
Avoid this by delaying splitting atomics patterns until after
prolog/epilog generation.

gcc/
	PR target/92692
	* config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
	Add assert to ensure prolog has been emitted.
	(aarch64_split_atomic_op): Likewise.
	* config/aarch64/atomics.md (aarch64_compare_and_swap<mode>)
	Use epilogue_completed rather than reload_completed.
	(aarch64_atomic_exchange<mode>): Likewise.
	(aarch64_atomic_<atomic_optab><mode>): Likewise.
	(atomic_nand<mode>): Likewise.
	(aarch64_atomic_fetch_<atomic_optab><mode>): Likewise.
	(atomic_fetch_nand<mode>): Likewise.
	(aarch64_atomic_<atomic_optab>_fetch<mode>): Likewise.
	(atomic_nand_fetch<mode>): Likewise.
---
 gcc/config/aarch64/atomics.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index cabcc58f1a0..1458bc00095 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -104,7 +104,7 @@
    (clobber (match_scratch:SI 7 "=&r"))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
     aarch64_split_compare_and_swap (operands);
-- 
2.20.1


[-- Attachment #8: 0007-aarch64-Fix-store-exclusive-in-load-operate-LSE-help.patch --]
[-- Type: application/octet-stream, Size: 881 bytes --]

From e817b86e534823ed51d49ef0b3cd7d1b02f83d83 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 25 Sep 2019 21:48:41 +0000
Subject: [PATCH 7/8] aarch64: Fix store-exclusive in load-operate LSE helpers

	PR target/91834
	* config/aarch64/lse.S (LDNM): Ensure STXR output does not
	overlap the inputs.

From-SVN: r276133
---
 libgcc/config/aarch64/lse.S | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libgcc/config/aarch64/lse.S b/libgcc/config/aarch64/lse.S
index a5f6673596c..c7979382ad7 100644
--- a/libgcc/config/aarch64/lse.S
+++ b/libgcc/config/aarch64/lse.S
@@ -227,8 +227,8 @@ STARTFN	NAME(LDNM)
 8:	mov		s(tmp0), s(0)
 0:	LDXR		s(0), [x1]
 	OP		s(tmp1), s(0), s(tmp0)
-	STXR		w(tmp1), s(tmp1), [x1]
-	cbnz		w(tmp1), 0b
+	STXR		w(tmp2), s(tmp1), [x1]
+	cbnz		w(tmp2), 0b
 	ret
 
 ENDFN	NAME(LDNM)
-- 
2.20.1


[-- Attachment #9: 0008-aarch64-Configure-for-sys-auxv.h-in-libgcc-for-lse-i.patch --]
[-- Type: application/octet-stream, Size: 7719 bytes --]

From aba70574696501513eeb862bb53da8509461b783 Mon Sep 17 00:00:00 2001
From: Richard Henderson <richard.henderson@linaro.org>
Date: Wed, 25 Sep 2019 22:51:55 +0000
Subject: [PATCH 8/8] aarch64: Configure for sys/auxv.h in libgcc for
 lse-init.c

	PR target/91833
	* config/aarch64/lse-init.c: Include auto-target.h.  Disable
	initialization if !HAVE_SYS_AUXV_H.
	* configure.ac (AC_CHECK_HEADERS): Add sys/auxv.h.
	* config.in, configure: Rebuild.

From-SVN: r276134
---
 libgcc/config.in                 |  8 ++++++++
 libgcc/config/aarch64/lse-init.c |  4 +++-
 libgcc/configure                 | 26 +++++++++++++++++++-------
 libgcc/configure.ac              |  2 +-
 4 files changed, 31 insertions(+), 9 deletions(-)
 mode change 100644 => 100755 libgcc/configure

diff --git a/libgcc/config.in b/libgcc/config.in
index d634af9d949..59a3d8daf52 100644
--- a/libgcc/config.in
+++ b/libgcc/config.in
@@ -43,6 +43,9 @@
 /* Define to 1 if you have the <string.h> header file. */
 #undef HAVE_STRING_H
 
+/* Define to 1 if you have the <sys/auxv.h> header file. */
+#undef HAVE_SYS_AUXV_H
+
 /* Define to 1 if you have the <sys/stat.h> header file. */
 #undef HAVE_SYS_STAT_H
 
@@ -82,6 +85,11 @@
 /* Define to 1 if the target use emutls for thread-local storage. */
 #undef USE_EMUTLS
 
+/* Enable large inode numbers on Mac OS X 10.5.  */
+#ifndef _DARWIN_USE_64_BIT_INODE
+# define _DARWIN_USE_64_BIT_INODE 1
+#endif
+
 /* Number of bits in a file offset, on hosts where this is settable. */
 #undef _FILE_OFFSET_BITS
 
diff --git a/libgcc/config/aarch64/lse-init.c b/libgcc/config/aarch64/lse-init.c
index 33d29147479..1a8f4c55213 100644
--- a/libgcc/config/aarch64/lse-init.c
+++ b/libgcc/config/aarch64/lse-init.c
@@ -23,12 +23,14 @@ a copy of the GCC Runtime Library Exception along with this program;
 see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 <http://www.gnu.org/licenses/>.  */
 
+#include "auto-target.h"
+
 /* Define the symbol gating the LSE implementations.  */
 _Bool __aarch64_have_lse_atomics
   __attribute__((visibility("hidden"), nocommon));
 
 /* Disable initialization of __aarch64_have_lse_atomics during bootstrap.  */
-#ifndef inhibit_libc
+#if !defined(inhibit_libc) && defined(HAVE_SYS_AUXV_H)
 # include <sys/auxv.h>
 
 /* Disable initialization if the system headers are too old.  */
diff --git a/libgcc/configure b/libgcc/configure
old mode 100644
new mode 100755
index 36dbbc1f699..cf42c057352
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -674,6 +674,7 @@ infodir
 docdir
 oldincludedir
 includedir
+runstatedir
 localstatedir
 sharedstatedir
 sysconfdir
@@ -763,6 +764,7 @@ datadir='${datarootdir}'
 sysconfdir='${prefix}/etc'
 sharedstatedir='${prefix}/com'
 localstatedir='${prefix}/var'
+runstatedir='${localstatedir}/run'
 includedir='${prefix}/include'
 oldincludedir='/usr/include'
 docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
@@ -1015,6 +1017,15 @@ do
   | -silent | --silent | --silen | --sile | --sil)
     silent=yes ;;
 
+  -runstatedir | --runstatedir | --runstatedi | --runstated \
+  | --runstate | --runstat | --runsta | --runst | --runs \
+  | --run | --ru | --r)
+    ac_prev=runstatedir ;;
+  -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
+  | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
+  | --run=* | --ru=* | --r=*)
+    runstatedir=$ac_optarg ;;
+
   -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
     ac_prev=sbindir ;;
   -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@@ -1152,7 +1163,7 @@ fi
 for ac_var in	exec_prefix prefix bindir sbindir libexecdir datarootdir \
 		datadir sysconfdir sharedstatedir localstatedir includedir \
 		oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
-		libdir localedir mandir
+		libdir localedir mandir runstatedir
 do
   eval ac_val=\$$ac_var
   # Remove trailing slashes.
@@ -1305,6 +1316,7 @@ Fine tuning of the installation directories:
   --sysconfdir=DIR        read-only single-machine data [PREFIX/etc]
   --sharedstatedir=DIR    modifiable architecture-independent data [PREFIX/com]
   --localstatedir=DIR     modifiable single-machine data [PREFIX/var]
+  --runstatedir=DIR       modifiable per-process data [LOCALSTATEDIR/run]
   --libdir=DIR            object code libraries [EPREFIX/lib]
   --includedir=DIR        C header files [PREFIX/include]
   --oldincludedir=DIR     C header files for non-gcc [/usr/include]
@@ -4170,7 +4182,7 @@ else
     We can't simply define LARGE_OFF_T to be 9223372036854775807,
     since some C++ compilers masquerading as C compilers
     incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
 		       && LARGE_OFF_T % 2147483647 == 1)
 		      ? 1 : -1];
@@ -4216,7 +4228,7 @@ else
     We can't simply define LARGE_OFF_T to be 9223372036854775807,
     since some C++ compilers masquerading as C compilers
     incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
 		       && LARGE_OFF_T % 2147483647 == 1)
 		      ? 1 : -1];
@@ -4240,7 +4252,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
     We can't simply define LARGE_OFF_T to be 9223372036854775807,
     since some C++ compilers masquerading as C compilers
     incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
 		       && LARGE_OFF_T % 2147483647 == 1)
 		      ? 1 : -1];
@@ -4285,7 +4297,7 @@ else
     We can't simply define LARGE_OFF_T to be 9223372036854775807,
     since some C++ compilers masquerading as C compilers
     incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
 		       && LARGE_OFF_T % 2147483647 == 1)
 		      ? 1 : -1];
@@ -4309,7 +4321,7 @@ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
     We can't simply define LARGE_OFF_T to be 9223372036854775807,
     since some C++ compilers masquerading as C compilers
     incorrectly reject 9223372036854775807.  */
-#define LARGE_OFF_T (((off_t) 1 << 62) - 1 + ((off_t) 1 << 62))
+#define LARGE_OFF_T ((((off_t) 1 << 31) << 31) - 1 + (((off_t) 1 << 31) << 31))
   int off_t_is_large[(LARGE_OFF_T % 2147483629 == 721
 		       && LARGE_OFF_T % 2147483647 == 1)
 		      ? 1 : -1];
@@ -4421,7 +4433,7 @@ as_fn_arith $ac_cv_sizeof_long_double \* 8 && long_double_type_size=$as_val
 
 for ac_header in inttypes.h stdint.h stdlib.h ftw.h \
 	unistd.h sys/stat.h sys/types.h \
-	string.h strings.h memory.h
+	string.h strings.h memory.h sys/auxv.h
 do :
   as_ac_Header=`$as_echo "ac_cv_header_$ac_header" | $as_tr_sh`
 ac_fn_c_check_header_preproc "$LINENO" "$ac_header" "$as_ac_Header"
diff --git a/libgcc/configure.ac b/libgcc/configure.ac
index 8e96cafdf8b..0762d77a801 100644
--- a/libgcc/configure.ac
+++ b/libgcc/configure.ac
@@ -207,7 +207,7 @@ AC_SUBST(long_double_type_size)
 
 AC_CHECK_HEADERS(inttypes.h stdint.h stdlib.h ftw.h \
 	unistd.h sys/stat.h sys/types.h \
-	string.h strings.h memory.h)
+	string.h strings.h memory.h sys/auxv.h)
 AC_HEADER_STDC
 
 # Check for decimal float support.
-- 
2.20.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2020-04-02  7:34 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <DEDBE6CA-4163-4B27-BC3B-D993407813E5@amazon.com>
     [not found] ` <fc24deee-e649-5651-17d3-353ec43f0d81@foss.arm.com>
2020-03-09 21:47   ` [AArch64] Backporting -moutline-atomics to gcc 9.x and 8.x Pop, Sebastian
2020-03-11 10:10     ` Kyrill Tkachov
2020-03-25  0:24 Pop, Sebastian
2020-03-31 15:47 ` Pop, Sebastian
     [not found]   ` <DB7PR08MB300296EE8E27166D95152D1393C90@DB7PR08MB3002.eurprd08.prod.outlook.com>
2020-04-01 14:26     ` Kyrylo Tkachov
2020-04-01 14:32       ` Pop, Sebastian
2020-04-01 14:35         ` Kyrylo Tkachov
2020-04-01 14:35         ` Jakub Jelinek
2020-04-01 14:40           ` Pop, Sebastian
2020-04-01 22:13 ` Christophe Lyon
2020-04-02  2:34   ` Pop, Sebastian
2020-04-02  7:34     ` Christophe Lyon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).