public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] aarch64: Fix TImode __sync_*_compare_and_exchange expansion with LSE [PR114310]
@ 2024-03-14  8:37 Jakub Jelinek
  2024-03-14 11:49 ` Richard Earnshaw (lists)
  0 siblings, 1 reply; 3+ messages in thread
From: Jakub Jelinek @ 2024-03-14  8:37 UTC (permalink / raw)
  To: Richard Earnshaw, Richard Sandiford, Kyrylo Tkachov; +Cc: gcc-patches

Hi!

The following testcase ICEs with LSE atomics.
The problem is that the @atomic_compare_and_swap<mode> expander uses
aarch64_reg_or_zero predicate for the desired operand, which is fine,
given that for most of the modes and even for TImode in some cases
it can handle zero immediate just fine, but the TImode
@aarch64_compare_and_swap<mode>_lse just uses register_operand for
that operand instead, again intentionally so, because the casp,
caspa, caspl and caspal instructions need to use a pair of consecutive
registers for the operand and xzr is just one register and we can't
just store zero into the link register to emulate pair of zeros.

So, the following patch fixes that by forcing the newval operand into
a register for the TImode LSE case.

Bootstrapped/regtested on aarch64-linux, ok for trunk?

2024-03-14  Jakub Jelinek  <jakub@redhat.com>

	PR target/114310
	* config/aarch64/aarch64.cc (aarch64_expand_compare_and_swap): For
	TImode force newval into a register.

	* gcc.dg/pr114310.c: New test.

--- gcc/config/aarch64/aarch64.cc.jj	2024-03-12 10:16:12.024101665 +0100
+++ gcc/config/aarch64/aarch64.cc	2024-03-13 18:55:39.147986554 +0100
@@ -24693,6 +24693,8 @@ aarch64_expand_compare_and_swap (rtx ope
         rval = copy_to_mode_reg (r_mode, oldval);
       else
 	emit_move_insn (rval, gen_lowpart (r_mode, oldval));
+      if (mode == TImode)
+	newval = force_reg (mode, newval);
 
       emit_insn (gen_aarch64_compare_and_swap_lse (mode, rval, mem,
 						   newval, mod_s));
--- gcc/testsuite/gcc.dg/pr114310.c.jj	2024-03-13 19:09:25.322597418 +0100
+++ gcc/testsuite/gcc.dg/pr114310.c	2024-03-13 19:08:50.802073314 +0100
@@ -0,0 +1,20 @@
+/* PR target/114310 */
+/* { dg-do run { target int128 } } */
+
+volatile __attribute__((aligned (sizeof (__int128_t)))) __int128_t v = 10;
+
+int
+main ()
+{
+#if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
+  if (__sync_val_compare_and_swap (&v, (__int128_t) 10, (__int128_t) 0) != 10)
+    __builtin_abort ();
+  if (__sync_val_compare_and_swap (&v, (__int128_t) 10, (__int128_t) 15) != 0)
+    __builtin_abort ();
+  if (__sync_val_compare_and_swap (&v, (__int128_t) 0, (__int128_t) 42) != 0)
+    __builtin_abort ();
+  if (__sync_val_compare_and_swap (&v, (__int128_t) 31, (__int128_t) 35) != 42)
+    __builtin_abort ();
+#endif
+  return 0;
+}

	Jakub


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] aarch64: Fix TImode __sync_*_compare_and_exchange expansion with LSE [PR114310]
  2024-03-14  8:37 [PATCH] aarch64: Fix TImode __sync_*_compare_and_exchange expansion with LSE [PR114310] Jakub Jelinek
@ 2024-03-14 11:49 ` Richard Earnshaw (lists)
  2024-03-14 13:02   ` Jakub Jelinek
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Earnshaw (lists) @ 2024-03-14 11:49 UTC (permalink / raw)
  To: Jakub Jelinek, Richard Sandiford, Kyrylo Tkachov; +Cc: gcc-patches



On 14/03/2024 08:37, Jakub Jelinek wrote:
> Hi!
> 
> The following testcase ICEs with LSE atomics.
> The problem is that the @atomic_compare_and_swap<mode> expander uses
> aarch64_reg_or_zero predicate for the desired operand, which is fine,
> given that for most of the modes and even for TImode in some cases
> it can handle zero immediate just fine, but the TImode
> @aarch64_compare_and_swap<mode>_lse just uses register_operand for
> that operand instead, again intentionally so, because the casp,
> caspa, caspl and caspal instructions need to use a pair of consecutive
> registers for the operand and xzr is just one register and we can't
> just store zero into the link register to emulate pair of zeros.
> 
> So, the following patch fixes that by forcing the newval operand into
> a register for the TImode LSE case.
> 
> Bootstrapped/regtested on aarch64-linux, ok for trunk?

An alternative fix would be to use a mode_attr to pick a different 
predicate for TImode.  But that's probably just a matter of taste; I'm 
not sure that one would be better than the other in reality.

OK (or with my suggestion if you prefer).

R.

> 
> 2024-03-14  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR target/114310
> 	* config/aarch64/aarch64.cc (aarch64_expand_compare_and_swap): For
> 	TImode force newval into a register.
> 
> 	* gcc.dg/pr114310.c: New test.
> 
> --- gcc/config/aarch64/aarch64.cc.jj	2024-03-12 10:16:12.024101665 +0100
> +++ gcc/config/aarch64/aarch64.cc	2024-03-13 18:55:39.147986554 +0100
> @@ -24693,6 +24693,8 @@ aarch64_expand_compare_and_swap (rtx ope
>           rval = copy_to_mode_reg (r_mode, oldval);
>         else
>   	emit_move_insn (rval, gen_lowpart (r_mode, oldval));
> +      if (mode == TImode)
> +	newval = force_reg (mode, newval);
>   
>         emit_insn (gen_aarch64_compare_and_swap_lse (mode, rval, mem,
>   						   newval, mod_s));
> --- gcc/testsuite/gcc.dg/pr114310.c.jj	2024-03-13 19:09:25.322597418 +0100
> +++ gcc/testsuite/gcc.dg/pr114310.c	2024-03-13 19:08:50.802073314 +0100
> @@ -0,0 +1,20 @@
> +/* PR target/114310 */
> +/* { dg-do run { target int128 } } */
> +
> +volatile __attribute__((aligned (sizeof (__int128_t)))) __int128_t v = 10;
> +
> +int
> +main ()
> +{
> +#if __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
> +  if (__sync_val_compare_and_swap (&v, (__int128_t) 10, (__int128_t) 0) != 10)
> +    __builtin_abort ();
> +  if (__sync_val_compare_and_swap (&v, (__int128_t) 10, (__int128_t) 15) != 0)
> +    __builtin_abort ();
> +  if (__sync_val_compare_and_swap (&v, (__int128_t) 0, (__int128_t) 42) != 0)
> +    __builtin_abort ();
> +  if (__sync_val_compare_and_swap (&v, (__int128_t) 31, (__int128_t) 35) != 42)
> +    __builtin_abort ();
> +#endif
> +  return 0;
> +}
> 
> 	Jakub
> 

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] aarch64: Fix TImode __sync_*_compare_and_exchange expansion with LSE [PR114310]
  2024-03-14 11:49 ` Richard Earnshaw (lists)
@ 2024-03-14 13:02   ` Jakub Jelinek
  0 siblings, 0 replies; 3+ messages in thread
From: Jakub Jelinek @ 2024-03-14 13:02 UTC (permalink / raw)
  To: Richard Earnshaw (lists); +Cc: Richard Sandiford, Kyrylo Tkachov, gcc-patches

On Thu, Mar 14, 2024 at 11:49:14AM +0000, Richard Earnshaw (lists) wrote:
> On 14/03/2024 08:37, Jakub Jelinek wrote:
> > Hi!
> > 
> > The following testcase ICEs with LSE atomics.
> > The problem is that the @atomic_compare_and_swap<mode> expander uses
> > aarch64_reg_or_zero predicate for the desired operand, which is fine,
> > given that for most of the modes and even for TImode in some cases
> > it can handle zero immediate just fine, but the TImode
> > @aarch64_compare_and_swap<mode>_lse just uses register_operand for
> > that operand instead, again intentionally so, because the casp,
> > caspa, caspl and caspal instructions need to use a pair of consecutive
> > registers for the operand and xzr is just one register and we can't
> > just store zero into the link register to emulate pair of zeros.
> > 
> > So, the following patch fixes that by forcing the newval operand into
> > a register for the TImode LSE case.
> > 
> > Bootstrapped/regtested on aarch64-linux, ok for trunk?
> 
> An alternative fix would be to use a mode_attr to pick a different predicate
> for TImode.  But that's probably just a matter of taste; I'm not sure that
> one would be better than the other in reality.

The reason I didn't do something like that is that for the non-LSE case,
@aarch64_compare_and_swap<mode> pattern actually handles const0_rtx desired
operand and so does the TARGET_OUTLINE_ATOMICS case, so if it was done through
a predicate, it would need to be a predicate that would be specific to this
case and would yield register_operand for TImode && TARGET_LSE and and
otherwise aarch64_reg_or_zero.  The patch seems to be simpler to me,
especially when aarch64_expand_compare_and_swap already does the forcing
stuff into registers at various other spots.

	Jakub


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2024-03-14 13:02 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-14  8:37 [PATCH] aarch64: Fix TImode __sync_*_compare_and_exchange expansion with LSE [PR114310] Jakub Jelinek
2024-03-14 11:49 ` Richard Earnshaw (lists)
2024-03-14 13:02   ` Jakub Jelinek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).