public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] lra: Replace subregs in bare uses & clobbers [PR108681]
@ 2023-02-07 10:29 Richard Sandiford
  2023-02-13  6:58 ` Jeff Law
  0 siblings, 1 reply; 3+ messages in thread
From: Richard Sandiford @ 2023-02-07 10:29 UTC (permalink / raw)
  To: gcc-patches

In this PR we had a write to one vector of a 4-vector tuple.
The vector had mode V1DI, and the target doesn't provide V1DI
moves, so this was converted into:

    (clobber (subreg:V1DI (reg/v:V4x1DI 92 [ b ]) 24))

followed by a DImode move.  (The clobber isn't really necessary
or helpful for a single word, but would be for wider moves.)

The subreg in the clobber survived until after RA:

    (clobber (subreg:V1DI (reg/v:V4x1DI 34 v2 [orig:92 b ] [92]) 24))

IMO this isn't well-formed.  If a subreg of a hard register simplifies
to a hard register, it should be replaced by the hard register.  If the
subreg doesn't simplify, then target-independent code can't be sure
which parts of the register are affected and which aren't.  A clobber
of such a subreg isn't useful and (again IMO) should just be removed.
Conversely, a use of such a subreg is effectively a use of the whole
inner register.

LRA has code to simplify subregs of hard registers, but it didn't
handle bare uses and clobbers.  The patch extends it to do that.

One question was whether the final_p argument to alter_subregs
should be true or false.  True is IMO dangerous, since it forces
replacements that might not be valid from a dataflow perspective,
and uses and clobbers only exist for dataflow.  As said above,
I think the correct way of handling a failed simplification would
be to delete clobbers and replace uses of subregs with uses of
the inner register.  But I didn't want to write untested code
to do that.

In the PR, the clobber caused an infinite loop in DCE, because
of a disagreement about what effect the clobber had.  But for
the reasons above, I think that was GIGO rather than a bug in
DF or DCE.

Tested on aarch64-linux-gnu & x86_64-linux-gnu.  OK to install?

Richard


gcc/
	PR rtl-optimization/108681
	* lra-spills.cc (lra_final_code_change): Extend subreg replacement
	code to handle bare uses and clobbers.

gcc/testsuite/
	PR rtl-optimization/108681
	* gcc.target/aarch64/pr108681.c: New test.
---
 gcc/lra-spills.cc                           |  3 +++
 gcc/testsuite/gcc.target/aarch64/pr108681.c | 15 +++++++++++++++
 2 files changed, 18 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr108681.c

diff --git a/gcc/lra-spills.cc b/gcc/lra-spills.cc
index a8d7e60acd3..4af85c49d43 100644
--- a/gcc/lra-spills.cc
+++ b/gcc/lra-spills.cc
@@ -860,6 +860,9 @@ lra_final_code_change (void)
 		lra_update_dup (id, i);
 		insn_change_p = true;
 	      }
+	  if ((GET_CODE (pat) == USE || GET_CODE (pat) == CLOBBER)
+	      && alter_subregs (&XEXP (pat, 0), false))
+	    insn_change_p = true;
 	  if (insn_change_p)
 	    lra_update_operator_dups (id);
 
diff --git a/gcc/testsuite/gcc.target/aarch64/pr108681.c b/gcc/testsuite/gcc.target/aarch64/pr108681.c
new file mode 100644
index 00000000000..2391eaac2f2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr108681.c
@@ -0,0 +1,15 @@
+/* { dg-options "-O" } */
+
+#pragma GCC aarch64 "arm_neon.h"
+typedef __Int64x1_t int64x1_t;
+void foo (int64x1x4_t);
+
+void
+bar (int64x1_t a)
+{
+  for (;;) {
+    int64x1x4_t b;
+    b.val[3] = a;
+    foo (b);
+  }
+}
-- 
2.25.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] lra: Replace subregs in bare uses & clobbers [PR108681]
  2023-02-07 10:29 [PATCH] lra: Replace subregs in bare uses & clobbers [PR108681] Richard Sandiford
@ 2023-02-13  6:58 ` Jeff Law
  2023-02-13 21:13   ` Richard Sandiford
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Law @ 2023-02-13  6:58 UTC (permalink / raw)
  To: gcc-patches, richard.sandiford



On 2/7/23 03:29, Richard Sandiford via Gcc-patches wrote:
> In this PR we had a write to one vector of a 4-vector tuple.
> The vector had mode V1DI, and the target doesn't provide V1DI
> moves, so this was converted into:
> 
>      (clobber (subreg:V1DI (reg/v:V4x1DI 92 [ b ]) 24))
> 
> followed by a DImode move.  (The clobber isn't really necessary
> or helpful for a single word, but would be for wider moves.)
> 
> The subreg in the clobber survived until after RA:
> 
>      (clobber (subreg:V1DI (reg/v:V4x1DI 34 v2 [orig:92 b ] [92]) 24))
Post-reload all (subregs (reg)) expressions are supposed to be 
simplified.  At least that's my recollection.  Though it looks like we 
don't force the simplification until final assembly output.

One might question under what circumstances simplifying (subreg (reg)) 
can legitimately fail.


> IMO this isn't well-formed.  If a subreg of a hard register simplifies
> to a hard register, it should be replaced by the hard register.  If the
> subreg doesn't simplify, then target-independent code can't be sure
> which parts of the register are affected and which aren't.  A clobber
> of such a subreg isn't useful and (again IMO) should just be removed.
> Conversely, a use of such a subreg is effectively a use of the whole
> inner register.
Agreed.

I'm not even sure that naked USE/CLOBBERS have any value post-reload 
except for the use of the return register(s) and those inserted by 
reorg.  But changing that at this stage seems inadvisable.


> 
> LRA has code to simplify subregs of hard registers, but it didn't
> handle bare uses and clobbers.  The patch extends it to do that.
> 
> One question was whether the final_p argument to alter_subregs
> should be true or false.  True is IMO dangerous, since it forces
> replacements that might not be valid from a dataflow perspective,
> and uses and clobbers only exist for dataflow.  As said above,
> I think the correct way of handling a failed simplification would
> be to delete clobbers and replace uses of subregs with uses of
> the inner register.  But I didn't want to write untested code
> to do that.
I'd go with "false" here after reviewing the code.



> 
> In the PR, the clobber caused an infinite loop in DCE, because
> of a disagreement about what effect the clobber had.  But for
> the reasons above, I think that was GIGO rather than a bug in
> DF or DCE.
> 
> Tested on aarch64-linux-gnu & x86_64-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> gcc/
> 	PR rtl-optimization/108681
> 	* lra-spills.cc (lra_final_code_change): Extend subreg replacement
> 	code to handle bare uses and clobbers.
> 
> gcc/testsuite/
> 	PR rtl-optimization/108681
> 	* gcc.target/aarch64/pr108681.c: New test.
OK
jeff

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] lra: Replace subregs in bare uses & clobbers [PR108681]
  2023-02-13  6:58 ` Jeff Law
@ 2023-02-13 21:13   ` Richard Sandiford
  0 siblings, 0 replies; 3+ messages in thread
From: Richard Sandiford @ 2023-02-13 21:13 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

Jeff Law <jeffreyalaw@gmail.com> writes:
> On 2/7/23 03:29, Richard Sandiford via Gcc-patches wrote:
>> In this PR we had a write to one vector of a 4-vector tuple.
>> The vector had mode V1DI, and the target doesn't provide V1DI
>> moves, so this was converted into:
>> 
>>      (clobber (subreg:V1DI (reg/v:V4x1DI 92 [ b ]) 24))
>> 
>> followed by a DImode move.  (The clobber isn't really necessary
>> or helpful for a single word, but would be for wider moves.)
>> 
>> The subreg in the clobber survived until after RA:
>> 
>>      (clobber (subreg:V1DI (reg/v:V4x1DI 34 v2 [orig:92 b ] [92]) 24))
> Post-reload all (subregs (reg)) expressions are supposed to be 
> simplified.  At least that's my recollection.  Though it looks like we 
> don't force the simplification until final assembly output.
>
> One might question under what circumstances simplifying (subreg (reg)) 
> can legitimately fail.

My memory's hazy, but I think e500 had instances of this.  e500's long
gone though, so maybe it's a non-issue now.

>> IMO this isn't well-formed.  If a subreg of a hard register simplifies
>> to a hard register, it should be replaced by the hard register.  If the
>> subreg doesn't simplify, then target-independent code can't be sure
>> which parts of the register are affected and which aren't.  A clobber
>> of such a subreg isn't useful and (again IMO) should just be removed.
>> Conversely, a use of such a subreg is effectively a use of the whole
>> inner register.
> Agreed.
>
> I'm not even sure that naked USE/CLOBBERS have any value post-reload 
> except for the use of the return register(s) and those inserted by 
> reorg.  But changing that at this stage seems inadvisable.

Yeah, not sure either about USEs.  I think the CLOBBERs can still be
useful as a way of avoiding partially-uninitialised registers becoming
too upwards-exposed.  E.g. when a 4-register hardreg is used and only
one register is set, the CLOBBER prevents the other 3 registers being
live on entry, or at least being kept live after some earlier unrelated
use.  That should give things like regrename more freedom.

Thanks for the review, now pushed.

Richard

>> LRA has code to simplify subregs of hard registers, but it didn't
>> handle bare uses and clobbers.  The patch extends it to do that.
>> 
>> One question was whether the final_p argument to alter_subregs
>> should be true or false.  True is IMO dangerous, since it forces
>> replacements that might not be valid from a dataflow perspective,
>> and uses and clobbers only exist for dataflow.  As said above,
>> I think the correct way of handling a failed simplification would
>> be to delete clobbers and replace uses of subregs with uses of
>> the inner register.  But I didn't want to write untested code
>> to do that.
> I'd go with "false" here after reviewing the code.
>
>
>
>> 
>> In the PR, the clobber caused an infinite loop in DCE, because
>> of a disagreement about what effect the clobber had.  But for
>> the reasons above, I think that was GIGO rather than a bug in
>> DF or DCE.
>> 
>> Tested on aarch64-linux-gnu & x86_64-linux-gnu.  OK to install?
>> 
>> Richard
>> 
>> 
>> gcc/
>> 	PR rtl-optimization/108681
>> 	* lra-spills.cc (lra_final_code_change): Extend subreg replacement
>> 	code to handle bare uses and clobbers.
>> 
>> gcc/testsuite/
>> 	PR rtl-optimization/108681
>> 	* gcc.target/aarch64/pr108681.c: New test.
> OK
> jeff

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-02-13 21:13 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-02-07 10:29 [PATCH] lra: Replace subregs in bare uses & clobbers [PR108681] Richard Sandiford
2023-02-13  6:58 ` Jeff Law
2023-02-13 21:13   ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).