public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] [PR100106] Reject unaligned subregs when strict alignment is required
@ 2022-05-05  6:52 Alexandre Oliva
  2022-05-05  7:59 ` Richard Sandiford
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Alexandre Oliva @ 2022-05-05  6:52 UTC (permalink / raw)
  To: gcc-patches; +Cc: ebotcazou, vmakarov, segher, dje.gcc


The testcase for pr100106, compiled with optimization for 32-bit
powerpc -mcpu=604 with -mstrict-align expands the initialization of a
union from a float _Complex value into a load from an SCmode
constant pool entry, aligned to 4 bytes, into a DImode pseudo,
requiring 8-byte alignment.

The patch that introduced the testcase modified simplify_subreg to
avoid changing the MEM to outermode, but simplify_gen_subreg still
creates a SUBREG or a MEM that would require stricter alignment than
MEM's, and lra_constraints appears to get confused by that, repeatedly
creating unsatisfiable reloads for the SUBREG until it exceeds the
insn count.

Avoiding the unaligned SUBREG, expand splits the DImode dest into
SUBREGs and loads each SImode word of the constant pool with the
proper alignment.


At the time of posting this patch, it occurred to me that maybe the test
should allow paradoxical subregs of mems, or even that non-paradoxical
subregs of mems should be allowed to change to a mode with stricter
alignment, and the register allocator should deal with that somehow.
WDYT?


Regstrapped on x86_64-linux-gnu and ppc64le-linux-gnu, also tested
targeting ppc- and ppc64-vx7r2.  Ok to install?


for  gcc/ChangeLog

	PR target/100106
	* emit-rtl.c (validate_subreg): Reject a SUBREG of a MEM that
	requires stricter alignment than MEM's.

for  gcc/testsuite/ChangeLog

	PR target/100106
	* gcc.target/powerpc/pr100106-sa.c: New.
---
 gcc/emit-rtl.cc                                |    3 +++
 gcc/testsuite/gcc.target/powerpc/pr100106-sa.c |    4 ++++
 2 files changed, 7 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr100106-sa.c

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 1e02ae254d012..642e47eada0d7 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -982,6 +982,9 @@ validate_subreg (machine_mode omode, machine_mode imode,
 
       return subreg_offset_representable_p (regno, imode, offset, omode);
     }
+  else if (reg && MEM_P (reg)
+	   && STRICT_ALIGNMENT && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode))
+    return false;
 
   /* The outer size must be ordered wrt the register size, otherwise
      we wouldn't know at compile time how many registers the outer
diff --git a/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c b/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
new file mode 100644
index 0000000000000..6cc29595c8b25
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
@@ -0,0 +1,4 @@
+/* { dg-do compile { target { ilp32 } } } */
+/* { dg-options "-mcpu=604 -O -mstrict-align" } */
+
+#include "../../gcc.c-torture/compile/pr100106.c"


-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] [PR100106] Reject unaligned subregs when strict alignment is required
  2022-05-05  6:52 [PATCH] [PR100106] Reject unaligned subregs when strict alignment is required Alexandre Oliva
@ 2022-05-05  7:59 ` Richard Sandiford
  2022-05-05 13:50   ` Segher Boessenkool
  2022-05-05 14:33 ` [PATCH] " Segher Boessenkool
  2022-05-06 18:04 ` [PATCH] " Vladimir Makarov
  2 siblings, 1 reply; 11+ messages in thread
From: Richard Sandiford @ 2022-05-05  7:59 UTC (permalink / raw)
  To: Alexandre Oliva via Gcc-patches; +Cc: Alexandre Oliva, dje.gcc, segher

Alexandre Oliva via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> The testcase for pr100106, compiled with optimization for 32-bit
> powerpc -mcpu=604 with -mstrict-align expands the initialization of a
> union from a float _Complex value into a load from an SCmode
> constant pool entry, aligned to 4 bytes, into a DImode pseudo,
> requiring 8-byte alignment.
>
> The patch that introduced the testcase modified simplify_subreg to
> avoid changing the MEM to outermode, but simplify_gen_subreg still
> creates a SUBREG or a MEM that would require stricter alignment than
> MEM's, and lra_constraints appears to get confused by that, repeatedly
> creating unsatisfiable reloads for the SUBREG until it exceeds the
> insn count.
>
> Avoiding the unaligned SUBREG, expand splits the DImode dest into
> SUBREGs and loads each SImode word of the constant pool with the
> proper alignment.
>
>
> At the time of posting this patch, it occurred to me that maybe the test
> should allow paradoxical subregs of mems, or even that non-paradoxical
> subregs of mems should be allowed to change to a mode with stricter
> alignment, and the register allocator should deal with that somehow.
> WDYT?
>
>
> Regstrapped on x86_64-linux-gnu and ppc64le-linux-gnu, also tested
> targeting ppc- and ppc64-vx7r2.  Ok to install?
>
>
> for  gcc/ChangeLog
>
> 	PR target/100106
> 	* emit-rtl.c (validate_subreg): Reject a SUBREG of a MEM that
> 	requires stricter alignment than MEM's.

I know this is the best being the enemy of the good, but given
that we're at the start of stage 1, would it be feasible to try
to get rid of (subreg (mem)) altogether for GCC 13?  We could do
it target-by-target, with a target macro (yes, macro :-)) that opts
in to keeping the existing behaviour.  (subreg (mem)) would then be
unconditionally invalid when the macro isn't defined.  (Even in
debug expressions, since those ought to narrow to a mem anyway.)

Thanks,
Richard

> for  gcc/testsuite/ChangeLog
>
> 	PR target/100106
> 	* gcc.target/powerpc/pr100106-sa.c: New.
> ---
>  gcc/emit-rtl.cc                                |    3 +++
>  gcc/testsuite/gcc.target/powerpc/pr100106-sa.c |    4 ++++
>  2 files changed, 7 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
>
> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> index 1e02ae254d012..642e47eada0d7 100644
> --- a/gcc/emit-rtl.cc
> +++ b/gcc/emit-rtl.cc
> @@ -982,6 +982,9 @@ validate_subreg (machine_mode omode, machine_mode imode,
>  
>        return subreg_offset_representable_p (regno, imode, offset, omode);
>      }
> +  else if (reg && MEM_P (reg)
> +	   && STRICT_ALIGNMENT && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode))
> +    return false;
>  
>    /* The outer size must be ordered wrt the register size, otherwise
>       we wouldn't know at compile time how many registers the outer
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c b/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
> new file mode 100644
> index 0000000000000..6cc29595c8b25
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
> @@ -0,0 +1,4 @@
> +/* { dg-do compile { target { ilp32 } } } */
> +/* { dg-options "-mcpu=604 -O -mstrict-align" } */
> +
> +#include "../../gcc.c-torture/compile/pr100106.c"

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] [PR100106] Reject unaligned subregs when strict alignment is required
  2022-05-05  7:59 ` Richard Sandiford
@ 2022-05-05 13:50   ` Segher Boessenkool
  2022-05-06 10:57     ` [PATCH v2 2/2] " Alexandre Oliva
  0 siblings, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2022-05-05 13:50 UTC (permalink / raw)
  To: Alexandre Oliva via Gcc-patches, Alexandre Oliva, dje.gcc,
	richard.sandiford

On Thu, May 05, 2022 at 08:59:21AM +0100, Richard Sandiford wrote:
> Alexandre Oliva via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> I know this is the best being the enemy of the good, but given
> that we're at the start of stage 1, would it be feasible to try
> to get rid of (subreg (mem)) altogether for GCC 13?

Yes please!

> We could do
> it target-by-target, with a target macro (yes, macro :-)) that opts
> in to keeping the existing behaviour.  (subreg (mem)) would then be
> unconditionally invalid when the macro isn't defined.  (Even in
> debug expressions, since those ought to narrow to a mem anyway.)

Or we can simply threaten to drop all unconverted targets.  That way at
least there is a *chance* (a slim chance, but still) that the conversion
will ever be finished.

Paradoxical subregs of memory are already not allowed on targets with
instruction scheduling, btw.


Segher

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] [PR100106] Reject unaligned subregs when strict alignment is required
  2022-05-05  6:52 [PATCH] [PR100106] Reject unaligned subregs when strict alignment is required Alexandre Oliva
  2022-05-05  7:59 ` Richard Sandiford
@ 2022-05-05 14:33 ` Segher Boessenkool
  2022-05-06  2:41   ` [PATCH v2] " Alexandre Oliva
  2022-05-06 18:04 ` [PATCH] " Vladimir Makarov
  2 siblings, 1 reply; 11+ messages in thread
From: Segher Boessenkool @ 2022-05-05 14:33 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, ebotcazou, vmakarov, dje.gcc

On Thu, May 05, 2022 at 03:52:01AM -0300, Alexandre Oliva wrote:
> The testcase for pr100106, compiled with optimization for 32-bit
> powerpc -mcpu=604 with -mstrict-align expands the initialization of a
> union from a float _Complex value into a load from an SCmode
> constant pool entry, aligned to 4 bytes, into a DImode pseudo,
> requiring 8-byte alignment.

> +  else if (reg && MEM_P (reg)
> +	   && STRICT_ALIGNMENT && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode))
> +    return false;

Please fix the line breaks?  Either do a break before every &&, or put
as many things as possible on one line?

Note that you should never have paradoxical subregs of mem on rs6000 or
any other target with INSN_SCHEDULING.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
> @@ -0,0 +1,4 @@
> +/* { dg-do compile { target { ilp32 } } } */
> +/* { dg-options "-mcpu=604 -O -mstrict-align" } */
> +
> +#include "../../gcc.c-torture/compile/pr100106.c"

It is better to copy the 11 lines of code.

Please comment what the ilp32 is for (namely, the -mcpu= will barf
without it)..  The testcase is okay with those changes, thanks!


Seghr

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] [PR100106] Reject unaligned subregs when strict alignment is required
  2022-05-05 14:33 ` [PATCH] " Segher Boessenkool
@ 2022-05-06  2:41   ` Alexandre Oliva
  2022-07-09 17:14     ` Jeff Law
  2023-05-24  5:39     ` Alexandre Oliva
  0 siblings, 2 replies; 11+ messages in thread
From: Alexandre Oliva @ 2022-05-06  2:41 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, ebotcazou, vmakarov, dje.gcc

On May  5, 2022, Segher Boessenkool <segher@kernel.crashing.org> wrote:

> On Thu, May 05, 2022 at 03:52:01AM -0300, Alexandre Oliva wrote:
>> +  else if (reg && MEM_P (reg)
>> +	   && STRICT_ALIGNMENT && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode))
>> +    return false;

> Please fix the line breaks?  Either do a break before every &&, or put
> as many things as possible on one line?

I was going for conceptual grouping of alignment-related subexprs,
but I don't care enough to fight for it.

> Note that you should never have paradoxical subregs of mem on rs6000 or
> any other target with INSN_SCHEDULING.

Great, that alleviates some of my concerns about overreaching in this patch.

>> +#include "../../gcc.c-torture/compile/pr100106.c"

> It is better to copy the 11 lines of code.

'k

> Please comment what the ilp32 is for (namely, the -mcpu= will barf
> without it)..

Ack

> The testcase is okay with those changes, thanks!

Thanks.  Here's the revised patch.

I'm now testing on several platforms a follow-up patch that introduces
TARGET_ALLOW_SUBREG_OF_MEM.


[PR100106] Reject unaligned subregs when strict alignment is required

From: Alexandre Oliva <oliva@adacore.com>

The testcase for pr100106, compiled with optimization for 32-bit
powerpc -mcpu=604 with -mstrict-align expands the initialization of a
union from a float _Complex value into a load from an SCmode
constant pool entry, aligned to 4 bytes, into a DImode pseudo,
requiring 8-byte alignment.

The patch that introduced the testcase modified simplify_subreg to
avoid changing the MEM to outermode, but simplify_gen_subreg still
creates a SUBREG or a MEM that would require stricter alignment than
MEM's, and lra_constraints appears to get confused by that, repeatedly
creating unsatisfiable reloads for the SUBREG until it exceeds the
insn count.

Avoiding the unaligned SUBREG, expand splits the DImode dest into
SUBREGs and loads each SImode word of the constant pool with the
proper alignment.


for  gcc/ChangeLog

	PR target/100106
	* emit-rtl.cc (validate_subreg): Reject a SUBREG of a MEM that
	requires stricter alignment than MEM's.

for  gcc/testsuite/ChangeLog

	PR target/100106
	* gcc.target/powerpc/pr100106-sa.c: New.
---
 gcc/emit-rtl.cc                                |    4 ++++
 gcc/testsuite/gcc.target/powerpc/pr100106-sa.c |   15 +++++++++++++++
 2 files changed, 19 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr100106-sa.c

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 1e02ae254d012..9c03e27894fff 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -982,6 +982,10 @@ validate_subreg (machine_mode omode, machine_mode imode,
 
       return subreg_offset_representable_p (regno, imode, offset, omode);
     }
+  /* Do not allow SUBREG with stricter alignment than the inner MEM.  */
+  else if (reg && MEM_P (reg) && STRICT_ALIGNMENT
+	   && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode))
+    return false;
 
   /* The outer size must be ordered wrt the register size, otherwise
      we wouldn't know at compile time how many registers the outer
diff --git a/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c b/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
new file mode 100644
index 0000000000000..87634efa8d0b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
@@ -0,0 +1,15 @@
+/* Require ilp32 because -mcpu=604 won't do 64 bits.  */
+/* { dg-do compile { target { ilp32 } } } */
+/* { dg-options "-mcpu=604 -O -mstrict-align" } */
+
+union a {
+  float _Complex b;
+  long long c;
+};
+
+void g(union a);
+
+void e() {
+  union a f = {1.0f};
+  g(f);
+}


-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] [PR100106] Reject unaligned subregs when strict alignment is required
  2022-05-05 13:50   ` Segher Boessenkool
@ 2022-05-06 10:57     ` Alexandre Oliva
  2022-05-09  8:09       ` Richard Sandiford
  0 siblings, 1 reply; 11+ messages in thread
From: Alexandre Oliva @ 2022-05-06 10:57 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Alexandre Oliva via Gcc-patches, dje.gcc, richard.sandiford

On May  5, 2022, Segher Boessenkool <segher@kernel.crashing.org> wrote:

> On Thu, May 05, 2022 at 08:59:21AM +0100, Richard Sandiford wrote:
>> Alexandre Oliva via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> I know this is the best being the enemy of the good, but given
>> that we're at the start of stage 1, would it be feasible to try
>> to get rid of (subreg (mem)) altogether for GCC 13?

> Yes please!

I'm not sure this is what you two had in mind, but the news I have is
not great.  With this patch, x86_64 has some regressions in vector
testcases (*), and ppc64le doesn't bootstrap (tsan_interface_atomic.o
ends up with a nil SET_DEST in split all insns).  aarch64 is still
building stage2.

I'm not sure this is enough.  IIRC register allocation modifies in place
pseudos that can't be assigned to hard registers, turning them into
MEMs.  If that's so, SUBREGs of such pseudos will silently become
SUBREGs of MEMs, and I don't know that they are validated again and, if
so, what happens to those that fail validation.

I kind of feel that this is more than I can tackle ATM, so I'd
appreciate if someone else would take this up and drive this transition.


Disallow SUBREG of MEM

Introduce TARGET_ALLOW_SUBREG_OF_MEM, defaulting to 0.

Reject SUBREG of MEM regardless of alignment, unless the macro is
defined to nonzero.


for  gcc/ChangeLog

	PR target/100106
	* emit-rtl.cc (validate_subreg) [!TARGET_ALLOW_SUBREG_OF_MEM]:
	Reject SUBREG of MEM.
---
 gcc/emit-rtl.cc |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
index 9c03e27894fff..f055179b3b8a6 100644
--- a/gcc/emit-rtl.cc
+++ b/gcc/emit-rtl.cc
@@ -983,8 +983,12 @@ validate_subreg (machine_mode omode, machine_mode imode,
       return subreg_offset_representable_p (regno, imode, offset, omode);
     }
   /* Do not allow SUBREG with stricter alignment than the inner MEM.  */
-  else if (reg && MEM_P (reg) && STRICT_ALIGNMENT
-	   && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode))
+  else if (reg && MEM_P (reg)
+#if TARGET_ALLOW_SUBREG_OF_MEM /* ??? Reject them all eventually.  */
+	   && STRICT_ALIGNMENT
+	   && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode)
+#endif
+	   )
     return false;
 
   /* The outer size must be ordered wrt the register size, otherwise



(*) here are the x86_64 regressions introduced by the patch:

+ FAIL: gcc.target/i386/avx-2.c (internal compiler error: in gen_rtx_SUBREG, at emit-rtl.cc:1030)
+ FAIL: gcc.target/i386/avx-2.c (test for excess errors)
+ FAIL: gcc.target/i386/sse-14.c (internal compiler error: in gen_rtx_SUBREG, at emit-rtl.cc:1030)
+ FAIL: gcc.target/i386/sse-14.c (test for excess errors)
+ FAIL: gcc.target/i386/sse-22.c (internal compiler error: in gen_rtx_SUBREG, at emit-rtl.cc:1030)
+ FAIL: gcc.target/i386/sse-22.c (test for excess errors)
+ FAIL: gcc.target/i386/sse-22a.c (internal compiler error: in gen_rtx_SUBREG, at emit-rtl.cc:1030)
+ FAIL: gcc.target/i386/sse-22a.c (test for excess errors)

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] [PR100106] Reject unaligned subregs when strict alignment is required
  2022-05-05  6:52 [PATCH] [PR100106] Reject unaligned subregs when strict alignment is required Alexandre Oliva
  2022-05-05  7:59 ` Richard Sandiford
  2022-05-05 14:33 ` [PATCH] " Segher Boessenkool
@ 2022-05-06 18:04 ` Vladimir Makarov
  2 siblings, 0 replies; 11+ messages in thread
From: Vladimir Makarov @ 2022-05-06 18:04 UTC (permalink / raw)
  To: Alexandre Oliva, gcc-patches; +Cc: ebotcazou, segher, dje.gcc


On 2022-05-05 02:52, Alexandre Oliva wrote:
>
> Regstrapped on x86_64-linux-gnu and ppc64le-linux-gnu, also tested
> targeting ppc- and ppc64-vx7r2.  Ok to install?
>
I am ok with the modified version of the patch.  It looks reasonable for 
me and I support its commit.

But I think I can not approve the patch formally as emit-rtl.cc is out 
of my jurisdiction and validate_subreg is used in many places besides RA.

Sorry, Alex, some global reviewer should do this.

> for  gcc/ChangeLog
>
> 	PR target/100106
> 	* emit-rtl.c (validate_subreg): Reject a SUBREG of a MEM that
> 	requires stricter alignment than MEM's.
>
> for  gcc/testsuite/ChangeLog
>
> 	PR target/100106
> 	* gcc.target/powerpc/pr100106-sa.c: New.
> ---
>   gcc/emit-rtl.cc                                |    3 +++
>   gcc/testsuite/gcc.target/powerpc/pr100106-sa.c |    4 ++++
>   2 files changed, 7 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
>
> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> index 1e02ae254d012..642e47eada0d7 100644
> --- a/gcc/emit-rtl.cc
> +++ b/gcc/emit-rtl.cc
> @@ -982,6 +982,9 @@ validate_subreg (machine_mode omode, machine_mode imode,
>   
>         return subreg_offset_representable_p (regno, imode, offset, omode);
>       }
> +  else if (reg && MEM_P (reg)
> +	   && STRICT_ALIGNMENT && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode))
> +    return false;
>   
>     /* The outer size must be ordered wrt the register size, otherwise
>        we wouldn't know at compile time how many registers the outer
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c b/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
> new file mode 100644
> index 0000000000000..6cc29595c8b25
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100106-sa.c
> @@ -0,0 +1,4 @@
> +/* { dg-do compile { target { ilp32 } } } */
> +/* { dg-options "-mcpu=604 -O -mstrict-align" } */
> +
> +#include "../../gcc.c-torture/compile/pr100106.c"
>
>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2 2/2] [PR100106] Reject unaligned subregs when strict alignment is required
  2022-05-06 10:57     ` [PATCH v2 2/2] " Alexandre Oliva
@ 2022-05-09  8:09       ` Richard Sandiford
  0 siblings, 0 replies; 11+ messages in thread
From: Richard Sandiford @ 2022-05-09  8:09 UTC (permalink / raw)
  To: Alexandre Oliva
  Cc: Segher Boessenkool, Alexandre Oliva via Gcc-patches, dje.gcc

Alexandre Oliva <oliva@adacore.com> writes:
> On May  5, 2022, Segher Boessenkool <segher@kernel.crashing.org> wrote:
>
>> On Thu, May 05, 2022 at 08:59:21AM +0100, Richard Sandiford wrote:
>>> Alexandre Oliva via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>>> I know this is the best being the enemy of the good, but given
>>> that we're at the start of stage 1, would it be feasible to try
>>> to get rid of (subreg (mem)) altogether for GCC 13?
>
>> Yes please!
>
> I'm not sure this is what you two had in mind, but the news I have is
> not great.  With this patch, x86_64 has some regressions in vector
> testcases (*), and ppc64le doesn't bootstrap (tsan_interface_atomic.o
> ends up with a nil SET_DEST in split all insns).  aarch64 is still
> building stage2.
>
> I'm not sure this is enough.  IIRC register allocation modifies in place
> pseudos that can't be assigned to hard registers, turning them into
> MEMs.  If that's so, SUBREGs of such pseudos will silently become
> SUBREGs of MEMs, and I don't know that they are validated again and, if
> so, what happens to those that fail validation.

Yeah, the changes would be a bit more invasive than this.  They would
touch more than just emit-rtl.cc.

> I kind of feel that this is more than I can tackle ATM, so I'd
> appreciate if someone else would take this up and drive this transition.

OK, I'll have a go if there's time.

Thanks,
Richard

> Disallow SUBREG of MEM
>
> Introduce TARGET_ALLOW_SUBREG_OF_MEM, defaulting to 0.
>
> Reject SUBREG of MEM regardless of alignment, unless the macro is
> defined to nonzero.
>
>
> for  gcc/ChangeLog
>
> 	PR target/100106
> 	* emit-rtl.cc (validate_subreg) [!TARGET_ALLOW_SUBREG_OF_MEM]:
> 	Reject SUBREG of MEM.
> ---
>  gcc/emit-rtl.cc |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc
> index 9c03e27894fff..f055179b3b8a6 100644
> --- a/gcc/emit-rtl.cc
> +++ b/gcc/emit-rtl.cc
> @@ -983,8 +983,12 @@ validate_subreg (machine_mode omode, machine_mode imode,
>        return subreg_offset_representable_p (regno, imode, offset, omode);
>      }
>    /* Do not allow SUBREG with stricter alignment than the inner MEM.  */
> -  else if (reg && MEM_P (reg) && STRICT_ALIGNMENT
> -	   && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode))
> +  else if (reg && MEM_P (reg)
> +#if TARGET_ALLOW_SUBREG_OF_MEM /* ??? Reject them all eventually.  */
> +	   && STRICT_ALIGNMENT
> +	   && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode)
> +#endif
> +	   )
>      return false;
>  
>    /* The outer size must be ordered wrt the register size, otherwise
>
>
>
> (*) here are the x86_64 regressions introduced by the patch:
>
> + FAIL: gcc.target/i386/avx-2.c (internal compiler error: in gen_rtx_SUBREG, at emit-rtl.cc:1030)
> + FAIL: gcc.target/i386/avx-2.c (test for excess errors)
> + FAIL: gcc.target/i386/sse-14.c (internal compiler error: in gen_rtx_SUBREG, at emit-rtl.cc:1030)
> + FAIL: gcc.target/i386/sse-14.c (test for excess errors)
> + FAIL: gcc.target/i386/sse-22.c (internal compiler error: in gen_rtx_SUBREG, at emit-rtl.cc:1030)
> + FAIL: gcc.target/i386/sse-22.c (test for excess errors)
> + FAIL: gcc.target/i386/sse-22a.c (internal compiler error: in gen_rtx_SUBREG, at emit-rtl.cc:1030)
> + FAIL: gcc.target/i386/sse-22a.c (test for excess errors)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] [PR100106] Reject unaligned subregs when strict alignment is required
  2022-05-06  2:41   ` [PATCH v2] " Alexandre Oliva
@ 2022-07-09 17:14     ` Jeff Law
  2023-05-24  5:39     ` Alexandre Oliva
  1 sibling, 0 replies; 11+ messages in thread
From: Jeff Law @ 2022-07-09 17:14 UTC (permalink / raw)
  To: gcc-patches



On 5/5/2022 8:41 PM, Alexandre Oliva via Gcc-patches wrote:
> On May  5, 2022, Segher Boessenkool <segher@kernel.crashing.org> wrote:
>
>> On Thu, May 05, 2022 at 03:52:01AM -0300, Alexandre Oliva wrote:
>>> +  else if (reg && MEM_P (reg)
>>> +	   && STRICT_ALIGNMENT && MEM_ALIGN (reg) < GET_MODE_ALIGNMENT (omode))
>>> +    return false;
>> Please fix the line breaks?  Either do a break before every &&, or put
>> as many things as possible on one line?
> I was going for conceptual grouping of alignment-related subexprs,
> but I don't care enough to fight for it.
>
>> Note that you should never have paradoxical subregs of mem on rs6000 or
>> any other target with INSN_SCHEDULING.
> Great, that alleviates some of my concerns about overreaching in this patch.
>
>>> +#include "../../gcc.c-torture/compile/pr100106.c"
>> It is better to copy the 11 lines of code.
> 'k
>
>> Please comment what the ilp32 is for (namely, the -mcpu= will barf
>> without it)..
> Ack
>
>> The testcase is okay with those changes, thanks!
> Thanks.  Here's the revised patch.
>
> I'm now testing on several platforms a follow-up patch that introduces
> TARGET_ALLOW_SUBREG_OF_MEM.
>
>
> [PR100106] Reject unaligned subregs when strict alignment is required
>
> From: Alexandre Oliva <oliva@adacore.com>
>
> The testcase for pr100106, compiled with optimization for 32-bit
> powerpc -mcpu=604 with -mstrict-align expands the initialization of a
> union from a float _Complex value into a load from an SCmode
> constant pool entry, aligned to 4 bytes, into a DImode pseudo,
> requiring 8-byte alignment.
>
> The patch that introduced the testcase modified simplify_subreg to
> avoid changing the MEM to outermode, but simplify_gen_subreg still
> creates a SUBREG or a MEM that would require stricter alignment than
> MEM's, and lra_constraints appears to get confused by that, repeatedly
> creating unsatisfiable reloads for the SUBREG until it exceeds the
> insn count.
>
> Avoiding the unaligned SUBREG, expand splits the DImode dest into
> SUBREGs and loads each SImode word of the constant pool with the
> proper alignment.
>
>
> for  gcc/ChangeLog
>
> 	PR target/100106
> 	* emit-rtl.cc (validate_subreg): Reject a SUBREG of a MEM that
> 	requires stricter alignment than MEM's.
>
> for  gcc/testsuite/ChangeLog
>
> 	PR target/100106
> 	* gcc.target/powerpc/pr100106-sa.c: New.
OK.
jeff


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] [PR100106] Reject unaligned subregs when strict alignment is required
  2022-05-06  2:41   ` [PATCH v2] " Alexandre Oliva
  2022-07-09 17:14     ` Jeff Law
@ 2023-05-24  5:39     ` Alexandre Oliva
  2023-05-24  9:04       ` Richard Biener
  1 sibling, 1 reply; 11+ messages in thread
From: Alexandre Oliva @ 2023-05-24  5:39 UTC (permalink / raw)
  To: gcc-patches; +Cc: ebotcazou

On May  5, 2022, Alexandre Oliva <oliva@adacore.com> wrote:

> for  gcc/ChangeLog

> 	PR target/100106
> 	* emit-rtl.cc (validate_subreg): Reject a SUBREG of a MEM that
> 	requires stricter alignment than MEM's.

> for  gcc/testsuite/ChangeLog

> 	PR target/100106
> 	* gcc.target/powerpc/pr100106-sa.c: New.

Ping?
https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594166.html

The testcase variant was approved, but the reformatted patch is still
pending review, despite support from Vlad Makarov for the original one;
the suggested separate followup patch, mentioned in the linked email,
turned out to be far more involved than anticipated, and needs further
work, but it's independent from this self-contained fix.

-- 
Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
   Free Software Activist                       GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] [PR100106] Reject unaligned subregs when strict alignment is required
  2023-05-24  5:39     ` Alexandre Oliva
@ 2023-05-24  9:04       ` Richard Biener
  0 siblings, 0 replies; 11+ messages in thread
From: Richard Biener @ 2023-05-24  9:04 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: gcc-patches, ebotcazou

On Wed, May 24, 2023 at 7:40 AM Alexandre Oliva via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On May  5, 2022, Alexandre Oliva <oliva@adacore.com> wrote:
>
> > for  gcc/ChangeLog
>
> >       PR target/100106
> >       * emit-rtl.cc (validate_subreg): Reject a SUBREG of a MEM that
> >       requires stricter alignment than MEM's.
>
> > for  gcc/testsuite/ChangeLog
>
> >       PR target/100106
> >       * gcc.target/powerpc/pr100106-sa.c: New.
>
> Ping?
> https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594166.html
>
> The testcase variant was approved, but the reformatted patch is still
> pending review, despite support from Vlad Makarov for the original one;
> the suggested separate followup patch, mentioned in the linked email,
> turned out to be far more involved than anticipated, and needs further
> work, but it's independent from this self-contained fix.

OK.

> --
> Alexandre Oliva, happy hacker                https://FSFLA.org/blogs/lxo/
>    Free Software Activist                       GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about <https://stallmansupport.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-05-24  9:07 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-05  6:52 [PATCH] [PR100106] Reject unaligned subregs when strict alignment is required Alexandre Oliva
2022-05-05  7:59 ` Richard Sandiford
2022-05-05 13:50   ` Segher Boessenkool
2022-05-06 10:57     ` [PATCH v2 2/2] " Alexandre Oliva
2022-05-09  8:09       ` Richard Sandiford
2022-05-05 14:33 ` [PATCH] " Segher Boessenkool
2022-05-06  2:41   ` [PATCH v2] " Alexandre Oliva
2022-07-09 17:14     ` Jeff Law
2023-05-24  5:39     ` Alexandre Oliva
2023-05-24  9:04       ` Richard Biener
2022-05-06 18:04 ` [PATCH] " Vladimir Makarov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).