public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi.
@ 2022-06-17  5:13 Roger Sayle
  2022-06-20 22:10 ` Segher Boessenkool
  0 siblings, 1 reply; 4+ messages in thread
From: Roger Sayle @ 2022-06-17  5:13 UTC (permalink / raw)
  To: gcc-patches; +Cc: 'Marek Polacek'

[-- Attachment #1: Type: text/plain, Size: 1424 bytes --]


This patch addresses PR target/105991 where a change to prefer representing
shifts and adds at the tree-level as multiplications, causes problems for
the rldimi patterns in the powerpc backend.  The issue is that rs6000.md
models this pattern using IOR, and some variants that have the equivalent
PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns.
This is fixed in this patch by adding a define_insn_and_split to locally
canonicalize the PLUS and XOR forms to the backend's preferred IOR form.

An alternative fix might be for the RTL optimizers to define a canonical
form for these plus_xor_ior equivalent expressions, but the logical
choice might be plus (which may appear in an addressing mode), and such
a change may require a number of tweaks to update various backends
(i.e.  a more intrusive change than the one proposed here).

Many thanks for Marek Polacek for bootstrapping and regression testing
this change without problems.  Hopefully the new testcase is portable
across powerpc's effective-targets.  Ok for mainline?


2022-06-17  Roger Sayle  <roger@nextmovesoftware.com>
	    Marek Polacek  <polacek@redhat.com>

gcc/ChangeLog
	PR target/105991
	* config/rs6000/rs6000.md (plus_xor): New code iterator.
	(*rotl<mode>3_insert_3_<code>): New define_insn_and_split.

gcc/testsuite/ChangeLog
	PR target/105991
	* gcc.target/powerpc/pr105991.c: New test case.


Thanks in advance,
Roger
--


[-- Attachment #2: patchpp.txt --]
[-- Type: text/plain, Size: 1477 bytes --]

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index c55ee7e..695ec33 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4188,6 +4188,23 @@
 }
   [(set_attr "type" "insert")])
 
+; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3
+(define_code_iterator plus_xor [plus xor])
+
+(define_insn_and_split "*rotl<mode>3_insert_3_<code>"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+	(plus_xor:GPR
+	  (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0")
+		   (match_operand:GPR 4 "const_int_operand" "n"))
+	  (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
+		      (match_operand:SI 2 "const_int_operand" "n"))))]
+  "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+	(ior:GPR (and:GPR (match_dup 3) (match_dup 4))
+		 (ashift:GPR (match_dup 1) (match_dup 2))))])
+
 (define_code_iterator plus_ior_xor [plus ior xor])
 
 (define_split
diff --git a/gcc/testsuite/gcc.target/powerpc/pr105991.c b/gcc/testsuite/gcc.target/powerpc/pr105991.c
new file mode 100644
index 0000000..e853e53
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr105991.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+unsigned long long
+foo (unsigned long long value)
+{
+  value &= 0xffffffff;
+  value |= value << 32;
+  return value;
+}
+/* { dg-final { scan-assembler "rldimi" } } */
+

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi.
  2022-06-17  5:13 [rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi Roger Sayle
@ 2022-06-20 22:10 ` Segher Boessenkool
  2022-06-21  2:03   ` Kewen.Lin
  0 siblings, 1 reply; 4+ messages in thread
From: Segher Boessenkool @ 2022-06-20 22:10 UTC (permalink / raw)
  To: Roger Sayle; +Cc: gcc-patches, 'Marek Polacek'

Hi!

On Fri, Jun 17, 2022 at 07:13:37AM +0200, Roger Sayle wrote:
> This patch addresses PR target/105991 where a change to prefer representing
> shifts and adds at the tree-level as multiplications, causes problems for
> the rldimi patterns in the powerpc backend.

Because it now is converted to different RTL at expand time.  Which the
generic expand code does some premature optimisation on, which makes us
end up with the addition instead of data manipulation insns.  Oh well.

> The issue is that rs6000.md
> models this pattern using IOR, and some variants that have the equivalent
> PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns.
> This is fixed in this patch by adding a define_insn_and_split to locally
> canonicalize the PLUS and XOR forms to the backend's preferred IOR form.

Okay.

> An alternative fix might be for the RTL optimizers to define a canonical
> form for these plus_xor_ior equivalent expressions, but the logical
> choice might be plus (which may appear in an addressing mode), and such
> a change may require a number of tweaks to update various backends
> (i.e.  a more intrusive change than the one proposed here).

This does not make sense in an address at all, thankfully :-)

The only sane canonicalisation for this is something like VEC_DUPLICATE
but for submodes of integer modes, instead of the component mode of a
vector mode.  I don't feel this is worth trying to handle in general
though.

> Many thanks for Marek Polacek for bootstrapping and regression testing
> this change without problems.

You have an account on the cfarm, it is quick and easy to test there :-)
I recommend gcc135, a 32 core p9, with oodles of disk space :-)

> +; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3
> +(define_code_iterator plus_xor [plus xor])
> +
> +(define_insn_and_split "*rotl<mode>3_insert_3_<code>"
> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
> +	(plus_xor:GPR
> +	  (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0")
> +		   (match_operand:GPR 4 "const_int_operand" "n"))
> +	  (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
> +		      (match_operand:SI 2 "const_int_operand" "n"))))]
> +  "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)"

exact_log2 returns -1 if its argument is not a power of two.  Please
test it is > 0 explicitly here: I don't think this splitter will work
correctly otherwise.  There shouldn't really be a shift by 0 ever of
course, but it isn't invalid RTL.

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr105991.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +unsigned long long
> +foo (unsigned long long value)
> +{
> +  value &= 0xffffffff;
> +  value |= value << 32;
> +  return value;
> +}
> +/* { dg-final { scan-assembler "rldimi" } } */

Write
/* { dg-final { scan-assembler {\mrldimi\M} } } */
please.


Okay for trunk with those changes.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi.
  2022-06-20 22:10 ` Segher Boessenkool
@ 2022-06-21  2:03   ` Kewen.Lin
  2022-06-21  7:34     ` Segher Boessenkool
  0 siblings, 1 reply; 4+ messages in thread
From: Kewen.Lin @ 2022-06-21  2:03 UTC (permalink / raw)
  To: Segher Boessenkool, Roger Sayle; +Cc: 'Marek Polacek', gcc-patches

on 2022/6/21 06:10, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Jun 17, 2022 at 07:13:37AM +0200, Roger Sayle wrote:
>> This patch addresses PR target/105991 where a change to prefer representing
>> shifts and adds at the tree-level as multiplications, causes problems for
>> the rldimi patterns in the powerpc backend.
> 
> Because it now is converted to different RTL at expand time.  Which the
> generic expand code does some premature optimisation on, which makes us
> end up with the addition instead of data manipulation insns.  Oh well.
> 
>> The issue is that rs6000.md
>> models this pattern using IOR, and some variants that have the equivalent
>> PLUS or XOR in the RTL fail to match some *rotl<mode>4_insert patterns.
>> This is fixed in this patch by adding a define_insn_and_split to locally
>> canonicalize the PLUS and XOR forms to the backend's preferred IOR form.
> 
> Okay.
> 
>> An alternative fix might be for the RTL optimizers to define a canonical
>> form for these plus_xor_ior equivalent expressions, but the logical
>> choice might be plus (which may appear in an addressing mode), and such
>> a change may require a number of tweaks to update various backends
>> (i.e.  a more intrusive change than the one proposed here).
> 
> This does not make sense in an address at all, thankfully :-)
> 
> The only sane canonicalisation for this is something like VEC_DUPLICATE
> but for submodes of integer modes, instead of the component mode of a
> vector mode.  I don't feel this is worth trying to handle in general
> though.
> 
>> Many thanks for Marek Polacek for bootstrapping and regression testing
>> this change without problems.
> 
> You have an account on the cfarm, it is quick and easy to test there :-)
> I recommend gcc135, a 32 core p9, with oodles of disk space :-)
> 
>> +; Canonicalize the PLUS and XOR forms to IOR for rotl<mode>3_insert_3
>> +(define_code_iterator plus_xor [plus xor])
>> +
>> +(define_insn_and_split "*rotl<mode>3_insert_3_<code>"
>> +  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>> +	(plus_xor:GPR
>> +	  (and:GPR (match_operand:GPR 3 "gpc_reg_operand" "0")
>> +		   (match_operand:GPR 4 "const_int_operand" "n"))
>> +	  (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
>> +		      (match_operand:SI 2 "const_int_operand" "n"))))]
>> +  "INTVAL (operands[2]) == exact_log2 (UINTVAL (operands[4]) + 1)"
> 
> exact_log2 returns -1 if its argument is not a power of two.  Please
> test it is > 0 explicitly here: I don't think this splitter will work
> correctly otherwise.  There shouldn't really be a shift by 0 ever of
> course, but it isn't invalid RTL.
> 
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr105991.c
>> @@ -0,0 +1,11 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2" } */
>> +unsigned long long
>> +foo (unsigned long long value)
>> +{
>> +  value &= 0xffffffff;
>> +  value |= value << 32;
>> +  return value;
>> +}
>> +/* { dg-final { scan-assembler "rldimi" } } */
> 
> Write
> /* { dg-final { scan-assembler {\mrldimi\M} } } */
> please.
> 

This case also needs effective-target keyword lp64,
that is /* { dg-require-effective-target lp64 } */

since with -m32, it gets:
  mr 3,4

with -m32 -mpowerpc64, it gets:
  rldicl 3,4,0,32


BR,
Kewen

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi.
  2022-06-21  2:03   ` Kewen.Lin
@ 2022-06-21  7:34     ` Segher Boessenkool
  0 siblings, 0 replies; 4+ messages in thread
From: Segher Boessenkool @ 2022-06-21  7:34 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: Roger Sayle, 'Marek Polacek', gcc-patches

On Tue, Jun 21, 2022 at 10:03:18AM +0800, Kewen.Lin wrote:
> This case also needs effective-target keyword lp64,
> that is /* { dg-require-effective-target lp64 } */

Good point.  Yes.

It would be nice to have just has_arch_ppc64 really.

> since with -m32, it gets:
>   mr 3,4
> 
> with -m32 -mpowerpc64, it gets:
>   rldicl 3,4,0,32

Yes, and that is not lp64 -- both longs and pointers are 32 bits when
you have -m32.

You get different code because parameter passing is different.  The
usual way to sidestep is to have the data in memory instead:

unsigned long long x;
void 
goo (void)
{
  unsigned long long value = x;
  value &= 0xffffffff;
  value |= value << 32;
  x = value;
}

but then the compiler tries to be smart and do code like
	addis 10,2,.LANCHOR0+4@toc@ha
	lwz 10,.LANCHOR0+4@toc@l(10)
	sldi 9,10,32
	add 9,9,10
	addis 10,2,.LANCHOR0@toc@ha
	std 9,.LANCHOR0@toc@l(10)
	blr
for -m64, and
	lis 9,x@ha
	la 10,x@l(9)
	lwz 10,4(10)
	stw 10,x@l(9)
	blr
for just -m32, but
	lis 10,x@ha
	la 9,x@l(10)
	la 10,x@l(10)
	ld 9,0(9)
	rldicl 8,9,0,32
	sldi 9,9,32
	add 9,9,8
	std 9,0(10)
	blr
for -m32 -mpowerpc64 (note it has not managed to do the splitter here;
it gets
Failed to match this instruction:
(set (reg:DI 128)
    (plus:DI (ashift:DI (reg/v:DI 117 [ value ])
            (const_int 32 [0x20]))
        (zero_extend:DI (subreg:SI (reg/v:DI 117 [ value ]) 4))))
and then
Failed to match this instruction:
(set (reg:DI 128)
    (plus:DI (and:DI (reg/v:DI 117 [ value ])
            (const_int 4294967295 [0xffffffff]))
        (ashift:DI (reg/v:DI 117 [ value ])
            (const_int 32 [0x20]))))
but that is not enough).

So let's just do lp64, at least for now :-)


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-06-21  7:35 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-17  5:13 [rs6000 PATCH] PR target/105991: Recognize PLUS and XOR forms of rldimi Roger Sayle
2022-06-20 22:10 ` Segher Boessenkool
2022-06-21  2:03   ` Kewen.Lin
2022-06-21  7:34     ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).