public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [rs6000 PATCH] Improve constant integer multiply using rldimi.
@ 2022-06-26 20:56 Roger Sayle
  2022-06-27  9:04 ` Kewen.Lin
  2022-06-27 14:38 ` Segher Boessenkool
  0 siblings, 2 replies; 3+ messages in thread
From: Roger Sayle @ 2022-06-26 20:56 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1987 bytes --]

 

This patch tweaks the code generated on POWER for integer multiplications

by a constant, by making use of rldimi instructions.  Much like x86's

lea instruction, rldimi can be used to implement a shift and add pair

in some circumstances.  For rldimi this is when the shifted operand

is known to have no bits in common with the added operand.

 

Hence for the new testcase below:

 

int foo(int x)

{

  int t = x & 42;

  return t * 0x2001;

}

 

when compiled with -O2, GCC currently generates:

 

        andi. 3,3,0x2a

        slwi 9,3,13

        add 3,9,3

        extsw 3,3

        blr

 

with this patch, we now generate:

 

        andi. 3,3,0x2a

        rlwimi 3,3,13,0,31-13

        extsw 3,3

        blr

 

It turns out this optimization already exists in the form of a combine

splitter in rs6000.md, but the constraints on combine splitters,

requiring three of four input instructions (and generating one or two

output instructions) mean it doesn't get applied as often as it could.

This patch converts the define_split into a define_insn_and_split to

catch more cases (such as the one above).

 

The one bit that's tricky/controversial is the use of RTL's

nonzero_bits which is accurate during the combine pass when this

pattern is first recognized, but not as advanced (not kept up to

date) when this pattern is eventually split.  To support this,

I've used a "|| reload_completed" idiom.  Does this approach seem

reasonable? [I've another patch of x86 that uses the same idiom].

 

This patch has been tested on powerpc64le-unknown-linux-gnu with

make bootstrap and make -k check with no new failures.

Ok for mainline?

 

 

2022-06-26  Roger Sayle  <roger@nextmovesoftware.com>

 

gcc/ChangeLog

* config/rs6000/rs6000.md (*rotl<mode>3_insert_3b_<code>): New

define_insn_and_split created from exisiting define_split.

 

gcc/testsuite/ChangeLog

* gcc.target/powerpc/rldimi-3.c: New test case.

 

 

Thanks in advance,

Roger

--

 


[-- Attachment #2: patchis2.txt --]
[-- Type: text/plain, Size: 1615 bytes --]

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 090dbcf..b8aada32 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -4209,13 +4209,19 @@
 
 (define_code_iterator plus_ior_xor [plus ior xor])
 
-(define_split
-  [(set (match_operand:GPR 0 "gpc_reg_operand")
-	(plus_ior_xor:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand")
-				      (match_operand:SI 2 "const_int_operand"))
-			  (match_operand:GPR 3 "gpc_reg_operand")))]
-  "nonzero_bits (operands[3], <MODE>mode)
-   < HOST_WIDE_INT_1U << INTVAL (operands[2])"
+(define_insn_and_split "*rotl<mode>3_insert_3b_<code>"
+  [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
+	(plus_ior_xor:GPR
+	  (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
+		      (match_operand:SI 2 "const_int_operand" "n"))
+	  (match_operand:GPR 3 "gpc_reg_operand" "0")))]
+  "INTVAL (operands[2]) > 0
+   && INTVAL (operands[2]) < 64
+   && ((nonzero_bits (operands[3], <MODE>mode)
+	< HOST_WIDE_INT_1U << INTVAL (operands[2]))
+       || reload_completed)"
+  "#"
+  "&& 1"
   [(set (match_dup 0)
 	(ior:GPR (and:GPR (match_dup 3)
 			  (match_dup 4))
diff --git a/gcc/testsuite/gcc.target/powerpc/rldimi-3.c b/gcc/testsuite/gcc.target/powerpc/rldimi-3.c
new file mode 100644
index 0000000..80ff86e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/rldimi-3.c
@@ -0,0 +1,9 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-O2" } */
+
+int foo(int x)
+{
+  int t = x & 42;
+  return t * 0x2001;
+}
+/* { dg-final { scan-assembler {\mrldimi\M} } } */

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2022-06-27 14:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-26 20:56 [rs6000 PATCH] Improve constant integer multiply using rldimi Roger Sayle
2022-06-27  9:04 ` Kewen.Lin
2022-06-27 14:38 ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).