public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH-1 v2, rs6000] Replace shift and ior insns with one rotate and mask insn for bswap pattern [PR93453]
@ 2022-06-07  8:07 HAO CHEN GUI
  2022-06-07 15:59 ` Segher Boessenkool
  2022-06-07 16:01 ` Segher Boessenkool
  0 siblings, 2 replies; 4+ messages in thread
From: HAO CHEN GUI @ 2022-06-07  8:07 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Kewen.Lin, Peter Bergner

Hi,
  This patch replaces shift and ior insns with one rotate and mask
insn for the split patterns which are for DI byte swap on Power6. The
test cases shows the optimization.

  Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

ChangeLog
2022-06-07 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
	* config/rs6000/rs6000.md (define_split for bswapdi load): Merge shift
	and ior insns to one rotate and mask insn.
	(define_split for bswapdi register): Likewise.

gcc/testsuite/
	* gcc.target/powerpc/pr93453-1.c: New.


patch.diff
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index bf85baa5370..83800df12aa 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -2828,8 +2828,8 @@ (define_split
       emit_insn (gen_bswapsi2 (dest_32, word2));
     }

-  emit_insn (gen_ashldi3 (op3, op3, GEN_INT (32)));
-  emit_insn (gen_iordi3 (dest, dest, op3));
+  emit_insn (gen_rotldi3_insert_3 (dest, op3, GEN_INT (32), dest,
+				   GEN_INT (0xffffffff)));
   DONE;
 })

@@ -2914,10 +2914,10 @@ (define_split
   rtx op3_si  = simplify_gen_subreg (SImode, op3, DImode, lo_off);

   emit_insn (gen_lshrdi3 (op2, src, GEN_INT (32)));
-  emit_insn (gen_bswapsi2 (dest_si, src_si));
-  emit_insn (gen_bswapsi2 (op3_si, op2_si));
-  emit_insn (gen_ashldi3 (dest, dest, GEN_INT (32)));
-  emit_insn (gen_iordi3 (dest, dest, op3));
+  emit_insn (gen_bswapsi2 (op3_si, src_si));
+  emit_insn (gen_bswapsi2 (dest_si, op2_si));
+  emit_insn (gen_rotldi3_insert_3 (dest, op3, GEN_INT (32), dest,
+				   GEN_INT (0xffffffff)));
   DONE;
 })

diff --git a/gcc/testsuite/gcc.target/powerpc/pr93453-1.c b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
new file mode 100644
index 00000000000..4271886561f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile { target lp64 } } */
+/* { dg-options "-mdejagnu-cpu=power6 -O2" } */
+
+unsigned long load_byte_reverse (unsigned long *in)
+{
+   return __builtin_bswap64 (*in);
+}
+
+unsigned long byte_reverse (unsigned long in)
+{
+   return __builtin_bswap64 (in);
+}
+
+/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH-1 v2, rs6000] Replace shift and ior insns with one rotate and mask insn for bswap pattern [PR93453]
  2022-06-07  8:07 [PATCH-1 v2, rs6000] Replace shift and ior insns with one rotate and mask insn for bswap pattern [PR93453] HAO CHEN GUI
@ 2022-06-07 15:59 ` Segher Boessenkool
  2022-06-08  7:57   ` HAO CHEN GUI
  2022-06-07 16:01 ` Segher Boessenkool
  1 sibling, 1 reply; 4+ messages in thread
From: Segher Boessenkool @ 2022-06-07 15:59 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: gcc-patches, David, Kewen.Lin, Peter Bergner

Hi!

On Tue, Jun 07, 2022 at 04:07:58PM +0800, HAO CHEN GUI wrote:
>   This patch replaces shift and ior insns with one rotate and mask
> insn for the split patterns which are for DI byte swap on Power6. The
> test cases shows the optimization.

Nice :-)

> -  emit_insn (gen_ashldi3 (op3, op3, GEN_INT (32)));
> -  emit_insn (gen_iordi3 (dest, dest, op3));
> +  emit_insn (gen_rotldi3_insert_3 (dest, op3, GEN_INT (32), dest,
> +				   GEN_INT (0xffffffff)));

You could make some define_expand to make this easier to use.  But not
sure what to call it.  The goal would be to make this easier to read and
use, not to make it harder :-)  Something with duplicate-si-to-di or
such?  Is that pattern somewhere else already, maybe vectors, maybe some
other target even?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile { target lp64 } } */
> +/* { dg-options "-mdejagnu-cpu=power6 -O2" } */

It doesn't require -m64, only -mpowerpc64.  You can use has_arch_ppc64
to test for the latter.

Okay for trunk, even without that improvement.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH-1 v2, rs6000] Replace shift and ior insns with one rotate and mask insn for bswap pattern [PR93453]
  2022-06-07  8:07 [PATCH-1 v2, rs6000] Replace shift and ior insns with one rotate and mask insn for bswap pattern [PR93453] HAO CHEN GUI
  2022-06-07 15:59 ` Segher Boessenkool
@ 2022-06-07 16:01 ` Segher Boessenkool
  1 sibling, 0 replies; 4+ messages in thread
From: Segher Boessenkool @ 2022-06-07 16:01 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: gcc-patches, David, Kewen.Lin, Peter Bergner

On Tue, Jun 07, 2022 at 04:07:58PM +0800, HAO CHEN GUI wrote:
>   Bootstrapped and tested on ppc64 Linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.

One further thing: please make sure you have tested on -m32 as well,
especially for integer stuff like this, it is easy to accidentally get
the conditions on a define_* and one of its uses out of synch.  But it
looks to be okay in this case :-)


Segher

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH-1 v2, rs6000] Replace shift and ior insns with one rotate and mask insn for bswap pattern [PR93453]
  2022-06-07 15:59 ` Segher Boessenkool
@ 2022-06-08  7:57   ` HAO CHEN GUI
  0 siblings, 0 replies; 4+ messages in thread
From: HAO CHEN GUI @ 2022-06-08  7:57 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, David, Kewen.Lin, Peter Bergner

Hi,

On 7/6/2022 下午 11:59, Segher Boessenkool wrote:
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr93453-1.c
>> @@ -0,0 +1,14 @@
>> +/* { dg-do compile { target lp64 } } */
>> +/* { dg-options "-mdejagnu-cpu=power6 -O2" } */
> It doesn't require -m64, only -mpowerpc64.  You can use has_arch_ppc64
> to test for the latter.

Tested it with 'target has_arch_ppc64', it works on both -m32 and -m64.

Thanks.
Gui Haochen

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-06-08  7:57 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-07  8:07 [PATCH-1 v2, rs6000] Replace shift and ior insns with one rotate and mask insn for bswap pattern [PR93453] HAO CHEN GUI
2022-06-07 15:59 ` Segher Boessenkool
2022-06-08  7:57   ` HAO CHEN GUI
2022-06-07 16:01 ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).