public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]
@ 2021-12-17  1:55 HAO CHEN GUI
  2022-01-10  3:16 ` Ping^1 " HAO CHEN GUI
  0 siblings, 1 reply; 9+ messages in thread
From: HAO CHEN GUI @ 2021-12-17  1:55 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Bill Schmidt

Hi,
   This patch defines a new split pattern for TI to V1TI move. The pattern concatenates two subreg:DI of
a TI to a V2DI. With the pattern, the subreg pass can do register split for TI when there is a TI to V1TI
move. The patch optimizes one unnecessary "mr" out on P9. The new test case illustrates it.

   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk?
Any recommendations? Thanks a lot.

ChangeLog
2021-12-13 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
	* config/rs6000/vsx.md (split pattern for TI to V1TI move): Defined.

gcc/testsuite/
	* gcc.target/powerpc/pr103124.c: New testcase.


patch.diff
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bf033e31c1c..52968eb4609 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -6589,3 +6589,19 @@ (define_insn "xxeval"
    [(set_attr "type" "vecperm")
     (set_attr "prefixed" "yes")])

+;; Construct V1TI by vsx_concat_v2di
+(define_split
+  [(set (match_operand:V1TI 0 "vsx_register_operand")
+	(subreg:V1TI
+	  (match_operand:TI 1 "int_reg_operand") 0 ))]
+  "TARGET_P9_VECTOR && !reload_completed"
+  [(const_int 0)]
+{
+  rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0);
+  rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8);
+  rtx tmp3 = gen_reg_rtx (V2DImode);
+  emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2));
+  rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0);
+  emit_move_insn (operands[0], tmp4);
+  DONE;
+})
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103124.c b/gcc/testsuite/gcc.target/powerpc/pr103124.c
new file mode 100644
index 00000000000..e9072d19b8e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
+/* { dg-final { scan-assembler-not "\mmr\M" } } */
+
+vector __int128 add (long long a)
+{
+  vector __int128 b;
+  b = (vector __int128) {a};
+  return b;
+}


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Ping^1 [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]
  2021-12-17  1:55 [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124] HAO CHEN GUI
@ 2022-01-10  3:16 ` HAO CHEN GUI
  2022-01-10 23:09   ` David Edelsohn
  0 siblings, 1 reply; 9+ messages in thread
From: HAO CHEN GUI @ 2022-01-10  3:16 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Bill Schmidt

Hi,

    Gentle ping this:
	https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587051.html

Thanks

On 17/12/2021 上午 9:55, HAO CHEN GUI wrote:
> Hi,
>    This patch defines a new split pattern for TI to V1TI move. The pattern concatenates two subreg:DI of
> a TI to a V2DI. With the pattern, the subreg pass can do register split for TI when there is a TI to V1TI
> move. The patch optimizes one unnecessary "mr" out on P9. The new test case illustrates it.
> 
>    Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk?
> Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2021-12-13 Haochen Gui <guihaoc@linux.ibm.com>
> 
> gcc/
> 	* config/rs6000/vsx.md (split pattern for TI to V1TI move): Defined.
> 
> gcc/testsuite/
> 	* gcc.target/powerpc/pr103124.c: New testcase.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index bf033e31c1c..52968eb4609 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -6589,3 +6589,19 @@ (define_insn "xxeval"
>     [(set_attr "type" "vecperm")
>      (set_attr "prefixed" "yes")])
> 
> +;; Construct V1TI by vsx_concat_v2di
> +(define_split
> +  [(set (match_operand:V1TI 0 "vsx_register_operand")
> +	(subreg:V1TI
> +	  (match_operand:TI 1 "int_reg_operand") 0 ))]
> +  "TARGET_P9_VECTOR && !reload_completed"
> +  [(const_int 0)]
> +{
> +  rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0);
> +  rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8);
> +  rtx tmp3 = gen_reg_rtx (V2DImode);
> +  emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2));
> +  rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0);
> +  emit_move_insn (operands[0], tmp4);
> +  DONE;
> +})
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103124.c b/gcc/testsuite/gcc.target/powerpc/pr103124.c
> new file mode 100644
> index 00000000000..e9072d19b8e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
> +/* { dg-final { scan-assembler-not "\mmr\M" } } */
> +
> +vector __int128 add (long long a)
> +{
> +  vector __int128 b;
> +  b = (vector __int128) {a};
> +  return b;
> +}
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Ping^1 [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]
  2022-01-10  3:16 ` Ping^1 " HAO CHEN GUI
@ 2022-01-10 23:09   ` David Edelsohn
  2022-01-11  1:12     ` Segher Boessenkool
  0 siblings, 1 reply; 9+ messages in thread
From: David Edelsohn @ 2022-01-10 23:09 UTC (permalink / raw)
  To: HAO CHEN GUI, Segher Boessenkool; +Cc: gcc-patches, Bill Schmidt

On Sun, Jan 9, 2022 at 10:16 PM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote:
>
> Hi,
>
>     Gentle ping this:
>         https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587051.html
>
> Thanks
>
> On 17/12/2021 上午 9:55, HAO CHEN GUI wrote:
> > Hi,
> >    This patch defines a new split pattern for TI to V1TI move. The pattern concatenates two subreg:DI of
> > a TI to a V2DI. With the pattern, the subreg pass can do register split for TI when there is a TI to V1TI
> > move. The patch optimizes one unnecessary "mr" out on P9. The new test case illustrates it.
> >
> >    Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk?
> > Any recommendations? Thanks a lot.
> >
> > ChangeLog
> > 2021-12-13 Haochen Gui <guihaoc@linux.ibm.com>
> >
> > gcc/
> >       * config/rs6000/vsx.md (split pattern for TI to V1TI move): Defined.
> >
> > gcc/testsuite/
> >       * gcc.target/powerpc/pr103124.c: New testcase.
> >
> >
> > patch.diff
> > diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> > index bf033e31c1c..52968eb4609 100644
> > --- a/gcc/config/rs6000/vsx.md
> > +++ b/gcc/config/rs6000/vsx.md
> > @@ -6589,3 +6589,19 @@ (define_insn "xxeval"
> >     [(set_attr "type" "vecperm")
> >      (set_attr "prefixed" "yes")])
> >
> > +;; Construct V1TI by vsx_concat_v2di
> > +(define_split
> > +  [(set (match_operand:V1TI 0 "vsx_register_operand")
> > +     (subreg:V1TI
> > +       (match_operand:TI 1 "int_reg_operand") 0 ))]
> > +  "TARGET_P9_VECTOR && !reload_completed"
> > +  [(const_int 0)]
> > +{
> > +  rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0);
> > +  rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8);
> > +  rtx tmp3 = gen_reg_rtx (V2DImode);
> > +  emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2));
> > +  rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0);
> > +  emit_move_insn (operands[0], tmp4);
> > +  DONE;
> > +})
> > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103124.c b/gcc/testsuite/gcc.target/powerpc/pr103124.c
> > new file mode 100644
> > index 00000000000..e9072d19b8e
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c
> > @@ -0,0 +1,12 @@
> > +/* { dg-do compile } */
> > +/* { dg-require-effective-target powerpc_p9vector_ok } */
> > +/* { dg-require-effective-target int128 } */
> > +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
> > +/* { dg-final { scan-assembler-not "\mmr\M" } } */

Segher probably would prefer {\mmr\M} .

> > +
> > +vector __int128 add (long long a)
> > +{
> > +  vector __int128 b;
> > +  b = (vector __int128) {a};
> > +  return b;
> > +}

This is okay.

Thanks, David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Ping^1 [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]
  2022-01-10 23:09   ` David Edelsohn
@ 2022-01-11  1:12     ` Segher Boessenkool
  2022-01-11  2:45       ` HAO CHEN GUI
  0 siblings, 1 reply; 9+ messages in thread
From: Segher Boessenkool @ 2022-01-11  1:12 UTC (permalink / raw)
  To: David Edelsohn; +Cc: HAO CHEN GUI, gcc-patches, Bill Schmidt

On Mon, Jan 10, 2022 at 06:09:01PM -0500, David Edelsohn wrote:
> On Sun, Jan 9, 2022 at 10:16 PM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote:
> > > +/* { dg-final { scan-assembler-not "\mmr\M" } } */
> 
> Segher probably would prefer {\mmr\M} .

Because that one works, and the one with double quotes doesn't, yes :-)

It is a scan-assembler-not so the testcase likely won't fail, but it is
checking the wrong thing.  In double-quoted strings "\m" means the same
as "m", and "\M" means the same as "M" (neither escape has any special
meaning).  If you want the regex escapes in such a string, you need to
escape the escapes, so write "\\m" and "\\M".  It is much simpler to not
have backslash substitution on the strings at all, so to use {\m} etc.


Segher

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: Ping^1 [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]
  2022-01-11  1:12     ` Segher Boessenkool
@ 2022-01-11  2:45       ` HAO CHEN GUI
  0 siblings, 0 replies; 9+ messages in thread
From: HAO CHEN GUI @ 2022-01-11  2:45 UTC (permalink / raw)
  To: Segher Boessenkool, David Edelsohn; +Cc: gcc-patches, Bill Schmidt

Segher and David,

   Thanks for your explanation. I got it. The "\m" itself is a constraint escape.

Gui Haochen

On 11/1/2022 上午 9:12, Segher Boessenkool wrote:
> On Mon, Jan 10, 2022 at 06:09:01PM -0500, David Edelsohn wrote:
>> On Sun, Jan 9, 2022 at 10:16 PM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote:
>>>> +/* { dg-final { scan-assembler-not "\mmr\M" } } */
>>
>> Segher probably would prefer {\mmr\M} .
> 
> Because that one works, and the one with double quotes doesn't, yes :-)
> 
> It is a scan-assembler-not so the testcase likely won't fail, but it is
> checking the wrong thing.  In double-quoted strings "\m" means the same
> as "m", and "\M" means the same as "M" (neither escape has any special
> meaning).  If you want the regex escapes in such a string, you need to
> escape the escapes, so write "\\m" and "\\M".  It is much simpler to not
> have backslash substitution on the strings at all, so to use {\m} etc.
> 
> 
> Segher

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]
  2021-12-13 22:59   ` Segher Boessenkool
@ 2021-12-14  1:41     ` HAO CHEN GUI
  0 siblings, 0 replies; 9+ messages in thread
From: HAO CHEN GUI @ 2021-12-14  1:41 UTC (permalink / raw)
  To: Segher Boessenkool, David Edelsohn; +Cc: gcc-patches, Bill Schmidt

Hi Segher,
  Thanks for your advice. Please see my comments.

On 14/12/2021 上午 6:59, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Dec 13, 2021 at 05:22:06PM -0500, David Edelsohn wrote:
>> On Sun, Dec 12, 2021 at 10:00 PM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote:
>>> --- a/gcc/config/rs6000/vsx.md
>>> +++ b/gcc/config/rs6000/vsx.md
>>> @@ -6589,3 +6589,19 @@ (define_insn "xxeval"
>>>     [(set_attr "type" "vecperm")
>>>      (set_attr "prefixed" "yes")])
>>>
>>> +;; split TI to V1TI move
> 
> Please comment that this splitter tries to generate mtvsrdd insns, and
> don't say the obvious things :-)
>

OK, I will modify it.

>>> +(define_split
>>> +  [(set (match_operand:V1TI 0 "vsx_register_operand")
>>> +       (subreg:V1TI
>>> +         (match_operand:TI 1 "int_reg_operand") 0 ))]
>>> +  "TARGET_P9_VECTOR && !reload_completed"
> 
> Why the "!reload_completed"?  Is this generated after reload as well,
> and that is bad for some reason?
> 
>>> +  [(const_int 0)]
>>> +{
>>> +  rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0);
>>> +  rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8);
>>> +  rtx tmp3 = gen_reg_rtx (V2DImode);
>>> +  emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2));
>>> +  rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0);
>>> +  emit_move_insn (operands[0], tmp4);
>>> +  DONE;
>>> +})
> 
> Ah, it is bad because it generates a pseudo.
> 
> So either you just make it work when everything is hard regs, or you do
> this *and comment it*.
> 
> The first option is not very easy to do.  You need to make sure you can
> do those subregs (and get GPRs!), and you need to use a hard reg instead
> of the new pseudo (you can use operand 0 for this here though, it can
> never be the same as operand 1 :-) (but only do this if this *is* after
> reload)).
> 
> But, it sounds like you actually saw problems when allowing it after
> reload, so it sounds like it would actually be useful to do it then?

The purpose of this split pattern is to generate V1TI by two subregs from TI.
Subsequent subreg pass can recognize TI in the insn as splitable. As there is no
subreg pass after reload, I want the split just to be done before reload. Also as
you mentioned, my patch generates a pseudo. It doesn't work after reload. That's
why I set "!reload_complete" condition.

> 
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c
>>> @@ -0,0 +1,11 @@
>>> +/* { dg-do compile { target { powerpc*-*-* && lp64 } } */
>>
>> Please don't include the "powerpc" target selector in the
>> gcc.target/powerpc directory.  Just use lp64.
> 
> Or actually, don't use anything, and do a  dg-require int128  instead.
> 

Thanks, I will take it.

> 
> Segher
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]
  2021-12-13 22:22 ` David Edelsohn
@ 2021-12-13 22:59   ` Segher Boessenkool
  2021-12-14  1:41     ` HAO CHEN GUI
  0 siblings, 1 reply; 9+ messages in thread
From: Segher Boessenkool @ 2021-12-13 22:59 UTC (permalink / raw)
  To: David Edelsohn; +Cc: HAO CHEN GUI, gcc-patches, Bill Schmidt

Hi!

On Mon, Dec 13, 2021 at 05:22:06PM -0500, David Edelsohn wrote:
> On Sun, Dec 12, 2021 at 10:00 PM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote:
> > --- a/gcc/config/rs6000/vsx.md
> > +++ b/gcc/config/rs6000/vsx.md
> > @@ -6589,3 +6589,19 @@ (define_insn "xxeval"
> >     [(set_attr "type" "vecperm")
> >      (set_attr "prefixed" "yes")])
> >
> > +;; split TI to V1TI move

Please comment that this splitter tries to generate mtvsrdd insns, and
don't say the obvious things :-)

> > +(define_split
> > +  [(set (match_operand:V1TI 0 "vsx_register_operand")
> > +       (subreg:V1TI
> > +         (match_operand:TI 1 "int_reg_operand") 0 ))]
> > +  "TARGET_P9_VECTOR && !reload_completed"

Why the "!reload_completed"?  Is this generated after reload as well,
and that is bad for some reason?

> > +  [(const_int 0)]
> > +{
> > +  rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0);
> > +  rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8);
> > +  rtx tmp3 = gen_reg_rtx (V2DImode);
> > +  emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2));
> > +  rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0);
> > +  emit_move_insn (operands[0], tmp4);
> > +  DONE;
> > +})

Ah, it is bad because it generates a pseudo.

So either you just make it work when everything is hard regs, or you do
this *and comment it*.

The first option is not very easy to do.  You need to make sure you can
do those subregs (and get GPRs!), and you need to use a hard reg instead
of the new pseudo (you can use operand 0 for this here though, it can
never be the same as operand 1 :-) (but only do this if this *is* after
reload)).

But, it sounds like you actually saw problems when allowing it after
reload, so it sounds like it would actually be useful to do it then?

> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c
> > @@ -0,0 +1,11 @@
> > +/* { dg-do compile { target { powerpc*-*-* && lp64 } } */
> 
> Please don't include the "powerpc" target selector in the
> gcc.target/powerpc directory.  Just use lp64.

Or actually, don't use anything, and do a  dg-require int128  instead.


Segher

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]
  2021-12-13  3:00 HAO CHEN GUI
@ 2021-12-13 22:22 ` David Edelsohn
  2021-12-13 22:59   ` Segher Boessenkool
  0 siblings, 1 reply; 9+ messages in thread
From: David Edelsohn @ 2021-12-13 22:22 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: gcc-patches, Segher Boessenkool, Bill Schmidt

On Sun, Dec 12, 2021 at 10:00 PM HAO CHEN GUI <guihaoc@linux.ibm.com> wrote:
>
> Hi,
>    This patch defines a new split pattern for TI to V1TI move. The pattern concatenates two subreg:DI of
> a TI to a V2DI, then move the V2DI to V1TI. With the pattern, the subreg pass can do register split for
> TI when there is a TI to V1TI move. The patch optimizes one unnecessary "mr" out on P9. The new
> test case illustrates it.
>
>    Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk?
> Any recommendations? Thanks a lot.
>
> ChangeLog
> 2021-12-13 Haochen Gui <guihaoc@linux.ibm.com>
>
> gcc/
>         * config/rs6000/vsx.md (split pattern for TI to V1TI move): Defined.
>
> gcc/testsuite/
>         * gcc.target/powerpc/pr103124.c: New testcase.
>
>
> patch.diff
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index bf033e31c1c..7bca7780735 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -6589,3 +6589,19 @@ (define_insn "xxeval"
>     [(set_attr "type" "vecperm")
>      (set_attr "prefixed" "yes")])
>
> +;; split TI to V1TI move
> +(define_split
> +  [(set (match_operand:V1TI 0 "vsx_register_operand")
> +       (subreg:V1TI
> +         (match_operand:TI 1 "int_reg_operand") 0 ))]
> +  "TARGET_P9_VECTOR && !reload_completed"
> +  [(const_int 0)]
> +{
> +  rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0);
> +  rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8);
> +  rtx tmp3 = gen_reg_rtx (V2DImode);
> +  emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2));
> +  rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0);
> +  emit_move_insn (operands[0], tmp4);
> +  DONE;
> +})
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103124.c b/gcc/testsuite/gcc.target/powerpc/pr103124.c
> new file mode 100644
> index 00000000000..724492dbcd2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile { target { powerpc*-*-* && lp64 } } */

Please don't include the "powerpc" target selector in the
gcc.target/powerpc directory.  Just use lp64.

> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
> +/* { dg-final { scan-assembler-not "\mmr\M" } } */
> +
> +vector __int128 add (long long a)
> +{
> +  vector __int128 b;
> +  b = (vector __int128) {a};
> +  return b;
> +}

Okay with that change.

Thanks, David

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124]
@ 2021-12-13  3:00 HAO CHEN GUI
  2021-12-13 22:22 ` David Edelsohn
  0 siblings, 1 reply; 9+ messages in thread
From: HAO CHEN GUI @ 2021-12-13  3:00 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Bill Schmidt

Hi,
   This patch defines a new split pattern for TI to V1TI move. The pattern concatenates two subreg:DI of
a TI to a V2DI, then move the V2DI to V1TI. With the pattern, the subreg pass can do register split for
TI when there is a TI to V1TI move. The patch optimizes one unnecessary "mr" out on P9. The new
test case illustrates it.

   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk?
Any recommendations? Thanks a lot.

ChangeLog
2021-12-13 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
	* config/rs6000/vsx.md (split pattern for TI to V1TI move): Defined.

gcc/testsuite/
	* gcc.target/powerpc/pr103124.c: New testcase.


patch.diff
diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
index bf033e31c1c..7bca7780735 100644
--- a/gcc/config/rs6000/vsx.md
+++ b/gcc/config/rs6000/vsx.md
@@ -6589,3 +6589,19 @@ (define_insn "xxeval"
    [(set_attr "type" "vecperm")
     (set_attr "prefixed" "yes")])

+;; split TI to V1TI move
+(define_split
+  [(set (match_operand:V1TI 0 "vsx_register_operand")
+	(subreg:V1TI
+	  (match_operand:TI 1 "int_reg_operand") 0 ))]
+  "TARGET_P9_VECTOR && !reload_completed"
+  [(const_int 0)]
+{
+  rtx tmp1 = simplify_gen_subreg (DImode, operands[1], TImode, 0);
+  rtx tmp2 = simplify_gen_subreg (DImode, operands[1], TImode, 8);
+  rtx tmp3 = gen_reg_rtx (V2DImode);
+  emit_insn (gen_vsx_concat_v2di (tmp3, tmp1, tmp2));
+  rtx tmp4 = simplify_gen_subreg (V1TImode, tmp3, V2DImode, 0);
+  emit_move_insn (operands[0], tmp4);
+  DONE;
+})
diff --git a/gcc/testsuite/gcc.target/powerpc/pr103124.c b/gcc/testsuite/gcc.target/powerpc/pr103124.c
new file mode 100644
index 00000000000..724492dbcd2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr103124.c
@@ -0,0 +1,11 @@
+/* { dg-do compile { target { powerpc*-*-* && lp64 } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
+/* { dg-final { scan-assembler-not "\mmr\M" } } */
+
+vector __int128 add (long long a)
+{
+  vector __int128 b;
+  b = (vector __int128) {a};
+  return b;
+}

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-01-11  2:45 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-17  1:55 [PATCH, rs6000] new split pattern for TI to V1TI move [PR103124] HAO CHEN GUI
2022-01-10  3:16 ` Ping^1 " HAO CHEN GUI
2022-01-10 23:09   ` David Edelsohn
2022-01-11  1:12     ` Segher Boessenkool
2022-01-11  2:45       ` HAO CHEN GUI
  -- strict thread matches above, loose matches on Subject: below --
2021-12-13  3:00 HAO CHEN GUI
2021-12-13 22:22 ` David Edelsohn
2021-12-13 22:59   ` Segher Boessenkool
2021-12-14  1:41     ` HAO CHEN GUI

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).