public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] rs6000: Use subreg for QI/HI vector init
@ 2020-12-02  9:44 Kewen.Lin
  2020-12-02 22:32 ` will schmidt
  2020-12-14 18:51 ` Segher Boessenkool
  0 siblings, 2 replies; 8+ messages in thread
From: Kewen.Lin @ 2020-12-02  9:44 UTC (permalink / raw)
  To: GCC Patches; +Cc: Segher Boessenkool, David Edelsohn, Bill Schmidt

[-- Attachment #1: Type: text/plain, Size: 734 bytes --]

Hi,

This patch is to use paradoxical subreg instead of
zero_extend for promoting QI/HI to SI/DI when we
want to construct one vector with these modes.
Since we do the gpr->vsx movement and vector merge
or pack later, the high part is useless and safe to
use paradoxical subreg.  It can avoid useless rlwinms
generated for signed cases.

Bootstrapped/regtested on powerpc64le-linux-gnu P9.

Is it ok for trunk?

BR,
Kewen
------
gcc/ChangeLog:

	* config/rs6000/rs6000.c (rs6000_expand_vector_init): Use
	paradoxical subreg instead of zero_extend for QI/HI promotion
	when doing QI/HI vector init.

gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/pr96933-1.c: Adjusted to check no rlwinm.
	* gcc.target/powerpc/pr96933-2.c: Likewise.

[-- Attachment #2: subreg.diff --]
[-- Type: text/plain, Size: 1841 bytes --]

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index f33fca3982a..9c084b055b8 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6793,17 +6793,8 @@ rs6000_expand_vector_init (rtx target, rtx vals)
       /* Force the values into word_mode registers.  */
       for (i = 0; i < n_elts; i++)
 	{
-	  rtx tmp = force_reg (GET_MODE_INNER (mode), XVECEXP (vals, 0, i));
-	  if (TARGET_POWERPC64)
-	    {
-	      op[i] = gen_reg_rtx (DImode);
-	      emit_insn (gen_zero_extendqidi2 (op[i], tmp));
-	    }
-	  else
-	    {
-	      op[i] = gen_reg_rtx (SImode);
-	      emit_insn (gen_zero_extendqisi2 (op[i], tmp));
-	    }
+	  rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
+	  op[i] = simplify_gen_subreg (Pmode, tmp, inner_mode, 0);
 	}
 
       /* Take unsigned char big endianness on 64bit as example for below
diff --git a/gcc/testsuite/gcc.target/powerpc/pr96933-1.c b/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
index 3b63865b3b8..71d72084413 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
@@ -13,3 +13,4 @@
 /* { dg-final { scan-assembler-times {\mvpkudum\M} 12 } } */
 /* { dg-final { scan-assembler-not {\mstb\M} } } */
 /* { dg-final { scan-assembler-not {\msth\M} } } */
+/* { dg-final { scan-assembler-not {\mrlwinm\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr96933-2.c b/gcc/testsuite/gcc.target/powerpc/pr96933-2.c
index cef8fbd4f35..9fa15125d8d 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr96933-2.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr96933-2.c
@@ -13,3 +13,4 @@
 /* { dg-final { scan-assembler-times {\mxxpermdi\M} 4 } } */
 /* { dg-final { scan-assembler-not {\mstb\M} } } */
 /* { dg-final { scan-assembler-not {\msth\M} } } */
+/* { dg-final { scan-assembler-not {\mrlwinm\M} } } */

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] rs6000: Use subreg for QI/HI vector init
  2020-12-02  9:44 [PATCH] rs6000: Use subreg for QI/HI vector init Kewen.Lin
@ 2020-12-02 22:32 ` will schmidt
  2020-12-14 18:54   ` Segher Boessenkool
  2020-12-14 18:51 ` Segher Boessenkool
  1 sibling, 1 reply; 8+ messages in thread
From: will schmidt @ 2020-12-02 22:32 UTC (permalink / raw)
  To: Kewen.Lin, GCC Patches; +Cc: Bill Schmidt, David Edelsohn, Segher Boessenkool

On Wed, 2020-12-02 at 17:44 +0800, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> This patch is to use paradoxical subreg instead of
> zero_extend for promoting QI/HI to SI/DI when we
> want to construct one vector with these modes.
> Since we do the gpr->vsx movement and vector merge
> or pack later, the high part is useless and safe to
> use paradoxical subreg.  It can avoid useless rlwinms
> generated for signed cases.
> 
> Bootstrapped/regtested on powerpc64le-linux-gnu P9.
> 
> Is it ok for trunk?

Mostly cosmetic review.  comments sprinkled below.
thanks
-Will

> 
> BR,
> Kewen
> ------
> gcc/ChangeLog:
> 
> 	* config/rs6000/rs6000.c (rs6000_expand_vector_init): Use
> 	paradoxical subreg instead of zero_extend for QI/HI promotion
> 	when doing QI/HI vector init.

A bit long, but OK with me. :-)

> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/powerpc/pr96933-1.c: Adjusted to check no rlwinm.
> 	* gcc.target/powerpc/pr96933-2.c: Likewise.

Ok.  (I'd hope a few more extend instructions would be eliminated, but
this only covers the tests that explicitly looked/didn't look for them, so OK).



> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index f33fca3982a..9c084b055b8 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -6793,17 +6793,8 @@ rs6000_expand_vector_init (rtx target, rtx vals)

I note that the code changes that follow here are within the code block 

  if (TARGET_DIRECT_MOVE && (mode == V16QImode || mode == V8HImode))
    {

This is implied per the patch description, but not obvious from the context of the changes here.   (OK).


>        /* Force the values into word_mode registers.  */
>        for (i = 0; i < n_elts; i++)
>         {
> -         rtx tmp = force_reg (GET_MODE_INNER (mode), XVECEXP (vals, 0, i));
> -         if (TARGET_POWERPC64)
> -           {
> -             op[i] = gen_reg_rtx (DImode);
> -             emit_insn (gen_zero_extendqidi2 (op[i], tmp));
> -           }
> -         else
> -           {
> -             op[i] = gen_reg_rtx (SImode);
> -             emit_insn (gen_zero_extendqisi2 (op[i], tmp));
> -           }
> +         rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
> +         op[i] = simplify_gen_subreg (Pmode, tmp, inner_mode, 0);
>         }
> 
>        /* Take unsigned char big endianness on 64bit as example for below
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr96933-1.c b/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
> index 3b63865b3b8..71d72084413 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr96933-1.c
> @@ -13,3 +13,4 @@
>  /* { dg-final { scan-assembler-times {\mvpkudum\M} 12 } } */
>  /* { dg-final { scan-assembler-not {\mstb\M} } } */
>  /* { dg-final { scan-assembler-not {\msth\M} } } */
> +/* { dg-final { scan-assembler-not {\mrlwinm\M} } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr96933-2.c b/gcc/testsuite/gcc.target/powerpc/pr96933-2.c
> index cef8fbd4f35..9fa15125d8d 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr96933-2.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr96933-2.c
> @@ -13,3 +13,4 @@
>  /* { dg-final { scan-assembler-times {\mxxpermdi\M} 4 } } */
>  /* { dg-final { scan-assembler-not {\mstb\M} } } */
>  /* { dg-final { scan-assembler-not {\msth\M} } } */
> +/* { dg-final { scan-assembler-not {\mrlwinm\M} } } */

Ok.
Thanks
-Will




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] rs6000: Use subreg for QI/HI vector init
  2020-12-02  9:44 [PATCH] rs6000: Use subreg for QI/HI vector init Kewen.Lin
  2020-12-02 22:32 ` will schmidt
@ 2020-12-14 18:51 ` Segher Boessenkool
  2020-12-15  7:53   ` Kewen.Lin
  1 sibling, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2020-12-14 18:51 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, David Edelsohn, Bill Schmidt

Hi!

On Wed, Dec 02, 2020 at 05:44:24PM +0800, Kewen.Lin wrote:
> This patch is to use paradoxical subreg instead of
> zero_extend for promoting QI/HI to SI/DI when we
> want to construct one vector with these modes.
> Since we do the gpr->vsx movement and vector merge
> or pack later, the high part is useless and safe to
> use paradoxical subreg.  It can avoid useless rlwinms
> generated for signed cases.

> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -6793,17 +6793,8 @@ rs6000_expand_vector_init (rtx target, rtx vals)
>        /* Force the values into word_mode registers.  */
>        for (i = 0; i < n_elts; i++)
>  	{
> -	  rtx tmp = force_reg (GET_MODE_INNER (mode), XVECEXP (vals, 0, i));
> -	  if (TARGET_POWERPC64)
> -	    {
> -	      op[i] = gen_reg_rtx (DImode);
> -	      emit_insn (gen_zero_extendqidi2 (op[i], tmp));
> -	    }
> -	  else
> -	    {
> -	      op[i] = gen_reg_rtx (SImode);
> -	      emit_insn (gen_zero_extendqisi2 (op[i], tmp));
> -	    }
> +	  rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
> +	  op[i] = simplify_gen_subreg (Pmode, tmp, inner_mode, 0);
>  	}

Pmode is defined based on TARGET_64BIT, not TARGET_POWERPC64.

But, can you not always use SImode here?

The rest of the patch is fine of course.


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] rs6000: Use subreg for QI/HI vector init
  2020-12-02 22:32 ` will schmidt
@ 2020-12-14 18:54   ` Segher Boessenkool
  0 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2020-12-14 18:54 UTC (permalink / raw)
  To: will schmidt; +Cc: Kewen.Lin, GCC Patches, Bill Schmidt, David Edelsohn

On Wed, Dec 02, 2020 at 04:32:47PM -0600, will schmidt wrote:
> On Wed, 2020-12-02 at 17:44 +0800, Kewen.Lin via Gcc-patches wrote:
> > 	* gcc.target/powerpc/pr96933-1.c: Adjusted to check no rlwinm.
> > 	* gcc.target/powerpc/pr96933-2.c: Likewise.
> 
> Ok.  (I'd hope a few more extend instructions would be eliminated, but
> this only covers the tests that explicitly looked/didn't look for them, so OK).

The patch changes only one very specific case.  To have a bigger effect,
some much more generic thing is needed (and that should *not* use
subregs, subregs are worse most of the time).


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] rs6000: Use subreg for QI/HI vector init
  2020-12-14 18:51 ` Segher Boessenkool
@ 2020-12-15  7:53   ` Kewen.Lin
  2020-12-15 14:40     ` Segher Boessenkool
  0 siblings, 1 reply; 8+ messages in thread
From: Kewen.Lin @ 2020-12-15  7:53 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, David Edelsohn, Bill Schmidt

Hi Segher,

Thanks for the review!

on 2020/12/15 上午2:51, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Dec 02, 2020 at 05:44:24PM +0800, Kewen.Lin wrote:
>> This patch is to use paradoxical subreg instead of
>> zero_extend for promoting QI/HI to SI/DI when we
>> want to construct one vector with these modes.
>> Since we do the gpr->vsx movement and vector merge
>> or pack later, the high part is useless and safe to
>> use paradoxical subreg.  It can avoid useless rlwinms
>> generated for signed cases.
> 
>> --- a/gcc/config/rs6000/rs6000.c
>> +++ b/gcc/config/rs6000/rs6000.c
>> @@ -6793,17 +6793,8 @@ rs6000_expand_vector_init (rtx target, rtx vals)
>>        /* Force the values into word_mode registers.  */
>>        for (i = 0; i < n_elts; i++)
>>  	{
>> -	  rtx tmp = force_reg (GET_MODE_INNER (mode), XVECEXP (vals, 0, i));
>> -	  if (TARGET_POWERPC64)
>> -	    {
>> -	      op[i] = gen_reg_rtx (DImode);
>> -	      emit_insn (gen_zero_extendqidi2 (op[i], tmp));
>> -	    }
>> -	  else
>> -	    {
>> -	      op[i] = gen_reg_rtx (SImode);
>> -	      emit_insn (gen_zero_extendqisi2 (op[i], tmp));
>> -	    }
>> +	  rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
>> +	  op[i] = simplify_gen_subreg (Pmode, tmp, inner_mode, 0);
>>  	}
> 
> Pmode is defined based on TARGET_64BIT, not TARGET_POWERPC64.
> 

Good point, you are right, is it ok to change this part with one
explicit mode based on TARGET_POWERPC64?

          rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
          machine_mode tmode = TARGET_POWERPC64 ? DImode : SImode;
          op[i] = simplify_gen_subreg (tmode, tmp, inner_mode, 0);

> But, can you not always use SImode here?
> 

Sorry that I didn't quite follow here.
It uses DImode when TARGET_POWERPC64, so not SImode always (when
!TARGET_POWERPC64).

BR,
Kewen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] rs6000: Use subreg for QI/HI vector init
  2020-12-15  7:53   ` Kewen.Lin
@ 2020-12-15 14:40     ` Segher Boessenkool
  2020-12-16  8:19       ` Kewen.Lin
  0 siblings, 1 reply; 8+ messages in thread
From: Segher Boessenkool @ 2020-12-15 14:40 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, David Edelsohn, Bill Schmidt

Hi Ke Wen,

On Tue, Dec 15, 2020 at 03:53:29PM +0800, Kewen.Lin wrote:
> on 2020/12/15 上午2:51, Segher Boessenkool wrote:
> > On Wed, Dec 02, 2020 at 05:44:24PM +0800, Kewen.Lin wrote:
> >> --- a/gcc/config/rs6000/rs6000.c
> >> +++ b/gcc/config/rs6000/rs6000.c
> >> @@ -6793,17 +6793,8 @@ rs6000_expand_vector_init (rtx target, rtx vals)
> >>        /* Force the values into word_mode registers.  */
> >>        for (i = 0; i < n_elts; i++)
> >>  	{
> >> -	  rtx tmp = force_reg (GET_MODE_INNER (mode), XVECEXP (vals, 0, i));
> >> -	  if (TARGET_POWERPC64)
> >> -	    {
> >> -	      op[i] = gen_reg_rtx (DImode);
> >> -	      emit_insn (gen_zero_extendqidi2 (op[i], tmp));
> >> -	    }
> >> -	  else
> >> -	    {
> >> -	      op[i] = gen_reg_rtx (SImode);
> >> -	      emit_insn (gen_zero_extendqisi2 (op[i], tmp));
> >> -	    }
> >> +	  rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
> >> +	  op[i] = simplify_gen_subreg (Pmode, tmp, inner_mode, 0);
> >>  	}
> > 
> > Pmode is defined based on TARGET_64BIT, not TARGET_POWERPC64.
> 
> Good point, you are right, is it ok to change this part with one
> explicit mode based on TARGET_POWERPC64?
> 
>           rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
>           machine_mode tmode = TARGET_POWERPC64 ? DImode : SImode;
>           op[i] = simplify_gen_subreg (tmode, tmp, inner_mode, 0);

That looks fine, yes.

> > But, can you not always use SImode here?
> 
> Sorry that I didn't quite follow here.

I mean do

          rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
          op[i] = simplify_gen_subreg (SImode, tmp, inner_mode, 0);

If that works (also in 64-bit mode), that is preferred.  It might need
some more adjustment elsewhere, not sure if that is worth it.

It is okay for trunk with either of those changes.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] rs6000: Use subreg for QI/HI vector init
  2020-12-15 14:40     ` Segher Boessenkool
@ 2020-12-16  8:19       ` Kewen.Lin
  2020-12-16 10:21         ` Segher Boessenkool
  0 siblings, 1 reply; 8+ messages in thread
From: Kewen.Lin @ 2020-12-16  8:19 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, David Edelsohn, Bill Schmidt

Hi Segher,

on 2020/12/15 下午10:40, Segher Boessenkool wrote:
> Hi Ke Wen,
> 
> On Tue, Dec 15, 2020 at 03:53:29PM +0800, Kewen.Lin wrote:
>> on 2020/12/15 上午2:51, Segher Boessenkool wrote:
>>> On Wed, Dec 02, 2020 at 05:44:24PM +0800, Kewen.Lin wrote:
>>>> --- a/gcc/config/rs6000/rs6000.c
>>>> +++ b/gcc/config/rs6000/rs6000.c
>>>> @@ -6793,17 +6793,8 @@ rs6000_expand_vector_init (rtx target, rtx vals)
>>>>        /* Force the values into word_mode registers.  */
>>>>        for (i = 0; i < n_elts; i++)
>>>>  	{
>>>> -	  rtx tmp = force_reg (GET_MODE_INNER (mode), XVECEXP (vals, 0, i));
>>>> -	  if (TARGET_POWERPC64)
>>>> -	    {
>>>> -	      op[i] = gen_reg_rtx (DImode);
>>>> -	      emit_insn (gen_zero_extendqidi2 (op[i], tmp));
>>>> -	    }
>>>> -	  else
>>>> -	    {
>>>> -	      op[i] = gen_reg_rtx (SImode);
>>>> -	      emit_insn (gen_zero_extendqisi2 (op[i], tmp));
>>>> -	    }
>>>> +	  rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
>>>> +	  op[i] = simplify_gen_subreg (Pmode, tmp, inner_mode, 0);
>>>>  	}
>>>
>>> Pmode is defined based on TARGET_64BIT, not TARGET_POWERPC64.
>>
>> Good point, you are right, is it ok to change this part with one
>> explicit mode based on TARGET_POWERPC64?
>>
>>           rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
>>           machine_mode tmode = TARGET_POWERPC64 ? DImode : SImode;
>>           op[i] = simplify_gen_subreg (tmode, tmp, inner_mode, 0);
> 
> That looks fine, yes.
> 
>>> But, can you not always use SImode here?
>>
>> Sorry that I didn't quite follow here.
> 
> I mean do
> 
>           rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
>           op[i] = simplify_gen_subreg (SImode, tmp, inner_mode, 0);
> 
> If that works (also in 64-bit mode), that is preferred.  It might need
> some more adjustment elsewhere, not sure if that is worth it.
> 

Thanks for the clarification!  Yes, as you said it doesn't work without
any adjustments, the gen_vsx_concat_v2di requires DImode operands, I
think it's not worthy to expand it just for this case.

> It is okay for trunk with either of those changes.  Thanks!
> 

Thanks!  Committed with the former way via r11-6109.

BR,
Kewen

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] rs6000: Use subreg for QI/HI vector init
  2020-12-16  8:19       ` Kewen.Lin
@ 2020-12-16 10:21         ` Segher Boessenkool
  0 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2020-12-16 10:21 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, David Edelsohn, Bill Schmidt

On Wed, Dec 16, 2020 at 04:19:12PM +0800, Kewen.Lin wrote:
> on 2020/12/15 下午10:40, Segher Boessenkool wrote:
> > I mean do
> > 
> >           rtx tmp = force_reg (inner_mode, XVECEXP (vals, 0, i));
> >           op[i] = simplify_gen_subreg (SImode, tmp, inner_mode, 0);
> > 
> > If that works (also in 64-bit mode), that is preferred.  It might need
> > some more adjustment elsewhere, not sure if that is worth it.
> > 
> 
> Thanks for the clarification!  Yes, as you said it doesn't work without
> any adjustments, the gen_vsx_concat_v2di requires DImode operands, I
> think it's not worthy to expand it just for this case.

Yeah good call.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2020-12-16 10:22 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-02  9:44 [PATCH] rs6000: Use subreg for QI/HI vector init Kewen.Lin
2020-12-02 22:32 ` will schmidt
2020-12-14 18:54   ` Segher Boessenkool
2020-12-14 18:51 ` Segher Boessenkool
2020-12-15  7:53   ` Kewen.Lin
2020-12-15 14:40     ` Segher Boessenkool
2020-12-16  8:19       ` Kewen.Lin
2020-12-16 10:21         ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).