public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]
@ 2022-02-09  2:43 HAO CHEN GUI
  2022-02-14  6:39 ` Ping^1 [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694, PR93123] HAO CHEN GUI
  2022-02-14 21:36 ` [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694] Segher Boessenkool
  0 siblings, 2 replies; 6+ messages in thread
From: HAO CHEN GUI @ 2022-02-09  2:43 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Bill Schmidt

Hi,
  This patch removes TImode from mode iterator BOOL_128. Thus, bool operations (AND, IOR, XOR, NOT)
on TImode will be split to the relevant operations on word mode during expand (in optabs.c). Potential
optimizations can be implemented after the split. The former practice splits it after the reload
pass which is too later for some optimizations. The new test case illustrates it.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk?
Any recommendations? Thanks a lot.

ChangeLog
2022-02-08 Haochen Gui <guihaoc@linux.ibm.com>

gcc/
	PR target/100694
	* config/rs6000/rs6000.md (BOOL_128): Remove TI.

gcc/testsuite/
	PR target/100694
	* gcc.target/powerpc/pr100694.c: New.
	* gcc.target/powerpc/pr92398.c: New.
	* gcc.target/powerpc/pr92398.h: Remove.
	* gcc.target/powerpc/pr92398.p9-.c: Remove.
	* gcc.target/powerpc/pr92398.p9+.c: Remove.

patch.diff
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6f74075f58d..2bc1b8f497a 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -750,8 +750,7 @@ (define_mode_attr SI_CONVERT_FP [(SF "TARGET_FCFIDS")
 				 (DF "TARGET_FCFID")])

 ;; Mode iterator for logical operations on 128-bit types
-(define_mode_iterator BOOL_128		[TI
-					 PTI
+(define_mode_iterator BOOL_128		[PTI
 					 (V16QI	"TARGET_ALTIVEC")
 					 (V8HI	"TARGET_ALTIVEC")
 					 (V4SI	"TARGET_ALTIVEC")
diff --git a/gcc/testsuite/gcc.target/powerpc/pr100694.c b/gcc/testsuite/gcc.target/powerpc/pr100694.c
new file mode 100644
index 00000000000..7b41d920140
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr100694.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times {\mstd\M} 2 } } */
+/* { dg-final { scan-assembler-not {\mli\M} } } */
+/* { dg-final { scan-assembler-not {\mor\M} } } */
+
+/* It just needs two std.  */
+void foo (unsigned __int128* res, unsigned long long hi, unsigned long long lo)
+{
+   unsigned __int128 i = hi;
+   i <<= 64;
+   i |= lo;
+   *res = i;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.c b/gcc/testsuite/gcc.target/powerpc/pr92398.c
new file mode 100644
index 00000000000..7d6201cc5bb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr92398.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target int128 } */
+/* { dg-options "-O2" } */
+/* { dg-final { scan-assembler-times {\mnot\M} 2 } } */
+/* { dg-final { scan-assembler-times {\mstd\M} 2 } } */
+
+/* All platforms should generate the same instructions: not;not;std;std.  */
+void bar (__int128_t *dst, __int128_t src)
+{
+  *dst =  ~src;
+}
+
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.h b/gcc/testsuite/gcc.target/powerpc/pr92398.h
deleted file mode 100644
index 5a4a8bcab80..00000000000
--- a/gcc/testsuite/gcc.target/powerpc/pr92398.h
+++ /dev/null
@@ -1,17 +0,0 @@
-/* This test code is included into pr92398.p9-.c and pr92398.p9+.c.
-   The two files have the tests for the number of instructions generated for
-   P9- versus P9+.
-
-   store generates difference instructions as below:
-   P9+: mtvsrdd;xxlnot;stxv.
-   P8/P7/P6 LE: not;not;std;std.
-   P8 BE: mtvsrd;mtvsrd;xxpermdi;xxlnor;stxvd2x.
-   P7/P6 BE: std;std;addi;lxvd2x;xxlnor;stxvd2x.
-   P9+ and P9- LE are expected, P6/P7/P8 BE are unexpected.  */
-
-void
-bar (__int128_t *dst, __int128_t src)
-{
-  *dst =  ~src;
-}
-
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c b/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
deleted file mode 100644
index 72dd1d9a274..00000000000
--- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
+++ /dev/null
@@ -1,12 +0,0 @@
-/* { dg-do compile { target { lp64 && has_arch_pwr9 } } } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-options "-O2 -mvsx" } */
-
-/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */
-/* { dg-final { scan-assembler-times {\mxxlnor\M} 1 } } */
-/* { dg-final { scan-assembler-times {\mstxv\M} 1 } } */
-/* { dg-final { scan-assembler-not {\mld\M} } } */
-/* { dg-final { scan-assembler-not {\mnot\M} } } */
-
-/* Source code for the test in pr92398.h */
-#include "pr92398.h"
diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c b/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
deleted file mode 100644
index bd7fa98af51..00000000000
--- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
+++ /dev/null
@@ -1,10 +0,0 @@
-/* { dg-do compile { target { lp64 && {! has_arch_pwr9} } } } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
-/* { dg-options "-O2 -mvsx" } */
-
-/* { dg-final { scan-assembler-times {\mnot\M} 2 { xfail be } } } */
-/* { dg-final { scan-assembler-times {\mstd\M} 2 { xfail { { {! has_arch_pwr9} && has_arch_pwr8 } && be } } } } */
-
-/* Source code for the test in pr92398.h */
-#include "pr92398.h"
-

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Ping^1 [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694, PR93123]
  2022-02-09  2:43 [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694] HAO CHEN GUI
@ 2022-02-14  6:39 ` HAO CHEN GUI
  2022-02-14 21:36 ` [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694] Segher Boessenkool
  1 sibling, 0 replies; 6+ messages in thread
From: HAO CHEN GUI @ 2022-02-14  6:39 UTC (permalink / raw)
  To: gcc-patches; +Cc: Segher Boessenkool, David, Bill Schmidt

Hi,
  Gentle ping this:
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590057.html

Thanks

On 9/2/2022 上午 10:43, HAO CHEN GUI wrote:
> Hi,
>   This patch removes TImode from mode iterator BOOL_128. Thus, bool operations (AND, IOR, XOR, NOT)
> on TImode will be split to the relevant operations on word mode during expand (in optabs.c). Potential
> optimizations can be implemented after the split. The former practice splits it after the reload
> pass which is too later for some optimizations. The new test case illustrates it.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. Is this okay for trunk?
> Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-02-08 Haochen Gui <guihaoc@linux.ibm.com>
> 
> gcc/
> 	PR target/100694
> 	* config/rs6000/rs6000.md (BOOL_128): Remove TI.
> 
> gcc/testsuite/
> 	PR target/100694
> 	* gcc.target/powerpc/pr100694.c: New.
> 	* gcc.target/powerpc/pr92398.c: New.
> 	* gcc.target/powerpc/pr92398.h: Remove.
> 	* gcc.target/powerpc/pr92398.p9-.c: Remove.
> 	* gcc.target/powerpc/pr92398.p9+.c: Remove.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 6f74075f58d..2bc1b8f497a 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -750,8 +750,7 @@ (define_mode_attr SI_CONVERT_FP [(SF "TARGET_FCFIDS")
>  				 (DF "TARGET_FCFID")])
> 
>  ;; Mode iterator for logical operations on 128-bit types
> -(define_mode_iterator BOOL_128		[TI
> -					 PTI
> +(define_mode_iterator BOOL_128		[PTI
>  					 (V16QI	"TARGET_ALTIVEC")
>  					 (V8HI	"TARGET_ALTIVEC")
>  					 (V4SI	"TARGET_ALTIVEC")
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100694.c b/gcc/testsuite/gcc.target/powerpc/pr100694.c
> new file mode 100644
> index 00000000000..7b41d920140
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100694.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler-times {\mstd\M} 2 } } */
> +/* { dg-final { scan-assembler-not {\mli\M} } } */
> +/* { dg-final { scan-assembler-not {\mor\M} } } */
> +
> +/* It just needs two std.  */
> +void foo (unsigned __int128* res, unsigned long long hi, unsigned long long lo)
> +{
> +   unsigned __int128 i = hi;
> +   i <<= 64;
> +   i |= lo;
> +   *res = i;
> +}
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.c b/gcc/testsuite/gcc.target/powerpc/pr92398.c
> new file mode 100644
> index 00000000000..7d6201cc5bb
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr92398.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { scan-assembler-times {\mnot\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mstd\M} 2 } } */
> +
> +/* All platforms should generate the same instructions: not;not;std;std.  */
> +void bar (__int128_t *dst, __int128_t src)
> +{
> +  *dst =  ~src;
> +}
> +
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.h b/gcc/testsuite/gcc.target/powerpc/pr92398.h
> deleted file mode 100644
> index 5a4a8bcab80..00000000000
> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.h
> +++ /dev/null
> @@ -1,17 +0,0 @@
> -/* This test code is included into pr92398.p9-.c and pr92398.p9+.c.
> -   The two files have the tests for the number of instructions generated for
> -   P9- versus P9+.
> -
> -   store generates difference instructions as below:
> -   P9+: mtvsrdd;xxlnot;stxv.
> -   P8/P7/P6 LE: not;not;std;std.
> -   P8 BE: mtvsrd;mtvsrd;xxpermdi;xxlnor;stxvd2x.
> -   P7/P6 BE: std;std;addi;lxvd2x;xxlnor;stxvd2x.
> -   P9+ and P9- LE are expected, P6/P7/P8 BE are unexpected.  */
> -
> -void
> -bar (__int128_t *dst, __int128_t src)
> -{
> -  *dst =  ~src;
> -}
> -
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c b/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
> deleted file mode 100644
> index 72dd1d9a274..00000000000
> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c
> +++ /dev/null
> @@ -1,12 +0,0 @@
> -/* { dg-do compile { target { lp64 && has_arch_pwr9 } } } */
> -/* { dg-require-effective-target powerpc_vsx_ok } */
> -/* { dg-options "-O2 -mvsx" } */
> -
> -/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */
> -/* { dg-final { scan-assembler-times {\mxxlnor\M} 1 } } */
> -/* { dg-final { scan-assembler-times {\mstxv\M} 1 } } */
> -/* { dg-final { scan-assembler-not {\mld\M} } } */
> -/* { dg-final { scan-assembler-not {\mnot\M} } } */
> -
> -/* Source code for the test in pr92398.h */
> -#include "pr92398.h"
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c b/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
> deleted file mode 100644
> index bd7fa98af51..00000000000
> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c
> +++ /dev/null
> @@ -1,10 +0,0 @@
> -/* { dg-do compile { target { lp64 && {! has_arch_pwr9} } } } */
> -/* { dg-require-effective-target powerpc_vsx_ok } */
> -/* { dg-options "-O2 -mvsx" } */
> -
> -/* { dg-final { scan-assembler-times {\mnot\M} 2 { xfail be } } } */
> -/* { dg-final { scan-assembler-times {\mstd\M} 2 { xfail { { {! has_arch_pwr9} && has_arch_pwr8 } && be } } } } */
> -
> -/* Source code for the test in pr92398.h */
> -#include "pr92398.h"
> -

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]
  2022-02-09  2:43 [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694] HAO CHEN GUI
  2022-02-14  6:39 ` Ping^1 [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694, PR93123] HAO CHEN GUI
@ 2022-02-14 21:36 ` Segher Boessenkool
  2022-02-15  3:01   ` HAO CHEN GUI
  1 sibling, 1 reply; 6+ messages in thread
From: Segher Boessenkool @ 2022-02-14 21:36 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: gcc-patches, David, Bill Schmidt

Hi!

On Wed, Feb 09, 2022 at 10:43:17AM +0800, HAO CHEN GUI wrote:
>   This patch removes TImode from mode iterator BOOL_128. Thus, bool operations (AND, IOR, XOR, NOT)
> on TImode will be split to the relevant operations on word mode during expand (in optabs.c).

But we also want to allow TImode in VSRs.  This of course is a never-
ending story, no choice works very well here.

> Potential
> optimizations can be implemented after the split. The former practice splits it after the reload
> pass which is too later for some optimizations. The new test case illustrates it.

All that are arguments for expanding to split form, not for removing
TImode from the iterator.  And you leave PTImode, which *always* is in
GPRs!

It may be that only leaving the "V" modes there works well; that needs
testing though, more than just asaserting this.

Just doing it and handling the ICEs later is fine, but in stage 1.

(You'll also have to show it is *correct*, you need to prove (or show it
really likely :-) ) that after this change there are no TImode things
generated anywhere (anywhere!) that are no longer handled now).


Segher

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]
  2022-02-14 21:36 ` [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694] Segher Boessenkool
@ 2022-02-15  3:01   ` HAO CHEN GUI
  2022-02-15 14:56     ` Segher Boessenkool
  0 siblings, 1 reply; 6+ messages in thread
From: HAO CHEN GUI @ 2022-02-15  3:01 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, David, Bill Schmidt

Segher,
  Thanks for your comments. Here are my comments and questions.Thanks.

On 15/2/2022 上午 5:36, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Feb 09, 2022 at 10:43:17AM +0800, HAO CHEN GUI wrote:
>>   This patch removes TImode from mode iterator BOOL_128. Thus, bool operations (AND, IOR, XOR, NOT)
>> on TImode will be split to the relevant operations on word mode during expand (in optabs.c).
> 
> But we also want to allow TImode in VSRs.  This of course is a never-
> ending story, no choice works very well here.
> 
>> Potential
>> optimizations can be implemented after the split. The former practice splits it after the reload
>> pass which is too later for some optimizations. The new test case illustrates it.
> 
> All that are arguments for expanding to split form, not for removing
> TImode from the iterator.  And you leave PTImode, which *always* is in
> GPRs!
From my understanding, PTImode has limitation that it needs to be assigned
with an even/odd register pair. So it can't be split before the reload pass.
Currently it is split after reload.>
> It may be that only leaving the "V" modes there works well; that needs
> testing though, more than just asaserting this.
> 
> Just doing it and handling the ICEs later is fine, but in stage 1.
> 
> (You'll also have to show it is *correct*, you need to prove (or show it
> really likely :-) ) that after this change there are no TImode things
> generated anywhere (anywhere!) that are no longer handled now).
> 
Yes, the TI may be generated after expand pass and causes ICEs. So how about
creating two mode iterators? One is for expand which doesn't include TImode,
another is for the split which include TImode and make TImode to be split
as early as possible?

> 
> Segher

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]
  2022-02-15  3:01   ` HAO CHEN GUI
@ 2022-02-15 14:56     ` Segher Boessenkool
  2022-02-16  8:28       ` HAO CHEN GUI
  0 siblings, 1 reply; 6+ messages in thread
From: Segher Boessenkool @ 2022-02-15 14:56 UTC (permalink / raw)
  To: HAO CHEN GUI; +Cc: gcc-patches, David, Bill Schmidt

On Tue, Feb 15, 2022 at 11:01:03AM +0800, HAO CHEN GUI wrote:
Hi!

> On 15/2/2022 上午 5:36, Segher Boessenkool wrote:
> > On Wed, Feb 09, 2022 at 10:43:17AM +0800, HAO CHEN GUI wrote:
> > All that are arguments for expanding to split form, not for removing
> > TImode from the iterator.  And you leave PTImode, which *always* is in
> > GPRs!
> >From my understanding, PTImode has limitation that it needs to be assigned
> with an even/odd register pair. So it can't be split before the reload pass.

TImode is put in an even/odd pair always as well.  What is special about
PTImode here?

> Currently it is split after reload.>

This prevents almost all optimisations.  Splits after reload should be a
last resort thing.  They almost always cause bigger problems than what
they are meant to solve.  There aren't many splitters that *have* to run
after RA!

> > (You'll also have to show it is *correct*, you need to prove (or show it
> > really likely :-) ) that after this change there are no TImode things
> > generated anywhere (anywhere!) that are no longer handled now).
> > 
> Yes, the TI may be generated after expand pass and causes ICEs. So how about
> creating two mode iterators? One is for expand which doesn't include TImode,
> another is for the split which include TImode and make TImode to be split
> as early as possible?

You can also have the expanders fail for TImode?  That gives you a good
place to put in a code comment as well ;-)


Segher

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694]
  2022-02-15 14:56     ` Segher Boessenkool
@ 2022-02-16  8:28       ` HAO CHEN GUI
  0 siblings, 0 replies; 6+ messages in thread
From: HAO CHEN GUI @ 2022-02-16  8:28 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: gcc-patches, David, Bill Schmidt

Hi,

On 15/2/2022 下午 10:56, Segher Boessenkool wrote:
> On Tue, Feb 15, 2022 at 11:01:03AM +0800, HAO CHEN GUI wrote:
> Hi!
> 
>> On 15/2/2022 上午 5:36, Segher Boessenkool wrote:
>>> On Wed, Feb 09, 2022 at 10:43:17AM +0800, HAO CHEN GUI wrote:
>>> All that are arguments for expanding to split form, not for removing
>>> TImode from the iterator.  And you leave PTImode, which *always* is in
>>> GPRs!
>> >From my understanding, PTImode has limitation that it needs to be assigned
>> with an even/odd register pair. So it can't be split before the reload pass.
> 
> TImode is put in an even/odd pair always as well.  What is special about
> PTImode here?
TI is allowed in any GPRs. TI can be placed in r3/r4 or r4/r5 (both odd/even
and even/odd) while PTI can only be placed in r4/r5 (even/odd). So if we
split PTI before reload,the constraint is broken then PTI can be placed in
any GPRs, I think.
> 
>> Currently it is split after reload.>
> 
> This prevents almost all optimisations.  Splits after reload should be a
> last resort thing.  They almost always cause bigger problems than what
> they are meant to solve.  There aren't many splitters that *have* to run
> after RA!
> 
>>> (You'll also have to show it is *correct*, you need to prove (or show it
>>> really likely :-) ) that after this change there are no TImode things
>>> generated anywhere (anywhere!) that are no longer handled now).
>>>
>> Yes, the TI may be generated after expand pass and causes ICEs. So how about
>> creating two mode iterators? One is for expand which doesn't include TImode,
>> another is for the split which include TImode and make TImode to be split
>> as early as possible?
> 
> You can also have the expanders fail for TImode?  That gives you a good
> place to put in a code comment as well ;-)
> 
Yes, I will take it.
> 
> Segher

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2022-02-16  8:28 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-09  2:43 [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694] HAO CHEN GUI
2022-02-14  6:39 ` Ping^1 [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694, PR93123] HAO CHEN GUI
2022-02-14 21:36 ` [PATCH, rs6000] Remove TImode from mode iterator BOOL_128 [PR100694] Segher Boessenkool
2022-02-15  3:01   ` HAO CHEN GUI
2022-02-15 14:56     ` Segher Boessenkool
2022-02-16  8:28       ` HAO CHEN GUI

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).