public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] rs6000/test: Add emulated gather test case
@ 2021-11-25  3:20 Kewen.Lin
  2021-11-25  5:17 ` Hongtao Liu
  2021-11-26 16:24 ` Segher Boessenkool
  0 siblings, 2 replies; 5+ messages in thread
From: Kewen.Lin @ 2021-11-25  3:20 UTC (permalink / raw)
  To: GCC Patches
  Cc: Segher Boessenkool, David Edelsohn, Bill Schmidt, Richard Biener

Hi,

This patch is to add a test case similar to the one in i386
to add testing coverage for 510.parest_r hotspots.

As evaluated, the emulated gather capability of vectorizer
(r12-2733) can help to speed up SPEC2017 510.parest_r on
Power8/9/10 by 5% to 9% with option sets Ofast unroll and
Ofast lto.  But since rs6000 missed unpacking support for
unsigned int before, it can only vectorize the hotspots
until r12-3134.

By checking why r12-2733 doesn't immediately show its impact
for SPEC2017 510.parest_r while the associated test case
already can get vectorized on rs6000 at that time, I realized
the associated test case use int as INDEXTYPE while the
hotspots actually use unsigned int.  So different from the one
in i386, this patch uses unsigned int as INDEXTYPE since the
unpack support for unsigned int (r12-3134) also matters for
the hotspots vectorization.  Not sure if it's worth to updating
the one in i386 as well?

Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8.

Is it ok for trunk?

BR,
Kewen
-----
gcc/testsuite/ChangeLog:

	* gcc.target/powerpc/vect-gather-1.c: New test.

diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c
new file mode 100644
index 00000000000..bf98045ab03
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* Profitable from Power8 since it supports efficient unaligned load.  */
+/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */
+
+#ifndef INDEXTYPE
+#define INDEXTYPE unsigned int
+#endif
+double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend,
+	    double *luval, double *dst)
+{
+  double res = 0;
+  for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval)
+        res += *luval * dst[*col];
+  return res;
+}
+
+/* With gather emulation this should be profitable to vectorize from Power8.  */
+/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */
+/* The index vector loads and promotions should be scalar after forwprop.  */
+/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */
--
2.25.1


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rs6000/test: Add emulated gather test case
  2021-11-25  3:20 [PATCH] rs6000/test: Add emulated gather test case Kewen.Lin
@ 2021-11-25  5:17 ` Hongtao Liu
  2021-11-25  5:31   ` Kewen.Lin
  2021-11-26 16:24 ` Segher Boessenkool
  1 sibling, 1 reply; 5+ messages in thread
From: Hongtao Liu @ 2021-11-25  5:17 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, Bill Schmidt, David Edelsohn, Segher Boessenkool

On Thu, Nov 25, 2021 at 11:21 AM Kewen.Lin via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi,
>
> This patch is to add a test case similar to the one in i386
> to add testing coverage for 510.parest_r hotspots.
>
> As evaluated, the emulated gather capability of vectorizer
> (r12-2733) can help to speed up SPEC2017 510.parest_r on
> Power8/9/10 by 5% to 9% with option sets Ofast unroll and
> Ofast lto.  But since rs6000 missed unpacking support for
> unsigned int before, it can only vectorize the hotspots
> until r12-3134.
>
> By checking why r12-2733 doesn't immediately show its impact
> for SPEC2017 510.parest_r while the associated test case
> already can get vectorized on rs6000 at that time, I realized
> the associated test case use int as INDEXTYPE while the
> hotspots actually use unsigned int.  So different from the one
> in i386, this patch uses unsigned int as INDEXTYPE since the
> unpack support for unsigned int (r12-3134) also matters for
> the hotspots vectorization.  Not sure if it's worth to updating
> the one in i386 as well?
It looks like the same testcase added in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88531
>
> Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8.
>
> Is it ok for trunk?
>
> BR,
> Kewen
> -----
> gcc/testsuite/ChangeLog:
>
>         * gcc.target/powerpc/vect-gather-1.c: New test.
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c
> new file mode 100644
> index 00000000000..bf98045ab03
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c
> @@ -0,0 +1,20 @@
> +/* { dg-do compile } */
> +/* Profitable from Power8 since it supports efficient unaligned load.  */
> +/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */
> +
> +#ifndef INDEXTYPE
> +#define INDEXTYPE unsigned int
> +#endif
> +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend,
> +           double *luval, double *dst)
> +{
> +  double res = 0;
> +  for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval)
> +        res += *luval * dst[*col];
> +  return res;
> +}
> +
> +/* With gather emulation this should be profitable to vectorize from Power8.  */
> +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */
> +/* The index vector loads and promotions should be scalar after forwprop.  */
> +/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */
> --
> 2.25.1
>


-- 
BR,
Hongtao

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rs6000/test: Add emulated gather test case
  2021-11-25  5:17 ` Hongtao Liu
@ 2021-11-25  5:31   ` Kewen.Lin
  0 siblings, 0 replies; 5+ messages in thread
From: Kewen.Lin @ 2021-11-25  5:31 UTC (permalink / raw)
  To: Hongtao Liu; +Cc: GCC Patches, Bill Schmidt, David Edelsohn, Segher Boessenkool

on 2021/11/25 下午1:17, Hongtao Liu wrote:
> On Thu, Nov 25, 2021 at 11:21 AM Kewen.Lin via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> Hi,
>>
>> This patch is to add a test case similar to the one in i386
>> to add testing coverage for 510.parest_r hotspots.
>>
>> As evaluated, the emulated gather capability of vectorizer
>> (r12-2733) can help to speed up SPEC2017 510.parest_r on
>> Power8/9/10 by 5% to 9% with option sets Ofast unroll and
>> Ofast lto.  But since rs6000 missed unpacking support for
>> unsigned int before, it can only vectorize the hotspots
>> until r12-3134.
>>
>> By checking why r12-2733 doesn't immediately show its impact
>> for SPEC2017 510.parest_r while the associated test case
>> already can get vectorized on rs6000 at that time, I realized
>> the associated test case use int as INDEXTYPE while the
>> hotspots actually use unsigned int.  So different from the one
>> in i386, this patch uses unsigned int as INDEXTYPE since the
>> unpack support for unsigned int (r12-3134) also matters for
>> the hotspots vectorization.  Not sure if it's worth to updating
>> the one in i386 as well?
> It looks like the same testcase added in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88531

Thanks for the information!  Good to know that there are already
some cases to cover.  :)

BR,
Kewen

>>
>> Tested on powerpc64le-linux-gnu P9 and powerpc64-linux-gnu P8.
>>
>> Is it ok for trunk?
>>
>> BR,
>> Kewen
>> -----
>> gcc/testsuite/ChangeLog:
>>
>>         * gcc.target/powerpc/vect-gather-1.c: New test.
>>
>> diff --git a/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c
>> new file mode 100644
>> index 00000000000..bf98045ab03
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/vect-gather-1.c
>> @@ -0,0 +1,20 @@
>> +/* { dg-do compile } */
>> +/* Profitable from Power8 since it supports efficient unaligned load.  */
>> +/* { dg-options "-Ofast -mdejagnu-cpu=power8 -fdump-tree-vect-details -fdump-tree-forwprop4" } */
>> +
>> +#ifndef INDEXTYPE
>> +#define INDEXTYPE unsigned int
>> +#endif
>> +double vmul(INDEXTYPE *rowstart, INDEXTYPE *rowend,
>> +           double *luval, double *dst)
>> +{
>> +  double res = 0;
>> +  for (const INDEXTYPE * col = rowstart; col != rowend; ++col, ++luval)
>> +        res += *luval * dst[*col];
>> +  return res;
>> +}
>> +
>> +/* With gather emulation this should be profitable to vectorize from Power8.  */
>> +/* { dg-final { scan-tree-dump "loop vectorized" "vect" } } */
>> +/* The index vector loads and promotions should be scalar after forwprop.  */
>> +/* { dg-final { scan-tree-dump-not "vec_unpack" "forwprop4" } } */
>> --
>> 2.25.1
>>
> 
> 



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rs6000/test: Add emulated gather test case
  2021-11-25  3:20 [PATCH] rs6000/test: Add emulated gather test case Kewen.Lin
  2021-11-25  5:17 ` Hongtao Liu
@ 2021-11-26 16:24 ` Segher Boessenkool
  2021-11-29  2:03   ` Kewen.Lin
  1 sibling, 1 reply; 5+ messages in thread
From: Segher Boessenkool @ 2021-11-26 16:24 UTC (permalink / raw)
  To: Kewen.Lin; +Cc: GCC Patches, David Edelsohn, Bill Schmidt, Richard Biener

Hi!

On Thu, Nov 25, 2021 at 11:20:57AM +0800, Kewen.Lin wrote:
> This patch is to add a test case similar to the one in i386
> to add testing coverage for 510.parest_r hotspots.

> gcc/testsuite/ChangeLog:
> 	* gcc.target/powerpc/vect-gather-1.c: New test.

This is okay for trunk.  Thanks!


Segher

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] rs6000/test: Add emulated gather test case
  2021-11-26 16:24 ` Segher Boessenkool
@ 2021-11-29  2:03   ` Kewen.Lin
  0 siblings, 0 replies; 5+ messages in thread
From: Kewen.Lin @ 2021-11-29  2:03 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: GCC Patches, David Edelsohn, Bill Schmidt

on 2021/11/27 上午12:24, Segher Boessenkool wrote:
> Hi!
> 
> On Thu, Nov 25, 2021 at 11:20:57AM +0800, Kewen.Lin wrote:
>> This patch is to add a test case similar to the one in i386
>> to add testing coverage for 510.parest_r hotspots.
> 
>> gcc/testsuite/ChangeLog:
>> 	* gcc.target/powerpc/vect-gather-1.c: New test.
> 
> This is okay for trunk.  Thanks!
> 

Thanks Segher!  Committed as r12-5569.

BR,
Kewen

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-11-29  2:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-25  3:20 [PATCH] rs6000/test: Add emulated gather test case Kewen.Lin
2021-11-25  5:17 ` Hongtao Liu
2021-11-25  5:31   ` Kewen.Lin
2021-11-26 16:24 ` Segher Boessenkool
2021-11-29  2:03   ` Kewen.Lin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).