* [PATCH] testsuite: Adjust possibly fragile slp-perm-9.c [PR104015]
@ 2022-01-18 3:06 Kewen.Lin
2022-01-18 8:06 ` Kewen.Lin
0 siblings, 1 reply; 5+ messages in thread
From: Kewen.Lin @ 2022-01-18 3:06 UTC (permalink / raw)
To: GCC Patches
Cc: Segher Boessenkool, Bill Schmidt, David Edelsohn,
Richard Sandiford, Andre Vieira (lists),
Richard Biener
Hi,
As discussed in PR104015, the test case slp-perm-9.c can be
fragile when vectorizer tries to use different vectorisation
strategies.
As Richard suggested, this patch tries to make the check not
sensitive on the re-trying times by removing the times checking.
To still retain the test coverage on unnecessary re-trying, for
example this exposed PR104015 on Power9, I added two test cases
to powerpc test bucket.
Tested on x86_64-redhat-linux, aarch64-linux-gnu and
powerpc64-linux-gnu Power8 and powerpc64le-linux-gnu
Power9/Power10.
Is it ok for trunk?
BR,
Kewen
-----
gcc/testsuite/ChangeLog:
PR tree-optimization/104015
* gcc.dg/vect/slp-perm-9.c: Adjust.
* gcc.target/powerpc/pr104015-1.c: New test.
* gcc.target/powerpc/pr104015-2.c: New test.
---
gcc/testsuite/gcc.dg/vect/slp-perm-9.c | 4 +--
gcc/testsuite/gcc.target/powerpc/pr104015-1.c | 28 +++++++++++++++++++
gcc/testsuite/gcc.target/powerpc/pr104015-2.c | 28 +++++++++++++++++++
3 files changed, 57 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104015-1.c
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104015-2.c
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
index 873eddf223e..154c00af598 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
@@ -61,9 +61,7 @@ int main (int argc, const char* argv[])
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { { vect_perm_short || vect32 } || vect_load_lanes } } } } */
/* We don't try permutes with a group size of 3 for variable-length
vectors. */
-/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && { ! vect_partial_vectors_usage_1 } } } xfail vect_variable_length } } } */
-/* Try to vectorize the epilogue using partial vectors. */
-/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 2 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && vect_partial_vectors_usage_1 } } xfail vect_variable_length } } } */
+/* { dg-final { scan-tree-dump "permutation requires at least three vectors" "vect" { target { vect_perm_short && { ! vect_perm3_short } } xfail vect_variable_length } } } */
/* { dg-final { scan-tree-dump-not "permutation requires at least three vectors" "vect" { target vect_perm3_short } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { { ! { vect_perm3_short || vect32 } } || vect_load_lanes } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { { vect_perm3_short || vect32 } && { ! vect_load_lanes } } } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-1.c b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c
new file mode 100644
index 00000000000..895c243aaf8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c
@@ -0,0 +1,28 @@
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power9 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+/* As PR104015, we don't expect vectorizer will re-try some vector modes
+ for epilogues on Power9, since Power9 doesn't support partial vector
+ by defaut. */
+
+#include <stdarg.h>
+#define N 200
+
+void __attribute__((noinline))
+foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput)
+{
+ unsigned short i, a, b, c;
+
+ for (i = 0; i < N / 3; i++)
+ {
+ a = *pInput++;
+ b = *pInput++;
+ c = *pInput++;
+
+ *pOutput++ = a + b + c + 3;
+ *pOutput++ = a + b + c + 12;
+ *pOutput++ = a + b + c + 1;
+ }
+}
+
+/* { dg-final { scan-tree-dump-not "Re-trying epilogue analysis with vector mode" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-2.c b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c
new file mode 100644
index 00000000000..1b66a64f47c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c
@@ -0,0 +1,28 @@
+/* { dg-require-effective-target power10_ok } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+/* Power10 support partial vector for epilogue by default, it's expected
+ vectorizer would re-try for it once. */
+
+#include <stdarg.h>
+#define N 200
+
+void __attribute__((noinline))
+foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput)
+{
+ unsigned short i, a, b, c;
+
+ for (i = 0; i < N / 3; i++)
+ {
+ a = *pInput++;
+ b = *pInput++;
+ c = *pInput++;
+
+ *pOutput++ = a + b + c + 3;
+ *pOutput++ = a + b + c + 12;
+ *pOutput++ = a + b + c + 1;
+ }
+}
+
+/* Vector with length instructions lxvl/stxvl are only enabled for 64 bit. */
+/* { dg-final { scan-tree-dump-times "Re-trying epilogue analysis with vector mode" 1 "vect" {target { ! ilp32 } } } } */
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] testsuite: Adjust possibly fragile slp-perm-9.c [PR104015]
2022-01-18 3:06 [PATCH] testsuite: Adjust possibly fragile slp-perm-9.c [PR104015] Kewen.Lin
@ 2022-01-18 8:06 ` Kewen.Lin
2022-01-18 11:57 ` Richard Sandiford
0 siblings, 1 reply; 5+ messages in thread
From: Kewen.Lin @ 2022-01-18 8:06 UTC (permalink / raw)
To: GCC Patches
Cc: Segher Boessenkool, Richard Sandiford, Bill Schmidt, David Edelsohn
[-- Attachment #1: Type: text/plain, Size: 1020 bytes --]
on 2022/1/18 上午11:06, Kewen.Lin via Gcc-patches wrote:
> Hi,
>
> As discussed in PR104015, the test case slp-perm-9.c can be
> fragile when vectorizer tries to use different vectorisation
> strategies.
>
> As Richard suggested, this patch tries to make the check not
> sensitive on the re-trying times by removing the times checking.
> To still retain the test coverage on unnecessary re-trying, for
> example this exposed PR104015 on Power9, I added two test cases
> to powerpc test bucket.
>
> Tested on x86_64-redhat-linux, aarch64-linux-gnu and
> powerpc64-linux-gnu Power8 and powerpc64le-linux-gnu
> Power9/Power10.
>
> Is it ok for trunk?
>
> BR,
> Kewen
> -----
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/104015
> * gcc.dg/vect/slp-perm-9.c: Adjust.
> * gcc.target/powerpc/pr104015-1.c: New test.
> * gcc.target/powerpc/pr104015-2.c: New test.
One updated version is attached to modify pr104015-2.c slightly by
using more clear required effective target lp64.
Tested as before.
BR,
Kewen
[-- Attachment #2: pr104015-test-v2.patch --]
[-- Type: text/plain, Size: 4270 bytes --]
gcc/testsuite/gcc.dg/vect/slp-perm-9.c | 4 +--
gcc/testsuite/gcc.target/powerpc/pr104015-1.c | 28 ++++++++++++++++++
gcc/testsuite/gcc.target/powerpc/pr104015-2.c | 29 +++++++++++++++++++
3 files changed, 58 insertions(+), 3 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104015-1.c
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104015-2.c
diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
index 873eddf223e..154c00af598 100644
--- a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
+++ b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
@@ -61,9 +61,7 @@ int main (int argc, const char* argv[])
/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { { vect_perm_short || vect32 } || vect_load_lanes } } } } */
/* We don't try permutes with a group size of 3 for variable-length
vectors. */
-/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && { ! vect_partial_vectors_usage_1 } } } xfail vect_variable_length } } } */
-/* Try to vectorize the epilogue using partial vectors. */
-/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 2 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && vect_partial_vectors_usage_1 } } xfail vect_variable_length } } } */
+/* { dg-final { scan-tree-dump "permutation requires at least three vectors" "vect" { target { vect_perm_short && { ! vect_perm3_short } } xfail vect_variable_length } } } */
/* { dg-final { scan-tree-dump-not "permutation requires at least three vectors" "vect" { target vect_perm3_short } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { { ! { vect_perm3_short || vect32 } } || vect_load_lanes } } } } */
/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { { vect_perm3_short || vect32 } && { ! vect_load_lanes } } } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-1.c b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c
new file mode 100644
index 00000000000..895c243aaf8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c
@@ -0,0 +1,28 @@
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power9 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+/* As PR104015, we don't expect vectorizer will re-try some vector modes
+ for epilogues on Power9, since Power9 doesn't support partial vector
+ by defaut. */
+
+#include <stdarg.h>
+#define N 200
+
+void __attribute__((noinline))
+foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput)
+{
+ unsigned short i, a, b, c;
+
+ for (i = 0; i < N / 3; i++)
+ {
+ a = *pInput++;
+ b = *pInput++;
+ c = *pInput++;
+
+ *pOutput++ = a + b + c + 3;
+ *pOutput++ = a + b + c + 12;
+ *pOutput++ = a + b + c + 1;
+ }
+}
+
+/* { dg-final { scan-tree-dump-not "Re-trying epilogue analysis with vector mode" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-2.c b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c
new file mode 100644
index 00000000000..ab482b11629
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c
@@ -0,0 +1,29 @@
+/* { dg-require-effective-target power10_ok } */
+/* Vector with length instructions lxvl/stxvl are only enabled for 64 bit. */
+/* { dg-require-effective-target lp64 } */
+/* { dg-options "-mdejagnu-cpu=power10 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+/* Power10 support partial vector for epilogue by default, it's expected
+ vectorizer would re-try for it once. */
+
+#include <stdarg.h>
+#define N 200
+
+void __attribute__((noinline))
+foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput)
+{
+ unsigned short i, a, b, c;
+
+ for (i = 0; i < N / 3; i++)
+ {
+ a = *pInput++;
+ b = *pInput++;
+ c = *pInput++;
+
+ *pOutput++ = a + b + c + 3;
+ *pOutput++ = a + b + c + 12;
+ *pOutput++ = a + b + c + 1;
+ }
+}
+
+/* { dg-final { scan-tree-dump-times "Re-trying epilogue analysis with vector mode" 1 "vect" } } */
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] testsuite: Adjust possibly fragile slp-perm-9.c [PR104015]
2022-01-18 8:06 ` Kewen.Lin
@ 2022-01-18 11:57 ` Richard Sandiford
2022-01-18 21:34 ` Segher Boessenkool
0 siblings, 1 reply; 5+ messages in thread
From: Richard Sandiford @ 2022-01-18 11:57 UTC (permalink / raw)
To: Kewen.Lin; +Cc: GCC Patches, Segher Boessenkool, Bill Schmidt, David Edelsohn
"Kewen.Lin" <linkw@linux.ibm.com> writes:
> on 2022/1/18 锟斤拷锟斤拷11:06, Kewen.Lin via Gcc-patches wrote:
>> Hi,
>>
>> As discussed in PR104015, the test case slp-perm-9.c can be
>> fragile when vectorizer tries to use different vectorisation
>> strategies.
>>
>> As Richard suggested, this patch tries to make the check not
>> sensitive on the re-trying times by removing the times checking.
>> To still retain the test coverage on unnecessary re-trying, for
>> example this exposed PR104015 on Power9, I added two test cases
>> to powerpc test bucket.
>>
>> Tested on x86_64-redhat-linux, aarch64-linux-gnu and
>> powerpc64-linux-gnu Power8 and powerpc64le-linux-gnu
>> Power9/Power10.
>>
>> Is it ok for trunk?
>>
>> BR,
>> Kewen
>> -----
>> gcc/testsuite/ChangeLog:
>>
>> PR tree-optimization/104015
>> * gcc.dg/vect/slp-perm-9.c: Adjust.
>> * gcc.target/powerpc/pr104015-1.c: New test.
>> * gcc.target/powerpc/pr104015-2.c: New test.
>
> One updated version is attached to modify pr104015-2.c slightly by
> using more clear required effective target lp64.
>
> Tested as before.
>
> BR,
> Kewen
OK for the target-independent part, thanks. IMO it's OK independently
of the rs6000 tests.
Richard
> gcc/testsuite/gcc.dg/vect/slp-perm-9.c | 4 +--
> gcc/testsuite/gcc.target/powerpc/pr104015-1.c | 28 ++++++++++++++++++
> gcc/testsuite/gcc.target/powerpc/pr104015-2.c | 29 +++++++++++++++++++
> 3 files changed, 58 insertions(+), 3 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104015-1.c
> create mode 100644 gcc/testsuite/gcc.target/powerpc/pr104015-2.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
> index 873eddf223e..154c00af598 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
> +++ b/gcc/testsuite/gcc.dg/vect/slp-perm-9.c
> @@ -61,9 +61,7 @@ int main (int argc, const char* argv[])
> /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { { vect_perm_short || vect32 } || vect_load_lanes } } } } */
> /* We don't try permutes with a group size of 3 for variable-length
> vectors. */
> -/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 1 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && { ! vect_partial_vectors_usage_1 } } } xfail vect_variable_length } } } */
> -/* Try to vectorize the epilogue using partial vectors. */
> -/* { dg-final { scan-tree-dump-times "permutation requires at least three vectors" 2 "vect" { target { vect_perm_short && { { ! vect_perm3_short } && vect_partial_vectors_usage_1 } } xfail vect_variable_length } } } */
> +/* { dg-final { scan-tree-dump "permutation requires at least three vectors" "vect" { target { vect_perm_short && { ! vect_perm3_short } } xfail vect_variable_length } } } */
> /* { dg-final { scan-tree-dump-not "permutation requires at least three vectors" "vect" { target vect_perm3_short } } } */
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 0 "vect" { target { { ! { vect_perm3_short || vect32 } } || vect_load_lanes } } } } */
> /* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { { vect_perm3_short || vect32 } && { ! vect_load_lanes } } } } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-1.c b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c
> new file mode 100644
> index 00000000000..895c243aaf8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr104015-1.c
> @@ -0,0 +1,28 @@
> +/* { dg-require-effective-target powerpc_p9vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power9 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */
> +
> +/* As PR104015, we don't expect vectorizer will re-try some vector modes
> + for epilogues on Power9, since Power9 doesn't support partial vector
> + by defaut. */
> +
> +#include <stdarg.h>
> +#define N 200
> +
> +void __attribute__((noinline))
> +foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput)
> +{
> + unsigned short i, a, b, c;
> +
> + for (i = 0; i < N / 3; i++)
> + {
> + a = *pInput++;
> + b = *pInput++;
> + c = *pInput++;
> +
> + *pOutput++ = a + b + c + 3;
> + *pOutput++ = a + b + c + 12;
> + *pOutput++ = a + b + c + 1;
> + }
> +}
> +
> +/* { dg-final { scan-tree-dump-not "Re-trying epilogue analysis with vector mode" "vect" } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr104015-2.c b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c
> new file mode 100644
> index 00000000000..ab482b11629
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr104015-2.c
> @@ -0,0 +1,29 @@
> +/* { dg-require-effective-target power10_ok } */
> +/* Vector with length instructions lxvl/stxvl are only enabled for 64 bit. */
> +/* { dg-require-effective-target lp64 } */
> +/* { dg-options "-mdejagnu-cpu=power10 -O2 -ftree-vectorize -fno-vect-cost-model -fdump-tree-vect-details" } */
> +
> +/* Power10 support partial vector for epilogue by default, it's expected
> + vectorizer would re-try for it once. */
> +
> +#include <stdarg.h>
> +#define N 200
> +
> +void __attribute__((noinline))
> +foo (unsigned short *__restrict__ pInput, unsigned short *__restrict__ pOutput)
> +{
> + unsigned short i, a, b, c;
> +
> + for (i = 0; i < N / 3; i++)
> + {
> + a = *pInput++;
> + b = *pInput++;
> + c = *pInput++;
> +
> + *pOutput++ = a + b + c + 3;
> + *pOutput++ = a + b + c + 12;
> + *pOutput++ = a + b + c + 1;
> + }
> +}
> +
> +/* { dg-final { scan-tree-dump-times "Re-trying epilogue analysis with vector mode" 1 "vect" } } */
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] testsuite: Adjust possibly fragile slp-perm-9.c [PR104015]
2022-01-18 11:57 ` Richard Sandiford
@ 2022-01-18 21:34 ` Segher Boessenkool
2022-01-19 6:14 ` Kewen.Lin
0 siblings, 1 reply; 5+ messages in thread
From: Segher Boessenkool @ 2022-01-18 21:34 UTC (permalink / raw)
To: Kewen.Lin, GCC Patches, Bill Schmidt, David Edelsohn, richard.sandiford
On Tue, Jan 18, 2022 at 11:57:32AM +0000, Richard Sandiford wrote:
> "Kewen.Lin" <linkw@linux.ibm.com> writes:
> >> PR tree-optimization/104015
> >> * gcc.dg/vect/slp-perm-9.c: Adjust.
> >> * gcc.target/powerpc/pr104015-1.c: New test.
> >> * gcc.target/powerpc/pr104015-2.c: New test.
> OK for the target-independent part, thanks. IMO it's OK independently
> of the rs6000 tests.
The rs6000 parts are fine as well. Thanks!
I see you got rid of the ilp32 tests, I was going to holler about that,
there is no reason this should only work (or only be tested) on 64-bit
systems :-)
Segher
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH] testsuite: Adjust possibly fragile slp-perm-9.c [PR104015]
2022-01-18 21:34 ` Segher Boessenkool
@ 2022-01-19 6:14 ` Kewen.Lin
0 siblings, 0 replies; 5+ messages in thread
From: Kewen.Lin @ 2022-01-19 6:14 UTC (permalink / raw)
To: Segher Boessenkool, richard.sandiford
Cc: David Edelsohn, Bill Schmidt, GCC Patches
on 2022/1/19 上午5:34, Segher Boessenkool wrote:
> On Tue, Jan 18, 2022 at 11:57:32AM +0000, Richard Sandiford wrote:
>> "Kewen.Lin" <linkw@linux.ibm.com> writes:
>>>> PR tree-optimization/104015
>>>> * gcc.dg/vect/slp-perm-9.c: Adjust.
>>>> * gcc.target/powerpc/pr104015-1.c: New test.
>>>> * gcc.target/powerpc/pr104015-2.c: New test.
>
>> OK for the target-independent part, thanks. IMO it's OK independently
>> of the rs6000 tests.
>
> The rs6000 parts are fine as well. Thanks!
>
> I see you got rid of the ilp32 tests, I was going to holler about that,
> there is no reason this should only work (or only be tested) on 64-bit
> systems :-)
>
Thanks Richard and Segher, committed as r12-6717.
BR,
Kewen
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-01-19 6:15 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-18 3:06 [PATCH] testsuite: Adjust possibly fragile slp-perm-9.c [PR104015] Kewen.Lin
2022-01-18 8:06 ` Kewen.Lin
2022-01-18 11:57 ` Richard Sandiford
2022-01-18 21:34 ` Segher Boessenkool
2022-01-19 6:14 ` Kewen.Lin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).