* adjust vectorization expectations for ppc costmodel 76b @ 2021-03-10 9:12 Alexandre Oliva 2024-04-22 9:28 ` [PATCH] " Alexandre Oliva 0 siblings, 1 reply; 7+ messages in thread From: Alexandre Oliva @ 2021-03-10 9:12 UTC (permalink / raw) To: gcc-patches; +Cc: Rainer Orth, Mike Stump, Segher Boessenkool, David Edelsohn This test expects vectorization at power8+ because strict alignment is not required for vectors. For power7, vectorization is not to take place because it's not deemed profitable: 12 iterations would be required to make it so. But for power6 and below, the test's 10 iterations are enough to make vectorization profitable, but the test doesn't expect this. Assuming the decision is indeed appropriate, I'm adjusting the expectations. This was regstrapped on x86_64-linux-gnu, tested with a cross to a ppc64-vxworks7r2, and I'm now also regstrapping on ppc64-linux-gnu just to be sure. Ok to install? for gcc/testsuite/ChangeLog * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c: Adjust expectations for cpus below power7. --- .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c index 5da4343198c10..937985012286c 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c @@ -45,9 +45,10 @@ int main (void) return 0; } -/* Peeling to align the store is used. Overhead of peeling is too high. */ -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { vector_alignment_reachable && {! vect_no_align} } } } } */ -/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target { vector_alignment_reachable && {! vect_hw_misalign} } } } } */ +/* Peeling to align the store is used. Overhead of peeling is too high + for power7, but acceptable for earlier architectures. */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { has_arch_pwr7 && { vector_alignment_reachable && {! vect_no_align} } } } } } */ +/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target { has_arch_pwr7 && { vector_alignment_reachable && {! vect_hw_misalign} } } } } } */ /* Versioning to align the store is used. Overhead of versioning is not too high. */ -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_no_align || {! vector_alignment_reachable} } } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} || {! has_arch_pwr7 } } } } } } */ -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer Vim, Vi, Voltei pro Emacs -- GNUlius Caesar ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH] adjust vectorization expectations for ppc costmodel 76b 2021-03-10 9:12 adjust vectorization expectations for ppc costmodel 76b Alexandre Oliva @ 2024-04-22 9:28 ` Alexandre Oliva 2024-04-24 8:24 ` Kewen.Lin 0 siblings, 1 reply; 7+ messages in thread From: Alexandre Oliva @ 2024-04-22 9:28 UTC (permalink / raw) To: gcc-patches Cc: Rainer Orth, Mike Stump, David Edelsohn, Segher Boessenkool, Kewen Lin Ping? https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566525.html This test expects vectorization at power8+ because strict alignment is not required for vectors. For power7, vectorization is not to take place because it's not deemed profitable: 12 iterations would be required to make it so. But for power6 and below, the test's 10 iterations are enough to make vectorization profitable, but the test doesn't expect this. Assuming the decision is indeed appropriate, I'm adjusting the expectations. for gcc/testsuite/ChangeLog * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c: Adjust expectations for cpus below power7. --- .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c index cbbfbb24658f8..0dab2c08acdb4 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c @@ -46,9 +46,10 @@ int main (void) return 0; } -/* Peeling to align the store is used. Overhead of peeling is too high. */ -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { vector_alignment_reachable && {! vect_no_align} } } } } */ -/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target { vector_alignment_reachable && {! vect_hw_misalign} } } } } */ +/* Peeling to align the store is used. Overhead of peeling is too high + for power7, but acceptable for earlier architectures. */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { has_arch_pwr7 && { vector_alignment_reachable && {! vect_no_align} } } } } } */ +/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target { has_arch_pwr7 && { vector_alignment_reachable && {! vect_hw_misalign} } } } } } */ /* Versioning to align the store is used. Overhead of versioning is not too high. */ -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_no_align || {! vector_alignment_reachable} } } } } */ +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} || {! has_arch_pwr7 } } } } } } */ -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b 2024-04-22 9:28 ` [PATCH] " Alexandre Oliva @ 2024-04-24 8:24 ` Kewen.Lin 2024-04-28 8:14 ` Alexandre Oliva 0 siblings, 1 reply; 7+ messages in thread From: Kewen.Lin @ 2024-04-24 8:24 UTC (permalink / raw) To: Alexandre Oliva Cc: Rainer Orth, Mike Stump, David Edelsohn, Segher Boessenkool, Kewen Lin, gcc-patches Hi, on 2024/4/22 17:28, Alexandre Oliva wrote: > Ping? > https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566525.html > > > This test expects vectorization at power8+ because strict alignment is > not required for vectors. For power7, vectorization is not to take > place because it's not deemed profitable: 12 iterations would be > required to make it so. > > But for power6 and below, the test's 10 iterations are enough to make > vectorization profitable, but the test doesn't expect this. Assuming > the decision is indeed appropriate, I'm adjusting the expectations. For a record, the cost difference between power6 and power7 is the cost for vec_perm, it's: * p6 * ic[i_23] 2 times vector_stmt costs 2 in prologue ic[i_23] 1 times vector_stmt costs 1 in prologue ic[i_23] 1 times vector_load costs 2 in body ic[i_23] 1 times vec_perm costs 1 in body vs. * p7 * ic[i_23] 2 times vector_stmt costs 2 in prologue ic[i_23] 1 times vector_stmt costs 1 in prologue ic[i_23] 1 times vector_load costs 2 in body ic[i_23] 1 times vec_perm costs 3 in body , it further cause minimum iters for profitability difference. > > > for gcc/testsuite/ChangeLog > > * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c: Adjust > expectations for cpus below power7. > --- > .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > index cbbfbb24658f8..0dab2c08acdb4 100644 > --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > @@ -46,9 +46,10 @@ int main (void) > return 0; > } > > -/* Peeling to align the store is used. Overhead of peeling is too high. */ > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { vector_alignment_reachable && {! vect_no_align} } } } } */ > -/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target { vector_alignment_reachable && {! vect_hw_misalign} } } } } */ > +/* Peeling to align the store is used. Overhead of peeling is too high > + for power7, but acceptable for earlier architectures. */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { has_arch_pwr7 && { vector_alignment_reachable && {! vect_no_align} } } } } } */ > +/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target { has_arch_pwr7 && { vector_alignment_reachable && {! vect_hw_misalign} } } } } } */ > > /* Versioning to align the store is used. Overhead of versioning is not too high. */ > -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_no_align || {! vector_alignment_reachable} } } } } */ > +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} || {! has_arch_pwr7 } } } } } } */ For !has_arch_pwr7 case, it still adopts peeling but as the comment (one line above) shows the original intention of this case is to expect not profitable for peeling so it's not expected to be handled here, can we just tweak the loop bound instead, such as: -#define N 14 +#define N 13 #define OFF 4 ?, it can make this loop not profitable to be vectorized for !vect_no_align with peeling (both pwr7 and pwr6) and keep consistent. BR, Kewen > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b 2024-04-24 8:24 ` Kewen.Lin @ 2024-04-28 8:14 ` Alexandre Oliva 2024-04-28 9:31 ` Kewen.Lin 0 siblings, 1 reply; 7+ messages in thread From: Alexandre Oliva @ 2024-04-28 8:14 UTC (permalink / raw) To: Kewen.Lin Cc: Rainer Orth, Mike Stump, David Edelsohn, Segher Boessenkool, Kewen Lin, gcc-patches On Apr 24, 2024, "Kewen.Lin" <linkw@linux.ibm.com> wrote: > For !has_arch_pwr7 case, it still adopts peeling but as the comment (one line above) > shows the original intention of this case is to expect not profitable for peeling > so it's not expected to be handled here, can we just tweak the loop bound instead, > such as: > -#define N 14 > +#define N 13 > #define OFF 4 > ?, it can make this loop not profitable to be vectorized for !vect_no_align with > peeling (both pwr7 and pwr6) and keep consistent. Like this? I didn't feel I could claim authorship of this one-liner just because I turned it into a patch and tested it, so I took the liberty of turning your own words above into the commit message. So far, tested on ppc64le-linux-gnu (ppc9). Testing with vxworks targets now. Would you like to tweak the commit message to your liking? Otherwise, is this ok to install? Thanks, adjust iteration count for ppc costmodel 76b From: Kewen Lin <linkw@linux.ibm.com> The original intention of this case is to expect not profitable for peeling. Tweak the loop bound to make this loop not profitable to be vectorized for !vect_no_align with peeling (both pwr7 and pwr6) and keep consistent. for gcc/testsuite/ChangeLog * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak. --- .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c index cbbfbb24658f8..e48b0ab759e75 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c @@ -6,7 +6,7 @@ /* On Power7 without misalign vector support, this case is to check it's not profitable to perform vectorization by peeling to align the store. */ -#define N 14 +#define N 13 #define OFF 4 /* Check handling of accesses for which the "initial condition" - -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b 2024-04-28 8:14 ` Alexandre Oliva @ 2024-04-28 9:31 ` Kewen.Lin 2024-04-29 6:28 ` Alexandre Oliva 0 siblings, 1 reply; 7+ messages in thread From: Kewen.Lin @ 2024-04-28 9:31 UTC (permalink / raw) To: Alexandre Oliva Cc: Rainer Orth, Mike Stump, David Edelsohn, Segher Boessenkool, Kewen Lin, gcc-patches Hi, on 2024/4/28 16:14, Alexandre Oliva wrote: > On Apr 24, 2024, "Kewen.Lin" <linkw@linux.ibm.com> wrote: > >> For !has_arch_pwr7 case, it still adopts peeling but as the comment (one line above) >> shows the original intention of this case is to expect not profitable for peeling >> so it's not expected to be handled here, can we just tweak the loop bound instead, >> such as: > >> -#define N 14 >> +#define N 13 >> #define OFF 4 > >> ?, it can make this loop not profitable to be vectorized for !vect_no_align with >> peeling (both pwr7 and pwr6) and keep consistent. > > Like this? I didn't feel I could claim authorship of this one-liner > just because I turned it into a patch and tested it, so I took the > liberty of turning your own words above into the commit message. So Feel free to do so! > far, tested on ppc64le-linux-gnu (ppc9). Testing with vxworks targets > now. Would you like to tweak the commit message to your liking? OK, tweaked as below. > Otherwise, is this ok to install? > > Thanks, > > > adjust iteration count for ppc costmodel 76b Nit: Maybe add a prefix "testsuite: ". > > From: Kewen Lin <linkw@linux.ibm.com> Thanks, you can just drop this. :) > > The original intention of this case is to expect not profitable for > peeling. Tweak the loop bound to make this loop not profitable to be > vectorized for !vect_no_align with peeling (both pwr7 and pwr6) and > keep consistent. For some hardware which doesn't support unaligned vector memory access, test case costmodel-vect-76b.c expects to see cost modeling would make the decision that it's not profitable for peeling, according to the commit history, test case comments and the way to check. For now, the existing loop bound 14 works well for Power7, but it does not for some targets on which the cost of operation vec_perm can be different from Power7, such as: Power6, it's 3 vs. 1. This difference further causes the difference (10 vs. 12) on the minimum iteration for profitability and cause the failure. To keep the original test point, this patch is to tweak the loop bound to ensure it's not profitable to be vectorized for !vect_no_align with peeling. OK for trunk (assuming the testings run well on p6/p7 too), thanks! BR, Kewen > > > for gcc/testsuite/ChangeLog > > * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak. > --- > .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > index cbbfbb24658f8..e48b0ab759e75 100644 > --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c > @@ -6,7 +6,7 @@ > > /* On Power7 without misalign vector support, this case is to check it's not > profitable to perform vectorization by peeling to align the store. */ > -#define N 14 > +#define N 13 > #define OFF 4 > > /* Check handling of accesses for which the "initial condition" - > > ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b 2024-04-28 9:31 ` Kewen.Lin @ 2024-04-29 6:28 ` Alexandre Oliva 2024-04-29 8:56 ` Kewen.Lin 0 siblings, 1 reply; 7+ messages in thread From: Alexandre Oliva @ 2024-04-29 6:28 UTC (permalink / raw) To: Kewen.Lin Cc: Rainer Orth, Mike Stump, David Edelsohn, Segher Boessenkool, Kewen Lin, gcc-patches On Apr 28, 2024, "Kewen.Lin" <linkw@linux.ibm.com> wrote: > Nit: Maybe add a prefix "testsuite: ". ACK >> >> From: Kewen Lin <linkw@linux.ibm.com> > Thanks, you can just drop this. :) I've turned it into Co-Authored-By, since you insist. But unfortunately with the patch it still fails when testing for -mcpu=power7 on ppc64le-linux-gnu: it does vectorize the loop with 13 iterations. We need 16 iterations, as in an earlier version of this test, for it to pass for -mcpu=power7, but then it doesn't pass for -mcpu=power6. It looks like we're going to have to adjust the expectations. -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b 2024-04-29 6:28 ` Alexandre Oliva @ 2024-04-29 8:56 ` Kewen.Lin 0 siblings, 0 replies; 7+ messages in thread From: Kewen.Lin @ 2024-04-29 8:56 UTC (permalink / raw) To: Alexandre Oliva Cc: Rainer Orth, Mike Stump, David Edelsohn, Segher Boessenkool, Kewen Lin, gcc-patches on 2024/4/29 14:28, Alexandre Oliva wrote: > On Apr 28, 2024, "Kewen.Lin" <linkw@linux.ibm.com> wrote: > >> Nit: Maybe add a prefix "testsuite: ". > > ACK > >>> >>> From: Kewen Lin <linkw@linux.ibm.com> > >> Thanks, you can just drop this. :) > > I've turned it into Co-Authored-By, since you insist. > > But unfortunately with the patch it still fails when testing for > -mcpu=power7 on ppc64le-linux-gnu: it does vectorize the loop with 13 > iterations. We need 16 iterations, as in an earlier version of this > test, for it to pass for -mcpu=power7, but then it doesn't pass for > -mcpu=power6. > > It looks like we're going to have to adjust the expectations. > I had a look at the failure, it's due to that "vect_no_align" is evaluated as true unexpectedly. "selector_expression: ` vect_no_align || {! vector_alignment_reachable} ' 1" Currently powerpc* checks check_p8vector_hw_available, ppc64le-linux-gnu has at least Power8 support (that is testing machine supports p8vector run), so it concludes vect_no_align is true. proc check_effective_target_vect_no_align { } { return [check_cached_effective_target_indexed vect_no_align { expr { [istarget mipsisa64*-*-*] || [istarget mips-sde-elf] || [istarget sparc*-*-*] || [istarget ia64-*-*] || [check_effective_target_arm_vect_no_misalign] || ([istarget powerpc*-*-*] && [check_p8vector_hw_available]) I'll fix this in PR113535 which was filed previously for visiting powerpc specific check in these vect* effective targets. If the testing just goes with native cpu type, this issue will become invisible. I think you can still push the patch as the testing just exposes another issue. BR, Kewen ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-04-29 8:57 UTC | newest] Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-10 9:12 adjust vectorization expectations for ppc costmodel 76b Alexandre Oliva 2024-04-22 9:28 ` [PATCH] " Alexandre Oliva 2024-04-24 8:24 ` Kewen.Lin 2024-04-28 8:14 ` Alexandre Oliva 2024-04-28 9:31 ` Kewen.Lin 2024-04-29 6:28 ` Alexandre Oliva 2024-04-29 8:56 ` Kewen.Lin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).