Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Alexandre Oliva <oliva@adacore.com>
Cc: Rainer Orth <ro@CeBiTec.Uni-Bielefeld.DE>,
	Mike Stump <mikestump@comcast.net>,
	David Edelsohn <dje.gcc@gmail.com>,
	Segher Boessenkool <segher@kernel.crashing.org>,
	Kewen Lin <linkw@gcc.gnu.org>,
	gcc-patches@gcc.gnu.org
Subject: Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b
Date: Wed, 24 Apr 2024 16:24:23 +0800	[thread overview]
Message-ID: <d041e080-3d58-9d35-d5f7-88415d647456@linux.ibm.com> (raw)
In-Reply-To: <oredaxoif1.fsf@lxoliva.fsfla.org>

Hi,

on 2024/4/22 17:28, Alexandre Oliva wrote:
> Ping?
> https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566525.html
> 
> 
> This test expects vectorization at power8+ because strict alignment is
> not required for vectors.  For power7, vectorization is not to take
> place because it's not deemed profitable: 12 iterations would be
> required to make it so.
> 
> But for power6 and below, the test's 10 iterations are enough to make
> vectorization profitable, but the test doesn't expect this.  Assuming
> the decision is indeed appropriate, I'm adjusting the expectations.

For a record, the cost difference between power6 and power7 is the cost
for vec_perm, it's:

* p6 *

ic[i_23] 2 times vector_stmt costs 2 in prologue
ic[i_23] 1 times vector_stmt costs 1 in prologue
ic[i_23] 1 times vector_load costs 2 in body
ic[i_23] 1 times vec_perm costs 1 in body

vs.

* p7 *

ic[i_23] 2 times vector_stmt costs 2 in prologue
ic[i_23] 1 times vector_stmt costs 1 in prologue
ic[i_23] 1 times vector_load costs 2 in body
ic[i_23] 1 times vec_perm costs 3 in body

, it further cause minimum iters for profitability difference.

> 
> 
> for  gcc/testsuite/ChangeLog
> 
> 	* gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c: Adjust
> 	expectations for cpus below power7.
> ---
>  .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c |    9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
> index cbbfbb24658f8..0dab2c08acdb4 100644
> --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
> +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
> @@ -46,9 +46,10 @@ int main (void)
>    return 0;
>  }
>  
> -/* Peeling to align the store is used. Overhead of peeling is too high.  */
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { vector_alignment_reachable && {! vect_no_align} } } } } */
> -/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target { vector_alignment_reachable && {! vect_hw_misalign} } } } } */
> +/* Peeling to align the store is used. Overhead of peeling is too high
> +   for power7, but acceptable for earlier architectures.  */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 0 "vect" { target { has_arch_pwr7 && { vector_alignment_reachable && {! vect_no_align} } } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorization not profitable" 1 "vect" { target { has_arch_pwr7 && { vector_alignment_reachable && {! vect_hw_misalign} } } } } } */
>  
>  /* Versioning to align the store is used. Overhead of versioning is not too high.  */
> -/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_no_align || {! vector_alignment_reachable} } } } } */
> +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target { vect_no_align || { {! vector_alignment_reachable} || {! has_arch_pwr7 } } } } } } */

For !has_arch_pwr7 case, it still adopts peeling but as the comment (one line above)
shows the original intention of this case is to expect not profitable for peeling
so it's not expected to be handled here, can we just tweak the loop bound instead,
such as:

-#define N 14
+#define N 13
 #define OFF 4 

?, it can make this loop not profitable to be vectorized for !vect_no_align with
peeling (both pwr7 and pwr6) and keep consistent.

BR,
Kewen

> 
>

next prev parent reply	other threads:[~2024-04-24  8:24 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10  9:12 Alexandre Oliva
2024-04-22  9:28 ` [PATCH] " Alexandre Oliva
2024-04-24  8:24   ` Kewen.Lin [this message]
2024-04-28  8:14     ` Alexandre Oliva
2024-04-28  9:31       ` Kewen.Lin
2024-04-29  6:28         ` Alexandre Oliva
2024-04-29  8:56           ` Kewen.Lin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d041e080-3d58-9d35-d5f7-88415d647456@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=linkw@gcc.gnu.org \
    --cc=mikestump@comcast.net \
    --cc=oliva@adacore.com \
    --cc=ro@CeBiTec.Uni-Bielefeld.DE \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).