public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "hliu at amperecomputing dot com" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/110474] New: Vect: the epilog vect loop should have small VF if the loop is unrolled during vectorization
Date: Thu, 29 Jun 2023 04:35:23 +0000	[thread overview]
Message-ID: <bug-110474-4@http.gcc.gnu.org/bugzilla/> (raw)

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110474

            Bug ID: 110474
           Summary: Vect: the epilog vect loop should have small VF if the
                    loop is unrolled during vectorization
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hliu at amperecomputing dot com
  Target Milestone: ---

Hi, I'm trying to use tune loop unrolling during vectorization (see more:
tree-vect-loop.cc suggested_unroll_factor). I find the unrolling may hurt
performance as unrolling also increases the VF (vector factor) of epilog vect
loop.

For example:
int foo(short *A, char *B, int N) {
    int sum = 0;
    for (int i = 0; i < N; ++i) {
        sum += A[i] * B[i];
    }
    return sum;
}


Compile it with "-O3 -mtune=neoverse-n2 -mcpu=neoverse-n1 --param
aarch64-vect-unroll-limit=2" (I'm using -mcpu n1 as I want to try a target
without SVE). GCC vectorization pass unrolls the loop by 2 and generates code
as following:

if N >= 32:
    main vect loop ...

if N >= 16:   # This may hurt performance if N is small (e.g. 8)
    epilog vect loop ...

epilog scalar code ...


If the loop is not unrolled (i.e. use "--param aarch64-vect-unroll-limit=1").
GCC generates code as following:

if N >= 16:
    main vect loop ...

if N >= 8:
    epilog vect loop ...

epilog scalar code ...


The runtime check is based on the VF of epilog vectorization. There is code in
tree-vect-loop.cc (line 2990) to choose epilog vect VF:
  /* If we're vectorizing an epilogue loop, the vectorized loop either needs
     to be able to handle fewer than VF scalars, or needs to have a lower VF
     than the main loop.  */
  if (LOOP_VINFO_EPILOGUE_P (loop_vinfo)
      && !LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)
      && maybe_ge (LOOP_VINFO_VECT_FACTOR (loop_vinfo),
                   LOOP_VINFO_VECT_FACTOR (orig_loop_vinfo)))
    return opt_result::failure_at (vect_location,
                                   "Vectorization factor too high for"
                                   " epilogue loop.\n");

But it doesn't consider about the suggested_unroll_factor. So I'm thinking
about adding following code to unscale the orig_loop_vinfo's VF by
unroll_factor:
      unscaled_orig_vf = exact_div (LOOP_VINFO_VECT_FACTOR (orig_loop_vinfo),
orig_loop_vinfo->suggested_unroll_factor);

Is this reasonable?

             reply	other threads:[~2023-06-29  4:35 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-29  4:35 hliu at amperecomputing dot com [this message]
2023-06-29  7:33 ` [Bug tree-optimization/110474] " rguenth at gcc dot gnu.org
2023-07-06  2:07 ` cvs-commit at gcc dot gnu.org
2023-07-06  2:21 ` hliu at amperecomputing dot com

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-110474-4@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).