public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Tamar Christina <tamar.christina@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: nd@arm.com, rguenther@suse.de, jlaw@ventanamicro.com
Subject: [PATCH]middle-end vect: adjust loop upper bounds when peeling for gaps and early break [PR114403]
Date: Thu, 4 Apr 2024 17:15:58 +0100	[thread overview]
Message-ID: <patch-18385-tamar@arm.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 3569 bytes --]

Hi All,

The report shows that we end up in a situation where the code has been peeled
for gaps and we have an early break.

The code for peeling for gaps assume that a scalar loop needs to perform at
least one iteration.  However this doesn't take into account early break where
the scalar loop may not need to be executed.

That the early break loop can be partial is not accounted for in this scenario.
loop partiality is normally handled by setting bias_for_lowest to 1, but when
peeling for gaps we end up with 0, which when the loop upper bounds are
calculated means that a partial loop iteration loses the final partial iter:

Analyzing # of iterations of loop 1
  exit condition [8, + , 18446744073709551615] != 0
  bounds on difference of bases: -8 ... -8
  result:
    # of iterations 8, bounded by 8

and a VF=4 calculating:

Loop 1 iterates at most 1 times.
Loop 1 likely iterates at most 1 times.
Analyzing # of iterations of loop 1
  exit condition [1, + , 1](no_overflow) < bnd.5505_39
  bounds on difference of bases: 0 ... 4611686018427387902
Matching expression match.pd:2011, generic-match-8.cc:27
Applying pattern match.pd:2067, generic-match-1.cc:4813
  result:
    # of iterations bnd.5505_39 + 18446744073709551615, bounded by 4611686018427387902
Estimating sizes for loop 1
...
   Induction variable computation will be folded away.
  size:   2 if (ivtmp_312 < bnd.5505_39)
   Exit condition will be eliminated in last copy.
size: 24-3, last_iteration: 24-5
  Loop size: 24
  Estimated size after unrolling: 26
;; Guessed iterations of loop 1 is 0.858446. New upper bound 1.

upper bound should be 2 not 1.

This patch forced the bias_for_lowest to be 1 even when peeling for gaps.

I have however not been able to write a standalone reproducer for this so I have
no tests but bootstrap and LLVM build fine now.

The testcase:

#define COUNT 9
#define SIZE COUNT * 4
#define TYPE unsigned long

TYPE x[SIZE], y[SIZE];

void __attribute__((noipa))
loop (TYPE val)
{
  for (int i = 0; i < COUNT; ++i)
    {
      if (x[i * 4] > val || x[i * 4 + 1] > val)
        return;
      x[i * 4] = y[i * 2] + 1;
      x[i * 4 + 1] = y[i * 2] + 2;
      x[i * 4 + 2] = y[i * 2 + 1] + 3;
      x[i * 4 + 3] = y[i * 2 + 1] + 4;
    }
}

does perform the peeling for gaps and early beak, however it creates a hybrid
loop which works fine. adjusting the indices to non linear also works. So I'd
like to submit the fix and work on a testcase separately if needed.

Bootstrapped Regtested on x86_64-pc-linux-gnu no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	PR tree-optimization/114403
	* tree-vect-loop.cc (vect_transform_loop): Adjust upper bounds for when
	peeling for gaps and early break.

---
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 4375ebdcb493a90fd0501cbb4b07466077b525c3..bf1bb9b005c68fbb13ee1b1279424865b237245a 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -12139,7 +12139,8 @@ vect_transform_loop (loop_vec_info loop_vinfo, gimple *loop_vectorized_call)
   /* The minimum number of iterations performed by the epilogue.  This
      is 1 when peeling for gaps because we always need a final scalar
      iteration.  */
-  int min_epilogue_iters = LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) ? 1 : 0;
+  int min_epilogue_iters = LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
+			   && !LOOP_VINFO_EARLY_BREAKS (loop_vinfo) ? 1 : 0;
   /* +1 to convert latch counts to loop iteration counts,
      -min_epilogue_iters to remove iterations that cannot be performed
        by the vector code.  */




-- 

[-- Attachment #2: rb18385.patch --]
[-- Type: text/x-diff, Size: 846 bytes --]

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 4375ebdcb493a90fd0501cbb4b07466077b525c3..bf1bb9b005c68fbb13ee1b1279424865b237245a 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -12139,7 +12139,8 @@ vect_transform_loop (loop_vec_info loop_vinfo, gimple *loop_vectorized_call)
   /* The minimum number of iterations performed by the epilogue.  This
      is 1 when peeling for gaps because we always need a final scalar
      iteration.  */
-  int min_epilogue_iters = LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo) ? 1 : 0;
+  int min_epilogue_iters = LOOP_VINFO_PEELING_FOR_GAPS (loop_vinfo)
+			   && !LOOP_VINFO_EARLY_BREAKS (loop_vinfo) ? 1 : 0;
   /* +1 to convert latch counts to loop iteration counts,
      -min_epilogue_iters to remove iterations that cannot be performed
        by the vector code.  */




             reply	other threads:[~2024-04-04 16:16 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-04 16:15 Tamar Christina [this message]
2024-04-05  7:07 ` Richard Biener

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=patch-18385-tamar@arm.com \
    --to=tamar.christina@arm.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jlaw@ventanamicro.com \
    --cc=nd@arm.com \
    --cc=rguenther@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).