public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/112774] New: Vectorize the loop by inferring nonwrapping information from arrays
@ 2023-11-30  9:01 hliu at amperecomputing dot com
  2023-11-30 12:27 ` [Bug tree-optimization/112774] " rguenth at gcc dot gnu.org
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: hliu at amperecomputing dot com @ 2023-11-30  9:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112774

            Bug ID: 112774
           Summary: Vectorize the loop by inferring nonwrapping
                    information from arrays
           Product: gcc
           Version: 14.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: hliu at amperecomputing dot com
  Target Milestone: ---

This case extracted from another benchmark and it is simpler than the case in
PR101450, as it has the additional boundary information from the array:

    int A[1024 * 2];

    int foo (unsigned offset, unsigned N) 
    {
      int sum = 0;

      for (unsigned i = 0; i < N; i++)
        sum += A[i + offset];

      return sum;
    }

The Gimple before the vectorization pass is:

    <bb 3> [local count: 955630224]:
    # sum_12 = PHI <sum_9(6), 0(5)>
    # i_14 = PHI <i_10(6), 0(5)>
    _1 = offset_8(D) + i_14;
    _2 = A[_1];
    sum_9 = _2 + sum_12;
    i_10 = i_14 + 1;

GCC failed to vectorize it as it the chrec "{offset_8, +, 1}_1" may
overflow/wrap. I summarized more details in the email:
https://gcc.gnu.org/pipermail/gcc/2023-November/242854.html

Actually, GCC already knows it won't by inferring the range from the array
(in estimate_numbers_of_iterations -> infer_loop_bounds_from_undefined ->
infer_loop_bounds_from_array):

    Induction variable (unsigned int) offset_8(D) + 1 * iteration does not wrap
in statement _2 = A[_1];
     in loop 1.
    Statement _2 = A[_1];
     is executed at most 2047 (bounded by 2047) + 1 times in loop 1.

We can use re-use this information to vectorize this case. I already have a
simple patch to achieve this, and will send it out later (after doing more
tests).

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/112774] Vectorize the loop by inferring nonwrapping information from arrays
  2023-11-30  9:01 [Bug tree-optimization/112774] New: Vectorize the loop by inferring nonwrapping information from arrays hliu at amperecomputing dot com
@ 2023-11-30 12:27 ` rguenth at gcc dot gnu.org
  2023-12-08  3:20 ` cvs-commit at gcc dot gnu.org
  2024-01-18  7:52 ` cvs-commit at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-11-30 12:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112774

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |53947
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2023-11-30
             Status|UNCONFIRMED                 |NEW

--- Comment #1 from Richard Biener <rguenth at gcc dot gnu.org> ---
Confirmed, though the fix should be to SCEV.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/112774] Vectorize the loop by inferring nonwrapping information from arrays
  2023-11-30  9:01 [Bug tree-optimization/112774] New: Vectorize the loop by inferring nonwrapping information from arrays hliu at amperecomputing dot com
  2023-11-30 12:27 ` [Bug tree-optimization/112774] " rguenth at gcc dot gnu.org
@ 2023-12-08  3:20 ` cvs-commit at gcc dot gnu.org
  2024-01-18  7:52 ` cvs-commit at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2023-12-08  3:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112774

--- Comment #2 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Hao Liu <hliu@gcc.gnu.org>:

https://gcc.gnu.org/g:2efe3a7de0107618397264017fb045f237764cc7

commit r14-6299-g2efe3a7de0107618397264017fb045f237764cc7
Author: Hao Liu <hliu@os.amperecomputing.com>
Date:   Wed Dec 6 14:52:19 2023 +0800

    tree-optimization/112774: extend the SCEV CHREC tree with a nonwrapping
flag

    The flag is defined as CHREC_NOWRAP(tree), and will be dumped from
    "{offset, +, 1}_1" to "{offset, +, 1}<nw>_1" (nw is short for nonwrapping).
    Two SCEV interfaces record_nonwrapping_chrec and nonwrapping_chrec_p are
    added to set and check the flag respectively.

    As resetting the SCEV cache (i.e., the chrec trees) may not reset the
    loop->estimate_state, free_numbers_of_iterations_estimates is called
    explicitly in loop vectorization to make sure the flag can be
    calculated propriately by niter.

    gcc/ChangeLog:

            PR tree-optimization/112774
            * tree-pretty-print.cc: if nonwrapping flag is set, chrec will be
            printed with additional <nw> info.
            * tree-scalar-evolution.cc: add record_nonwrapping_chrec and
            nonwrapping_chrec_p to set and check the new flag respectively.
            * tree-scalar-evolution.h: Likewise.
            * tree-ssa-loop-niter.cc (idx_infer_loop_bounds,
            infer_loop_bounds_from_pointer_arith,
infer_loop_bounds_from_signedness,
            scev_probably_wraps_p): call record_nonwrapping_chrec before
            record_nonwrapping_iv, call nonwrapping_chrec_p to check the flag
is
            set and return false from scev_probably_wraps_p.
            * tree-vect-loop.cc (vect_analyze_loop): call
            free_numbers_of_iterations_estimates explicitly.
            * tree-core.h: document the nothrow_flag usage in CHREC_NOWRAP
            * tree.h: add CHREC_NOWRAP(NODE), base.nothrow_flag is used to
            represent the nonwrapping info.

    gcc/testsuite/ChangeLog:

            * gcc.dg/tree-ssa/scev-16.c: New test.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug tree-optimization/112774] Vectorize the loop by inferring nonwrapping information from arrays
  2023-11-30  9:01 [Bug tree-optimization/112774] New: Vectorize the loop by inferring nonwrapping information from arrays hliu at amperecomputing dot com
  2023-11-30 12:27 ` [Bug tree-optimization/112774] " rguenth at gcc dot gnu.org
  2023-12-08  3:20 ` cvs-commit at gcc dot gnu.org
@ 2024-01-18  7:52 ` cvs-commit at gcc dot gnu.org
  2 siblings, 0 replies; 4+ messages in thread
From: cvs-commit at gcc dot gnu.org @ 2024-01-18  7:52 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112774

--- Comment #3 from GCC Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Jakub Jelinek <jakub@gcc.gnu.org>:

https://gcc.gnu.org/g:484f48f03cf9a382b3bcf4dadac09c4ee59c2ddf

commit r14-8210-g484f48f03cf9a382b3bcf4dadac09c4ee59c2ddf
Author: Jakub Jelinek <jakub@redhat.com>
Date:   Thu Jan 18 08:51:53 2024 +0100

    testsuite: Fix up scev-16.c test [PR113446]

    This test FAILs on i686-linux or e.g. sparc*-solaris*, because
    it uses vect_int effective target outside of */vect/ testsuite.
    That is wrong, vect_int assumes the extra added flags by vect.exp
    by default, which aren't added in other testsuites.

    The following patch fixes that by moving the test into gcc.dg/vect/
    and doing small tweaks.

    2024-01-18  Jakub Jelinek  <jakub@redhat.com>

            PR tree-optimization/112774
            PR testsuite/113446
            * gcc.dg/tree-ssa/scev-16.c: Move test ...
            * gcc.dg/vect/pr112774.c: ... here.  Add PR comment line, use
            dg-additional-options instead of dg-options and drop
            -fdump-tree-vect-details.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2024-01-18  7:52 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-30  9:01 [Bug tree-optimization/112774] New: Vectorize the loop by inferring nonwrapping information from arrays hliu at amperecomputing dot com
2023-11-30 12:27 ` [Bug tree-optimization/112774] " rguenth at gcc dot gnu.org
2023-12-08  3:20 ` cvs-commit at gcc dot gnu.org
2024-01-18  7:52 ` cvs-commit at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).