public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/116766] New: Missed loop vectorization case with gather/scatter
@ 2024-09-18 15:18 Dusan.Stojkovic@rt-rk.com
  2024-09-18 19:13 ` [Bug tree-optimization/116766] " pinskia at gcc dot gnu.org
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Dusan.Stojkovic@rt-rk.com @ 2024-09-18 15:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116766

            Bug ID: 116766
           Summary: Missed loop vectorization case with gather/scatter
           Product: gcc
           Version: 15.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: Dusan.Stojkovic@rt-rk.com
  Target Milestone: ---

Created attachment 59138
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59138&action=edit
A dump file of vect pass with details enabled

Hello this report is about a loop which GCC doesn't vectorize. 
This is the loop in question: 

void
foo(int *in, int *out1, int *out2, int n)
{
    for (int i = 0; i < n; i+=2)
    {
        out1[i/2] = in[i];
        out2[i/2] = in[i+1];
    }
}

GCC isn't able to vectorize this example:
ARM: https://godbolt.org/z/Eq7vs7ssx
RISC-V: https://godbolt.org/z/Y1sr4cr8K

Clang vectorizes this loop for ARM as well as RISC-V:
ARM: https://godbolt.org/z/YvaMqTz1h
RISC-V: https://godbolt.org/z/x1MxGc7Gx

In the vect optimization pass there are two misses regarding array indexing:
analyze_innermost: t.c:6:19: missed:  failed: evolution of base is not affine.
analyze_innermost: t.c:7:19: missed:  failed: evolution of base is not affine.

There are also these misses:
t.c:6:23: missed:   possible alias involving gather/scatter between *_3 and *_7
t.c:4:23: missed:  bad data dependence.
t.c:6:23: missed:   possible alias involving gather/scatter between *_3 and *_7
t.c:4:23: missed:  bad data dependence.
t.c:6:23: missed:   possible alias involving gather/scatter between *_3 and *_7
t.c:4:23: missed:  bad data dependence.
t.c:6:23: missed:   possible alias involving gather/scatter between *_3 and *_7
t.c:4:23: missed:  bad data dependence.
t.c:6:23: missed:   possible alias involving gather/scatter between *_3 and *_7
t.c:4:23: missed:  bad data dependence.
t.c:6:23: missed:   possible alias involving gather/scatter between *_3 and *_7
t.c:4:23: missed:  bad data dependence.
t.c:6:23: missed:   possible alias involving gather/scatter between *_3 and *_7
t.c:4:23: missed:  bad data dependence.
t.c:6:23: missed:   possible alias involving gather/scatter between *_3 and *_7
t.c:4:23: missed:  bad data dependence.
t.c:6:23: missed:   possible alias involving gather/scatter between *_3 and *_7
t.c:4:23: missed:  bad data dependence.

t.c:4:23: missed: couldn't vectorize loop

void foo (int * in, int * out1, int * out2, int n)
{
  int i;
  long unsigned int _1;
  long unsigned int _2;
  int * _3;
  int _4;
  long unsigned int _5;
  long unsigned int _6;
  int * _7;
  int _8;
  sizetype _9;
  sizetype _10;
  int * _11;
  int * _12;
  int _13;

  <bb 2> [local count: 118111600]:
  if (n_17(D) > 0)
    goto <bb 5>; [89.00%]
  else
    goto <bb 4>; [11.00%]

  <bb 5> [local count: 105119324]:

  <bb 3> [local count: 955630224]:
  # i_26 = PHI <i_23(6), 0(5)>
  _1 = (long unsigned int) i_26;
  _2 = _1 * 4;
  _3 = in_18(D) + _2;
  _4 = i_26 >> 1;
  _5 = (long unsigned int) _4;
  _6 = _5 * 4;
  _7 = out1_19(D) + _6;
  _8 = *_3;
  *_7 = _8;
  _9 = _1 + 1;
  _10 = _9 * 4;
  _11 = in_18(D) + _10;
  _12 = out2_21(D) + _6;
  _13 = *_11;
  *_12 = _13;
  i_23 = i_26 + 2;
  if (n_17(D) > i_23)
    goto <bb 6>; [89.00%]
  else
    goto <bb 7>; [11.00%]

  <bb 6> [local count: 850510900]:
  goto <bb 3>; [100.00%]

  <bb 7> [local count: 105119324]:

  <bb 4> [local count: 118111600]:
  return;

}

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/116766] Missed loop vectorization case with gather/scatter
  2024-09-18 15:18 [Bug tree-optimization/116766] New: Missed loop vectorization case with gather/scatter Dusan.Stojkovic@rt-rk.com
@ 2024-09-18 19:13 ` pinskia at gcc dot gnu.org
  2024-09-18 22:45 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-09-18 19:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116766

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Blocks|                            |53947
           Severity|normal                      |enhancement
           See Also|                            |https://gcc.gnu.org/bugzill
                   |                            |a/show_bug.cgi?id=104368


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/116766] Missed loop vectorization case with gather/scatter
  2024-09-18 15:18 [Bug tree-optimization/116766] New: Missed loop vectorization case with gather/scatter Dusan.Stojkovic@rt-rk.com
  2024-09-18 19:13 ` [Bug tree-optimization/116766] " pinskia at gcc dot gnu.org
@ 2024-09-18 22:45 ` pinskia at gcc dot gnu.org
  2024-09-19  7:12 ` [Bug tree-optimization/116766] Missed loop vectorization case due to even IV division by two rguenth at gcc dot gnu.org
  2024-09-19  7:31 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-09-18 22:45 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116766

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I thought I had saw this one before ...

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/116766] Missed loop vectorization case due to even IV division by two
  2024-09-18 15:18 [Bug tree-optimization/116766] New: Missed loop vectorization case with gather/scatter Dusan.Stojkovic@rt-rk.com
  2024-09-18 19:13 ` [Bug tree-optimization/116766] " pinskia at gcc dot gnu.org
  2024-09-18 22:45 ` pinskia at gcc dot gnu.org
@ 2024-09-19  7:12 ` rguenth at gcc dot gnu.org
  2024-09-19  7:31 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-09-19  7:12 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116766

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2024-09-19
     Ever confirmed|0                           |1
            Summary|Missed loop vectorization   |Missed loop vectorization
                   |case with gather/scatter    |case due to even IV
                   |                            |division by two
             Status|UNCONFIRMED                 |NEW

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
This isn't about gather/scatter but SCEV analysis failing to handle i/2
(i >> 1 in the IL) as IV.  That's only affine because i is even so it's
an exact division.  We might want to "fold" i / 2 to i /[ex] 2 somewhere
in VRP to help, but possibly somehow SCEV could "optimistically" analyze
the shift and resolve it after it finds the initial value and increment.

Bottom line - rewrite your code to

    for (int i = 0; i < n/2; i++)
    {
        out1[i] = in[2*i];
        out2[i] = in[2*i+1];
    }

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [Bug tree-optimization/116766] Missed loop vectorization case due to even IV division by two
  2024-09-18 15:18 [Bug tree-optimization/116766] New: Missed loop vectorization case with gather/scatter Dusan.Stojkovic@rt-rk.com
                   ` (2 preceding siblings ...)
  2024-09-19  7:12 ` [Bug tree-optimization/116766] Missed loop vectorization case due to even IV division by two rguenth at gcc dot gnu.org
@ 2024-09-19  7:31 ` rguenth at gcc dot gnu.org
  3 siblings, 0 replies; 5+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-09-19  7:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116766

--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Created attachment 59144
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59144&action=edit
hack

This works for the special case of >> 1;  something more general and nicer
written could work.  CHREC_LEFT might not need to be a multiple I think and
thus not constant.  EXACT_DIV_EXPR could be allowed directly in a CHREC_RIGHT,
even if we can't constant-fold it, so I think working towards that would be
better and instead handle EXACT_DIV_EXPR in SCEV.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2024-09-19  7:31 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-18 15:18 [Bug tree-optimization/116766] New: Missed loop vectorization case with gather/scatter Dusan.Stojkovic@rt-rk.com
2024-09-18 19:13 ` [Bug tree-optimization/116766] " pinskia at gcc dot gnu.org
2024-09-18 22:45 ` pinskia at gcc dot gnu.org
2024-09-19  7:12 ` [Bug tree-optimization/116766] Missed loop vectorization case due to even IV division by two rguenth at gcc dot gnu.org
2024-09-19  7:31 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).