public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/102564] New: Missed loop vectorization with reduction and ptr load/store inside loop
@ 2021-10-02 10:07 david.bolvansky at gmail dot com
  2021-10-02 12:55 ` [Bug tree-optimization/102564] " pinskia at gcc dot gnu.org
  2021-10-04  7:06 ` rguenth at gcc dot gnu.org
  0 siblings, 2 replies; 3+ messages in thread
From: david.bolvansky at gmail dot com @ 2021-10-02 10:07 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102564

            Bug ID: 102564
           Summary: Missed loop vectorization with reduction and ptr
                    load/store inside loop
           Product: gcc
           Version: unknown
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: tree-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: david.bolvansky at gmail dot com
  Target Milestone: ---

void test1(int *p, int *t, int N) {
    for (int i = 0; i != N; i++) *t += p[i];
}

void test2(int *p, int *t, int N) {
    if (N > 1024) // hint, N is not small
        for (int i = 0; i != N; i++) *t += p[i];
}

void test3(int *p, int *t, int N) {
    if (N > 1024) { // hint, N is not small
        int s = 0;
        for (int i = 0; i != N; i++) s += p[i];
        *t += s;
    }
}

test3 is successfully vectorized with LLVM, GCC, ICC. Sadly, only ICC can catch
test1 and test2.

https://godbolt.org/z/PzoYd4eEK

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug tree-optimization/102564] Missed loop vectorization with reduction and ptr load/store inside loop
  2021-10-02 10:07 [Bug tree-optimization/102564] New: Missed loop vectorization with reduction and ptr load/store inside loop david.bolvansky at gmail dot com
@ 2021-10-02 12:55 ` pinskia at gcc dot gnu.org
  2021-10-04  7:06 ` rguenth at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-10-02 12:55 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102564

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|normal                      |enhancement
             Blocks|                            |53947
           Keywords|                            |alias, missed-optimization

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I suspect the vectorizer is not adding an alias check in the case of reduction.


Referenced Bugs:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947
[Bug 53947] [meta-bug] vectorizer missed-optimizations

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [Bug tree-optimization/102564] Missed loop vectorization with reduction and ptr load/store inside loop
  2021-10-02 10:07 [Bug tree-optimization/102564] New: Missed loop vectorization with reduction and ptr load/store inside loop david.bolvansky at gmail dot com
  2021-10-02 12:55 ` [Bug tree-optimization/102564] " pinskia at gcc dot gnu.org
@ 2021-10-04  7:06 ` rguenth at gcc dot gnu.org
  1 sibling, 0 replies; 3+ messages in thread
From: rguenth at gcc dot gnu.org @ 2021-10-04  7:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102564

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2021-10-04
            Version|unknown                     |12.0

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is that t can point anywhere into p[], Andrew is correct in that we
could in theory do a runtime check but unfortunately vectorization relies on
reductions being done on registers and thus store-motion to have taken place.
But store-motion does not do any runtime alias checks.

The fix is at the source level to add __restrict__ to p for example or to
perform the store-motion yourself as you've done in test3.

In principle the vectorizer could do reduction vectorization on
stride zero memory accesses as well, but currently we give up on
such stores completely.

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-10-04  7:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-02 10:07 [Bug tree-optimization/102564] New: Missed loop vectorization with reduction and ptr load/store inside loop david.bolvansky at gmail dot com
2021-10-02 12:55 ` [Bug tree-optimization/102564] " pinskia at gcc dot gnu.org
2021-10-04  7:06 ` rguenth at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).