public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug tree-optimization/105793] New: Missed vectorisation with conditional-select inside loop
@ 2022-05-31 14:55 ktkachov at gcc dot gnu.org
2022-05-31 14:59 ` [Bug tree-optimization/105793] " pinskia at gcc dot gnu.org
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: ktkachov at gcc dot gnu.org @ 2022-05-31 14:55 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105793
Bug ID: 105793
Summary: Missed vectorisation with conditional-select inside
loop
Product: gcc
Version: unknown
Status: UNCONFIRMED
Keywords: missed-optimization
Severity: normal
Priority: P3
Component: tree-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: ktkachov at gcc dot gnu.org
Target Milestone: ---
The code:
#define N 1024
float f(const float in[N], unsigned int n) {
float a = 0.0f;
for (unsigned i = 0; i < N; ++i) {
float b = in[i];
if (b < 10.f)
a += b;
else
a -= b;
}
return a;
}
with -Ofast does not vectorise (on aarch64, for example):
f:
movi v0.2s, #0
add x1, x0, 4096
fmov s3, 1.0e+1
.L5:
ldr s1, [x0], 4
fsub s2, s0, s1
fcmpe s1, s3
fadd s0, s0, s1
fcsel s0, s0, s2, mi
cmp x1, x0
bne .L5
ret
whereas clang can and does. Commenting out the "else a -=b;" line allows GCC to
vectorise it:
f:
movi v0.4s, 0
add x1, x0, 4096
fmov v3.4s, 1.0e+1
.L2:
ldr q2, [x0], 16
fcmgt v1.4s, v3.4s, v2.4s
and v1.16b, v1.16b, v2.16b
fadd v0.4s, v0.4s, v1.4s
cmp x1, x0
bne .L2
faddp v0.4s, v0.4s, v0.4s
faddp v0.4s, v0.4s, v0.4s
ret
Examples at https://gcc.godbolt.org/z/qbn6T73qE
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/105793] Missed vectorisation with conditional-select inside loop
2022-05-31 14:55 [Bug tree-optimization/105793] New: Missed vectorisation with conditional-select inside loop ktkachov at gcc dot gnu.org
@ 2022-05-31 14:59 ` pinskia at gcc dot gnu.org
2022-06-01 5:59 ` crazylht at gmail dot com
2022-06-01 12:09 ` rguenth at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-05-31 14:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105793
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Severity|normal |enhancement
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
I think there is another bug about this. Basically it comes down to recognizing
conditional negative.
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/105793] Missed vectorisation with conditional-select inside loop
2022-05-31 14:55 [Bug tree-optimization/105793] New: Missed vectorisation with conditional-select inside loop ktkachov at gcc dot gnu.org
2022-05-31 14:59 ` [Bug tree-optimization/105793] " pinskia at gcc dot gnu.org
@ 2022-06-01 5:59 ` crazylht at gmail dot com
2022-06-01 12:09 ` rguenth at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: crazylht at gmail dot com @ 2022-06-01 5:59 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105793
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
Guess vectorizer expects something like
tmp1 = cond ? b : -b;
a_5 = a_4 + tmp1;
from ifcvt instead of current
a_13 = b_10 + a_16;
# DEBUG a => NULL
_4 = b_10 < 1.0e+1;
# DEBUG BEGIN_STMT
a_12 = a_16 - b_10;
# DEBUG a => NULL
a_5 = _4 ? a_13 : a_12;
^ permalink raw reply [flat|nested] 4+ messages in thread
* [Bug tree-optimization/105793] Missed vectorisation with conditional-select inside loop
2022-05-31 14:55 [Bug tree-optimization/105793] New: Missed vectorisation with conditional-select inside loop ktkachov at gcc dot gnu.org
2022-05-31 14:59 ` [Bug tree-optimization/105793] " pinskia at gcc dot gnu.org
2022-06-01 5:59 ` crazylht at gmail dot com
@ 2022-06-01 12:09 ` rguenth at gcc dot gnu.org
2 siblings, 0 replies; 4+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-06-01 12:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105793
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Version|unknown |13.0
Target| |aarch64
Ever confirmed|0 |1
Status|UNCONFIRMED |NEW
Last reconfirmed| |2022-06-01
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
There's already some code in if-conversion to deal with the vectorizers
restrictions with respect to how reductions have to appear. Basically the
vectorizer currently does not accept
for (..)
a = b < 10. ? a + b : a - b;
because there are two uses of 'a' here. Re-writing this to
for (..)
a = a + (b < 10. ? b : -b)
would indeed work. See is_cond_scalar_reduction for the existing special
casing.
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2022-06-01 12:09 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-31 14:55 [Bug tree-optimization/105793] New: Missed vectorisation with conditional-select inside loop ktkachov at gcc dot gnu.org
2022-05-31 14:59 ` [Bug tree-optimization/105793] " pinskia at gcc dot gnu.org
2022-06-01 5:59 ` crazylht at gmail dot com
2022-06-01 12:09 ` rguenth at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).