public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2
@ 2023-07-18 10:11 mrks2023 at proton dot me
2023-07-18 10:37 ` [Bug middle-end/110711] " crazylht at gmail dot com
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: mrks2023 at proton dot me @ 2023-07-18 10:11 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711
Bug ID: 110711
Summary: possible missed optimization for std::max with
-march=znver2
Product: gcc
Version: 13.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: mrks2023 at proton dot me
Target Milestone: ---
I think I found a missed optimization involving std::max() for -march=znver2
(sorry if it was already reported, but I didn't find anything related in the
bug tracker).
I have two functions that compute the maximum element of an array:
- function k_std_max uses std::max() and is never vectorized
- function k_max uses conditional assignment and is vectorized, when the
optimization flags allow for it
The code (also https://godbolt.org/z/hW49nbqMY):
#include <cassert>
#include <algorithm>
double k_std_max(size_t n_els, double * a)
{
assert(n_els > 0);
double m = a[0];
#ifdef _OPENMP
#pragma omp simd reduction(max:m)
#endif
for (size_t i = 1; i < n_els; ++i) {
m = std::max(m, a[i]);
}
return m;
}
double k_max(size_t n_els, double * a)
{
assert(n_els > 0);
double m = a[0];
#ifdef _OPENMP
#pragma omp simd reduction(max:m)
#endif
for (size_t i = 1; i < n_els; ++i) {
m = m < a[i] ? a[i] : m;
}
return m;
}
Compiling with "-O3 -fopenmp -march=znver2 -Wall -Wextra -DNDEBUG" vectorizes
k_max:
.L19:
vmovupd ymm3, YMMWORD PTR [rax+8]
add rax, 32
vmaxpd ymm1, ymm3, ymm1
cmp rax, rdx
jne .L19
but for k_std_max still scalar instructions are used:
.L3:
vmovsd xmm0, QWORD PTR [rax]
add rax, 8
vmaxsd xmm0, xmm0, xmm1
cmp rdx, rax
jne .L5
Note that I had to use -fopenmp as using only -fopenmp-simd did not vectorize
k_max.
Even when I use "-Ofast" or "-Ofast -fopenmp" instead of "-O3" k_std_max is not
vectorized:
.L3:
vmaxsd xmm0, xmm0, QWORD PTR [rax]
add rax, 8
cmp rdx, rax
jne .L3
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/110711] possible missed optimization for std::max with -march=znver2
2023-07-18 10:11 [Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2 mrks2023 at proton dot me
@ 2023-07-18 10:37 ` crazylht at gmail dot com
2023-07-18 10:38 ` crazylht at gmail dot com
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2023-07-18 10:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711
Hongtao.liu <crazylht at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |crazylht at gmail dot com
--- Comment #1 from Hongtao.liu <crazylht at gmail dot com> ---
You need to use -ffast-math, w/o it, operands order matters for floating point
max/min, they're not commutative.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/110711] possible missed optimization for std::max with -march=znver2
2023-07-18 10:11 [Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2 mrks2023 at proton dot me
2023-07-18 10:37 ` [Bug middle-end/110711] " crazylht at gmail dot com
@ 2023-07-18 10:38 ` crazylht at gmail dot com
2023-07-18 12:03 ` rguenth at gcc dot gnu.org
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2023-07-18 10:38 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711
--- Comment #2 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #1)
> You need to use -ffast-math, w/o it, operands order matters for floating
> point max/min, they're not commutative.
Sorry, too fast to reply, ignore this comment.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/110711] possible missed optimization for std::max with -march=znver2
2023-07-18 10:11 [Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2 mrks2023 at proton dot me
2023-07-18 10:37 ` [Bug middle-end/110711] " crazylht at gmail dot com
2023-07-18 10:38 ` crazylht at gmail dot com
@ 2023-07-18 12:03 ` rguenth at gcc dot gnu.org
2023-07-18 12:03 ` rguenth at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-18 12:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Keywords| |missed-optimization, openmp
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
Both are vectorized but somehow the OMP simd setup discards the vectorized
variant of k_std_max. The .GOMP_SIMD_VF (simduid.3_12(D)) seems to be
statically zero?!
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/110711] possible missed optimization for std::max with -march=znver2
2023-07-18 10:11 [Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2 mrks2023 at proton dot me
` (2 preceding siblings ...)
2023-07-18 12:03 ` rguenth at gcc dot gnu.org
@ 2023-07-18 12:03 ` rguenth at gcc dot gnu.org
2023-07-19 6:57 ` crazylht at gmail dot com
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: rguenth at gcc dot gnu.org @ 2023-07-18 12:03 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Last reconfirmed| |2023-07-18
Status|UNCONFIRMED |NEW
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/110711] possible missed optimization for std::max with -march=znver2
2023-07-18 10:11 [Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2 mrks2023 at proton dot me
` (3 preceding siblings ...)
2023-07-18 12:03 ` rguenth at gcc dot gnu.org
@ 2023-07-19 6:57 ` crazylht at gmail dot com
2023-07-19 8:34 ` crazylht at gmail dot com
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2023-07-19 6:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711
--- Comment #4 from Hongtao.liu <crazylht at gmail dot com> ---
<bb 18> [local count: 105119324]:
_6 = .LOOP_VECTORIZED (1, 2);
if (_6 != 0)
goto <bb 19>; [100.00%]
else
goto <bb 20>; [100.00%]
<bb 19> [local count: 105119324]:
<bb 5> [local count: 955630227]:
# i_21 = PHI <i_16(10), 1(19)>
# prephitmp_19 = PHI <prephitmp_24(10), _1(19)>
_2 = i_21 * 8;
_3 = a_10(D) + _2;
_17 = MEM[(const double &)_3];
_29 = MAX_EXPR <_17, prephitmp_19>;
prephitmp_24 = _29;
i_16 = i_21 + 1;
if (n_els_7(D) > i_16)
goto <bb 10>; [89.00%]
else
goto <bb 13>; [11.00%]
<bb 13> [local count: 105119324]:
# prephitmp_15 = PHI <_29(5), prephitmp_27(16)>
goto <bb 8>; [100.00%]
<bb 10> [local count: 850510903]:
goto <bb 5>; [100.00%]
test.C:8:26: note: Analyze phi: prephitmp_19 = PHI <prephitmp_24(10), _1(19)>
test.C:8:26: note: reduction path: prephitmp_24 _29 prephitmp_19
test.C:8:26: note: reduction: unknown pattern
test.C:8:26: missed: Unknown def-use cycle pattern.
test.C:8:26: note: === vect_determine_precisions ===
It looks like there's extra move generate by ifcvt which make vectorizer think
it's not a reduction. prephitmp_24 can be replaced with _29 since it's not used
elsewhere.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/110711] possible missed optimization for std::max with -march=znver2
2023-07-18 10:11 [Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2 mrks2023 at proton dot me
` (4 preceding siblings ...)
2023-07-19 6:57 ` crazylht at gmail dot com
@ 2023-07-19 8:34 ` crazylht at gmail dot com
2023-07-19 8:37 ` crazylht at gmail dot com
2023-07-19 8:41 ` crazylht at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2023-07-19 8:34 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711
--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
Looks like it's GCC13 regression, GCC12.3 successfully vectorizes k_std_max
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/110711] possible missed optimization for std::max with -march=znver2
2023-07-18 10:11 [Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2 mrks2023 at proton dot me
` (5 preceding siblings ...)
2023-07-19 8:34 ` crazylht at gmail dot com
@ 2023-07-19 8:37 ` crazylht at gmail dot com
2023-07-19 8:41 ` crazylht at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2023-07-19 8:37 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711
--- Comment #6 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Hongtao.liu from comment #5)
> Looks like it's GCC13 regression, GCC12.3 successfully vectorizes k_std_max
https://godbolt.org/z/6111MP354
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug middle-end/110711] possible missed optimization for std::max with -march=znver2
2023-07-18 10:11 [Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2 mrks2023 at proton dot me
` (6 preceding siblings ...)
2023-07-19 8:37 ` crazylht at gmail dot com
@ 2023-07-19 8:41 ` crazylht at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: crazylht at gmail dot com @ 2023-07-19 8:41 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110711
--- Comment #7 from Hongtao.liu <crazylht at gmail dot com> ---
git diff gcc12_171t.ifcvt gcc13.173.ifcvt
<bb 5> [local count: 955630227]:
- # i_22 = PHI <i_16(10), 1(19)>
+ # i_21 = PHI <i_16(10), 1(19)>
# prephitmp_19 = PHI <prephitmp_24(10), _1(19)>
- _2 = i_22 * 8;
+ _2 = i_21 * 8;
_3 = a_10(D) + _2;
_17 = MEM[(const double &)_3];
- prephitmp_24 = MAX_EXPR <_17, prephitmp_19>;
- i_16 = i_22 + 1;
+ _29 = MAX_EXPR <_17, prephitmp_19>;
+ prephitmp_24 = _29;
+ i_16 = i_21 + 1;
if (n_els_7(D) > i_16)
goto <bb 10>; [89.00%]
else
goto <bb 13>; [11.00%]
<bb 13> [local count: 105119324]:
- # prephitmp_13 = PHI <prephitmp_24(5), prephitmp_12(16)>
+ # prephitmp_15 = PHI <_29(5), prephitmp_27(16)>
goto <bb 8>; [100.00%]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2023-07-19 8:41 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-18 10:11 [Bug middle-end/110711] New: possible missed optimization for std::max with -march=znver2 mrks2023 at proton dot me
2023-07-18 10:37 ` [Bug middle-end/110711] " crazylht at gmail dot com
2023-07-18 10:38 ` crazylht at gmail dot com
2023-07-18 12:03 ` rguenth at gcc dot gnu.org
2023-07-18 12:03 ` rguenth at gcc dot gnu.org
2023-07-19 6:57 ` crazylht at gmail dot com
2023-07-19 8:34 ` crazylht at gmail dot com
2023-07-19 8:37 ` crazylht at gmail dot com
2023-07-19 8:41 ` crazylht at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).