public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512
@ 2020-04-13 21:28 lopresti at gmail dot com
2020-04-14 6:47 ` [Bug target/94587] " marxin at gcc dot gnu.org
` (7 more replies)
0 siblings, 8 replies; 9+ messages in thread
From: lopresti at gmail dot com @ 2020-04-13 21:28 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587
Bug ID: 94587
Summary: Intrinsics optimization bug with -O2
-march=skylake-avx512
Product: gcc
Version: 9.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: rtl-optimization
Assignee: unassigned at gcc dot gnu.org
Reporter: lopresti at gmail dot com
Target Milestone: ---
Created attachment 48265
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48265&action=edit
Minimal test case for skylake-avx512 intrinsics optimization bug
This bug appears to be present across at least GCC 8 and GCC 9.
Compile the attached program using "g++ -O2 -march=skylake-avx512" and run it.
Expected result: No output, exit status 0.
Actual result: "WRONG -0.99999999999999789058", exit status 2.
To make the problem go away, do any of the following:
- Use "-march=corei7" ; or
- Remove the "__attribute__((noinline))" from the rot90() function ; or
- Add "-DHEISENBUG" to the command line
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
@ 2020-04-14 6:47 ` marxin at gcc dot gnu.org
2020-04-14 9:27 ` jakub at gcc dot gnu.org
` (6 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-04-14 6:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587
Martin Liška <marxin at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
CC| |marxin at gcc dot gnu.org
Last reconfirmed| |2020-04-14
Ever confirmed|0 |1
--- Comment #1 from Martin Liška <marxin at gcc dot gnu.org> ---
Confirmed. I see it also with all starting from 6.1.0 where skylake-512 was
added. One can run it on a non-avx512 target.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
2020-04-14 6:47 ` [Bug target/94587] " marxin at gcc dot gnu.org
@ 2020-04-14 9:27 ` jakub at gcc dot gnu.org
2020-04-14 15:25 ` lopresti at gmail dot com
` (5 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-04-14 9:27 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |jakub at gcc dot gnu.org
Status|NEW |RESOLVED
Resolution|--- |INVALID
--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I don't see anything wrong, with -mavx512vl -O2 FMA is used, while it isn't
used without those options. Use -ffp-contract=off if you don't like this, the
default is -ffp-contract=fast as documented.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
2020-04-14 6:47 ` [Bug target/94587] " marxin at gcc dot gnu.org
2020-04-14 9:27 ` jakub at gcc dot gnu.org
@ 2020-04-14 15:25 ` lopresti at gmail dot com
2020-04-14 18:51 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: lopresti at gmail dot com @ 2020-04-14 15:25 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587
Patrick J. LoPresti <lopresti at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|RESOLVED |REOPENED
Resolution|INVALID |---
--- Comment #3 from Patrick J. LoPresti <lopresti at gmail dot com> ---
That works; thank you. However...
I realize there is no formal spec for intrinsics. But when I use them, I expect
deterministic behavior by default. This has been true on every compiler with
every set of optimization and architecture flags I have ever used (GCC before
AVX, many versions of Clang, many versions of the Intel compiler).
Also, the "-DHEISENBUG" example shows that simply adding a side-effect-free
assert() changes the behavior. This seems... unfriendly... as a default.
Wouldn't fp-contract be more appropriate as part of "-ffast-math"?
To my knowledge, no other compiler behaves this way. Are there any other
options I need to ensure deterministic behavior for SSE intrinsics on GCC? Will
there be more in the future? I do apologize if I missed the answer in the
1000-page GCC manual.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
` (2 preceding siblings ...)
2020-04-14 15:25 ` lopresti at gmail dot com
@ 2020-04-14 18:51 ` pinskia at gcc dot gnu.org
2020-04-14 19:35 ` lopresti at gmail dot com
` (3 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-04-14 18:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |WONTFIX
Status|REOPENED |RESOLVED
--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Patrick J. LoPresti from comment #3)
> That works; thank you. However...
>
> I realize there is no formal spec for intrinsics. But when I use them, I
> expect deterministic behavior by default. This has been true on every
> compiler with every set of optimization and architecture flags I have ever
> used (GCC before AVX, many versions of Clang, many versions of the Intel
> compiler).
>
> Also, the "-DHEISENBUG" example shows that simply adding a side-effect-free
> assert() changes the behavior. This seems... unfriendly... as a default.
Note this is true even without using intrinsics really. You can get the same
behavior you are seeing with using standard C code.
>
> Wouldn't fp-contract be more appropriate as part of "-ffast-math"?
No. This has been discussed many times and decided no.
>
> To my knowledge, no other compiler behaves this way. Are there any other
> options I need to ensure deterministic behavior for SSE intrinsics on GCC?
> Will there be more in the future? I do apologize if I missed the answer in
> the 1000-page GCC manual.
https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Floating-point-implementation.html#Floating-point-implementation
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
` (3 preceding siblings ...)
2020-04-14 18:51 ` pinskia at gcc dot gnu.org
@ 2020-04-14 19:35 ` lopresti at gmail dot com
2020-04-14 20:44 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
7 siblings, 0 replies; 9+ messages in thread
From: lopresti at gmail dot com @ 2020-04-14 19:35 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587
--- Comment #5 from Patrick J. LoPresti <lopresti at gmail dot com> ---
(In reply to Andrew Pinski from comment #4)
>
> Note this is true even without using intrinsics really. You can get the
> same behavior you are seeing with using standard C code.
Yes, which is one reason I am using intrinsics: They provide deterministic
behavior on literally every compiler at every optimization level by default.
Except GCC when AVX512 is enabled, that is.
> >
> > Wouldn't fp-contract be more appropriate as part of "-ffast-math"?
>
> No. This has been discussed many times and decided no.
A ridiculous but not surprising decision.
> https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Floating-point-implementation.
> html#Floating-point-implementation
Well, that page is wrong.
"Expressions are currently only contracted if -ffp-contract=fast,
-funsafe-math-optimizations or -ffast-math are used."
I did not use -ffp-contract=fast nor -funsafe-math-optimizations nor
-ffast-math. Yet the statements were contracted. So the documentation has a
bug.
More to the point, it does not answer the question I asked, which is what
options are required to get deterministic behavior from intrinsics.
So I suppose I have to re-read that chapter on every release, then search the
rest of the documentation to learn what the defaults are, to figure out whether
and how you broke something further? OK thanks
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
` (4 preceding siblings ...)
2020-04-14 19:35 ` lopresti at gmail dot com
@ 2020-04-14 20:44 ` pinskia at gcc dot gnu.org
2020-04-14 20:47 ` jakub at gcc dot gnu.org
2020-04-14 21:33 ` lopresti at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-04-14 20:44 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587
--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Patrick J. LoPresti from comment #5)
> I did not use -ffp-contract=fast nor -funsafe-math-optimizations nor
> -ffast-math. Yet the statements were contracted. So the documentation has a
> bug.
Not exactly because the defualt for -std=c11 is different from -std=gnu11
(which is the default, or is the default -std=gnu17 well it depends on the
compiler version).
Plus:
https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Optimize-Options.html#index-ffp-contract
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
` (5 preceding siblings ...)
2020-04-14 20:44 ` pinskia at gcc dot gnu.org
@ 2020-04-14 20:47 ` jakub at gcc dot gnu.org
2020-04-14 21:33 ` lopresti at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-04-14 20:47 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587
--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, clang defaults to -ffp-contract=on which is like =fast (except when you
use FP_CONTRACT pragma).
^ permalink raw reply [flat|nested] 9+ messages in thread
* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
` (6 preceding siblings ...)
2020-04-14 20:47 ` jakub at gcc dot gnu.org
@ 2020-04-14 21:33 ` lopresti at gmail dot com
7 siblings, 0 replies; 9+ messages in thread
From: lopresti at gmail dot com @ 2020-04-14 21:33 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587
--- Comment #8 from Patrick J. LoPresti <lopresti at gmail dot com> ---
(In reply to Andrew Pinski from comment #6)
> (In reply to Patrick J. LoPresti from comment #5)
> > I did not use -ffp-contract=fast nor -funsafe-math-optimizations nor
> > -ffast-math. Yet the statements were contracted. So the documentation has a
> > bug.
>
> Not exactly because the defualt for -std=c11 is different from -std=gnu11
> (which is the default, or is the default -std=gnu17 well it depends on the
> compiler version).
Yes, I know. So the floating-point implementation chapter can only be
understood if you also search the rest of the 1000-page documentation to learn
what the defaults are. And then repeat for every release because "This is
subject to change."
(In reply to Jakub Jelinek from comment #7)
> Note, clang defaults to -ffp-contract=on which is like =fast (except when
> you use FP_CONTRACT pragma).
No, it is not like "fast". At least, not according to my experience and this
StackOverflow answer:
https://stackoverflow.com/a/43357638
Note the link to the live example on gcc.godbolt.org.
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2020-04-14 21:33 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
2020-04-14 6:47 ` [Bug target/94587] " marxin at gcc dot gnu.org
2020-04-14 9:27 ` jakub at gcc dot gnu.org
2020-04-14 15:25 ` lopresti at gmail dot com
2020-04-14 18:51 ` pinskia at gcc dot gnu.org
2020-04-14 19:35 ` lopresti at gmail dot com
2020-04-14 20:44 ` pinskia at gcc dot gnu.org
2020-04-14 20:47 ` jakub at gcc dot gnu.org
2020-04-14 21:33 ` lopresti at gmail dot com
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).