public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512
@ 2020-04-13 21:28 lopresti at gmail dot com
  2020-04-14  6:47 ` [Bug target/94587] " marxin at gcc dot gnu.org
                   ` (7 more replies)
  0 siblings, 8 replies; 9+ messages in thread
From: lopresti at gmail dot com @ 2020-04-13 21:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587

            Bug ID: 94587
           Summary: Intrinsics optimization bug with -O2
                    -march=skylake-avx512
           Product: gcc
           Version: 9.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: rtl-optimization
          Assignee: unassigned at gcc dot gnu.org
          Reporter: lopresti at gmail dot com
  Target Milestone: ---

Created attachment 48265
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=48265&action=edit
Minimal test case for skylake-avx512 intrinsics optimization bug

This bug appears to be present across at least GCC 8 and GCC 9.

Compile the attached program using "g++ -O2 -march=skylake-avx512" and run it.

Expected result: No output, exit status 0.

Actual result: "WRONG -0.99999999999999789058", exit status 2.

To make the problem go away, do any of the following:

- Use "-march=corei7" ; or

- Remove the "__attribute__((noinline))" from the rot90() function ; or

- Add "-DHEISENBUG" to the command line

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
  2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
@ 2020-04-14  6:47 ` marxin at gcc dot gnu.org
  2020-04-14  9:27 ` jakub at gcc dot gnu.org
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: marxin at gcc dot gnu.org @ 2020-04-14  6:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587

Martin Liška <marxin at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
                 CC|                            |marxin at gcc dot gnu.org
   Last reconfirmed|                            |2020-04-14
     Ever confirmed|0                           |1

--- Comment #1 from Martin Liška <marxin at gcc dot gnu.org> ---
Confirmed. I see it also with all starting from 6.1.0 where skylake-512 was
added. One can run it on a non-avx512 target.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
  2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
  2020-04-14  6:47 ` [Bug target/94587] " marxin at gcc dot gnu.org
@ 2020-04-14  9:27 ` jakub at gcc dot gnu.org
  2020-04-14 15:25 ` lopresti at gmail dot com
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-04-14  9:27 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org
             Status|NEW                         |RESOLVED
         Resolution|---                         |INVALID

--- Comment #2 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
I don't see anything wrong, with -mavx512vl -O2 FMA is used, while it isn't
used without those options.  Use -ffp-contract=off if you don't like this, the
default is -ffp-contract=fast as documented.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
  2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
  2020-04-14  6:47 ` [Bug target/94587] " marxin at gcc dot gnu.org
  2020-04-14  9:27 ` jakub at gcc dot gnu.org
@ 2020-04-14 15:25 ` lopresti at gmail dot com
  2020-04-14 18:51 ` pinskia at gcc dot gnu.org
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: lopresti at gmail dot com @ 2020-04-14 15:25 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587

Patrick J. LoPresti <lopresti at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|INVALID                     |---

--- Comment #3 from Patrick J. LoPresti <lopresti at gmail dot com> ---
That works; thank you. However...

I realize there is no formal spec for intrinsics. But when I use them, I expect
deterministic behavior by default. This has been true on every compiler with
every set of optimization and architecture flags I have ever used (GCC before
AVX, many versions of Clang, many versions of the Intel compiler).

Also, the "-DHEISENBUG" example shows that simply adding a side-effect-free
assert() changes the behavior. This seems... unfriendly... as a default.

Wouldn't fp-contract be more appropriate as part of "-ffast-math"?

To my knowledge, no other compiler behaves this way. Are there any other
options I need to ensure deterministic behavior for SSE intrinsics on GCC? Will
there be more in the future? I do apologize if I missed the answer in the
1000-page GCC manual.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
  2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
                   ` (2 preceding siblings ...)
  2020-04-14 15:25 ` lopresti at gmail dot com
@ 2020-04-14 18:51 ` pinskia at gcc dot gnu.org
  2020-04-14 19:35 ` lopresti at gmail dot com
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-04-14 18:51 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WONTFIX
             Status|REOPENED                    |RESOLVED

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Patrick J. LoPresti from comment #3)
> That works; thank you. However...
> 
> I realize there is no formal spec for intrinsics. But when I use them, I
> expect deterministic behavior by default. This has been true on every
> compiler with every set of optimization and architecture flags I have ever
> used (GCC before AVX, many versions of Clang, many versions of the Intel
> compiler).
> 
> Also, the "-DHEISENBUG" example shows that simply adding a side-effect-free
> assert() changes the behavior. This seems... unfriendly... as a default.

Note this is true even without using intrinsics really.  You can get the same
behavior you are seeing with using standard C code.  

> 
> Wouldn't fp-contract be more appropriate as part of "-ffast-math"?

No.  This has been discussed many times and decided no.  

> 
> To my knowledge, no other compiler behaves this way. Are there any other
> options I need to ensure deterministic behavior for SSE intrinsics on GCC?
> Will there be more in the future? I do apologize if I missed the answer in
> the 1000-page GCC manual.

https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Floating-point-implementation.html#Floating-point-implementation

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
  2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
                   ` (3 preceding siblings ...)
  2020-04-14 18:51 ` pinskia at gcc dot gnu.org
@ 2020-04-14 19:35 ` lopresti at gmail dot com
  2020-04-14 20:44 ` pinskia at gcc dot gnu.org
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 9+ messages in thread
From: lopresti at gmail dot com @ 2020-04-14 19:35 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587

--- Comment #5 from Patrick J. LoPresti <lopresti at gmail dot com> ---
(In reply to Andrew Pinski from comment #4)
> 
> Note this is true even without using intrinsics really.  You can get the
> same behavior you are seeing with using standard C code.  

Yes, which is one reason I am using intrinsics: They provide deterministic
behavior on literally every compiler at every optimization level by default.
Except GCC when AVX512 is enabled, that is.

> > 
> > Wouldn't fp-contract be more appropriate as part of "-ffast-math"?
> 
> No.  This has been discussed many times and decided no.  

A ridiculous but not surprising decision.

> https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Floating-point-implementation.
> html#Floating-point-implementation

Well, that page is wrong.

"Expressions are currently only contracted if -ffp-contract=fast,
-funsafe-math-optimizations or -ffast-math are used."

I did not use -ffp-contract=fast nor -funsafe-math-optimizations nor
-ffast-math. Yet the statements were contracted. So the documentation has a
bug.

More to the point, it does not answer the question I asked, which is what
options are required to get deterministic behavior from intrinsics.

So I suppose I have to re-read that chapter on every release, then search the
rest of the documentation to learn what the defaults are, to figure out whether
and how you broke something further? OK thanks

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
  2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
                   ` (4 preceding siblings ...)
  2020-04-14 19:35 ` lopresti at gmail dot com
@ 2020-04-14 20:44 ` pinskia at gcc dot gnu.org
  2020-04-14 20:47 ` jakub at gcc dot gnu.org
  2020-04-14 21:33 ` lopresti at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: pinskia at gcc dot gnu.org @ 2020-04-14 20:44 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Patrick J. LoPresti from comment #5)
> I did not use -ffp-contract=fast nor -funsafe-math-optimizations nor
> -ffast-math. Yet the statements were contracted. So the documentation has a
> bug.

Not exactly because the defualt for -std=c11 is different from -std=gnu11
(which is the default, or is the default -std=gnu17 well it depends on the
compiler version).

Plus:
https://gcc.gnu.org/onlinedocs/gcc-9.3.0/gcc/Optimize-Options.html#index-ffp-contract

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
  2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
                   ` (5 preceding siblings ...)
  2020-04-14 20:44 ` pinskia at gcc dot gnu.org
@ 2020-04-14 20:47 ` jakub at gcc dot gnu.org
  2020-04-14 21:33 ` lopresti at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: jakub at gcc dot gnu.org @ 2020-04-14 20:47 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Note, clang defaults to -ffp-contract=on which is like =fast (except when you
use FP_CONTRACT pragma).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [Bug target/94587] Intrinsics optimization bug with -O2 -march=skylake-avx512
  2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
                   ` (6 preceding siblings ...)
  2020-04-14 20:47 ` jakub at gcc dot gnu.org
@ 2020-04-14 21:33 ` lopresti at gmail dot com
  7 siblings, 0 replies; 9+ messages in thread
From: lopresti at gmail dot com @ 2020-04-14 21:33 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94587

--- Comment #8 from Patrick J. LoPresti <lopresti at gmail dot com> ---
(In reply to Andrew Pinski from comment #6)
> (In reply to Patrick J. LoPresti from comment #5)
> > I did not use -ffp-contract=fast nor -funsafe-math-optimizations nor
> > -ffast-math. Yet the statements were contracted. So the documentation has a
> > bug.
> 
> Not exactly because the defualt for -std=c11 is different from -std=gnu11
> (which is the default, or is the default -std=gnu17 well it depends on the
> compiler version).

Yes, I know. So the floating-point implementation chapter can only be
understood if you also search the rest of the 1000-page documentation to learn
what the defaults are. And then repeat for every release because "This is
subject to change."

(In reply to Jakub Jelinek from comment #7)
> Note, clang defaults to -ffp-contract=on which is like =fast (except when
> you use FP_CONTRACT pragma).

No, it is not like "fast". At least, not according to my experience and this
StackOverflow answer:

https://stackoverflow.com/a/43357638

Note the link to the live example on gcc.godbolt.org.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-04-14 21:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-13 21:28 [Bug rtl-optimization/94587] New: Intrinsics optimization bug with -O2 -march=skylake-avx512 lopresti at gmail dot com
2020-04-14  6:47 ` [Bug target/94587] " marxin at gcc dot gnu.org
2020-04-14  9:27 ` jakub at gcc dot gnu.org
2020-04-14 15:25 ` lopresti at gmail dot com
2020-04-14 18:51 ` pinskia at gcc dot gnu.org
2020-04-14 19:35 ` lopresti at gmail dot com
2020-04-14 20:44 ` pinskia at gcc dot gnu.org
2020-04-14 20:47 ` jakub at gcc dot gnu.org
2020-04-14 21:33 ` lopresti at gmail dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).