public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug c++/100839] New: -O2 does dangerous optimizations using FMA (please don't break my cross product)
@ 2021-05-31 1:30 metalcaedes at gmail dot com
2021-05-31 1:39 ` [Bug middle-end/100839] " pinskia at gcc dot gnu.org
` (6 more replies)
0 siblings, 7 replies; 8+ messages in thread
From: metalcaedes at gmail dot com @ 2021-05-31 1:30 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100839
Bug ID: 100839
Summary: -O2 does dangerous optimizations using FMA (please
don't break my cross product)
Product: gcc
Version: 11.1.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c++
Assignee: unassigned at gcc dot gnu.org
Reporter: metalcaedes at gmail dot com
Target Milestone: ---
Created attachment 50893
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50893&action=edit
simple test case (same as on compiler explorer)
I'm having problems with a cross-product function returning (slightly) wrong
results when built with -O2 (or above) and with FMA enabled (like
-march=znver1) - more specifically, -fexpensive-optimizations causes the
problem.
-ffast-math or related flags were *not* used.
Values that should be exactly 0 aren't, because the intermediate of the
multiplication part of vfmsub132ss has "infinite precision" while the
subtracted value (also result of a multiplication) has single-precision.
I (or rather Doom3) has a 3D Vector class called idVec3 which has three
(single-precision) float members x, y, z.
The cross product is calculated like this:
idVec3 cross( const idVec3& v1, const idVec3 &v2 ) {
float x = v1.y * v2.z - v1.z * v2.y;
float y = v1.z * v2.x - v1.x * v2.z;
float z = v1.x * v2.y - v1.y * v2.x;
return idVec3(x, y, z);
}
For brevity (and because it was what caused the bug I investigated[1]) I only
looked at the calculation of z, but the same problem should happen with x and
y:
float crossZ(const idVec3& v1, const idVec3& v2) {
float z = v1.x * v2.y - v1.y * v2.x;
return z;
}
So if v1.x * v2.y == v1.y * v2.x (like when v1 == v2) z should be exactly 0.0
However, it's not when FMA and -fexpensive-optimizations are used, because then
that function is generated as:
vmulss xmm1,xmm1,xmm2
vfmsub132ss xmm0,xmm1,xmm3
ret
So the `v1.y * v2.x` part is calculated with a normal multiplication and stored
as single-precision floats.
Then `v1.x * v2.y - other_result` is calculated with vfmsub132ss, which means
that the result of `v1.x * v2.y` is never stored, but only exist as an
"infinite precision intermediate result"[2] from which the other
(single-precision) result of the `vmulss` is subtracted.
This means that, if both multiplication results should *theoretically* be
identical, it returns the rounding error between the result as "infinite
precision" float (double at least?) and single-precision float.
This rounding error fits well into a float because floats have great precisions
near zero - and it can be relatively big: With v1.x = -277.3333129883 and v1.y
= -69.3333282471 (result of multiplications: about 23665.775), crossZ(v1, v1)
returns 0.0002170140.
With -O1 (or -O2 -fno-expensive-optimizations) the generated ASM is more
straightforward and, as expected, does two multiplications and then an
addition:
vmovss xmm0,DWORD PTR [rdi]
vmulss xmm0,xmm0,DWORD PTR [rsi+0x4]
vmovss xmm1,DWORD PTR [rdi+0x4]
vmulss xmm1,xmm1,DWORD PTR [rsi]
vsubss xmm0,xmm0,xmm1
ret
IMHO an optimization that basically causes a*b - b*a to not return exactly 0.0
should only be enabled with dangerous flags like -ffast-math, not with plain
-O2.
Incidentally, it seems like this is what clang does: It only uses vfmsub* in
crossZ() if -ffast-math is set.
Here's a compiler explorer link with a reduced (to .z) testcase, printing the
results of that function when compiled with -O2 vs -O1:
https://gcc.godbolt.org/z/8K3vKh7b3
The problem happens with all GCC versions I tested, including the 11.1 and
"trunk" versions in compiler explorer. I didn't test this, but I wouldn't be
surprised if plain C was also affected (and not just C++).
[1] https://github.com/RobertBeckebans/RBDOOM-3-BFG/issues/436
[2] https://www.felixcloutier.com/x86/vfmsub132ss:vfmsub213ss:vfmsub231ss
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/100839] -O2 does dangerous optimizations using FMA (please don't break my cross product)
2021-05-31 1:30 [Bug c++/100839] New: -O2 does dangerous optimizations using FMA (please don't break my cross product) metalcaedes at gmail dot com
@ 2021-05-31 1:39 ` pinskia at gcc dot gnu.org
2021-05-31 1:46 ` metalcaedes at gmail dot com
` (5 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-05-31 1:39 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100839
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |INVALID
Component|c++ |middle-end
Status|UNCONFIRMED |RESOLVED
--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
-ffp-contract=off
https://gcc.gnu.org/onlinedocs/gcc-11.1.0/gcc/Optimize-Options.html#index-ffp-contract
-ffp-contract=style
-ffp-contract=off disables floating-point expression contraction.
-ffp-contract=fast enables floating-point expression contraction such as
forming of fused multiply-add operations if the target has native support for
them. -ffp-contract=on enables floating-point expression contraction if allowed
by the language standard. This is currently not implemented and treated equal
to -ffp-contract=off.
The default is -ffp-contract=fast.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/100839] -O2 does dangerous optimizations using FMA (please don't break my cross product)
2021-05-31 1:30 [Bug c++/100839] New: -O2 does dangerous optimizations using FMA (please don't break my cross product) metalcaedes at gmail dot com
2021-05-31 1:39 ` [Bug middle-end/100839] " pinskia at gcc dot gnu.org
@ 2021-05-31 1:46 ` metalcaedes at gmail dot com
2021-05-31 1:51 ` pinskia at gcc dot gnu.org
` (4 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: metalcaedes at gmail dot com @ 2021-05-31 1:46 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100839
Daniel Gibson <metalcaedes at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|INVALID |FIXED
--- Comment #2 from Daniel Gibson <metalcaedes at gmail dot com> ---
Do I understand this correctly that optimized builds by default violate the
standard?
At least that's what the difference between -ffp-contract=fast and the
unimplemented -ffp-contract=on suggests
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/100839] -O2 does dangerous optimizations using FMA (please don't break my cross product)
2021-05-31 1:30 [Bug c++/100839] New: -O2 does dangerous optimizations using FMA (please don't break my cross product) metalcaedes at gmail dot com
2021-05-31 1:39 ` [Bug middle-end/100839] " pinskia at gcc dot gnu.org
2021-05-31 1:46 ` metalcaedes at gmail dot com
@ 2021-05-31 1:51 ` pinskia at gcc dot gnu.org
2021-05-31 1:57 ` metalcaedes at gmail dot com
` (3 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-05-31 1:51 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100839
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|FIXED |INVALID
--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Use strict language mode then, e.g. -std=c++11 or -std=c11 etc.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/100839] -O2 does dangerous optimizations using FMA (please don't break my cross product)
2021-05-31 1:30 [Bug c++/100839] New: -O2 does dangerous optimizations using FMA (please don't break my cross product) metalcaedes at gmail dot com
` (2 preceding siblings ...)
2021-05-31 1:51 ` pinskia at gcc dot gnu.org
@ 2021-05-31 1:57 ` metalcaedes at gmail dot com
2021-05-31 2:08 ` pinskia at gcc dot gnu.org
` (2 subsequent siblings)
6 siblings, 0 replies; 8+ messages in thread
From: metalcaedes at gmail dot com @ 2021-05-31 1:57 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100839
Daniel Gibson <metalcaedes at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|INVALID |FIXED
--- Comment #4 from Daniel Gibson <metalcaedes at gmail dot com> ---
Nope.
1. -std=c++11 doesn't make a difference, just tested it on compiler explorer.
2. What's the point of -Ofast ("It also enables optimizations that are not
valid for all standard-compliant programs.") if the normal optimization levels
already violate the standard?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/100839] -O2 does dangerous optimizations using FMA (please don't break my cross product)
2021-05-31 1:30 [Bug c++/100839] New: -O2 does dangerous optimizations using FMA (please don't break my cross product) metalcaedes at gmail dot com
` (3 preceding siblings ...)
2021-05-31 1:57 ` metalcaedes at gmail dot com
@ 2021-05-31 2:08 ` pinskia at gcc dot gnu.org
2021-05-31 3:09 ` metalcaedes at gmail dot com
2021-05-31 3:26 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-05-31 2:08 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100839
Andrew Pinski <pinskia at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Resolution|FIXED |INVALID
--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Daniel Gibson from comment #4)
> Nope.
>
> 1. -std=c++11 doesn't make a difference, just tested it on compiler explorer.
> 2. What's the point of -Ofast ("It also enables optimizations that are not
> valid for all standard-compliant programs.") if the normal optimization
> levels already violate the standard?
Oh that is because C++ allows it while C does not.
/* ISO C restricts floating-point expression contraction to within
source-language expressions (-ffp-contract=on, currently an alias
for -ffp-contract=off). */
if (flag_iso
&& !c_dialect_cxx ()
&& (global_options_set.x_flag_fp_contract_mode
== (enum fp_contract_mode) 0)
&& flag_unsafe_math_optimizations == 0)
flag_fp_contract_mode = FP_CONTRACT_OFF;
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/100839] -O2 does dangerous optimizations using FMA (please don't break my cross product)
2021-05-31 1:30 [Bug c++/100839] New: -O2 does dangerous optimizations using FMA (please don't break my cross product) metalcaedes at gmail dot com
` (4 preceding siblings ...)
2021-05-31 2:08 ` pinskia at gcc dot gnu.org
@ 2021-05-31 3:09 ` metalcaedes at gmail dot com
2021-05-31 3:26 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: metalcaedes at gmail dot com @ 2021-05-31 3:09 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100839
--- Comment #6 from Daniel Gibson <metalcaedes at gmail dot com> ---
Are you sure about this?
I couldn't find much about floats (and nothing about legal transformations) in
the C++ standard, and if it's in IEEE-754 it should be the same for C and C++,
right?
^ permalink raw reply [flat|nested] 8+ messages in thread
* [Bug middle-end/100839] -O2 does dangerous optimizations using FMA (please don't break my cross product)
2021-05-31 1:30 [Bug c++/100839] New: -O2 does dangerous optimizations using FMA (please don't break my cross product) metalcaedes at gmail dot com
` (5 preceding siblings ...)
2021-05-31 3:09 ` metalcaedes at gmail dot com
@ 2021-05-31 3:26 ` pinskia at gcc dot gnu.org
6 siblings, 0 replies; 8+ messages in thread
From: pinskia at gcc dot gnu.org @ 2021-05-31 3:26 UTC (permalink / raw)
To: gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100839
--- Comment #7 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Daniel Gibson from comment #6)
> Are you sure about this?
> I couldn't find much about floats (and nothing about legal transformations)
> in the C++ standard, and if it's in IEEE-754 it should be the same for C and
> C++, right?
C is just different here from C++ :).
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-05-31 3:26 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-31 1:30 [Bug c++/100839] New: -O2 does dangerous optimizations using FMA (please don't break my cross product) metalcaedes at gmail dot com
2021-05-31 1:39 ` [Bug middle-end/100839] " pinskia at gcc dot gnu.org
2021-05-31 1:46 ` metalcaedes at gmail dot com
2021-05-31 1:51 ` pinskia at gcc dot gnu.org
2021-05-31 1:57 ` metalcaedes at gmail dot com
2021-05-31 2:08 ` pinskia at gcc dot gnu.org
2021-05-31 3:09 ` metalcaedes at gmail dot com
2021-05-31 3:26 ` pinskia at gcc dot gnu.org
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).