public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
* [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le
@ 2022-05-04 13:15 sebastian-gcc at sipsolutions dot net
  2022-05-04 14:21 ` [Bug target/105480] " sebastian-gcc at sipsolutions dot net
                   ` (11 more replies)
  0 siblings, 12 replies; 13+ messages in thread
From: sebastian-gcc at sipsolutions dot net @ 2022-05-04 13:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

            Bug ID: 105480
           Summary: Vectorized `isnan` appears to trigger FPE on ppc64le
           Product: gcc
           Version: 11.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
          Assignee: unassigned at gcc dot gnu.org
          Reporter: sebastian-gcc at sipsolutions dot net
  Target Milestone: ---

Compiling for ppc64le with `-O3` seems to use the `xvcmpgedp` instruction which
ends up setting the FPE flags.  I have created an example in godbolt:

https://godbolt.org/z/n8PfPPP1d

Clang will produce similar code (using the same instruction), but when enabling
`-ftrapping-math` (as they always want...) will switch to `vcmpgtud`.

I think the same issue happens with `==` and `!=` which (IIRC) should not trap.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
@ 2022-05-04 14:21 ` sebastian-gcc at sipsolutions dot net
  2022-05-05  8:14 ` sebastian-gcc at sipsolutions dot net
                   ` (10 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: sebastian-gcc at sipsolutions dot net @ 2022-05-04 14:21 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

--- Comment #1 from seberg <sebastian-gcc at sipsolutions dot net> ---
Actually, I think I likely misanalyzed, the actual code in question where I
found this was doing something like:


while (n--) {
     if (isnan(*input)) {
         *out = 1;
     }
     else {
         *out = (long)(*input)
     }
}

So probably it is rather the branch that gets optimized away.  I will try to
post a better example.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
  2022-05-04 14:21 ` [Bug target/105480] " sebastian-gcc at sipsolutions dot net
@ 2022-05-05  8:14 ` sebastian-gcc at sipsolutions dot net
  2022-11-02 20:00 ` pinskia at gcc dot gnu.org
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: sebastian-gcc at sipsolutions dot net @ 2022-05-05  8:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

--- Comment #2 from seberg <sebastian-gcc at sipsolutions dot net> ---
I still owed the correct example: https://godbolt.org/z/33Pj6xvPr

Now I think the cause is the indeed (somewhat understandandable) desire to
optimize away the branching.  And not the instruction (I am not sure how the
comparison instruction deals with trapping, so it could be both).

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
  2022-05-04 14:21 ` [Bug target/105480] " sebastian-gcc at sipsolutions dot net
  2022-05-05  8:14 ` sebastian-gcc at sipsolutions dot net
@ 2022-11-02 20:00 ` pinskia at gcc dot gnu.org
  2022-11-02 20:10 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-02 20:00 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

--- Comment #3 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Created attachment 53821
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=53821&action=edit
testcase -O3

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
                   ` (2 preceding siblings ...)
  2022-11-02 20:00 ` pinskia at gcc dot gnu.org
@ 2022-11-02 20:10 ` pinskia at gcc dot gnu.org
  2022-11-08  7:14 ` linkw at gcc dot gnu.org
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: pinskia at gcc dot gnu.org @ 2022-11-02 20:10 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Target|powerpc64le-linux           |powerpc64*-linux-*

--- Comment #4 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
During expand:
(insn 20 19 21 (set (reg:V2DF 149)
        (unordered:V2DF (reg:V2DF 128 [ vect__7.7 ])
            (reg:V2DF 128 [ vect__7.7 ]))) -1
     (nil))

Which is correct.

after combine:
(insn 26 21 27 3 (set (reg:V2DF 154)
        (unordered:V2DF (reg:V2DF 144 [ vect__7.8 ])
            (reg:V2DF 144 [ vect__7.8 ]))) 1138 {vector_unorderedv2df}
     (expr_list:REG_DEAD (reg:V2DF 144 [ vect__7.8 ])
        (nil)))

So basically the splitter here goes wrong:
; For ltgt/uneq/ordered/unordered:
; ltgt: gt(a,b) | gt(b,a)
; uneq: ~(gt(a,b) | gt(b,a))
; ordered: ge(a,b) | ge(b,a)
; unordered: ~(ge(a,b) | ge(b,a))
(define_insn_and_split

Someone else needs to look into making sure these are valid for trapping math
case ...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
                   ` (3 preceding siblings ...)
  2022-11-02 20:10 ` pinskia at gcc dot gnu.org
@ 2022-11-08  7:14 ` linkw at gcc dot gnu.org
  2022-11-09  8:31 ` linkw at gcc dot gnu.org
                   ` (6 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-11-08  7:14 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Assignee|unassigned at gcc dot gnu.org      |linkw at gcc dot gnu.org
     Ever confirmed|0                           |1
   Last reconfirmed|                            |2022-11-08
             Status|UNCONFIRMED                 |NEW
                 CC|                            |linkw at gcc dot gnu.org

--- Comment #5 from Kewen Lin <linkw at gcc dot gnu.org> ---
Confirmed, the vectorized version should not trap which is inconsistent with
scalar version, btw it's better to initialize some array members as explicit
NAN to make the trap reproducible all the time.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
                   ` (4 preceding siblings ...)
  2022-11-08  7:14 ` linkw at gcc dot gnu.org
@ 2022-11-09  8:31 ` linkw at gcc dot gnu.org
  2022-11-09  8:48 ` rguenth at gcc dot gnu.org
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-11-09  8:31 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

Kewen Lin <linkw at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bergner at gcc dot gnu.org,
                   |                            |jsm28 at gcc dot gnu.org,
                   |                            |rguenth at gcc dot gnu.org,
                   |                            |rsandifo at gcc dot gnu.org,
                   |                            |segher at gcc dot gnu.org

--- Comment #6 from Kewen Lin <linkw at gcc dot gnu.org> ---
It's certainly an issue reported here, for one complete fix I did some more
investigation how the option -ftrapping-math affects things, from what LLVM
generates, it looks to me that:
  1) with -ftrapping-math, we can assume fp operations can raise exceptions
(user visible as doc said), here __builtin_isnan should not raise exception
even for sNaN, so it uses vector int instructions instead.
  2) while with -fno-trapping-math, we can assume fp operations can't raise
exceptions, and as doc said "This option requires that -fno-signaling-nans be
in effect", so there is no sNaN at all, safe to use vector fp instructions
which don't trap for qNaN.

It's concluded that __builtin_isnan is supposed not to trap even if the given
value is sNaN.

But with one simple scalar version of case with __builtin_isnan, I don't see
GCC honor -ftrapping-math from the code generated on both Power and aarch64:

----
#define _GNU_SOURCE
#include "math.h"

int func(double x)
{
  return __builtin_isnan (x);
}
----

w/ -ftrapping-math or -fno-trapping-math, it gets the same insns like:

aarch64:
        fcmp    d0, d0
        cset    w0, vs
        ret

ppc64le:
        fcmpu 0,1,1
        mfcr 3,128
        rlwinm 3,3,4,1

Both fcmpu and fcmp would trap for sNaN, is it expected with the current GCC
implementation?

Tested with compiler explorer, I saw LLVM generates insns without fcmpu on
Power, like:

        mffprd  4, 1
        li 3, 2047
        rldic 3, 3, 52, 1
        clrldi  4, 4, 1
        subc    4, 3, 4
        subfe 3, 3, 3
        neg 3, 3
        blr

though I did still see LLVM uses fcmp on aarch64 (maybe a warning saying
"overriding currently unsupported use of floating point exceptions on this
target" can explain it).

Another question in mind is that I saw GCC lowed __builtin_isnan with tree code
UNORDERED_EXPR, does it mean that UNORDERED_EXPR has the same semantic as 
what's concluded for __builtin_isnan above (that is not to trap for both qNaN
and sNaN)? It seems internal documentation doesn't say much on this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
                   ` (5 preceding siblings ...)
  2022-11-09  8:31 ` linkw at gcc dot gnu.org
@ 2022-11-09  8:48 ` rguenth at gcc dot gnu.org
  2022-11-09  8:59 ` linkw at gcc dot gnu.org
                   ` (4 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-11-09  8:48 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

--- Comment #7 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #6)
> It's certainly an issue reported here, for one complete fix I did some more
> investigation how the option -ftrapping-math affects things, from what LLVM
> generates, it looks to me that:
>   1) with -ftrapping-math, we can assume fp operations can raise exceptions
> (user visible as doc said), here __builtin_isnan should not raise exception
> even for sNaN, so it uses vector int instructions instead.
>   2) while with -fno-trapping-math, we can assume fp operations can't raise
> exceptions, and as doc said "This option requires that -fno-signaling-nans
> be in effect", so there is no sNaN at all, safe to use vector fp
> instructions which don't trap for qNaN.
> 
> It's concluded that __builtin_isnan is supposed not to trap even if the
> given value is sNaN.
> 
> But with one simple scalar version of case with __builtin_isnan, I don't see
> GCC honor -ftrapping-math from the code generated on both Power and aarch64:
> 
> ----
> #define _GNU_SOURCE
> #include "math.h"
> 
> int func(double x)
> {
>   return __builtin_isnan (x);
> }
> ----
> 
> w/ -ftrapping-math or -fno-trapping-math, it gets the same insns like:
> 
> aarch64:
>         fcmp    d0, d0
>         cset    w0, vs
>         ret
> 
> ppc64le:
>         fcmpu 0,1,1
>         mfcr 3,128
>         rlwinm 3,3,4,1
> 
> Both fcmpu and fcmp would trap for sNaN, is it expected with the current GCC
> implementation?

But the key is -fsignalling-nans (default off)

> Tested with compiler explorer, I saw LLVM generates insns without fcmpu on
> Power, like:
> 
>         mffprd  4, 1
>         li 3, 2047
>         rldic 3, 3, 52, 1
>         clrldi  4, 4, 1
>         subc    4, 3, 4
>         subfe 3, 3, 3
>         neg 3, 3
>         blr
> 
> though I did still see LLVM uses fcmp on aarch64 (maybe a warning saying
> "overriding currently unsupported use of floating point exceptions on this
> target" can explain it).
> 
> Another question in mind is that I saw GCC lowed __builtin_isnan with tree
> code UNORDERED_EXPR, does it mean that UNORDERED_EXPR has the same semantic
> as  what's concluded for __builtin_isnan above (that is not to trap for both
> qNaN and sNaN)? It seems internal documentation doesn't say much on this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
                   ` (6 preceding siblings ...)
  2022-11-09  8:48 ` rguenth at gcc dot gnu.org
@ 2022-11-09  8:59 ` linkw at gcc dot gnu.org
  2022-11-09 13:08 ` rguenth at gcc dot gnu.org
                   ` (3 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-11-09  8:59 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

--- Comment #8 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #7)
> (In reply to Kewen Lin from comment #6)
> > Both fcmpu and fcmp would trap for sNaN, is it expected with the current GCC
> > implementation?
> 
> But the key is -fsignalling-nans (default off)
> 

Thanks for the hint, but the behaviors don't change with one more explicit
option -fsignalling-nans (before or after the option for trapping-math). I saw
the option description saying "This option is experimental and does not
currently guarantee to disable all GCC optimizations that affect signaling NaN
behavior.", does it mean target codes don't honor it much so far?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
                   ` (7 preceding siblings ...)
  2022-11-09  8:59 ` linkw at gcc dot gnu.org
@ 2022-11-09 13:08 ` rguenth at gcc dot gnu.org
  2022-11-09 17:30 ` joseph at codesourcery dot com
                   ` (2 subsequent siblings)
  11 siblings, 0 replies; 13+ messages in thread
From: rguenth at gcc dot gnu.org @ 2022-11-09 13:08 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Kewen Lin from comment #8)
> (In reply to Richard Biener from comment #7)
> > (In reply to Kewen Lin from comment #6)
> > > Both fcmpu and fcmp would trap for sNaN, is it expected with the current GCC
> > > implementation?
> > 
> > But the key is -fsignalling-nans (default off)
> > 
> 
> Thanks for the hint, but the behaviors don't change with one more explicit
> option -fsignalling-nans (before or after the option for trapping-math). I
> saw the option description saying "This option is experimental and does not
> currently guarantee to disable all GCC optimizations that affect signaling
> NaN behavior.", does it mean target codes don't honor it much so far?

That could be very well the case.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
                   ` (8 preceding siblings ...)
  2022-11-09 13:08 ` rguenth at gcc dot gnu.org
@ 2022-11-09 17:30 ` joseph at codesourcery dot com
  2022-11-10  5:58 ` linkw at gcc dot gnu.org
  2022-11-10 23:42 ` joseph at codesourcery dot com
  11 siblings, 0 replies; 13+ messages in thread
From: joseph at codesourcery dot com @ 2022-11-09 17:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

--- Comment #10 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
For scalar isnan see bug 66462.  (The claim in bug 66462 comment 19 that 
there was ever a working version of that patch ready to go in is 
incorrect: November 2016 is older than June 2017.)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
                   ` (9 preceding siblings ...)
  2022-11-09 17:30 ` joseph at codesourcery dot com
@ 2022-11-10  5:58 ` linkw at gcc dot gnu.org
  2022-11-10 23:42 ` joseph at codesourcery dot com
  11 siblings, 0 replies; 13+ messages in thread
From: linkw at gcc dot gnu.org @ 2022-11-10  5:58 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

--- Comment #11 from Kewen Lin <linkw at gcc dot gnu.org> ---
(In reply to joseph@codesourcery.com from comment #10)
> For scalar isnan see bug 66462.  (The claim in bug 66462 comment 19 that 
> there was ever a working version of that patch ready to go in is 
> incorrect: November 2016 is older than June 2017.)

Thanks for the pointer! So it's clear that the function should not set the
invalid flag for either NaN. I wonder if we should expect tree code
UNORDERED_EXPR and rtx code UNORDERED have the same semantic (will not set
invalid) since it's lowered to them accordingly. If yes, the current optimized
dump looks fine, otherwise it's undefined for tree and rtx code? I guess it
means we should lower it to something else which isn't expected to raise
exceptions instead?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [Bug target/105480] Vectorized `isnan` appears to trigger FPE on ppc64le
  2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
                   ` (10 preceding siblings ...)
  2022-11-10  5:58 ` linkw at gcc dot gnu.org
@ 2022-11-10 23:42 ` joseph at codesourcery dot com
  11 siblings, 0 replies; 13+ messages in thread
From: joseph at codesourcery dot com @ 2022-11-10 23:42 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105480

--- Comment #12 from joseph at codesourcery dot com <joseph at codesourcery dot com> ---
__builtin_isnan must not raise "invalid" for signaling NaN arguments.

__builtin_isunordered (i.e. UNORDERED / UNORDERED_EXPR; standard macro 
isunordered) must raise "invalid" for signaling NaN arguments.

The -ftrapping-math option (which is on by default) means code 
transformations that either add or remove exceptions should be avoided 
(though this isn't implemented very consistently, especially as regards 
transformations that remove exceptions).

Thus, transforming in either direction between __builtin_isnan and 
UNORDERED_EXPR is undesirable given -ftrapping-math -fsignaling-nans.  
Given -fno-trapping-math or (default) -fno-signaling-nans, the 
transformation is OK.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-11-10 23:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-04 13:15 [Bug target/105480] New: Vectorized `isnan` appears to trigger FPE on ppc64le sebastian-gcc at sipsolutions dot net
2022-05-04 14:21 ` [Bug target/105480] " sebastian-gcc at sipsolutions dot net
2022-05-05  8:14 ` sebastian-gcc at sipsolutions dot net
2022-11-02 20:00 ` pinskia at gcc dot gnu.org
2022-11-02 20:10 ` pinskia at gcc dot gnu.org
2022-11-08  7:14 ` linkw at gcc dot gnu.org
2022-11-09  8:31 ` linkw at gcc dot gnu.org
2022-11-09  8:48 ` rguenth at gcc dot gnu.org
2022-11-09  8:59 ` linkw at gcc dot gnu.org
2022-11-09 13:08 ` rguenth at gcc dot gnu.org
2022-11-09 17:30 ` joseph at codesourcery dot com
2022-11-10  5:58 ` linkw at gcc dot gnu.org
2022-11-10 23:42 ` joseph at codesourcery dot com

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).