[Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN

public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed

* [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
@ 2024-06-06 19:01 k3x-devel at outlook dot com
  2024-06-06 19:04 ` [Bug c/115374] " pinskia at gcc dot gnu.org
                   ` (12 more replies)
  0 siblings, 13 replies; 14+ messages in thread
From: k3x-devel at outlook dot com @ 2024-06-06 19:01 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

            Bug ID: 115374
           Summary: fmod() in x86_64 -O3 not using return value from the
                    glibc's implementation if x87 FPU fprem returns NaN
           Product: gcc
           Version: 14.1.1
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: k3x-devel at outlook dot com
  Target Milestone: ---

Created attachment 58371
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58371&action=edit
code to reproduce the problem, compile with -O3

I believe I have found a minor bug in GCC when -O3 optimization is enabled. Due
to the complexity of GCC codebase, I am unable to check the relevant parts but
I have a reproducible example and a commented assembly to illustrate the
problem.

I was able to reproduce the same behavior on Arch Linux GCC 14.1.1 20240522 and
Gentoo's GCC (Gentoo 13.2.1_p20240210 p14) 13.2.1 20240210.

Apparently, on x86_64 target with -O3 optimization enabled, GCC tries to use
use x87 FPU for the fmod implementation. But if the resulting number is NaN, it
seem to fall back to the glibc's implementation of fmod. The problem is it
looks like it never uses the returned value from the call and instead re-uses
the NaN from the previous FPU operation.

The NaN in the FPU can happen if some MMX instructions were used previously,
filling the FPU stack without using EMMS instruction to bring back the FPU into
usable state. This is how I found out the bug. In such case GCC could fall back
to the glibc's implementation and actually use the resulting value which was a
valid non-NaN number.

The commented assembly illustrating the problem:

00000000000011c0 <do_fmod>:
    11c0:       48 83 ec 28             sub    $0x28,%rsp
    11c4:       f2 0f 11 44 24 08       movsd  %xmm0,0x8(%rsp)
    11ca:       f2 0f 11 4c 24 10       movsd  %xmm1,0x10(%rsp)
    11d0:       dd 44 24 10             fldl   0x10(%rsp)           # load
numbers from stack to FPU
    11d4:       dd 44 24 08             fldl   0x8(%rsp)
    11d8:       d9 f8                   fprem                       # do fp
partial remainder
    11da:       df e0                   fnstsw %ax                  # read FPU
SW into AX
    11dc:       f6 c4 04                test   $0x4,%ah             # test if
C2 (incomplete reduction) is set
    11df:       75 f7                   jne    11d8 <do_fmod+0x18>  # jump to
fprem again if C2 was set (not taken)
    11e1:       dd d9                   fstp   %st(1)               # pop st1
from the stack
    11e3:       dd 5c 24 18             fstpl  0x18(%rsp)           # store st0
into the stack and pop (stores NaN due to FPU stack fault and IE)
    11e7:       f2 0f 10 54 24 18       movsd  0x18(%rsp),%xmm2     # copy
result to the %xmm2
    11ed:       66 0f 2e d2             ucomisd %xmm2,%xmm2         # check if
%xmm2 holds NaN
    11f1:       7a 09                   jp     11fc <do_fmod+0x3c>  # jump if
so (PB set, taken), jumps to A
    11f3:       66 0f 28 c2             movapd %xmm2,%xmm0          # B: copy
%xmm2 to %xmm0
    11f7:       48 83 c4 28             add    $0x28,%rsp
    11fb:       c3                      ret                         # return
from the procedure with NaN from %xmm0 as the result!
    11fc:       f2 0f 10 4c 24 10       movsd  0x10(%rsp),%xmm1     # A: load
original fmod args into xmm0 and 1
    1202:       f2 0f 10 44 24 08       movsd  0x8(%rsp),%xmm0
    1208:       e8 33 fe ff ff          call   1040 <fmod@plt>      # call libc
fmod
    120d:       f2 0f 10 54 24 18       movsd  0x18(%rsp),%xmm2     # <== GCC
BUG? Ignores libc fmod result and copies previous NaN to xmm2
    1213:       eb de                   jmp    11f3 <do_fmod+0x33>  # jumps to
B
    1215:       66 66 2e 0f 1f 84 00    data16 cs nopw 0x0(%rax,%rax,1)
    121c:       00 00 00 00

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
@ 2024-06-06 19:04 ` pinskia at gcc dot gnu.org
  2024-06-06 19:06 ` pinskia at gcc dot gnu.org
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-06 19:04 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

Andrew Pinski <pinskia at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2024-06-06
             Status|UNCONFIRMED                 |WAITING
     Ever confirmed|0                           |1

--- Comment #1 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
Your example code is undefined due to the way it changes the stack without
mentioning it to the compiler.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
  2024-06-06 19:04 ` [Bug c/115374] " pinskia at gcc dot gnu.org
@ 2024-06-06 19:06 ` pinskia at gcc dot gnu.org
  2024-06-06 19:13 ` k3x-devel at outlook dot com
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-06 19:06 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

--- Comment #2 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
And for mmx code, you can't combine it in a nice way without using EMMS.

So it might be a bug in the code you are testing with.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
  2024-06-06 19:04 ` [Bug c/115374] " pinskia at gcc dot gnu.org
  2024-06-06 19:06 ` pinskia at gcc dot gnu.org
@ 2024-06-06 19:13 ` k3x-devel at outlook dot com
  2024-06-06 19:15 ` k3x-devel at outlook dot com
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: k3x-devel at outlook dot com @ 2024-06-06 19:13 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

Mario Hros <k3x-devel at outlook dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
  Attachment #58371|0                           |1
        is obsolete|                            |

--- Comment #3 from Mario Hros <k3x-devel at outlook dot com> ---
Created attachment 58372
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58372&action=edit
reporoduction code, compile with -O3 -lm

Prints "is NaN and should not be"

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
                   ` (2 preceding siblings ...)
  2024-06-06 19:13 ` k3x-devel at outlook dot com
@ 2024-06-06 19:15 ` k3x-devel at outlook dot com
  2024-06-06 19:18 ` pinskia at gcc dot gnu.org
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: k3x-devel at outlook dot com @ 2024-06-06 19:15 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

--- Comment #4 from Mario Hros <k3x-devel at outlook dot com> ---
I already reported the problem in the code using MMX without EMMS - it was
FFMPEG
(https://patchwork.ffmpeg.org/project/ffmpeg/patch/AM9P193MB194004554ECF79F43679CFB2AFF92@AM9P193MB1940.EURP193.PROD.OUTLOOK.COM/).

This bug report is about the strange behavior where GCC doesn't seem to be
using result of the fmod@plt at all. Or am I missing something?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
                   ` (3 preceding siblings ...)
  2024-06-06 19:15 ` k3x-devel at outlook dot com
@ 2024-06-06 19:18 ` pinskia at gcc dot gnu.org
  2024-06-06 19:20 ` pinskia at gcc dot gnu.org
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-06 19:18 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

--- Comment #5 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
The inline-asm is still broken

https://gcc.gnu.org/onlinedocs/gcc-14.1.0/gcc/Extended-Asm.html#x86-Floating-Point-asm-Operands

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
                   ` (4 preceding siblings ...)
  2024-06-06 19:18 ` pinskia at gcc dot gnu.org
@ 2024-06-06 19:20 ` pinskia at gcc dot gnu.org
  2024-06-06 19:28 ` k3x-devel at outlook dot com
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: pinskia at gcc dot gnu.org @ 2024-06-06 19:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

--- Comment #6 from Andrew Pinski <pinskia at gcc dot gnu.org> ---
(In reply to Mario Hros from comment #4)
> 
> This bug report is about the strange behavior where GCC doesn't seem to be
> using result of the fmod@plt at all. Or am I missing something?

Yes you are missing the fmod function will set errno on error. If you don't
want to depend on that behavior use -fno-math-errno .

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
                   ` (5 preceding siblings ...)
  2024-06-06 19:20 ` pinskia at gcc dot gnu.org
@ 2024-06-06 19:28 ` k3x-devel at outlook dot com
  2024-06-06 19:32 ` ubizjak at gmail dot com
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: k3x-devel at outlook dot com @ 2024-06-06 19:28 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

--- Comment #7 from Mario Hros <k3x-devel at outlook dot com> ---
Created attachment 58373
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=58373&action=edit
reporoduction code, compile with -O3 -lm

prints:

is NaN and should not be

errno = 0

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
                   ` (6 preceding siblings ...)
  2024-06-06 19:28 ` k3x-devel at outlook dot com
@ 2024-06-06 19:32 ` ubizjak at gmail dot com
  2024-06-07  6:09 ` rguenth at gcc dot gnu.org
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: ubizjak at gmail dot com @ 2024-06-06 19:32 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

--- Comment #8 from Uroš Bizjak <ubizjak at gmail dot com> ---
When compiling the following testcase:

--cut here--
#include <math.h>

double
__attribute__((noinline))
do_fmod (double x, double y)
{
  return fmod(x, y);
}
--cut here--

one can find in _.265t.optimized:

__attribute__((noinline))
double do_fmod (double x, double y)
{
  double _5;

  <bb 2> [local count: 1073741824]:
  _5 = .FMOD (x_2(D), y_3(D));
  if (_5 == _5)
    goto <bb 4>; [99.95%]
  else
    goto <bb 3>; [0.05%]

  <bb 3> [local count: 536864]:
  fmod (x_2(D), y_3(D));

  <bb 4> [local count: 1073741824]:
  return _5;

}

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
                   ` (7 preceding siblings ...)
  2024-06-06 19:32 ` ubizjak at gmail dot com
@ 2024-06-07  6:09 ` rguenth at gcc dot gnu.org
  2024-06-07 15:20 ` k3x-devel at outlook dot com
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: rguenth at gcc dot gnu.org @ 2024-06-07  6:09 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

--- Comment #9 from Richard Biener <rguenth at gcc dot gnu.org> ---
Yep, it's call DCE which elides the errno setting function call iff the result
is not NaN.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
                   ` (8 preceding siblings ...)
  2024-06-07  6:09 ` rguenth at gcc dot gnu.org
@ 2024-06-07 15:20 ` k3x-devel at outlook dot com
  2024-06-07 15:23 ` jakub at gcc dot gnu.org
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 14+ messages in thread
From: k3x-devel at outlook dot com @ 2024-06-07 15:20 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

--- Comment #10 from Mario Hros <k3x-devel at outlook dot com> ---
That _.265t.optimized posted matches my observation. So the call into glibc
fmod() is made to set errno eventually, ok. But shouldn't the returned value
from the glibc call be used instead of returning NaN? I am not the one to
decide. If that works as expected, then it is not a bug and this issue can be
closed.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
                   ` (9 preceding siblings ...)
  2024-06-07 15:20 ` k3x-devel at outlook dot com
@ 2024-06-07 15:23 ` jakub at gcc dot gnu.org
  2024-06-07 15:30 ` k3x-devel at outlook dot com
  2024-06-07 15:34 ` jakub at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-07 15:23 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

Jakub Jelinek <jakub at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jakub at gcc dot gnu.org

--- Comment #11 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to Mario Hros from comment #10)
> That _.265t.optimized posted matches my observation. So the call into glibc
> fmod() is made to set errno eventually, ok. But shouldn't the returned value
> from the glibc call be used instead of returning NaN?

Why?  We know the result should be NaN and we have a NaN from the inline fmod
expansion.  The glibc call is purely to set errno.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
                   ` (10 preceding siblings ...)
  2024-06-07 15:23 ` jakub at gcc dot gnu.org
@ 2024-06-07 15:30 ` k3x-devel at outlook dot com
  2024-06-07 15:34 ` jakub at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: k3x-devel at outlook dot com @ 2024-06-07 15:30 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

Mario Hros <k3x-devel at outlook dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
         Resolution|---                         |WONTFIX
             Status|WAITING                     |RESOLVED

--- Comment #12 from Mario Hros <k3x-devel at outlook dot com> ---
Because fmod() does not work when the FPU is in an invalid operation state,
despite the fact that it could if glibc's implementation had been used as a
"fallback". But maybe it's not worth it. I'll mark this as resolved. I am sorry
for bothering you.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [Bug c/115374] fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN
  2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
                   ` (11 preceding siblings ...)
  2024-06-07 15:30 ` k3x-devel at outlook dot com
@ 2024-06-07 15:34 ` jakub at gcc dot gnu.org
  12 siblings, 0 replies; 14+ messages in thread
From: jakub at gcc dot gnu.org @ 2024-06-07 15:34 UTC (permalink / raw)
  To: gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115374

--- Comment #13 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
The bug is mixing MMX with floating point and expecting it to work, it will
never work properly, you need manual emms in between, or better yet, avoid MMX
altogether, it really isn't worth it.  Just use SSE and higher.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-06-07 15:34 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-06 19:01 [Bug c/115374] New: fmod() in x86_64 -O3 not using return value from the glibc's implementation if x87 FPU fprem returns NaN k3x-devel at outlook dot com
2024-06-06 19:04 ` [Bug c/115374] " pinskia at gcc dot gnu.org
2024-06-06 19:06 ` pinskia at gcc dot gnu.org
2024-06-06 19:13 ` k3x-devel at outlook dot com
2024-06-06 19:15 ` k3x-devel at outlook dot com
2024-06-06 19:18 ` pinskia at gcc dot gnu.org
2024-06-06 19:20 ` pinskia at gcc dot gnu.org
2024-06-06 19:28 ` k3x-devel at outlook dot com
2024-06-06 19:32 ` ubizjak at gmail dot com
2024-06-07  6:09 ` rguenth at gcc dot gnu.org
2024-06-07 15:20 ` k3x-devel at outlook dot com
2024-06-07 15:23 ` jakub at gcc dot gnu.org
2024-06-07 15:30 ` k3x-devel at outlook dot com
2024-06-07 15:34 ` jakub at gcc dot gnu.org

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).