public inbox for glibc-cvs@sourceware.org
help / color / mirror / Atom feed
* [glibc/maskray/unnest] Fix sysdeps/x86/fpu/s_ffma.c for 32-bit FMA processor case
@ 2021-09-24 20:12 Fangrui Song
  0 siblings, 0 replies; only message in thread
From: Fangrui Song @ 2021-09-24 20:12 UTC (permalink / raw)
  To: glibc-cvs

https://sourceware.org/git/gitweb.cgi?p=glibc.git;h=b26901b26e0b0b61a151ff18e53bee84d977ef7c

commit b26901b26e0b0b61a151ff18e53bee84d977ef7c
Author: Joseph Myers <joseph@codesourcery.com>
Date:   Fri Sep 24 17:59:22 2021 +0000

    Fix sysdeps/x86/fpu/s_ffma.c for 32-bit FMA processor case
    
    It turns out the __SSE2_MATH__ conditional in sysdeps/x86/fpu/s_ffma.c
    does not cover all cases where the x86 fenv_private.h macros might
    manipulate one of the SSE and 387 floating-point state, while the
    actual fma implementation uses the other.  Specifically, in the 32-bit
    case, with a compiler not defaulting to -mfpmath=sse, but testing on a
    processor with hardware FMA support, the multiarch fma function
    implementations will end up using SSE, while the fenv_private.h macros
    will use the 387 state for double.  Change the conditional to use the
    default macros rather than the optimized ones in all cases except when
    the compiler inlines an fma instruction (in which case, since all
    those instructions are SSE instructions and -mfpmath=sse must be in
    effect for them to be inlined, the optimized macros will only use the
    SSE state and it's OK for them to only use the SSE state).
    
    Tested for x86_64 and x86.  H.J. reports in
    <https://sourceware.org/pipermail/libc-alpha/2021-September/131367.html>
    that it fixes the problems he observed.

Diff:
---
 sysdeps/x86/fpu/s_ffma.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/sysdeps/x86/fpu/s_ffma.c b/sysdeps/x86/fpu/s_ffma.c
index 95c2dcd7b7..da4bb55f9a 100644
--- a/sysdeps/x86/fpu/s_ffma.c
+++ b/sysdeps/x86/fpu/s_ffma.c
@@ -27,10 +27,14 @@
 
 #include <math-narrow.h>
 
-#if defined __SSE2_MATH__ && !defined __FP_FAST_FMA
+#ifndef __FP_FAST_FMA
 /* Depending on the details of the glibc configuration, fma might use
    either SSE or 387 arithmetic; ensure that both parts of the
-   floating-point state are handled in the round-to-odd code.  */
+   floating-point state are handled in the round-to-odd code.  If
+   __FP_FAST_FMA is defined, that implies that the compiler is using
+   SSE floating point and that the fma call will be inlined, so the
+   x86 macros will work with only the SSE state and that is
+   sufficient.  */
 # undef libc_feholdexcept_setround
 # define libc_feholdexcept_setround	default_libc_feholdexcept_setround
 # undef libc_feupdateenv_test


^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-09-24 20:12 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-24 20:12 [glibc/maskray/unnest] Fix sysdeps/x86/fpu/s_ffma.c for 32-bit FMA processor case Fangrui Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).