public inbox for glibc-bugs@sourceware.org help / color / mirror / Atom feed
From: "David dot Warme at Group-W-Inc dot com" <sourceware-bugzilla@sourceware.org> To: glibc-bugs@sources.redhat.com Subject: [Bug math/3451] New: dangerous inlining of floor and ceil on i386. Date: Thu, 02 Nov 2006 22:42:00 -0000 [thread overview] Message-ID: <20061102224237.3451.David.Warme@Group-W-Inc.com> (raw) The implementation of floor() and ceil() contained in sysdeps/i386/fpu/bits/mathinline.h can cause incorrect code to be generated. This problem occurs when the optimizer decides to move surrounding floating- point instructions into the region between the two "fldcw" instructions: fldcw __cwtmp frndint fldcw __cw Whenever this happens, these instructions are executed with the rounding mode forced to "round down" (floor) or "round down" (ceil), instead of the proper rounding mode in effect for the general computation. I have short C and C++ programs that demonstrate the bug. The programs get different answers when compiled with -O2 and without. Compiling with -O2 -S and examining the generated assembly code shows TWO floating-point operations moved into the space between the first fldcw and the frndint instruction. The floating-point constants used in the program are such that the differences result directly from the different rounding mode for these two instructions, not from any 53-bit versus 64-bit mantissa issues. I can send you these example programs if you decide that really need to see gcc doing this. Here is a patch that corrects the problem by generating all three of these instructions in a single "asm" directive so that the optimizer won't move any other instructions into the middle of this sequence. In my humble opinion, the code correctness obtained via the patch outweighs any possible advantages that the original implementation might have had -- for example, opening up the 3-instruction sequence for instruction scheduling, etc. Getting the wrong answer at higher speed still gets you the wrong answer. ;^> *** glibc-2.5/sysdeps/i386/fpu/bits/mathinline.h 2004-09-07 18:23:42.000000000 -0400 --- glibc-2.5.fixed/sysdeps/i386/fpu/bits/mathinline.h 2006-11-02 16:04:43.000000000 -0500 *************** *** 526,559 **** register long double __exm1 = __expm1l (-__fabsl (__x + __x)); \ return __exm1 / (__exm1 + 2.0) * __sgn1l (-__x)) #endif __inline_mathcodeNP (floor, __x, \ register long double __value; \ __volatile unsigned short int __cw; \ __volatile unsigned short int __cwtmp; \ __asm __volatile ("fnstcw %0" : "=m" (__cw)); \ __cwtmp = (__cw & 0xf3ff) | 0x0400; /* rounding down */ \ ! __asm __volatile ("fldcw %0" : : "m" (__cwtmp)); \ ! __asm __volatile ("frndint" : "=t" (__value) : "0" (__x)); \ ! __asm __volatile ("fldcw %0" : : "m" (__cw)); \ return __value) __inline_mathcodeNP (ceil, __x, \ register long double __value; \ __volatile unsigned short int __cw; \ __volatile unsigned short int __cwtmp; \ __asm __volatile ("fnstcw %0" : "=m" (__cw)); \ __cwtmp = (__cw & 0xf3ff) | 0x0800; /* rounding up */ \ ! __asm __volatile ("fldcw %0" : : "m" (__cwtmp)); \ ! __asm __volatile ("frndint" : "=t" (__value) : "0" (__x)); \ ! __asm __volatile ("fldcw %0" : : "m" (__cw)); \ return __value) #ifdef __FAST_MATH__ # define __ldexp_code \ register long double __value; \ __asm __volatile__ \ ("fscale" \ : "=t" (__value) : "0" (__x), "u" ((long double) __y)); \ return __value --- 526,559 ---- register long double __exm1 = __expm1l (-__fabsl (__x + __x)); \ return __exm1 / (__exm1 + 2.0) * __sgn1l (-__x)) #endif __inline_mathcodeNP (floor, __x, \ register long double __value; \ __volatile unsigned short int __cw; \ __volatile unsigned short int __cwtmp; \ __asm __volatile ("fnstcw %0" : "=m" (__cw)); \ __cwtmp = (__cw & 0xf3ff) | 0x0400; /* rounding down */ \ ! __asm __volatile ("fldcw %2\n\tfrndint\n\tfldcw %3" \ ! : "=t" (__value) \ ! : "0" (__x), "m" (__cwtmp), "m" (__cw)); \ return __value) __inline_mathcodeNP (ceil, __x, \ register long double __value; \ __volatile unsigned short int __cw; \ __volatile unsigned short int __cwtmp; \ __asm __volatile ("fnstcw %0" : "=m" (__cw)); \ __cwtmp = (__cw & 0xf3ff) | 0x0800; /* rounding up */ \ ! __asm __volatile ("fldcw %2\n\tfrndint\n\tfldcw %3" \ ! : "=t" (__value) \ ! : "0" (__x), "m" (__cwtmp), "m" (__cw)); \ return __value) #ifdef __FAST_MATH__ # define __ldexp_code \ register long double __value; \ __asm __volatile__ \ ("fscale" \ : "=t" (__value) : "0" (__x), "u" ((long double) __y)); \ return __value -- Summary: dangerous inlining of floor and ceil on i386. Product: glibc Version: 2.4 Status: NEW Severity: normal Priority: P2 Component: math AssignedTo: aj at suse dot de ReportedBy: David dot Warme at Group-W-Inc dot com CC: glibc-bugs at sources dot redhat dot com GCC target triplet: i386-*-* http://sourceware.org/bugzilla/show_bug.cgi?id=3451 ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
next reply other threads:[~2006-11-02 22:42 UTC|newest] Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top 2006-11-02 22:42 David dot Warme at Group-W-Inc dot com [this message] 2006-11-02 22:45 ` [Bug math/3451] " David dot Warme at Group-W-Inc dot com 2006-11-10 17:05 ` drepper at redhat dot com 2007-01-12 15:31 ` cvs-commit at gcc dot gnu dot org
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20061102224237.3451.David.Warme@Group-W-Inc.com \ --to=sourceware-bugzilla@sourceware.org \ --cc=glibc-bugs@sources.redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).