From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 29962 invoked by alias); 1 Mar 2008 18:22:21 -0000 Received: (qmail 29897 invoked by uid 22791); 1 Mar 2008 18:22:20 -0000 X-Spam-Check-By: sourceware.org Received: from dessent.net (HELO dessent.net) (69.60.119.225) by sourceware.org (qpsmtpd/0.31) with ESMTP; Sat, 01 Mar 2008 18:21:50 +0000 Received: from localhost ([127.0.0.1] helo=dessent.net) by dessent.net with esmtp (Exim 4.50) id 1JVWLD-0002uQ-W1; Sat, 01 Mar 2008 18:21:48 +0000 Message-ID: <47C99EBA.59B1B90D@dessent.net> Date: Sat, 01 Mar 2008 18:22:00 -0000 From: Brian Dessent Reply-To: gcc-help@gcc.gnu.org X-Mailer: Mozilla 4.79 [en] (Windows NT 5.0; U) MIME-Version: 1.0 To: CSights CC: gcc-help@gcc.gnu.org Subject: Re: binary compiled with -O1 and w/ individual optimization flags are not the same References: <200802291215.47293.csights@fastmail.fm> <200802291716.22197.csights@fastmail.fm> <47C88B66.A501ABD6@dessent.net> <200803011057.26145.csights@fastmail.fm> Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org X-SW-Source: 2008-03/txt/msg00005.txt.bz2 CSights wrote: > Also, I didn't mention earlier (did I?) that the program's output when > compiled on the Macintosh matched at all optimization levels. (O0 == O1 == > O2) (Though the output did not match any output from the program compiled on > linux.) Is this possibly b/c the Mac has sse2 (Core 2 Duo) and able to use > those instructions which have more meaningful decimal places? Yes, it's probably using the sse2 unit. > If this is the problem, what would be a good way of dealing with it? Well first realize that it's not a problem per se. The results *are* equivalent in the significant digits that actually represent what a double can hold. The only reason they seem different is because there are these extra bits of precision that result from the value still being in a 387 register. But those bits shouldn't matter because as soon as the result is moved into memory they are truncated away. > Throwing away the meaningless decimal digits is okay with me, but avoiding > the performance hit that comes with ffloat-store would be nice. Also, it Like I said, you can use -mpc64 to explicitly set the 387 to 64 bits precision, just like the sse2 unit. If you don't have a gcc new enough to have this option or you don't want to depend on requiring an option, you can simply manually configure the 387 it at the beginning of your program to disable the extended precision. See for a code snippet of how to do this. (That relies on a glibc-specific fpu_control.h header but the definitions in that header are pretty self-contained.) > would be nice to not have the output depend on compiler flags. But the output doesn't *really* depend on compiler flags! That's the point I'm trying to make. It only seems like the output differs because you're looking at something that's like the equivalent of uninitialized memory. Suppose you had a string buffer of 80 chars and you filled it with a \0-terminated string of 40 chars, but to display it you print all 80 chars of the buffer. Clearly two strings that have the same first 40 chars before the \0 are semantically equivalent as C strings, because the rest of the buffer is just junk. No reasonable programmer would ever consider printing the junk past the \0 when displaying the string, just like it's not reasonable to print more than 15 (or whatever the limit is, I forget) significant digits of a double. This can also cause issues if you are simply testing for equality, i.e. assert((x/y) == (x/y)) can sometimes fail simply because one result is in a register and another in memory. But the solution here is to not use == for comparing floating point values, but rather compare the absolute value of their difference to some small delta. But this is something that you should do anyway with floating point calculations because they are by their very design inexact. Some details at . Brian