From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 22979 invoked by alias); 1 Mar 2008 17:26:02 -0000 Received: (qmail 22824 invoked by uid 22791); 1 Mar 2008 17:25:58 -0000 X-Spam-Check-By: sourceware.org Received: from smtp123.sbc.mail.sp1.yahoo.com (HELO smtp123.sbc.mail.sp1.yahoo.com) (69.147.64.96) by sourceware.org (qpsmtpd/0.31) with SMTP; Sat, 01 Mar 2008 17:25:30 +0000 Received: (qmail 43090 invoked from network); 1 Mar 2008 17:25:28 -0000 Received: from unknown (HELO ?192.168.1.6?) (timothyprince@sbcglobal.net@68.125.167.198 with plain) by smtp123.sbc.mail.sp1.yahoo.com with SMTP; 1 Mar 2008 17:25:27 -0000 X-YMail-OSG: jqC1hfoVM1nBrA04DiNMW3AHbHW2ctDWlRT3kmZ7i8ZeI7j8Q8k48R4jHWKcmbzcTWkZHbsUjQ-- X-Yahoo-Newman-Property: ymail-3 Message-ID: <47C99188.2050309@myrealbox.com> Date: Sat, 01 Mar 2008 17:26:00 -0000 From: Tim Prince User-Agent: Thunderbird 2.0.0.12 (Windows/20080213) MIME-Version: 1.0 To: CSights CC: gcc-help@gcc.gnu.org, Brian Dessent Subject: Re: binary compiled with -O1 and w/ individual optimization flags are not the same References: <200802291215.47293.csights@fastmail.fm> <200802291716.22197.csights@fastmail.fm> <47C88B66.A501ABD6@dessent.net> <200803011057.26145.csights@fastmail.fm> In-Reply-To: <200803011057.26145.csights@fastmail.fm> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-IsSubscribed: yes Mailing-List: contact gcc-help-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Archive: List-Post: List-Help: Sender: gcc-help-owner@gcc.gnu.org X-SW-Source: 2008-03/txt/msg00004.txt.bz2 CSights wrote: > > Currently using doubles, but thanks for reminding me about the number of > decimals that make sense. > > >> By default calculations on the 387 are done by the hardware in 80 bits >> precision, but truncated down to 64 (assuming double types) when moved >> out of the registers. There are a number of ways to deal with it, or at >> least expose it: >> >> -ffloat-store will cause gcc to always move intermediate results out of >> registers and into memory, which effectively gets rid of the excess >> precision at the cost of a speed hit. >> > > Progress! Now the program output matching blocks are > (O0 -ffloat-store == O1 ffloat-store == O2 ffloat-store) != (O0) != (O1 == O2 > == O3) In other words, now the O0 matches 1,2 with the addition > of -ffloat-store, even though it still doesn't match the Ox without > ffloat-store. > Does this suggest to you the mismatching output was due to decimal point > differences rather than other problems (aliasing for example)? > It suggests that you were in fact getting more than 53-bit double somewhere, and that it's not an aliasing error. > Also, I didn't mention earlier (did I?) that the program's output when > compiled on the Macintosh matched at all optimization levels. (O0 == O1 == > O2) (Though the output did not match any output from the program compiled on > linux.) Is this possibly b/c the Mac has sse2 (Core 2 Duo) and able to use > those instructions which have more meaningful decimal places? > If you use SSE2, you have no extra precision for -ffloat-store to suppress. Assuming the machine where you used 387 has SSE2 hardware, you could set -mfpmath=sse That is the default for 64-bit gcc. > I've tried using floats only in the what I guess is the key calculation > involving the exp(), then casting to double (so that I don't have to modify > all the code to be float), but this doesn't result in matching output between > O1 and O0. Does the compiler do any recasting of float->double double->float > behind the scenes? > > The 387 exp() performs all its calculations with extra precision. Then, if you don't set -ffloat-store, it may never get rounded down. If you have no SSE2 math library, you will still get 387 exp() even if you set -mfpmath=sse, but there will be an implicit -ffloat-store in the conversion of the result to SSE2.