public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* Floating point performance issue
@ 2011-12-20  9:52 Ico
  2011-12-20 10:05 ` Marcin Mirosław
                   ` (2 more replies)
  0 siblings, 3 replies; 56+ messages in thread
From: Ico @ 2011-12-20  9:52 UTC (permalink / raw)
  To: gcc-help

Hello,

I'm running the program below twice with different command line arguments. The
argument is used a a floating point scaling factor in the code, but does not
change the algorithm in any way.  I am baffled by the difference in run time of
the two runs, since the program flow is not altered by the argument.

$ gcc -O3 t.c

$ time ./a.out 0.1

real	0m7.300s
user	0m7.286s
sys	0m0.007s

$ time ./a.out 0.0001

real	0m0.060s
user	0m0.058s
sys	0m0.003s


The second run is about 120 times faster then the first.

I did some quick tests using the 'perf' profiling utility on Linux, and
it seems that the slow run has about 70% branch misses, which I guess
might kill performance drastically.

I am able to reproduce this on multiple i686 boxes using various gcc versions
(4.4, 4.6). Compiling on x86_64 does not show this behaviour.

Is anybody able to reproduce this issue, and how can this be explained ?

Thanks,

Ico



/* 
 * gcc -O3 test.c && ./a.out NUMBER
 */

#include <stdio.h>
#include <stdlib.h>

#define N 4000
#define S 5000

struct t {
        double a, b, f;
};

int main(int argc, char **argv)
{
        int i, j;
        struct t t[N];
        double f = atof(argv[1]);

        for(i=0; i<N; i++) {
                t[i].a = 0;
                t[i].b = 1;
                t[i].f = i * f;
        };

        for(j=0; j<S; j++) {
                for(i=0; i<N; i++) {
                        t[i].a += t[i].b * t[i].f;
                        t[i].b -= t[i].a * t[i].f;
                }
        }

        return t[1].a;
}





processor	: 1
vendor_id	: GenuineIntel
cpu family	: 6
model		: 15
model name	: Intel(R) Core(TM)2 Duo CPU     T7500  @ 2.20GHz
stepping	: 11


Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/i486-linux-gnu/4.6/lto-wrapper
Target: i486-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 4.6.2-7'
  --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs
  --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr
  --program-suffix=-4.6 --enable-shared --enable-linker-build-id
  --with-system-zlib --libexecdir=/usr/lib --without-included-gettext
  --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6
  --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
  --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc
  --enable-targets=all --with-arch-32=i586 --with-tune=generic
  --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu
  --target=i486-linux-gnu
Thread model: posix
gcc version 4.6.2 (Debian 4.6.2-7) 
-- 
:wq
^X^Cy^K^X^C^C^C^C

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20  9:52 Floating point performance issue Ico
@ 2011-12-20 10:05 ` Marcin Mirosław
  2011-12-20 10:20   ` Ico
  2011-12-20 10:46 ` Marc Glisse
  2011-12-20 12:21 ` Tim Prince
  2 siblings, 1 reply; 56+ messages in thread
From: Marcin Mirosław @ 2011-12-20 10:05 UTC (permalink / raw)
  To: gcc-help

W dniu 20.12.2011 10:52, Ico pisze:
> Hello,

Hi,

> I am able to reproduce this on multiple i686 boxes using various gcc versions
> (4.4, 4.6). Compiling on x86_64 does not show this behaviour.
> 
> Is anybody able to reproduce this issue, and how can this be explained ?

I can reproduce such situation too. I can only guess this happens
because on i686 default is mfpmath=387, on x86_64 default is
mfpmath=sse. If you compile your code using "-O3 -mfpmath=sse
-march=native <or something what else what have support for sse>" then
booth times will be almost equal.
Regards,

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 10:05 ` Marcin Mirosław
@ 2011-12-20 10:20   ` Ico
  2011-12-20 10:34     ` Jonathan Wakely
  2011-12-20 11:44     ` David Brown
  0 siblings, 2 replies; 56+ messages in thread
From: Ico @ 2011-12-20 10:20 UTC (permalink / raw)
  To: gcc-help

* On Tue Dec 20 11:05:17 +0100 2011, Marcin Mirosław wrote:
 
> W dniu 20.12.2011 10:52, Ico pisze:

> > I am able to reproduce this on multiple i686 boxes using various gcc versions
> > (4.4, 4.6). Compiling on x86_64 does not show this behaviour.
> > 
> > Is anybody able to reproduce this issue, and how can this be explained ?
> 
> I can reproduce such situation too. I can only guess this happens
> because on i686 default is mfpmath=387, on x86_64 default is
> mfpmath=sse. If you compile your code using "-O3 -mfpmath=sse
> -march=native <or something what else what have support for sse>" then
> booth times will be almost equal.

Thanks for testing this.

Still, I'm not sure if sse is part of the problem and/or solution.

I have been reducing the program to see what the smallest code is that still
shows this behaviour. Latest version is below. 


$ gcc -msse -mfpmath=sse -O3 -march=native test.c 
$ time ./a.out 0.9

real	0m2.653s
user	0m2.648s
sys	0m0.002s
$ time ./a.out 0.001

real	0m0.144s
user	0m0.140s
sys	0m0.002s


/* gcc -msse -mfpmath=sse -O3 -march=native test.c  */

#include <stdlib.h>

#define S 20000000
        
int main(int argc, char **argv)
{       
        int j;
        double a = 0;
        double b = 1;
        double f = atof(argv[1]);

        for(j=0; j<S; j++) {
                a = b * f;
                b = a * f;
        }

        return a;
}
-- 
:wq
^X^Cy^K^X^C^C^C^C

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 10:20   ` Ico
@ 2011-12-20 10:34     ` Jonathan Wakely
  2011-12-20 10:43       ` Ico
  2011-12-20 11:24       ` Vincent Lefevre
  2011-12-20 11:44     ` David Brown
  1 sibling, 2 replies; 56+ messages in thread
From: Jonathan Wakely @ 2011-12-20 10:34 UTC (permalink / raw)
  To: Ico; +Cc: gcc-help

On 20 December 2011 10:20, Ico wrote:
>
> Still, I'm not sure if sse is part of the problem and/or solution.

It's the solution.

> I have been reducing the program to see what the smallest code is that still
> shows this behaviour. Latest version is below.
>
>
> $ gcc -msse -mfpmath=sse -O3 -march=native test.c

What is "native" for your system, i686? (also, what does gcc -dumpmachine show?)
i686 doesn't support SSE, you need at least pentium3.

Remove the -msse and see if you get a warning telling you SSE
instructions are disabled.

Try -march=pentium3 -mfpmath=sse instead (without -msse)

If you don't have at least a pentium3, you're stuck with the 387 FP
registers, and have to use horrible code.

> $ time ./a.out 0.9
>
> real    0m2.653s
> user    0m2.648s
> sys     0m0.002s

That looks as though you're still not using SSE registers.

> $ time ./a.out 0.001
>
> real    0m0.144s
> user    0m0.140s
> sys     0m0.002s

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 10:34     ` Jonathan Wakely
@ 2011-12-20 10:43       ` Ico
  2011-12-20 11:24       ` Vincent Lefevre
  1 sibling, 0 replies; 56+ messages in thread
From: Ico @ 2011-12-20 10:43 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: gcc-help

* On Tue Dec 20 11:34:35 +0100 2011, Jonathan Wakely wrote:
 
> > I have been reducing the program to see what the smallest code is that still
> > shows this behaviour. Latest version is below.
> >
> > $ gcc -msse -mfpmath=sse -O3 -march=native test.c
> 
> What is "native" for your system, i686? (also, what does gcc -dumpmachine show?)

i486-linux-gnu

> i686 doesn't support SSE, you need at least pentium3.
> 
> Remove the -msse and see if you get a warning telling you SSE
> instructions are disabled.

True

> Try -march=pentium3 -mfpmath=sse instead (without -msse)
> 
> If you don't have at least a pentium3, you're stuck with the 387 FP
> registers, and have to use horrible code.

> That looks as though you're still not using SSE registers.

The inner loop boils down to this (-msse -mfpmath=sse -O3 -march=native)

 8048370:       66 0f 28 c1             movapd %xmm1,%xmm0
 8048374:       83 e8 01                sub    $0x1,%eax
 8048377:       f2 0f 59 c2             mulsd  %xmm2,%xmm0
 804837b:       66 0f 28 c8             movapd %xmm0,%xmm1
 804837f:       f2 0f 59 ca             mulsd  %xmm2,%xmm1
 8048383:       75 eb                   jne    8048370 <main+0x40>

or this (-march=pentium3 -mfpmath=sse -O3)

 8048360:       dd d9                   fstp   %st(1)
 8048362:       83 e8 01                sub    $0x1,%eax
 8048365:       d8 c9                   fmul   %st(1),%st
 8048367:       d9 c0                   fld    %st(0)
 8048369:       d8 ca                   fmul   %st(2),%st
 804836b:       75 f3                   jne    8048360 <main+0x30

The first runs about twice as fast as the latter, but still I see a huge difference
in run time depending on the 'f' in the original code


-- 
:wq
^X^Cy^K^X^C^C^C^C

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20  9:52 Floating point performance issue Ico
  2011-12-20 10:05 ` Marcin Mirosław
@ 2011-12-20 10:46 ` Marc Glisse
  2011-12-20 11:11   ` Ico
  2011-12-20 11:16   ` Vincent Lefevre
  2011-12-20 12:21 ` Tim Prince
  2 siblings, 2 replies; 56+ messages in thread
From: Marc Glisse @ 2011-12-20 10:46 UTC (permalink / raw)
  To: Ico; +Cc: gcc-help

On Tue, 20 Dec 2011, Ico wrote:

> Hello,
>
> I'm running the program below twice with different command line arguments. The
> argument is used a a floating point scaling factor in the code, but does not
> change the algorithm in any way.  I am baffled by the difference in run time of
> the two runs, since the program flow is not altered by the argument.

Hello,

you are thinking about the program flow in terms high level code. Most 
float operations simply go through the hardware and complete in equal 
time, but that doesn't include operations on denormals (numbers very close 
to 0) which are emulated and take forever to complete. Notice that 
-ffast-math implies "I don't care about that" and makes it fast.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 10:46 ` Marc Glisse
@ 2011-12-20 11:11   ` Ico
  2011-12-20 11:16   ` Vincent Lefevre
  1 sibling, 0 replies; 56+ messages in thread
From: Ico @ 2011-12-20 11:11 UTC (permalink / raw)
  To: gcc-help

* On Tue Dec 20 11:46:24 +0100 2011, Marc Glisse wrote:
 
> On Tue, 20 Dec 2011, Ico wrote:
> 
> > Hello,
> >
> > I'm running the program below twice with different command line arguments. The
> > argument is used a a floating point scaling factor in the code, but does not
> > change the algorithm in any way.  I am baffled by the difference in run time of
> > the two runs, since the program flow is not altered by the argument.
> 
> Hello,
> 
> you are thinking about the program flow in terms high level code. Most 
> float operations simply go through the hardware and complete in equal 
> time, but that doesn't include operations on denormals (numbers very close 
> to 0) which are emulated and take forever to complete. Notice that 
> -ffast-math implies "I don't care about that" and makes it fast.

So I could expect 

  gcc -g -ffast-math -O3  test.c

or 

  gcc -g -march=pentium3 -mfpmath=sse -ffast-math -O3  test.c

to solve the issue ? 

Unfortunately, from what I just tested, it does not.

However, it does when using

 gcc -g -msse -mfpmath=sse -O3 -march=native -ffast-math test.c

I will no go read the manual and learn about the *exact* meanings of all
these options to see if I can understand what exactly is going on under
the hood.

Thank you,

Ico


 
-- 
:wq
^X^Cy^K^X^C^C^C^C

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 10:46 ` Marc Glisse
  2011-12-20 11:11   ` Ico
@ 2011-12-20 11:16   ` Vincent Lefevre
  2011-12-20 12:00     ` Vincent Lefevre
  1 sibling, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-20 11:16 UTC (permalink / raw)
  To: gcc-help; +Cc: Ico

Hi,

On 2011-12-20 11:46:24 +0100, Marc Glisse wrote:
> you are thinking about the program flow in terms high level code. Most float
> operations simply go through the hardware and complete in equal time, but
> that doesn't include operations on denormals (numbers very close to 0) which
> are emulated and take forever to complete. Notice that -ffast-math implies
> "I don't care about that" and makes it fast.

I really don't think subnormals are emulated on x86 processors.
They may be slower (I haven't tested), but I'm quite sure that
they are implemented in hardware (x86 processors even implement
elementary functions, though not accurately).

Now, the test program is faster where the values are *closer* to 0.
This contradicts what you say about subnormals. I just think that
for small values, one gets 0 very early, and the multiplication by 0
is much faster than a generic multiplication (IIRC, for old SPARC
processors at least, this was the opposite!).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 10:34     ` Jonathan Wakely
  2011-12-20 10:43       ` Ico
@ 2011-12-20 11:24       ` Vincent Lefevre
  2011-12-20 11:51         ` Dario Saccavino
  1 sibling, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-20 11:24 UTC (permalink / raw)
  To: Jonathan Wakely; +Cc: Ico, gcc-help

On 2011-12-20 10:34:35 +0000, Jonathan Wakely wrote:
> On 20 December 2011 10:20, Ico wrote:
> >
> > Still, I'm not sure if sse is part of the problem and/or solution.
> 
> It's the solution.
> 
> > I have been reducing the program to see what the smallest code is
> > that still shows this behaviour. Latest version is below.
> >
> >
> > $ gcc -msse -mfpmath=sse -O3 -march=native test.c
> 
> What is "native" for your system, i686? (also, what does gcc
> -dumpmachine show?) i686 doesn't support SSE, you need at least
> pentium3.

I can reproduce the "problem" on an x86_64 machine, so it is not
due to the traditional FPU. I just think that the multiplication
by 0 is faster (because much easier than the generic case), as
I've said in my other message. But to have such an optimization,
I wouldn't complain. :)

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 10:20   ` Ico
  2011-12-20 10:34     ` Jonathan Wakely
@ 2011-12-20 11:44     ` David Brown
  2011-12-20 11:49       ` David Brown
  1 sibling, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-20 11:44 UTC (permalink / raw)
  To: Ico; +Cc: gcc-help

On 20/12/2011 11:20, Ico wrote:
> * On Tue Dec 20 11:05:17 +0100 2011, Marcin Mirosław wrote:
>
>> W dniu 20.12.2011 10:52, Ico pisze:
>
>>> I am able to reproduce this on multiple i686 boxes using various gcc versions
>>> (4.4, 4.6). Compiling on x86_64 does not show this behaviour.
>>>
>>> Is anybody able to reproduce this issue, and how can this be explained ?
>>
>> I can reproduce such situation too. I can only guess this happens
>> because on i686 default is mfpmath=387, on x86_64 default is
>> mfpmath=sse. If you compile your code using "-O3 -mfpmath=sse
>> -march=native<or something what else what have support for sse>" then
>> booth times will be almost equal.
>
> Thanks for testing this.
>
> Still, I'm not sure if sse is part of the problem and/or solution.
>
> I have been reducing the program to see what the smallest code is that still
> shows this behaviour. Latest version is below.
>
>
> $ gcc -msse -mfpmath=sse -O3 -march=native test.c
> $ time ./a.out 0.9
>
> real	0m2.653s
> user	0m2.648s
> sys	0m0.002s
> $ time ./a.out 0.001
>
> real	0m0.144s
> user	0m0.140s
> sys	0m0.002s
>
>
> /* gcc -msse -mfpmath=sse -O3 -march=native test.c  */
>
> #include<stdlib.h>
>
> #define S 20000000
>
> int main(int argc, char **argv)
> {
>          int j;
>          double a = 0;
>          double b = 1;
>          double f = atof(argv[1]);
>
>          for(j=0; j<S; j++) {
>                  a = b * f;
>                  b = a * f;
>          }
>
>          return a;
> }

I've just tried this code (on an i7-920, 64-bit Linux), compiled with 
gcc 4.5.1, command line:

	gcc fp.c -o fp -O2 -Wa,-ahdls=fp.lst

"time ./fp 0.50" takes 0.088 seconds, but "time ./fp 0.51" takes 2.584 
seconds.

The inner loop is using mmx, as can be seen from the listing file:

   18                    .L2:
   19 0020 660F28CA              movapd  %xmm2, %xmm1
   20 0024 83E801                subl    $1, %eax
   21 0027 F20F59C8              mulsd   %xmm0, %xmm1
   22 002b 660F28D1              movapd  %xmm1, %xmm2
   23 002f F20F59D0              mulsd   %xmm0, %xmm2
   24 0033 75EB                  jne     .L2

But what is more interesting, is if I compile with the "-ffast-math" 
flag, exactly the same listing file is generated, but the program runs 
at about 0.088 seconds for any input.

You can get a clue as to the reason behind this if you add a 
"printf("%g\t%g\n", a, b)" statement at the end of the program.  When I 
then run "fp.slow" (no "-ffast-math" flag) and "fp.fast" (with 
"-ffast-math") I get:

[david@davidquad c]$ time ./fp.slow 0.50
0       0.000000

real    0m0.088s
user    0m0.086s
sys     0m0.000s

[david@davidquad c]$ time ./fp.slow 0.51
4.94066e-324    0.000000

real    0m2.575s
user    0m2.567s
sys     0m0.000s

[david@davidquad c]$ time ./fp.fast 0.50
0       0.000000

real    0m0.087s
user    0m0.086s
sys     0m0.000s
[david@davidquad c]$ time ./fp.fast 0.51
0       0.000000

real    0m0.087s
user    0m0.087s
sys     0m0.000s


The key point here is that without the "-ffast-math" flag, rounding 
effects mean that with input values strictly more than 0.5, but less 
than 1, end up with "a" being stuck on a very small but non-zero value. 
  With input values of 0.5 or less, "a" quickly reduces to 0, and the 
processor short-cuts the multiply.  With the "-ffast-math" flag active, 
my guess is that the program header sets up different rounding or 
saturation modes on the processor, avoiding this issue.

mvh.,

David

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 11:44     ` David Brown
@ 2011-12-20 11:49       ` David Brown
  0 siblings, 0 replies; 56+ messages in thread
From: David Brown @ 2011-12-20 11:49 UTC (permalink / raw)
  To: gcc-help; +Cc: gcc-help

On 20/12/2011 11:20, Ico wrote:
> * On Tue Dec 20 11:05:17 +0100 2011, Marcin Mirosław wrote:
>
>> W dniu 20.12.2011 10:52, Ico pisze:
>
>>> I am able to reproduce this on multiple i686 boxes using various gcc versions
>>> (4.4, 4.6). Compiling on x86_64 does not show this behaviour.
>>>
>>> Is anybody able to reproduce this issue, and how can this be explained ?
>>
>> I can reproduce such situation too. I can only guess this happens
>> because on i686 default is mfpmath=387, on x86_64 default is
>> mfpmath=sse. If you compile your code using "-O3 -mfpmath=sse
>> -march=native<or something what else what have support for sse>" then
>> booth times will be almost equal.
>
> Thanks for testing this.
>
> Still, I'm not sure if sse is part of the problem and/or solution.
>
> I have been reducing the program to see what the smallest code is that still
> shows this behaviour. Latest version is below.
>
>
> $ gcc -msse -mfpmath=sse -O3 -march=native test.c
> $ time ./a.out 0.9
>
> real	0m2.653s
> user	0m2.648s
> sys	0m0.002s
> $ time ./a.out 0.001
>
> real	0m0.144s
> user	0m0.140s
> sys	0m0.002s
>
>
> /* gcc -msse -mfpmath=sse -O3 -march=native test.c  */
>
> #include<stdlib.h>
>
> #define S 20000000
>
> int main(int argc, char **argv)
> {
>          int j;
>          double a = 0;
>          double b = 1;
>          double f = atof(argv[1]);
>
>          for(j=0; j<S; j++) {
>                  a = b * f;
>                  b = a * f;
>          }
>
>          return a;
> }

I've just tried this code (on an i7-920, 64-bit Linux), compiled with 
gcc 4.5.1, command line:

	gcc fp.c -o fp -O2 -Wa,-ahdls=fp.lst

"time ./fp 0.50" takes 0.088 seconds, but "time ./fp 0.51" takes 2.584 
seconds.

The inner loop is using mmx, as can be seen from the listing file:

   18                    .L2:
   19 0020 660F28CA              movapd  %xmm2, %xmm1
   20 0024 83E801                subl    $1, %eax
   21 0027 F20F59C8              mulsd   %xmm0, %xmm1
   22 002b 660F28D1              movapd  %xmm1, %xmm2
   23 002f F20F59D0              mulsd   %xmm0, %xmm2
   24 0033 75EB                  jne     .L2

But what is more interesting, is if I compile with the "-ffast-math" 
flag, exactly the same listing file is generated, but the program runs 
at about 0.088 seconds for any input.

You can get a clue as to the reason behind this if you add a 
"printf("%g\t%g\n", a, b)" statement at the end of the program.  When I 
then run "fp.slow" (no "-ffast-math" flag) and "fp.fast" (with 
"-ffast-math") I get:

[david@davidquad c]$ time ./fp.slow 0.50
0       0.000000

real    0m0.088s
user    0m0.086s
sys     0m0.000s

[david@davidquad c]$ time ./fp.slow 0.51
4.94066e-324    0.000000

real    0m2.575s
user    0m2.567s
sys     0m0.000s

[david@davidquad c]$ time ./fp.fast 0.50
0       0.000000

real    0m0.087s
user    0m0.086s
sys     0m0.000s
[david@davidquad c]$ time ./fp.fast 0.51
0       0.000000

real    0m0.087s
user    0m0.087s
sys     0m0.000s


The key point here is that without the "-ffast-math" flag, rounding 
effects mean that with input values strictly more than 0.5, but less 
than 1, end up with "a" being stuck on a very small but non-zero value. 
  With input values of 0.5 or less, "a" quickly reduces to 0, and the 
processor short-cuts the multiply.  With the "-ffast-math" flag active, 
my guess is that the program header sets up different rounding or 
saturation modes on the processor, avoiding this issue.

mvh.,

David


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 11:24       ` Vincent Lefevre
@ 2011-12-20 11:51         ` Dario Saccavino
  2011-12-20 12:02           ` Ico
  2011-12-20 12:12           ` Vincent Lefevre
  0 siblings, 2 replies; 56+ messages in thread
From: Dario Saccavino @ 2011-12-20 11:51 UTC (permalink / raw)
  To: Jonathan Wakely, Ico, gcc-help

2011/12/20 Vincent Lefevre <vincent+gcc@vinc17.org>:
> On 2011-12-20 10:34:35 +0000, Jonathan Wakely wrote:
>> On 20 December 2011 10:20, Ico wrote:
>> >
>> > Still, I'm not sure if sse is part of the problem and/or solution.
>>
>> It's the solution.
>>
>> > I have been reducing the program to see what the smallest code is
>> > that still shows this behaviour. Latest version is below.
>> >
>> >
>> > $ gcc -msse -mfpmath=sse -O3 -march=native test.c
>>
>> What is "native" for your system, i686? (also, what does gcc
>> -dumpmachine show?) i686 doesn't support SSE, you need at least
>> pentium3.
>
> I can reproduce the "problem" on an x86_64 machine, so it is not
> due to the traditional FPU. I just think that the multiplication
> by 0 is faster (because much easier than the generic case), as
> I've said in my other message. But to have such an optimization,
> I wouldn't complain. :)
>
> --
> Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
> 100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
> Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

The problem doesn't manifest when the hardware mode flush-to-zero
(FTZ) is enabled. This flag causes the hardware to round all denormal
values produced by an operation to zero.

In the second program, if 0.5 < f < 1 the values of a and b eventually
become the smallest representable denormal value and never change
afterwards, resulting in a large number of operations involving
denormal numbers.
When f <= 0.5, in the default rounding mode, when a is the smallest
representable number the result of (a * f) is zero. Therefore denormal
numbers are produced only a small number of times.

gcc enables FTZ when using SSE and ffast-math (I think the specific
compiler flag is -funsafe-math-optimizations).
Therefore the flags needed are -msse2 -mfpmath=sse -ffast-math


    Dario

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 11:16   ` Vincent Lefevre
@ 2011-12-20 12:00     ` Vincent Lefevre
  0 siblings, 0 replies; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-20 12:00 UTC (permalink / raw)
  To: gcc-help, Ico

On 2011-12-20 12:16:34 +0100, Vincent Lefevre wrote:
> Now, the test program is faster where the values are *closer* to 0.
> This contradicts what you say about subnormals.

Actually, after thinking a bit more about it, there are also more
operations on the subnormals in the case 0.9, so that the subnormals
could also be a problem, and indeed:

ypig:~> time ./a.out 0.9999999
./a.out 0.9999999  0.10s user 0.00s system 92% cpu 0.108 total

ypig:~> time ./a.out 0.9
./a.out 0.9  2.93s user 0.00s system 98% cpu 2.987 total

ypig:~> time ./a.out 0.1
./a.out 0.1  0.10s user 0.00s system 86% cpu 0.116 total

The fact that only the second case is slower tends to indicate that
the problem is the subnormals (whether they are implemented in
software and in hardware).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 11:51         ` Dario Saccavino
@ 2011-12-20 12:02           ` Ico
  2011-12-20 12:12           ` Vincent Lefevre
  1 sibling, 0 replies; 56+ messages in thread
From: Ico @ 2011-12-20 12:02 UTC (permalink / raw)
  To: gcc-help

* On Tue Dec 20 12:48:53 +0100 2011, Dario Saccavino wrote:
 
> 2011/12/20 Vincent Lefevre <vincent+gcc@vinc17.org>:
> > On 2011-12-20 10:34:35 +0000, Jonathan Wakely wrote:
> >> On 20 December 2011 10:20, Ico wrote:
> >> >
> >> > I have been reducing the program to see what the smallest code is
> >> > that still shows this behaviour. Latest version is below.
> >> >

> The problem doesn't manifest when the hardware mode flush-to-zero
> (FTZ) is enabled. This flag causes the hardware to round all denormal
> values produced by an operation to zero.
> 
> In the second program, if 0.5 < f < 1 the values of a and b eventually
> become the smallest representable denormal value and never change
> afterwards, resulting in a large number of operations involving
> denormal numbers.
> When f <= 0.5, in the default rounding mode, when a is the smallest
> representable number the result of (a * f) is zero. Therefore denormal
> numbers are produced only a small number of times.

Clear, thank you for the thorough explanation. I was so naive to assume
I knew 'enough' of floating point operations for daily use, but it seems
that I have some reading to do.

Ico
-- 
:wq
^X^Cy^K^X^C^C^C^C

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 11:51         ` Dario Saccavino
  2011-12-20 12:02           ` Ico
@ 2011-12-20 12:12           ` Vincent Lefevre
  2011-12-20 12:28             ` Tim Prince
                               ` (2 more replies)
  1 sibling, 3 replies; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-20 12:12 UTC (permalink / raw)
  To: Dario Saccavino; +Cc: Jonathan Wakely, Ico, gcc-help

On 2011-12-20 12:48:53 +0100, Dario Saccavino wrote:
> In the second program, if 0.5 < f < 1 the values of a and b eventually
> become the smallest representable denormal value and never change
> afterwards, resulting in a large number of operations involving
> denormal numbers.

Yes, I agree (I forgot about that)... except that if f is close enough
to 1, you won't have subnormals and the program will be fast (like in
the case f <= 0.5).

> gcc enables FTZ when using SSE and ffast-math (I think the specific
> compiler flag is -funsafe-math-optimizations).

Thanks, good to know...

> Therefore the flags needed are -msse2 -mfpmath=sse -ffast-math

I would discourage the use of -ffast-math, which can affect generic
code very badly (due to -funsafe-math-optimizations). Isn't there
an option to enable FTZ?

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20  9:52 Floating point performance issue Ico
  2011-12-20 10:05 ` Marcin Mirosław
  2011-12-20 10:46 ` Marc Glisse
@ 2011-12-20 12:21 ` Tim Prince
  2 siblings, 0 replies; 56+ messages in thread
From: Tim Prince @ 2011-12-20 12:21 UTC (permalink / raw)
  To: gcc-help

On 12/20/2011 4:52 AM, Ico wrote:
> Hello,
>
> I'm running the program below twice with different command line arguments. The
> argument is used a a floating point scaling factor in the code, but does not
> change the algorithm in any way.  I am baffled by the difference in run time of
> the two runs, since the program flow is not altered by the argument.
Evidently, your definition of program flow doesn't take into account 
changes in your choice of architecture or number of exceptions.
Is your interest in x87 behavior due to historic considerations?
>
> $ gcc -O3 t.c
>
> $ time ./a.out 0.1
>
> real	0m7.300s
> user	0m7.286s
> sys	0m0.007s
>
> $ time ./a.out 0.0001
>
> real	0m0.060s
> user	0m0.058s
> sys	0m0.003s
>
>
> The second run is about 120 times faster then the first.
>
> I did some quick tests using the 'perf' profiling utility on Linux, and
> it seems that the slow run has about 70% branch misses, which I guess
> might kill performance drastically.
>
> I am able to reproduce this on multiple i686 boxes using various gcc versions
> (4.4, 4.6). Compiling on x86_64 does not show this behaviour.
>
> Is anybody able to reproduce this issue, and how can this be explained ?
If you had turned on your search engine, you would have seen the 
articles about "x87 Floating Point Assist."
Did you also test SSE code with and without abrupt underflow?
>
> Thanks,
>
> Ico
>
>
>
> /*
>   * gcc -O3 test.c&&  ./a.out NUMBER
>   */
>
> #include<stdio.h>
> #include<stdlib.h>
>
> #define N 4000
> #define S 5000
>
> struct t {
>          double a, b, f;
> };
>
> int main(int argc, char **argv)
> {
>          int i, j;
>          struct t t[N];
>          double f = atof(argv[1]);
>
>          for(i=0; i<N; i++) {
>                  t[i].a = 0;
>                  t[i].b = 1;
>                  t[i].f = i * f;
>          };
>
>          for(j=0; j<S; j++) {
>                  for(i=0; i<N; i++) {
>                          t[i].a += t[i].b * t[i].f;
>                          t[i].b -= t[i].a * t[i].f;
>                  }
>          }
>
>          return t[1].a;
> }
>
>
>
>
>
> processor	: 1
> vendor_id	: GenuineIntel
> cpu family	: 6
> model		: 15
> model name	: Intel(R) Core(TM)2 Duo CPU     T7500  @ 2.20GHz
> stepping	: 11
>
>
> Using built-in specs.
> COLLECT_GCC=gcc
> COLLECT_LTO_WRAPPER=/usr/lib/gcc/i486-linux-gnu/4.6/lto-wrapper
> Target: i486-linux-gnu
> Configured with: ../src/configure -v --with-pkgversion='Debian 4.6.2-7'
>    --with-bugurl=file:///usr/share/doc/gcc-4.6/README.Bugs
>    --enable-languages=c,c++,fortran,objc,obj-c++,go --prefix=/usr
>    --program-suffix=-4.6 --enable-shared --enable-linker-build-id
>    --with-system-zlib --libexecdir=/usr/lib --without-included-gettext
>    --enable-threads=posix --with-gxx-include-dir=/usr/include/c++/4.6
>    --libdir=/usr/lib --enable-nls --enable-clocale=gnu --enable-libstdcxx-debug
>    --enable-libstdcxx-time=yes --enable-plugin --enable-objc-gc
>    --enable-targets=all --with-arch-32=i586 --with-tune=generic
>    --enable-checking=release --build=i486-linux-gnu --host=i486-linux-gnu
>    --target=i486-linux-gnu
> Thread model: posix
> gcc version 4.6.2 (Debian 4.6.2-7)


-- 
Tim Prince

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 12:12           ` Vincent Lefevre
@ 2011-12-20 12:28             ` Tim Prince
  2011-12-20 12:43             ` Segher Boessenkool
  2011-12-20 13:43             ` David Brown
  2 siblings, 0 replies; 56+ messages in thread
From: Tim Prince @ 2011-12-20 12:28 UTC (permalink / raw)
  To: gcc-help

On 12/20/2011 7:01 AM, Vincent Lefevre wrote:
> On 2011-12-20 12:48:53 +0100, Dario Saccavino wrote:
>> In the second program, if 0.5<  f<  1 the values of a and b eventually
>> become the smallest representable denormal value and never change
>> afterwards, resulting in a large number of operations involving
>> denormal numbers.
>
> Yes, I agree (I forgot about that)... except that if f is close enough
> to 1, you won't have subnormals and the program will be fast (like in
> the case f<= 0.5).
>
>> gcc enables FTZ when using SSE and ffast-math (I think the specific
>> compiler flag is -funsafe-math-optimizations).
>
> Thanks, good to know...
>
>> Therefore the flags needed are -msse2 -mfpmath=sse -ffast-math
>
> I would discourage the use of -ffast-math, which can affect generic
> code very badly (due to -funsafe-math-optimizations). Isn't there
> an option to enable FTZ?
>
-ffast-math appears to have been made much more sane in current gcc 
versions (e.g. observance of parentheses is on by default).  Back in the 
pre-SSE days over a decade ago, which we are revisiting in this thread, 
the most widely used mathinline.h implementations had a great deal of 
intentional breakage invoked along with -ffast-math.
CPUs introduced this year, such as Sandy Bridge, are designed to handle 
simple underflow situations such as these without serious performance 
degradation.  Like OP, I have a CPU which was introduced over 5 years 
ago, where many of the characteristics are of only historic interest.

-- 
Tim Prince

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 12:12           ` Vincent Lefevre
  2011-12-20 12:28             ` Tim Prince
@ 2011-12-20 12:43             ` Segher Boessenkool
  2011-12-20 13:02               ` Vincent Lefevre
  2011-12-20 13:43             ` David Brown
  2 siblings, 1 reply; 56+ messages in thread
From: Segher Boessenkool @ 2011-12-20 12:43 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

I tested this on a PowerPC 970 so I could get lovely charts from
the Shark.  The problem is much less severe there, but it is
totally obvious the problem is that with the default rounding
mode (round to nearest, tie break even) the denormal sticks
around for > 0.5 .

>> Therefore the flags needed are -msse2 -mfpmath=sse -ffast-math
>
> I would discourage the use of -ffast-math, which can affect generic
> code very badly (due to -funsafe-math-optimizations). Isn't there
> an option to enable FTZ?

Dunno about that(*), but you can portably do

fesetround(FE_TOWARDZERO);

and that prevents the problem from occurring as well.


Segher


(*) So I looked it up, gcc/config/i386/crtfastmath.c, the code is
(for x86-64):

#define MXCSR_DAZ (1 << 6)      /* Enable denormals are zero mode */
#define MXCSR_FTZ (1 << 15)     /* Enable flush to zero mode */

   unsigned int mxcsr = __builtin_ia32_stmxcsr ();
   mxcsr |= MXCSR_DAZ | MXCSR_FTZ;
   __builtin_ia32_ldmxcsr (mxcsr);

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 12:43             ` Segher Boessenkool
@ 2011-12-20 13:02               ` Vincent Lefevre
  2011-12-20 19:51                 ` Segher Boessenkool
  0 siblings, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-20 13:02 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 2011-12-20 13:30:30 +0100, Segher Boessenkool wrote:
> Dunno about that(*), but you can portably do
> 
> fesetround(FE_TOWARDZERO);
> 
> and that prevents the problem from occurring as well.

But this is a bad idea in general, as it would affect all FP operations,
not just subnormals (when one gets subnormals, one has already lost much
precision, so that FTZ may not be a problem in practice -- this depends
on the application, of course).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 12:12           ` Vincent Lefevre
  2011-12-20 12:28             ` Tim Prince
  2011-12-20 12:43             ` Segher Boessenkool
@ 2011-12-20 13:43             ` David Brown
  2011-12-20 13:58               ` Vincent Lefevre
  2 siblings, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-20 13:43 UTC (permalink / raw)
  To: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 20/12/2011 13:01, Vincent Lefevre wrote:
> On 2011-12-20 12:48:53 +0100, Dario Saccavino wrote:
>> In the second program, if 0.5<  f<  1 the values of a and b eventually
>> become the smallest representable denormal value and never change
>> afterwards, resulting in a large number of operations involving
>> denormal numbers.
>
> Yes, I agree (I forgot about that)... except that if f is close enough
> to 1, you won't have subnormals and the program will be fast (like in
> the case f<= 0.5).
>
>> gcc enables FTZ when using SSE and ffast-math (I think the specific
>> compiler flag is -funsafe-math-optimizations).
>
> Thanks, good to know...
>
>> Therefore the flags needed are -msse2 -mfpmath=sse -ffast-math
>
> I would discourage the use of -ffast-math, which can affect generic
> code very badly (due to -funsafe-math-optimizations). Isn't there
> an option to enable FTZ?
>

There are times when you want IEEE-accurate floating point, because you 
are pushing the limits of accuracy, or you want high repeatability.  But 
normally when you use floating point, you are accepting a certain amount 
of inaccuracy.  If your code is going to produce incorrect answers 
because of the way the compiler/cpu rounds calculations, orders your 
sums, or treats denormals and other specials, then I would argue that 
your code is probably wrong - perhaps you should be using decimal types, 
long doubles, __float128, or some sort of multi-precision maths library 
instead.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 13:43             ` David Brown
@ 2011-12-20 13:58               ` Vincent Lefevre
  2011-12-20 14:25                 ` David Brown
  0 siblings, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-20 13:58 UTC (permalink / raw)
  To: David Brown; +Cc: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 2011-12-20 14:01:19 +0100, David Brown wrote:
> There are times when you want IEEE-accurate floating point, because you are
> pushing the limits of accuracy, or you want high repeatability.  But
> normally when you use floating point, you are accepting a certain amount of
> inaccuracy.  If your code is going to produce incorrect answers because of
> the way the compiler/cpu rounds calculations, orders your sums, or treats
> denormals and other specials, then I would argue that your code is probably
> wrong - perhaps you should be using decimal types, long doubles, __float128,
> or some sort of multi-precision maths library instead.

I disagree: the operations could be written in an order to avoid some
inaccuracies (such as huge cancellations) or to emulate more precision
(e.g. via Veltkamp's splitting) or to control the rounding (see some
rint() implementation http://sourceware.org/bugzilla/show_bug.cgi?id=602
for instance). On such code, unsafe optimizations could yield problems.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 13:58               ` Vincent Lefevre
@ 2011-12-20 14:25                 ` David Brown
  2011-12-20 15:05                   ` Vincent Lefevre
  0 siblings, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-20 14:25 UTC (permalink / raw)
  To: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 20/12/2011 14:43, Vincent Lefevre wrote:
> On 2011-12-20 14:01:19 +0100, David Brown wrote:
>> There are times when you want IEEE-accurate floating point, because you are
>> pushing the limits of accuracy, or you want high repeatability.  But
>> normally when you use floating point, you are accepting a certain amount of
>> inaccuracy.  If your code is going to produce incorrect answers because of
>> the way the compiler/cpu rounds calculations, orders your sums, or treats
>> denormals and other specials, then I would argue that your code is probably
>> wrong - perhaps you should be using decimal types, long doubles, __float128,
>> or some sort of multi-precision maths library instead.
>
> I disagree: the operations could be written in an order to avoid some
> inaccuracies (such as huge cancellations) or to emulate more precision
> (e.g. via Veltkamp's splitting) or to control the rounding (see some
> rint() implementation http://sourceware.org/bugzilla/show_bug.cgi?id=602
> for instance). On such code, unsafe optimizations could yield problems.
>

I guess that's why it's an option - then we can choose.  I would still 
say that most floating point code does not need such control, and that 
situations where it matters are rather specialised.  But that's just my 
unfounded opinion - judging from your signature you /do/ need such tight 
control in your work, while I've only learned today that "-ffast-math" 
has effects other than possibly changing the generated code.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 14:25                 ` David Brown
@ 2011-12-20 15:05                   ` Vincent Lefevre
  2011-12-20 15:44                     ` David Brown
  2011-12-20 15:45                     ` Jeff Kenton
  0 siblings, 2 replies; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-20 15:05 UTC (permalink / raw)
  To: David Brown; +Cc: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 2011-12-20 14:57:16 +0100, David Brown wrote:
> On 20/12/2011 14:43, Vincent Lefevre wrote:
> >I disagree: the operations could be written in an order to avoid some
> >inaccuracies (such as huge cancellations) or to emulate more precision
> >(e.g. via Veltkamp's splitting) or to control the rounding (see some
> >rint() implementation http://sourceware.org/bugzilla/show_bug.cgi?id=602
> >for instance). On such code, unsafe optimizations could yield problems.
> 
> I guess that's why it's an option - then we can choose.

Really, it should have never been an option since it may produce
incorrect code. Such kinds of optimization should have only been
enabled via pragmas, and only in a well-documented manner so that
the developer can know how this code could be transformed (and
only the developer should be allowed to enable such optimizations).

> I would still say that most floating point code does not need such
> control, and that situations where it matters are rather
> specialised.

I think that if this were true, there would have never been an
IEEE 754 standard.

> But that's just my unfounded opinion - judging from your signature
> you /do/ need such tight control in your work, while I've only
> learned today that "-ffast-math" has effects other than possibly
> changing the generated code.

This is on a different matter, but you can look at

  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323

(see also the huge number of duplicates). Many people complain
about floating-point when it gives "unexpected" results...

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 15:05                   ` Vincent Lefevre
@ 2011-12-20 15:44                     ` David Brown
  2011-12-20 16:18                       ` Vincent Lefevre
  2011-12-21  1:19                       ` Miles Bader
  2011-12-20 15:45                     ` Jeff Kenton
  1 sibling, 2 replies; 56+ messages in thread
From: David Brown @ 2011-12-20 15:44 UTC (permalink / raw)
  To: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 20/12/2011 15:24, Vincent Lefevre wrote:
> On 2011-12-20 14:57:16 +0100, David Brown wrote:
>> On 20/12/2011 14:43, Vincent Lefevre wrote:
>>> I disagree: the operations could be written in an order to avoid some
>>> inaccuracies (such as huge cancellations) or to emulate more precision
>>> (e.g. via Veltkamp's splitting) or to control the rounding (see some
>>> rint() implementation http://sourceware.org/bugzilla/show_bug.cgi?id=602
>>> for instance). On such code, unsafe optimizations could yield problems.
>>
>> I guess that's why it's an option - then we can choose.
>
> Really, it should have never been an option since it may produce
> incorrect code. Such kinds of optimization should have only been
> enabled via pragmas, and only in a well-documented manner so that
> the developer can know how this code could be transformed (and
> only the developer should be allowed to enable such optimizations).
>

As a general point about unsafe optimisations or ones that cause 
deviations from the standards, I think it is important that such 
features are not enabled by default or by any general -O flag - and to 
my knowledge, gcc always follows that rule.

But I think compiler flags are a suitable choice as they make it easier 
for the user to apply them to the program as a whole.  An example of 
this would be "-fshort-double" or "-fsingle-precision-constant".  These 
specifically tell the compiler to generate non-conforming code, but are 
very useful for targets that have floating-point hardware for 
single-precision but not for double-precision.

>> I would still say that most floating point code does not need such
>> control, and that situations where it matters are rather
>> specialised.
>
> I think that if this were true, there would have never been an
> IEEE 754 standard.
>

The IEEE 754 standard is like many other such standards - it is very 
useful, perhaps critically so, to some users.  For others, it's just a 
pain.

I work with embedded systems.  If the code generators (of gcc or the 
other compilers I use) and the toolchain libraries conform strictly to 
IEEE 754, then the code can often be significantly slower with no 
possible benefits.  If I am working with a motor controller, I don't 
want the system to run slower just to get conforming behaviour if the 
speed is infinite, the position is denormal and the velocity might be 
positive or negative zero.  I don't care if the motor's position is 
rounded up or down to the nearest nanometer - but I /do/ care if it 
takes an extra microsecond to make that decision.

For processors which do not have hardware floating point support, IEEE 
754 is often not the best choice of format.  It takes time to pack and 
unpack the fields.  But in almost all cases, it's what compilers use - 
because that's the C standards.  Users would often prefer faster but 
non-conforming floating point, and they certainly don't want to waste 
time with non-normal numbers.

Again, I state my qualifications - I'm just a user, and have no 
statistics from other users to back up my claims.  My own opinions are 
my own, of course, but my extrapolations to other users are only based 
on how I've seen floating point being used.

My believe is that for the great majority of users and uses, floating 
point is used as rough numbers.  People use them knowing they are 
inaccurate, and almost never knowing or caring about exactly how 
accurate or inaccurate they are.  They use them for normal, finite 
numbers.  Such users mostly do not know or care what IEEE 754 is.

There are, of course, a percentage of users who think of floating point 
numbers as precise.  They are mistaken - and would be mistaken with or 
without IEEE 754.


>> But that's just my unfounded opinion - judging from your signature
>> you /do/ need such tight control in your work, while I've only
>> learned today that "-ffast-math" has effects other than possibly
>> changing the generated code.
>
> This is on a different matter, but you can look at
>
>    http://gcc.gnu.org/bugzilla/show_bug.cgi?id=323
>
> (see also the huge number of duplicates). Many people complain
> about floating-point when it gives "unexpected" results...
>

That's probably one of the most common mistakes with floating point - 
the belief that two floating point numbers can be equal just because 
mathematically they should be.  This "bug" is not, IMHO, a bug - it's a 
misunderstanding of floating point.  Making "-Wfloat-equal" a default 
flag would eliminate many of these mistakes.


mvh.,

David

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 15:05                   ` Vincent Lefevre
  2011-12-20 15:44                     ` David Brown
@ 2011-12-20 15:45                     ` Jeff Kenton
  1 sibling, 0 replies; 56+ messages in thread
From: Jeff Kenton @ 2011-12-20 15:45 UTC (permalink / raw)
  To: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 12/20/2011 09:24 AM, Vincent Lefevre wrote:
> On 2011-12-20 14:57:16 +0100, David Brown wrote:
>> On 20/12/2011 14:43, Vincent Lefevre wrote:
>>> I disagree: the operations could be written in an order to avoid some
>>> inaccuracies (such as huge cancellations) or to emulate more precision
>>> (e.g. via Veltkamp's splitting) or to control the rounding (see some
>>> rint() implementation http://sourceware.org/bugzilla/show_bug.cgi?id=602
>>> for instance). On such code, unsafe optimizations could yield problems.
>> I guess that's why it's an option - then we can choose.
> Really, it should have never been an option since it may produce
> incorrect code. Such kinds of optimization should have only been
> enabled via pragmas, and only in a well-documented manner so that
> the developer can know how this code could be transformed (and
> only the developer should be allowed to enable such optimizations).
>
>> I would still say that most floating point code does not need such
>> control, and that situations where it matters are rather
>> specialised.
> I think that if this were true, there would have never been an
> IEEE 754 standard.
>
>

This argument has been going on since (or before) the IEEE 754 standard 
was approved.  Mathematical purists on one side and 
just-make-it-run-fast pragmatists on the other.  We won't solve it here.

--jeff

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 15:44                     ` David Brown
@ 2011-12-20 16:18                       ` Vincent Lefevre
  2011-12-20 22:32                         ` David Brown
  2011-12-21  1:19                       ` Miles Bader
  1 sibling, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-20 16:18 UTC (permalink / raw)
  To: David Brown; +Cc: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 2011-12-20 16:04:05 +0100, David Brown wrote:
> But I think compiler flags are a suitable choice as they make it
> easier for the user to apply them to the program as a whole.

This can be done with pragmas too.

> An example of this would
> be "-fshort-double" or "-fsingle-precision-constant".  These specifically
> tell the compiler to generate non-conforming code, but are very useful for
> targets that have floating-point hardware for single-precision but not for
> double-precision.

If the developer wants to allow the user to choose the precision
(or the type), he can do something like:

#ifndef FP_TYPE
/* default */
# define FP_TYPE double
#endif
typedef FP_TYPE fp_t;

or something more restrictive. At least, it is clear that the choice
of the precision is allowed in the program.

> >>I would still say that most floating point code does not need such
> >>control, and that situations where it matters are rather
> >>specialised.
> >
> >I think that if this were true, there would have never been an
> >IEEE 754 standard.
> >
> 
> The IEEE 754 standard is like many other such standards - it is very useful,
> perhaps critically so, to some users.  For others, it's just a pain.
> 
> I work with embedded systems. [...]

OK, so I would say that embedded systems are very specific and do not
concern most floating-point code. However it's a pity that people from
the embedded world didn't take part of the discussion of IEEE 754R.
IIRC the only discussions about embedded systems concerned FTZ, and
IEEE 754-2008 has something similar (substitute / abruptUnderflow).

> My believe is that for the great majority of users and uses, floating point
> is used as rough numbers.

But the IEEE 754 standard is also a benefit for them (reproducibility
of the results, portability, detection of potential problems thanks
to exceptions...).

> People use them knowing they are inaccurate, and almost never
> knowing or caring about exactly how accurate or inaccurate they are.
> They use them for normal, finite numbers. Such users mostly do not
> know or care what IEEE 754 is.

Still, -ffast-math can be inappropriate for them due to enabled
options like -fassociative-math. Exception handling can allow them
to detect bugs or other problems.

> That's probably one of the most common mistakes with floating point
> - the belief that two floating point numbers can be equal just
> because mathematically they should be.

No, the problem is more than that, it's also about the consistency.
Assuming that

int foo (void)
{
  double x = 1.0/3.0;
  return x == 1.0/3.0;
}

returns true should not be regarded as a mistake (even though it might
be allowed to return false for some reasons, the default should be
true).

> This "bug" is not, IMHO, a bug - it's a misunderstanding of floating
> point.

Despite some users who may misunderstand FP, it's really a bug that
affects correct code. And the fact that some users do not understand
FP is not a reason to make their life difficult.

> Making "-Wfloat-equal" a default flag would eliminate many of these
> mistakes.

No. This bug is due to the extended precision. Floating-point code
affected by rounding errors vs discontinuous functions (such as ==)
will have problems, whether extended precision is used or not.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 13:02               ` Vincent Lefevre
@ 2011-12-20 19:51                 ` Segher Boessenkool
  2011-12-20 21:02                   ` Vincent Lefevre
  0 siblings, 1 reply; 56+ messages in thread
From: Segher Boessenkool @ 2011-12-20 19:51 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

>> Dunno about that(*), but you can portably do
>>
>> fesetround(FE_TOWARDZERO);
>>
>> and that prevents the problem from occurring as well.
>
> But this is a bad idea in general, as it would affect all FP  
> operations,
> not just subnormals (when one gets subnormals, one has already lost  
> much
> precision, so that FTZ may not be a problem in practice -- this  
> depends
> on the application, of course).

*Any* specific rounding mode is a bad idea *in general*.  It all
depends on your algorithm.  The same is true for flush-to-zero:
for some algorithms it is great, for others it is disastrous.

For the OP's example, rounding towards zero does not give less
precise results.  To get accurate results requires (much) more
work.

In either case, the OP has his answer: the big slowdown he is
seeing is because his CPU handles calculations with denormals
much more slowly than it handles normal numbers.  And in this
thread various ways to avoid denormals have been pointed out.
Which of those is best for his particular actual problem is not
something we can answer.


Segher

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 19:51                 ` Segher Boessenkool
@ 2011-12-20 21:02                   ` Vincent Lefevre
  2011-12-21  4:36                     ` Segher Boessenkool
  0 siblings, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-20 21:02 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 2011-12-20 17:20:32 +0100, Segher Boessenkool wrote:
> *Any* specific rounding mode is a bad idea *in general*.

No, rounding to nearest will give the best accuracy in average
(what is preferable for most algorithms). That's why it's the
default rounding mode.

> It all depends on your algorithm.

Algorithms that use directed rounding generally need to mix
different modes. So, setting one particular mode won't help
in such a case.

> The same is true for flush-to-zero: for some algorithms it is great,
> for others it is disastrous.

I would say that in general, algorithms that works with FTZ, also
work with the usual default rounding. The main advantage of FTZ is
that it is faster.

> For the OP's example, rounding towards zero does not give less
> precise results.

I don't think this is true (or can you explain why?).

Also note that in general, with rounding to nearest, the errors will
tend to compensate, partly. This is not true for directed rounding,
except for particular algorithms.

> To get accurate results requires (much) more work.

If one wants accurate results to around 1 ulp, yes. However, without
this work, rounding to nearest is a bit better than directed rounding.

> In either case, the OP has his answer: the big slowdown he is
> seeing is because his CPU handles calculations with denormals
> much more slowly than it handles normal numbers.  And in this
> thread various ways to avoid denormals have been pointed out.
> Which of those is best for his particular actual problem is not
> something we can answer.

Trapping the underflow exception may be another solution, e.g. to
branch to specific code when the first underflow occurs, but this
requires some work.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 16:18                       ` Vincent Lefevre
@ 2011-12-20 22:32                         ` David Brown
  2011-12-23 20:11                           ` Vincent Lefevre
  0 siblings, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-20 22:32 UTC (permalink / raw)
  To: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help


As someone else has pointed out, this is an age-old argument that will 
never be settled.  I have no hopes of changing anyone's opinion here, 
but perhaps the exchange of opinions will be informative.

On 20/12/11 17:14, Vincent Lefevre wrote:
> On 2011-12-20 16:04:05 +0100, David Brown wrote:
>> But I think compiler flags are a suitable choice as they make it
>> easier for the user to apply them to the program as a whole.
>
> This can be done with pragmas too.
>
>> An example of this would
>> be "-fshort-double" or "-fsingle-precision-constant".  These specifically
>> tell the compiler to generate non-conforming code, but are very useful for
>> targets that have floating-point hardware for single-precision but not for
>> double-precision.
>
> If the developer wants to allow the user to choose the precision
> (or the type), he can do something like:
>
> #ifndef FP_TYPE
> /* default */
> # define FP_TYPE double
> #endif
> typedef FP_TYPE fp_t;
>
> or something more restrictive. At least, it is clear that the choice
> of the precision is allowed in the program.
>

I understand what you are saying here - and I agree that it's very 
important that any such choice is made clear and explicit.  But that's 
why a program's makefile is part of its source code - compiler flags are 
an essential part of the source.  Sometimes making use of specific types 
like this is a better way, but not necessarily - it may make the code 
less clear, it leaves more chance of accidentally mixing in doubles 
(such as from a constant without the "f" suffix), and it means you can't 
mix in pre-written code that uses "double" without changing it.  The 
"-fshort-double" flag is a "use once and forget" way to get much faster 
code (on relevant cpus, of course) at the cost of lost precision.


>>>> I would still say that most floating point code does not need such
>>>> control, and that situations where it matters are rather
>>>> specialised.
>>>
>>> I think that if this were true, there would have never been an
>>> IEEE 754 standard.
>>>
>>
>> The IEEE 754 standard is like many other such standards - it is very useful,
>> perhaps critically so, to some users.  For others, it's just a pain.
>>
>> I work with embedded systems. [...]
>
> OK, so I would say that embedded systems are very specific and do not
> concern most floating-point code. However it's a pity that people from
> the embedded world didn't take part of the discussion of IEEE 754R.
> IIRC the only discussions about embedded systems concerned FTZ, and
> IEEE 754-2008 has something similar (substitute / abruptUnderflow).
>

The embedded world doesn't tend to have good representation on these 
sorts of things - I agree it's a pity.

And while there are relatively few embedded programmers compared to "big 
system" programmers, it's still very relevant to gcc.  Most gcc users 
may be x86/amd64 target users, but the majority of gcc targets are for 
embedded systems, and they are the part of what drives gcc onwards.

>> My believe is that for the great majority of users and uses, floating point
>> is used as rough numbers.
>
> But the IEEE 754 standard is also a benefit for them (reproducibility
> of the results, portability, detection of potential problems thanks
> to exceptions...).
>

The programs would be equally portable if the last few bits of 
calculations varied, or the rounding was different on different 
machines.  And people do not expect bit-perfect repeatability of 
floating point, especially not across targets.

In my opinion, code that relies on exceptions to spot errors in 
calculations is normally bad code.  You don't do a division and handle 
divide-by-zero errors - you write code so that you never try to divide 
by zero.  At best, such exceptions are an aid during debugging.

>> People use them knowing they are inaccurate, and almost never
>> knowing or caring about exactly how accurate or inaccurate they are.
>> They use them for normal, finite numbers. Such users mostly do not
>> know or care what IEEE 754 is.
>
> Still, -ffast-math can be inappropriate for them due to enabled
> options like -fassociative-math. Exception handling can allow them
> to detect bugs or other problems.
>

Again, -fassociative-math is not a problem for code that does sensible 
calculations.  It's theoretical only, or for people who want to insist 
on bit-perfect repeatability of their floating point code.  The example 
case given in the gcc manual is "(x + 2**52) - 2**52)".  Assuming the 
implication is that x is a small number, there are almost no real-world 
circumstances when such code exists.  And if you really want to add or 
subtract numbers that differ by 52 orders of binary magnitude, and are 
interested in accurate results, you don't use "float" or "double" types 
anyway.

And again about exceptions - I work in a world where bugs cause big 
problems.  It would not be acceptable to have an attitude that you do 
the calculations regardless of the input, and let exception handlers (or 
signal handlers) clear up the mess once it all flops.  The rule is 
garbage in, garbage out - if you don't want garbage out, don't put 
garbage in.  Then the guts of your code can be written to assume that 
the input is valid, and the output will be valid.  While embedded 
programmers are often more vigilant than desktop programmers, I fail to 
see why any other sort of programmer would accept floating point 
exceptions as appropriate behaviour for code.

>> That's probably one of the most common mistakes with floating point
>> - the belief that two floating point numbers can be equal just
>> because mathematically they should be.
>
> No, the problem is more than that, it's also about the consistency.
> Assuming that
>
> int foo (void)
> {
>    double x = 1.0/3.0;
>    return x == 1.0/3.0;
> }
>
> returns true should not be regarded as a mistake (even though it might
> be allowed to return false for some reasons, the default should be
> true).
>

You should not be able to rely on code like that giving either true or 
false consistently.  You say yourself it "should return true" but "might 
return false".  The code is therefore useless, and it would be a good 
idea for the compiler to warn about it.

>> This "bug" is not, IMHO, a bug - it's a misunderstanding of floating
>> point.
>
> Despite some users who may misunderstand FP, it's really a bug that
> affects correct code. And the fact that some users do not understand
> FP is not a reason to make their life difficult.
>
>> Making "-Wfloat-equal" a default flag would eliminate many of these
>> mistakes.
>
> No. This bug is due to the extended precision. Floating-point code
> affected by rounding errors vs discontinuous functions (such as ==)
> will have problems, whether extended precision is used or not.
>

Yes, I understand the nature of the bug mentioned - floating point 
hardware may use more than "double" precision during calculations.  But 
I don't see why it should be treated any differently from any other 
mistaken attempts at comparing floating point numbers.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 15:44                     ` David Brown
  2011-12-20 16:18                       ` Vincent Lefevre
@ 2011-12-21  1:19                       ` Miles Bader
  2011-12-21  2:19                         ` David Brown
  1 sibling, 1 reply; 56+ messages in thread
From: Miles Bader @ 2011-12-21  1:19 UTC (permalink / raw)
  To: David Brown; +Cc: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

David Brown <david@westcontrol.com> writes:
> Making "-Wfloat-equal" a default flag would eliminate many of these
> mistakes.

It also results in false positives, so it shouldn't be on by default.

[E.g. "float x = 0;  .... y = x; ... if (y == 0) ..." should not result
in a warning.]

-Miles

-- 
Hers, pron. His.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21  1:19                       ` Miles Bader
@ 2011-12-21  2:19                         ` David Brown
  2011-12-21  4:03                           ` Miles Bader
  0 siblings, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-21  2:19 UTC (permalink / raw)
  To: Miles Bader; +Cc: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 21/12/11 01:27, Miles Bader wrote:
> David Brown<david@westcontrol.com>  writes:
>> Making "-Wfloat-equal" a default flag would eliminate many of these
>> mistakes.
>
> It also results in false positives, so it shouldn't be on by default.
>
> [E.g. "float x = 0;  .... y = x; ... if (y == 0) ..." should not result
> in a warning.]
>
> -Miles
>

I gather (from the bug report mentioned by Vincent Lefevre) that code 
like this will not always give the result you expect - so the compiler 
should definitely warn in such cases.  (Alternatively the compiler could 
be changed so that such code is always consistent regardless of 
architecture, floating point options, and optimisation options - I 
suspect giving a warning is easier.)

mvh.,

David

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21  2:19                         ` David Brown
@ 2011-12-21  4:03                           ` Miles Bader
  2011-12-21  8:32                             ` David Brown
  0 siblings, 1 reply; 56+ messages in thread
From: Miles Bader @ 2011-12-21  4:03 UTC (permalink / raw)
  To: David Brown; +Cc: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

David Brown <david.brown@hesbynett.no> writes:
>> David Brown<david@westcontrol.com>  writes:
>>> Making "-Wfloat-equal" a default flag would eliminate many of these
>>> mistakes.
>>
>> It also results in false positives, so it shouldn't be on by default.
>>
>> [E.g. "float x = 0;  .... y = x; ... if (y == 0) ..." should not result
>> in a warning.]
>>
>> -Miles
>>
>
> I gather (from the bug report mentioned by Vincent Lefevre) that code
> like this will not always give the result you expect - so the compiler
> should definitely warn in such cases. 

No.  The bug cited by Vincent is a completely different case.

Calculation and assignment are not the same.

-Miles

-- 
Quotation, n. The act of repeating erroneously the words of another. The words
erroneously repeated.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 21:02                   ` Vincent Lefevre
@ 2011-12-21  4:36                     ` Segher Boessenkool
  2011-12-21  6:15                       ` Segher Boessenkool
  0 siblings, 1 reply; 56+ messages in thread
From: Segher Boessenkool @ 2011-12-21  4:36 UTC (permalink / raw)
  To: Vincent Lefevre; +Cc: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

[snip stuff I don't agree with, or rather, we probably agree but
have a different idea of what "in general" means; it all borders
on philosophy, and that is very off-topic here, so let's not :-) ]

>> For the OP's example, rounding towards zero does not give less
>> precise results.
>
> I don't think this is true (or can you explain why?).

First off, only a very small range of inputs does not completely
explode to 0 or Inf, of course.  Or the smallest denormal ;-)

Let me give you a testcase:

--- 8< ---
#include <stdio.h>
#include <math.h>
#include <fenv.h>

#define S 20000000

static double calc(double f)
{
         int j;
         double a = 0;
         double b = 1;

         for(j = 0; j < S; j++) {
                 a = b * f;
                 b = a * f;
         }

         return a;
}

int main(void)
{
         double x = 1 - 1./(2*S);

         double c1 = calc(x);
         fesetround(FE_TOWARDZERO);
         double c2 = calc(x);

         printf("nearest = %.20g\n", c1);
         printf("down    = %.20g\n", c2);

         return 0;
}
--- 8< ---

It outputs:

nearest = 0.36787944637187158792
down    = 0.36787944526181193261

while we have

correct = 0.36787943657294925905

so both rounding modes lose about half of the available precision.
Round to zero actually did slightly better.

Now you can say that this is because x already is rounded from the
true value, this however is not the case: taking S=2**25, we get

nearest = 0.36787944391225085860
down    = 0.36787944204965455918

with

correct = 0.36787943843052687818


For 2**N multiplies, you lose about N bits of precision with
either rounding mode.

>> To get accurate results requires (much) more work.
>
> If one wants accurate results to around 1 ulp, yes. However, without
> this work, rounding to nearest is a bit better than directed rounding.

I actually thought in terms of single precision numbers, which would
give totally bogus results; but 64-bit loses more than half of the
bits already.  Surprisingly round to zero did a bit better, I didn't
expect that either (I thought it would be a bit or maybe two worse).
Rats, now I have to look it up, there must be literature on this :-)

> Trapping the underflow exception may be another solution, e.g. to
> branch to specific code when the first underflow occurs, but this
> requires some work.

And in a real calculation, very much more work!


Segher

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21  4:36                     ` Segher Boessenkool
@ 2011-12-21  6:15                       ` Segher Boessenkool
  2011-12-23 20:25                         ` Vincent Lefevre
  0 siblings, 1 reply; 56+ messages in thread
From: Segher Boessenkool @ 2011-12-21  6:15 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Vincent Lefevre, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

...And then, of course, I discover the program does not calculate
f**(2*S) but f**(2*S-1).  Which makes my "correct" results not so
very correct.  Ouch.

> It outputs:
>
> nearest = 0.36787944637187158792
> down    = 0.36787944526181193261
>
> while we have

correct = 0.36787944576993540330

so both rounding modes are about equal.

> Now you can say that this is because x already is rounded from the
> true value,

... and this is indeed ...

> the case: taking S=2**25, we get
>
> nearest = 0.36787944391225085860
> down    = 0.36787944204965455918
>
> with

correct = 0.36787944391235777181

so in that case round towards zero gets twice as many bits
wrong as round to nearest, just as a naive analysis would show.


Segher

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21  4:03                           ` Miles Bader
@ 2011-12-21  8:32                             ` David Brown
  2011-12-21  9:02                               ` Miles Bader
  2011-12-23 21:15                               ` Vincent Lefevre
  0 siblings, 2 replies; 56+ messages in thread
From: David Brown @ 2011-12-21  8:32 UTC (permalink / raw)
  To: Miles Bader; +Cc: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 21/12/2011 03:18, Miles Bader wrote:
> David Brown<david.brown@hesbynett.no>  writes:
>>> David Brown<david@westcontrol.com>   writes:
>>>> Making "-Wfloat-equal" a default flag would eliminate many of these
>>>> mistakes.
>>>
>>> It also results in false positives, so it shouldn't be on by default.
>>>
>>> [E.g. "float x = 0;  .... y = x; ... if (y == 0) ..." should not result
>>> in a warning.]
>>>
>>> -Miles
>>>
>>
>> I gather (from the bug report mentioned by Vincent Lefevre) that code
>> like this will not always give the result you expect - so the compiler
>> should definitely warn in such cases.
>
> No.  The bug cited by Vincent is a completely different case.
>
> Calculation and assignment are not the same.
>
> -Miles
>

My impression was that they were the same, or at least related, since 
the calculations in the examples would probably be done in advance by 
the compiler and reduced to simple assignments.

But I expect that you know the details a lot better than me.  If the 
compiler can guarantee consistent and expected results in cases like 
yours involving simple assignments, then it would make sense to change 
the "-Wfloat-equal" not to trigger in such situations.  After all, the 
point of the warning is to help users avoid code that might not do what 
it seems to do - if it /does/ do the expected thing, then there is no 
need of a warning.

In other words, the best course (IMHO) is to fix -Wfloat-equal to 
eliminate common false positives, and /then/ enable it by default.

mvh.,

David

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21  8:32                             ` David Brown
@ 2011-12-21  9:02                               ` Miles Bader
  2011-12-21  9:23                                 ` David Brown
  2011-12-23 21:15                               ` Vincent Lefevre
  1 sibling, 1 reply; 56+ messages in thread
From: Miles Bader @ 2011-12-21  9:02 UTC (permalink / raw)
  To: David Brown; +Cc: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

2011/12/21 David Brown <david@westcontrol.com>:
> My impression was that they were the same, or at least related, since the
> calculations in the examples would probably be done in advance by the
> compiler and reduced to simple assignments.

You can't depend on the values the compiler calculates at compile-time
anymore than you can depend on values calculated at runtime (after
all, a decent compiler will strictly emulate the behavior of the
target hardware when doing constant folding).

For constant values, especially those meeting certain criteria, things
are simpler.  For instance, [positive] zero or smallish positive
integer values can be exactly represented by any reasonable
floating-point format, and no amount of copying around is going to
change them, even given  typical FPU wackiness.

> If the
> compiler can guarantee consistent and expected results in cases like yours
> involving simple assignments, then it would make sense to change the
> "-Wfloat-equal" not to trigger in such situations.  After all, the point of
> the warning is to help users avoid code that might not do what it seems to
> do - if it /does/ do the expected thing, then there is no need of a warning.
>
> In other words, the best course (IMHO) is to fix -Wfloat-equal to eliminate
> common false positives, and /then/ enable it by default.

The problem is that in many cases you simply can't tell, because you
don't know how a particular value was produced.

For instance, if you have code like:

   float var = zot ();
   if (var == 0.f)
      blahblah ();

You often don't know what "zot" did to calculate it's return value.
If it just did "return 0.f;", then you're golden -- the comparison is
fine.  However, if it did "return 1.f - fsin (1.f);", wellllll.....

One thing that might work would be to have the warning code look at
the local context before the comparison, and only warn if it saw that
one of values being compared is actually calculated, rather than being
a constant or coming from some non-local source.  This would result in
a lot of false negatives, but that's better than false positives.

-miles

-- 
Cat is power.  Cat is peace.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21  9:02                               ` Miles Bader
@ 2011-12-21  9:23                                 ` David Brown
  2011-12-21 11:58                                   ` Miles Bader
  0 siblings, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-21  9:23 UTC (permalink / raw)
  To: Miles Bader; +Cc: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 21/12/2011 09:31, Miles Bader wrote:
> 2011/12/21 David Brown<david@westcontrol.com>:
>> My impression was that they were the same, or at least related, since the
>> calculations in the examples would probably be done in advance by the
>> compiler and reduced to simple assignments.
>
> You can't depend on the values the compiler calculates at compile-time
> anymore than you can depend on values calculated at runtime (after
> all, a decent compiler will strictly emulate the behavior of the
> target hardware when doing constant folding).
>
> For constant values, especially those meeting certain criteria, things
> are simpler.  For instance, [positive] zero or smallish positive
> integer values can be exactly represented by any reasonable
> floating-point format, and no amount of copying around is going to
> change them, even given  typical FPU wackiness.
>
>> If the
>> compiler can guarantee consistent and expected results in cases like yours
>> involving simple assignments, then it would make sense to change the
>> "-Wfloat-equal" not to trigger in such situations.  After all, the point of
>> the warning is to help users avoid code that might not do what it seems to
>> do - if it /does/ do the expected thing, then there is no need of a warning.
>>
>> In other words, the best course (IMHO) is to fix -Wfloat-equal to eliminate
>> common false positives, and /then/ enable it by default.
>
> The problem is that in many cases you simply can't tell, because you
> don't know how a particular value was produced.
>
> For instance, if you have code like:
>
>     float var = zot ();
>     if (var == 0.f)
>        blahblah ();
>
> You often don't know what "zot" did to calculate it's return value.
> If it just did "return 0.f;", then you're golden -- the comparison is
> fine.  However, if it did "return 1.f - fsin (1.f);", wellllll.....
>
> One thing that might work would be to have the warning code look at
> the local context before the comparison, and only warn if it saw that
> one of values being compared is actually calculated, rather than being
> a constant or coming from some non-local source.  This would result in
> a lot of false negatives, but that's better than false positives.
>
> -miles
>

I understand the problem you are describing, and the challenges of it. 
But I think I disagree a little on what you describe as false positives, 
and how much of a problem they are.  Basically, I believe that if you 
can't be sure that the code is consistent and correct in all 
circumstances, including different optimisation levels, then it's bad 
code - and it is fair to give a warning.  So I would want a warning on 
your "zot" code regardless of how it is implemented - to my mind, that's 
not a false positive even if zot() were a macro expanding to "0.f". 
It's bad code style to rely on it, so give a warning.

I am more a fan of strict warnings than most people, and have often said 
most warnings should be enabled by default - though I know perfectly 
well that they won't be.  I'm just evangelising for safer coding 
practices, as I see them.  In reality, of course, I'll just keep the 
warning flag in my own makefiles.

mvh.,

David

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21  9:23                                 ` David Brown
@ 2011-12-21 11:58                                   ` Miles Bader
  2011-12-21 16:49                                     ` David Brown
  0 siblings, 1 reply; 56+ messages in thread
From: Miles Bader @ 2011-12-21 11:58 UTC (permalink / raw)
  To: David Brown; +Cc: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

2011/12/21 David Brown <david@westcontrol.com>:
> think I disagree a little on what you describe as false positives, and how
> much of a problem they are.  Basically, I believe that if you can't be sure
> that the code is consistent and correct in all circumstances, including
> different optimisation levels, then it's bad code - and it is fair to give a
> warning.  So I would want a warning on your "zot" code regardless of how it
> is implemented - to my mind, that's not a false positive even if zot() were
> a macro expanding to "0.f". It's bad code style to rely on it, so give a
> warning.

You seemed to have misunderstood what I'm trying to illustrate by the
two versions of "zot."

They are examples of two _fundamentally different_ operations.  If zot
returns a constant 0.f in certain cases, obviously it needs to make
that fact a documented part of its interface ("zot returns an exact
value 0.f when XXX happens").   The "return ...sin..." case, on the
other hand is an arbitrary calculation, for which interface of zot
would not make such guarantees.

Given the first version of zot, then, any subsequent comparison with
0.f, then, is _not_ relying on implementation details, it's relying on
a documented guarantee.  That's certainly not "bad coding style."

However, the compiler has no way of knowing about this guarantee, so
any warning code needs to be conservative, and only warn when it's
quite clear that something dodgy is happening.

-Miles

-- 
Cat is power.  Cat is peace.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21 11:58                                   ` Miles Bader
@ 2011-12-21 16:49                                     ` David Brown
  2011-12-22  3:23                                       ` Miles Bader
  0 siblings, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-21 16:49 UTC (permalink / raw)
  To: Miles Bader; +Cc: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 21/12/2011 10:22, Miles Bader wrote:
> 2011/12/21 David Brown<david@westcontrol.com>:
>> think I disagree a little on what you describe as false positives, and how
>> much of a problem they are.  Basically, I believe that if you can't be sure
>> that the code is consistent and correct in all circumstances, including
>> different optimisation levels, then it's bad code - and it is fair to give a
>> warning.  So I would want a warning on your "zot" code regardless of how it
>> is implemented - to my mind, that's not a false positive even if zot() were
>> a macro expanding to "0.f". It's bad code style to rely on it, so give a
>> warning.
>
> You seemed to have misunderstood what I'm trying to illustrate by the
> two versions of "zot."
>
> They are examples of two _fundamentally different_ operations.  If zot
> returns a constant 0.f in certain cases, obviously it needs to make
> that fact a documented part of its interface ("zot returns an exact
> value 0.f when XXX happens").   The "return ...sin..." case, on the
> other hand is an arbitrary calculation, for which interface of zot
> would not make such guarantees.
>
> Given the first version of zot, then, any subsequent comparison with
> 0.f, then, is _not_ relying on implementation details, it's relying on
> a documented guarantee.  That's certainly not "bad coding style."
>
> However, the compiler has no way of knowing about this guarantee, so
> any warning code needs to be conservative, and only warn when it's
> quite clear that something dodgy is happening.
>

I think our key difference of opinion here is that you want a warning 
only when it is quite clear that "something dodgy" is happening - I want 
the warning unless it is quite clear that something dodgy is /not/ 
happening.  I think it is bad coding style to write something that is 
not clearly correct - "might be right, might be dodgy" is, to me, simply 
"dodgy" and I want the warning.

So I'm pretty happy with the -Wfloat-equal flag - as I see it, there are 
no false positives.  One improvement might be to override it with double 
parenthesis, so that "if ((x == 0.1f))" would not give a warning (in the 
same way as for "if ((x = nextChar()))" avoids the "-Wparentheses" warning.

mvh.,

David


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21 16:49                                     ` David Brown
@ 2011-12-22  3:23                                       ` Miles Bader
  0 siblings, 0 replies; 56+ messages in thread
From: Miles Bader @ 2011-12-22  3:23 UTC (permalink / raw)
  To: David Brown; +Cc: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

2011/12/21 David Brown <david@westcontrol.com>:
> I think our key difference of opinion here is that you want a warning only
> when it is quite clear that "something dodgy" is happening - I want the
> warning unless it is quite clear that something dodgy is /not/ happening.

Er, well feel free to turn on the warning for code you write...

But the _default_ settings (and what's included in "catch-all" options
like -Wall) are something altogether different.  They are a balancing
act, and must take into account many different programming styles, and
avoid excessive false positives[1], which are quite annoying,
especially when there isn't a clear alternative to what's being warned
about[2].  False negatives are less of an issue, as typically
"something is better than nothing"...

[1] Yes, in this case, a warning about a valid comparison _is_ a false
positive; the warning says "this is unsafe" for a comparison that is
safe

[2] E.g., "if (x = ...)" can be trivially rewritten as "x = ..; if
(x)", and doing so generally improves code readability without any
runtime penalty; however this isn't true for all constructs

-miles

-- 
Cat is power.  Cat is peace.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-20 22:32                         ` David Brown
@ 2011-12-23 20:11                           ` Vincent Lefevre
  2011-12-24  7:38                             ` Vincent Lefevre
  2011-12-24 11:11                             ` David Brown
  0 siblings, 2 replies; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-23 20:11 UTC (permalink / raw)
  To: David Brown; +Cc: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 2011-12-20 22:52:12 +0100, David Brown wrote:
> I understand what you are saying here - and I agree that it's very important
> that any such choice is made clear and explicit.  But that's why a program's
> makefile is part of its source code - compiler flags are an essential part
> of the source.

Very often there isn't a static makefile, for instance for all those
programs built via the autotools: the compiler options are provided
by the one who builds the program. Note that there has already been
complaints because some program gave incorrect results due to the
-ffast-math option provided by a 3rd-party (who didn't know that it
shouldn't have been used for this program).

BTW, I've found in my archives the following example I had posted:

float x = 30.0;
int main()
{
  if ( 90.0/x != 3.0)
    abort();
  return 0;
}

fails with -ffast-math (on x86).

Note that more generally, depending on the application, one may be
able to prove that the division is exact (e.g. when computing some
determinant on simple data), so that the != should be perfectly safe
in such a case.

[...]
> The programs would be equally portable if the last few bits of calculations
> varied, or the rounding was different on different machines.

No, even if these last few bits are meaningless, full reproducibility
may be important in some cases, e.g. for debugging, or for checking
the results on a different machine.

> And people do not expect bit-perfect repeatability of floating
> point, especially not across targets.

Even across targets: many people complained that they got different
results on x86 (where extended precision is used) and on other
platforms (as done in the LHC@home project of CERN).

> In my opinion, code that relies on exceptions to spot errors in calculations
> is normally bad code.  You don't do a division and handle divide-by-zero
> errors - you write code so that you never try to divide by zero.  At best,
> such exceptions are an aid during debugging.

In many scientific codes, avoiding such exceptions would mean a loss
of performance. With the IEEE 754 spec, the code may still be correct,
though. That's why the infinities have been introduced in the standard
(otherwise a NaN would have been sufficient for all such exceptions).

> Again, -fassociative-math is not a problem for code that does sensible
> calculations.

Wrong. It is a problem in various codes.

>  It's theoretical only, or for people who want to insist on
> bit-perfect repeatability of their floating point code. The example
> case given in the gcc manual is "(x + 2**52) - 2**52)". Assuming the
> implication is that x is a small number, there are almost no
> real-world circumstances when such code exists.

Many codes will fail when rewriting such math expressions. See for
instance the rint() implementation, and the codes whose goal is to
improve the accuracy of floating-point computations (these codes are
generally based on algorithms like TwoSum/FastTwoSum and Veltkamp's
splitting). For instance, see the IBM Accurate Mathematical Library,
which is part of the glibc.

> And if you really want to add or subtract numbers that differ by 52
> orders of binary magnitude, and are interested in accurate results,
> you don't use "float" or "double" types anyway.

Wrong again. I suggest that you read our "Handbook of Floating-Point
Arithmetic".

GNU MPFR could be used to do reliable FP arithmetic, but it should
not be seen as a replacement for hardware FP arithmetic, when simple
algorithms based on IEEE 754 arithmetic (as described in our book)
could solve problems.

> >>That's probably one of the most common mistakes with floating point
> >>- the belief that two floating point numbers can be equal just
> >>because mathematically they should be.
> >
> >No, the problem is more than that, it's also about the consistency.
> >Assuming that
> >
> >int foo (void)
> >{
> >   double x = 1.0/3.0;
> >   return x == 1.0/3.0;
> >}
> >
> >returns true should not be regarded as a mistake (even though it might
> >be allowed to return false for some reasons, the default should be
> >true).
> >
> 
> You should not be able to rely on code like that giving either true or false
> consistently.  You say yourself it "should return true" but "might return
> false".  The code is therefore useless, and it would be a good idea for the
> compiler to warn about it.

It is not useless. The developer should have a way to control this
(and the ISO C99 standard provides some ways of checking that, though
this is rather limited).

> >No. This bug is due to the extended precision. Floating-point code
> >affected by rounding errors vs discontinuous functions (such as ==)
> >will have problems, whether extended precision is used or not.
> 
> Yes, I understand the nature of the bug mentioned - floating point
> hardware may use more than "double" precision during calculations.
> But I don't see why it should be treated any differently from any
> other mistaken attempts at comparing floating point numbers.

Because in various cases, using == is not a mistake. Really, I expect
that on current machines, 14.0/7.0 == 2.0 evaluate to 1 (true), which
was unfortunately not always the case 30 years ago.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21  6:15                       ` Segher Boessenkool
@ 2011-12-23 20:25                         ` Vincent Lefevre
  0 siblings, 0 replies; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-23 20:25 UTC (permalink / raw)
  To: Segher Boessenkool; +Cc: Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 2011-12-21 05:35:53 +0100, Segher Boessenkool wrote:
> ...And then, of course, I discover the program does not calculate
> f**(2*S) but f**(2*S-1).  Which makes my "correct" results not so
> very correct.  Ouch.
> 
> >It outputs:
> >
> >nearest = 0.36787944637187158792
> >down    = 0.36787944526181193261
> >
> >while we have
> 
> correct = 0.36787944576993540330
> 
> so both rounding modes are about equal.
> 
> >Now you can say that this is because x already is rounded from the
> >true value,
> 
> ... and this is indeed ...
> 
> >the case: taking S=2**25, we get
> >
> >nearest = 0.36787944391225085860
> >down    = 0.36787944204965455918
> >
> >with
> 
> correct = 0.36787944391235777181
> 
> so in that case round towards zero gets twice as many bits
> wrong as round to nearest, just as a naive analysis would show.

and ideally, you should test programs on many inputs, and also prove
error bounds (which will probably be pessimistic, but guaranteed).

Now, what I wanted to say is that in general, leaving the rounding
to nearest and setting FTZ would be much better than changing the
rounding mode globally. Another drawback of changing the rounding
mode globally is that some libraries may not support it; this is
the case of the math library from the glibc on x86_64 (and other
processors for which the same code is used):

  http://sourceware.org/bugzilla/show_bug.cgi?id=3976

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-21  8:32                             ` David Brown
  2011-12-21  9:02                               ` Miles Bader
@ 2011-12-23 21:15                               ` Vincent Lefevre
  2011-12-24 17:18                                 ` David Brown
  1 sibling, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-23 21:15 UTC (permalink / raw)
  To: David Brown
  Cc: Miles Bader, David Brown, Dario Saccavino, Jonathan Wakely, Ico,
	gcc-help

On 2011-12-21 08:51:39 +0100, David Brown wrote:
> But I expect that you know the details a lot better than me.  If the
> compiler can guarantee consistent and expected results in cases like yours
> involving simple assignments, then it would make sense to change the
> "-Wfloat-equal" not to trigger in such situations.  After all, the point of
> the warning is to help users avoid code that might not do what it seems to
> do - if it /does/ do the expected thing, then there is no need of a warning.

Floating point is tricky. You can't ask the compiler to detect every
potential problems. That's not possible. Otherwise you would have too
many false positives (-Wfloat-equal being one of the causes).

For instance, according to Usenet posts in C and Perl groups, users
often regard something like

  double x = 0.1;

as exact. So, would you want the compiler to issue a warning every
time a constant that cannot be represented exactly is used? I'd say
no, even though the consequences can be disastrous[*]. Users should
learn how FP works instead of relying on the compiler to detect
their mistakes.

[*] This example isn't much different from

  http://www.ima.umn.edu/~arnold/disasters/patriot.html

(where all calculations could have been done exactly, if the code
were better designed), which lead to 28 people killed.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-23 20:11                           ` Vincent Lefevre
@ 2011-12-24  7:38                             ` Vincent Lefevre
  2011-12-24 11:11                             ` David Brown
  1 sibling, 0 replies; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-24  7:38 UTC (permalink / raw)
  To: David Brown, David Brown, Dario Saccavino, Jonathan Wakely, Ico,
	gcc-help

On 2011-12-23 21:02:44 +0100, Vincent Lefevre wrote:
> On 2011-12-20 22:52:12 +0100, David Brown wrote:
> >  It's theoretical only, or for people who want to insist on
> > bit-perfect repeatability of their floating point code. The example
> > case given in the gcc manual is "(x + 2**52) - 2**52)". Assuming the
> > implication is that x is a small number, there are almost no
> > real-world circumstances when such code exists.
> 
> Many codes will fail when rewriting such math expressions. See for
> instance the rint() implementation, and the codes whose goal is to
> improve the accuracy of floating-point computations (these codes are
> generally based on algorithms like TwoSum/FastTwoSum and Veltkamp's
> splitting). For instance, see the IBM Accurate Mathematical Library,
> which is part of the glibc.

I would add that even the C standard (C99 and C11) explicitly forbids
the rearrangements of such expressions. In 5.1.2.3 Program execution:

  14    EXAMPLE 5  Rearrangement for floating-point expressions is often
        restricted because of limitations in precision as well as range.
        The implementation cannot generally apply the mathematical
        associative rules for addition or multiplication, nor the
        distributive rule, because of roundoff error, even in the
        absence of overflow and underflow. Likewise, implementations
        cannot generally replace decimal constants in order to rearrange
        expressions. In the following fragment, rearrangements
        suggested by mathematical rules for real numbers are often not
        valid (see F.9).

          double x, y, z;
          /* ... */
          x = (x * y) * z;    //  not equivalent to x  *= y * z;
          z = (x - y) + y ;   //  not equivalent to z  = x;
          z = x + x * y;      //  not equivalent to z  = x * (1.0 + y);
          y = x / 5.0;        //  not equivalent to y  = x * 0.2;

Note: "***often*** not valid".

See also the following articles:

  http://csdl.computer.org/dl/mags/co/2005/05/r5091.pdf
  "An Open Question to Developers of Numerical Software", by
  W. Kahan and D. Zuras

  http://www.cs.berkeley.edu/~wkahan/Mindless.pdf
  "How Futile are Mindless Assessments of Roundoff in Floating-Point
  Computation?", by W. Kahan

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-23 20:11                           ` Vincent Lefevre
  2011-12-24  7:38                             ` Vincent Lefevre
@ 2011-12-24 11:11                             ` David Brown
  2011-12-26  1:15                               ` Vincent Lefevre
  1 sibling, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-24 11:11 UTC (permalink / raw)
  To: David Brown, Dario Saccavino, Jonathan Wakely, Ico, gcc-help

On 23/12/11 21:02, Vincent Lefevre wrote:
> On 2011-12-20 22:52:12 +0100, David Brown wrote:
>> I understand what you are saying here - and I agree that it's very important
>> that any such choice is made clear and explicit.  But that's why a program's
>> makefile is part of its source code - compiler flags are an essential part
>> of the source.
>
> Very often there isn't a static makefile, for instance for all those
> programs built via the autotools: the compiler options are provided
> by the one who builds the program. Note that there has already been
> complaints because some program gave incorrect results due to the
> -ffast-math option provided by a 3rd-party (who didn't know that it
> shouldn't have been used for this program).
>
> BTW, I've found in my archives the following example I had posted:
>
> float x = 30.0;
> int main()
> {
>    if ( 90.0/x != 3.0)
>      abort();
>    return 0;
> }
>
> fails with -ffast-math (on x86).
>
> Note that more generally, depending on the application, one may be
> able to prove that the division is exact (e.g. when computing some
> determinant on simple data), so that the != should be perfectly safe
> in such a case.
>

<snip>

I have read the rest of your post (and the next one) with interest - I'm 
just snipping here to save space.

It's quite clear that we are seeing a very different range of programs 
using floating point, and a different range of problems or questions 
around them.  You see a much wider range than me - it makes sense that 
default flags, settings and warnings on the compiler are picked to do a 
reasonable job for such programs.

I work on a more restricted type of program (mostly embedded 
programming) where the rules are stricter.  For example, a makefile or 
other clear and unambiguous indication of compiler flags is a 
requirement.  In my work, the code you wrote above is bad code - it 
doesn't matter if the floating-point comparison turns out to be 
consistent when used, it's still wrong.  So to me, there are no "false 
positives" with something like -Wfloat-equal.  I want to be warned on 
anything that even smells like it could be "dodgy" code.

And (regarding other examples in your posts) if floating-point code 
depends on things like the order of calculations, it is also wrong - 
thus "-ffast-math" will not affect the correctness of the program, but 
will sometimes greatly improve the speed.


I don't think I'm alone in wishing that more programmers used stricter 
coding practices - a great many bugs that cause software users grief 
could have been found if the authors used their compilers' warning 
flags, and fixed their code appropriately.  It is true that this would 
sometimes mean changing code that is actually correct, but looks wrong 
to the compiler - that's part of the cost.  In many such situations, the 
code would also look questionable to humans, and should be changed anyway.

But I know full well that this is not going to change.  So the default 
settings of the compiler provide the safest possible code in the widest 
possible circumstances, and programmers like me can use the flags to get 
the features they want.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-23 21:15                               ` Vincent Lefevre
@ 2011-12-24 17:18                                 ` David Brown
  2011-12-26  8:12                                   ` Vincent Lefevre
  0 siblings, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-24 17:18 UTC (permalink / raw)
  To: David Brown, Miles Bader, Dario Saccavino, Jonathan Wakely, Ico,
	gcc-help

On 23/12/11 21:25, Vincent Lefevre wrote:
> On 2011-12-21 08:51:39 +0100, David Brown wrote:
>> But I expect that you know the details a lot better than me.  If the
>> compiler can guarantee consistent and expected results in cases like yours
>> involving simple assignments, then it would make sense to change the
>> "-Wfloat-equal" not to trigger in such situations.  After all, the point of
>> the warning is to help users avoid code that might not do what it seems to
>> do - if it /does/ do the expected thing, then there is no need of a warning.
>
> Floating point is tricky. You can't ask the compiler to detect every
> potential problems. That's not possible. Otherwise you would have too
> many false positives (-Wfloat-equal being one of the causes).
>
> For instance, according to Usenet posts in C and Perl groups, users
> often regard something like
>
>    double x = 0.1;
>
> as exact. So, would you want the compiler to issue a warning every
> time a constant that cannot be represented exactly is used? I'd say
> no, even though the consequences can be disastrous[*]. Users should
> learn how FP works instead of relying on the compiler to detect
> their mistakes.
>

I agree that users need to understand floating point, its benefits and 
its restrictions.  And compilers most certainly can't warn about 
everything.  But when it /can/ warn you and help you, that's a good thing.

It is particularly useful when dealing with other people's code, of 
course.  The "-Wfloat-equal" flag would almost never trigger a warning 
on my own code - I would not write code that triggers it.  But it may 
catch a mistake I accidentally made, and it may help when handling other 
people's code.

> [*] This example isn't much different from
>
>    http://www.ima.umn.edu/~arnold/disasters/patriot.html
>
> (where all calculations could have been done exactly, if the code
> were better designed), which lead to 28 people killed.
>

This is an example where floating point should not have been used - no 
compiler can warn about such inappropriate use of tools and lack of 
understanding about the job in hand.

In defence of the programmer, however, it should be noted that the 
Patriots were never designed for intercepting Scuds, and there were a 
lot of reasons why they did a poor job of it (with issues in the 
software, electronics and mechanics).


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-24 11:11                             ` David Brown
@ 2011-12-26  1:15                               ` Vincent Lefevre
  2011-12-26 11:48                                 ` David Brown
  0 siblings, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-26  1:15 UTC (permalink / raw)
  To: gcc-help

On 2011-12-24 12:02:24 +0100, David Brown wrote:
> And (regarding other examples in your posts) if floating-point code
> depends on things like the order of calculations, it is also wrong -
> thus "-ffast-math" will not affect the correctness of the program,
> but will sometimes greatly improve the speed.

Whether you like it or not, a floating-point result does depend on the
order of calculations in general. How would you write code to compute
the mathematical expression a + b - c where you know that you have the
property 1/2 <= a/c <= 2 on the inputs? Or code to compute a*a - b*b
where a and b are close to each other? With the IEEE 754 rules, one
can use formulas that give very accurate results, but if the compiler
is allowed to rewrite the code (without control from the developer who
wrote the code), the results may no longer be accurate (and may even
be quite wrong).

> I don't think I'm alone in wishing that more programmers used stricter
> coding practices

On the contrary, we are very strict on coding practices in order to
get accurate results.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-24 17:18                                 ` David Brown
@ 2011-12-26  8:12                                   ` Vincent Lefevre
  2011-12-26 13:00                                     ` David Brown
  0 siblings, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-26  8:12 UTC (permalink / raw)
  To: gcc-help

On 2011-12-24 12:11:06 +0100, David Brown wrote:
> On 23/12/11 21:25, Vincent Lefevre wrote:
> >[*] This example isn't much different from
> >
> >   http://www.ima.umn.edu/~arnold/disasters/patriot.html
> >
> >(where all calculations could have been done exactly, if the code
> >were better designed), which lead to 28 people killed.
> 
> This is an example where floating point should not have been used - no
> compiler can warn about such inappropriate use of tools and lack of
> understanding about the job in hand.

Actually it wasn't floating point, but fixed point, i.e. something
like scaled integers. Anyway the problem is the same, whether it is
implemented in fixed point or floating point: an initial error that
accumulates... Now, since you can also use integers to represent
approximations of real values (planes, at least not the latest ones,
use integer arithmetic for their calculations, so you see...), would
you also ban the equality test on integers?

> In defence of the programmer, however, it should be noted that the Patriots
> were never designed for intercepting Scuds, and there were a lot of reasons
> why they did a poor job of it (with issues in the software, electronics and
> mechanics).

If the limitations were properly documented, users would have detected
the potential failure. Or perhaps they didn't RTFM. :)

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-26  1:15                               ` Vincent Lefevre
@ 2011-12-26 11:48                                 ` David Brown
  2011-12-26 13:07                                   ` Vincent Lefevre
  0 siblings, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-26 11:48 UTC (permalink / raw)
  To: gcc-help

On 26/12/11 01:58, Vincent Lefevre wrote:
> On 2011-12-24 12:02:24 +0100, David Brown wrote:
>> And (regarding other examples in your posts) if floating-point code
>> depends on things like the order of calculations, it is also wrong -
>> thus "-ffast-math" will not affect the correctness of the program,
>> but will sometimes greatly improve the speed.
>
> Whether you like it or not, a floating-point result does depend on the
> order of calculations in general. How would you write code to compute
> the mathematical expression a + b - c where you know that you have the
> property 1/2<= a/c<= 2 on the inputs? Or code to compute a*a - b*b
> where a and b are close to each other? With the IEEE 754 rules, one
> can use formulas that give very accurate results, but if the compiler
> is allowed to rewrite the code (without control from the developer who
> wrote the code), the results may no longer be accurate (and may even
> be quite wrong).
>

If it matters that "a + b - c" be calculated "(a + b) - c" or "a + (b - 
c)", then use brackets.  That what you show the compiler and people 
reading the code what's going on.  In my type of programming, if you 
need to be sure of precise calculations, you always use integer 
arithmetic - it's clear and well-defined, with no scope for 
inaccuracies, rounding errors, etc.  And it's not uncommon that you have 
to use brackets like this to ensure everything works out correctly.

In your example, I don't see how a difference in the ordering of 
calculations will make a difference to the result of more than an LSB or 
two unless there were wildly differing magnitudes involved.  And that's 
something to avoid - you are never going to get accurate results by 
adding or subtracting values of wildly different magnitudes in floating 
point.  You may be able to get consistent, precise, well-defined, 
portable results - but you will not get correct or meaningful results.

My point is that you have to take that kind of effect into account when 
writing the algorithm and the code.  To my mind, it doesn't make sense 
to add a distance in kilometres to a distance in nanometres and expect 
sensible results.  If you have a situation where what is called for 
(maybe you work are CERN), then you are probably better off storing all 
your distances as int64_t femometer units.


>> I don't think I'm alone in wishing that more programmers used stricter
>> coding practices
>
> On the contrary, we are very strict on coding practices in order to
> get accurate results.
>

I realise that, but I don't think you count as "most programmers" in 
this sense - as far as I can tell, you have rather specialised needs. 
You have different requirements than mine, but at least as strict.

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-26  8:12                                   ` Vincent Lefevre
@ 2011-12-26 13:00                                     ` David Brown
  2011-12-26 13:22                                       ` Vincent Lefevre
  0 siblings, 1 reply; 56+ messages in thread
From: David Brown @ 2011-12-26 13:00 UTC (permalink / raw)
  To: gcc-help

On 26/12/11 02:15, Vincent Lefevre wrote:
> On 2011-12-24 12:11:06 +0100, David Brown wrote:
>> On 23/12/11 21:25, Vincent Lefevre wrote:
>>> [*] This example isn't much different from
>>>
>>>    http://www.ima.umn.edu/~arnold/disasters/patriot.html
>>>
>>> (where all calculations could have been done exactly, if the code
>>> were better designed), which lead to 28 people killed.
>>
>> This is an example where floating point should not have been used - no
>> compiler can warn about such inappropriate use of tools and lack of
>> understanding about the job in hand.
>
> Actually it wasn't floating point, but fixed point, i.e. something
> like scaled integers. Anyway the problem is the same, whether it is
> implemented in fixed point or floating point: an initial error that
> accumulates... Now, since you can also use integers to represent
> approximations of real values (planes, at least not the latest ones,
> use integer arithmetic for their calculations, so you see...), would
> you also ban the equality test on integers?
>

No, integer arithmetic is well-defined and consistent.  However, I would 
put big question marks about code that relied on the accuracy of 
division, since it generally can't be done exactly in integers.  But 
that's a matter of writing the code correctly, and problems should be 
spotted by code reviews rather than the compiler.

>> In defence of the programmer, however, it should be noted that the Patriots
>> were never designed for intercepting Scuds, and there were a lot of reasons
>> why they did a poor job of it (with issues in the software, electronics and
>> mechanics).
>
> If the limitations were properly documented, users would have detected
> the potential failure. Or perhaps they didn't RTFM. :)
>

I don't know the details of how the Patriots ended up being used here 
(though I do know someone who does).  It could have been something as 
obvious as "we know they are not ideal, but they are the best we have". 
  Success rates of Patriot interception of Scuds was somewhere between 
0% and 97%, depending on who you ask and how you measure it.  Of course, 
if both sides believed them to be successful at the time, then they 
/were/ successful in some ways - regardless of algorithm errors.

I think this is getting a touch off-topic for this newsgroup, however.


^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-26 11:48                                 ` David Brown
@ 2011-12-26 13:07                                   ` Vincent Lefevre
  2011-12-26 13:37                                     ` Tim Prince
  0 siblings, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-26 13:07 UTC (permalink / raw)
  To: gcc-help

On 2011-12-26 12:37:27 +0100, David Brown wrote:
> If it matters that "a + b - c" be calculated "(a + b) - c" or "a + (b - c)",
> then use brackets.

but brackets shouldn't change anything with -fassociative-math.

In C, brackets are purely syntactic, i.e. a + b - c is equivalent
to (a + b) - c.

>  That what you show the compiler and people reading the code what's
> going on.

Not in C.

>  In my type of programming, if you need to be sure of precise
> calculations, you always use integer arithmetic - it's clear and
> well-defined, with no scope for inaccuracies, rounding errors, etc.

This is the contrary. IEEE 754 completely specifies the floating-point
arithmetic, even more than how integer arithmetic is specified by C:
in C, an overflow on signed integers gives undefined behavior, while
IEEE 754 specifies all the floating-point exceptions, and everything
is well defined.

BTW, concerning integer arithmetic vs floating-point arithmetic,
there are good reasons why the plane industry is switching from
integer arithmetic to floating-point arithmetic... Except for
specific problems, integer arithmetic for computations on real
numbers is the past.

>  And it's not uncommon that you have to use brackets like this to
> ensure everything works out correctly.

If you assume that this will make a difference, that's a bad practice.

> In your example, I don't see how a difference in the ordering of
> calculations will make a difference to the result of more than an
> LSB or two unless there were wildly differing magnitudes involved.

The difference is precisely when differing magnitudes are involved.
This happens in practice. Very often.

>  And that's something to avoid - you are never going to get accurate
> results by adding or subtracting values of wildly different
> magnitudes in floating point. You may be able to get consistent,
> precise, well-defined, portable results - but you will not get
> correct or meaningful results.

You're wrong. When the operations are done in a well-specified
order, one can get very accurate results (one can even emulate
precisions higher than the one of the floating-point system),
and one can *prove* these results. I really suggest that you
read our "Handbook of Floating-Point Arithmetic".

> My point is that you have to take that kind of effect into account
> when writing the algorithm and the code. To my mind, it doesn't make
> sense to add a distance in kilometres to a distance in nanometres
> and expect sensible results. If you have a situation where what is
> called for (maybe you work are CERN), then you are probably better
> off storing all your distances as int64_t femometer units.

I don't work at CERN and personally don't work with physical
quantities. But for instance, if you want to calculate a derivative
numerically (something that happens very often in scientific code,
either directly or indirectly), you will have cancellations and the
magnitudes greatly differ. Magnitudes also differ when one wants to
emulate a greater precision as said above (something quite specific,
but necessary when one wants to implement some mathematical function
accurately).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-26 13:00                                     ` David Brown
@ 2011-12-26 13:22                                       ` Vincent Lefevre
  0 siblings, 0 replies; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-26 13:22 UTC (permalink / raw)
  To: gcc-help

On 2011-12-26 12:56:50 +0100, David Brown wrote:
> No, integer arithmetic is well-defined and consistent.

No, with integer arithmetic you can have overflows, which are not
specified and where things can badly fail, just like the Ariane 5
explosion:

  http://www.ima.umn.edu/~arnold/disasters/ariane.html

If you try to implement calculations on real numbers in integer
arithmetic instead of floating-point, you will have the same kinds
of problems (if you want to compare 2 approximations, the equality
operator won't work either), except that you will need to handle
more difficulties manually.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-26 13:07                                   ` Vincent Lefevre
@ 2011-12-26 13:37                                     ` Tim Prince
  2011-12-26 14:01                                       ` Vincent Lefevre
  0 siblings, 1 reply; 56+ messages in thread
From: Tim Prince @ 2011-12-26 13:37 UTC (permalink / raw)
  To: gcc-help

On 12/26/2011 7:59 AM, Vincent Lefevre wrote:
> On 2011-12-26 12:37:27 +0100, David Brown wrote:
>> If it matters that "a + b - c" be calculated "(a + b) - c" or "a + (b - c)",
>> then use brackets.
>
> but brackets shouldn't change anything with -fassociative-math.
>
> In C, brackets are purely syntactic, i.e. a + b - c is equivalent
> to (a + b) - c.
This was so prior to 1989, but the rules changed with the advent of ISO 
standards.  Even where compilers support algebraic simplification across 
parentheses in violation of the standards, the results are unreliable as 
well as non-portable.
-- 
Tim Prince

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-26 13:37                                     ` Tim Prince
@ 2011-12-26 14:01                                       ` Vincent Lefevre
  2011-12-26 14:39                                         ` Tim Prince
  0 siblings, 1 reply; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-26 14:01 UTC (permalink / raw)
  To: gcc-help

On 2011-12-26 08:22:11 -0500, Tim Prince wrote:
> On 12/26/2011 7:59 AM, Vincent Lefevre wrote:
> >On 2011-12-26 12:37:27 +0100, David Brown wrote:
> >>If it matters that "a + b - c" be calculated "(a + b) - c" or "a + (b - c)",
> >>then use brackets.
> >
> >but brackets shouldn't change anything with -fassociative-math.
> >
> >In C, brackets are purely syntactic, i.e. a + b - c is equivalent
> >to (a + b) - c.
> This was so prior to 1989, but the rules changed with the advent of ISO
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> standards.  Even where compilers support algebraic simplification across
> parentheses in violation of the standards, the results are unreliable as
> well as non-portable.

Wrong! The ISO C standard even gives an example (5.1.2.3p14) saying
that an expression like a + b - c is equivalent to (a + b) - c.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-26 14:01                                       ` Vincent Lefevre
@ 2011-12-26 14:39                                         ` Tim Prince
  2011-12-26 16:40                                           ` Vincent Lefevre
  0 siblings, 1 reply; 56+ messages in thread
From: Tim Prince @ 2011-12-26 14:39 UTC (permalink / raw)
  To: gcc-help

On 12/26/2011 8:36 AM, Vincent Lefevre wrote:
> On 2011-12-26 08:22:11 -0500, Tim Prince wrote:
>> On 12/26/2011 7:59 AM, Vincent Lefevre wrote:
>>> On 2011-12-26 12:37:27 +0100, David Brown wrote:
>>>> If it matters that "a + b - c" be calculated "(a + b) - c" or "a + (b - c)",
>>>> then use brackets.
>>>
>>> but brackets shouldn't change anything with -fassociative-math.
>>>
>>> In C, brackets are purely syntactic, i.e. a + b - c is equivalent
>>> to (a + b) - c.
>> This was so prior to 1989, but the rules changed with the advent of ISO
>    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> standards.  Even where compilers support algebraic simplification across
>> parentheses in violation of the standards, the results are unreliable as
>> well as non-portable.
>
> Wrong! The ISO C standard even gives an example (5.1.2.3p14) saying
> that an expression like a + b - c is equivalent to (a + b) - c.
>
True, in the case where left-to-right evaluation is not over-ruled by 
parens.  However, certain compilers which ignore parens also have no 
reliable left-to-right or right-to-left evaluation rules.

-- 
Tim Prince

^ permalink raw reply	[flat|nested] 56+ messages in thread

* Re: Floating point performance issue
  2011-12-26 14:39                                         ` Tim Prince
@ 2011-12-26 16:40                                           ` Vincent Lefevre
  0 siblings, 0 replies; 56+ messages in thread
From: Vincent Lefevre @ 2011-12-26 16:40 UTC (permalink / raw)
  To: gcc-help

On 2011-12-26 09:00:40 -0500, Tim Prince wrote:
> True, in the case where left-to-right evaluation is not over-ruled by
> parens.  However, certain compilers which ignore parens also have no
> reliable left-to-right or right-to-left evaluation rules.

OK. These compilers should really be avoided (at least to compile
3rd-party code).

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2011-12-26 14:39 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-12-20  9:52 Floating point performance issue Ico
2011-12-20 10:05 ` Marcin Mirosław
2011-12-20 10:20   ` Ico
2011-12-20 10:34     ` Jonathan Wakely
2011-12-20 10:43       ` Ico
2011-12-20 11:24       ` Vincent Lefevre
2011-12-20 11:51         ` Dario Saccavino
2011-12-20 12:02           ` Ico
2011-12-20 12:12           ` Vincent Lefevre
2011-12-20 12:28             ` Tim Prince
2011-12-20 12:43             ` Segher Boessenkool
2011-12-20 13:02               ` Vincent Lefevre
2011-12-20 19:51                 ` Segher Boessenkool
2011-12-20 21:02                   ` Vincent Lefevre
2011-12-21  4:36                     ` Segher Boessenkool
2011-12-21  6:15                       ` Segher Boessenkool
2011-12-23 20:25                         ` Vincent Lefevre
2011-12-20 13:43             ` David Brown
2011-12-20 13:58               ` Vincent Lefevre
2011-12-20 14:25                 ` David Brown
2011-12-20 15:05                   ` Vincent Lefevre
2011-12-20 15:44                     ` David Brown
2011-12-20 16:18                       ` Vincent Lefevre
2011-12-20 22:32                         ` David Brown
2011-12-23 20:11                           ` Vincent Lefevre
2011-12-24  7:38                             ` Vincent Lefevre
2011-12-24 11:11                             ` David Brown
2011-12-26  1:15                               ` Vincent Lefevre
2011-12-26 11:48                                 ` David Brown
2011-12-26 13:07                                   ` Vincent Lefevre
2011-12-26 13:37                                     ` Tim Prince
2011-12-26 14:01                                       ` Vincent Lefevre
2011-12-26 14:39                                         ` Tim Prince
2011-12-26 16:40                                           ` Vincent Lefevre
2011-12-21  1:19                       ` Miles Bader
2011-12-21  2:19                         ` David Brown
2011-12-21  4:03                           ` Miles Bader
2011-12-21  8:32                             ` David Brown
2011-12-21  9:02                               ` Miles Bader
2011-12-21  9:23                                 ` David Brown
2011-12-21 11:58                                   ` Miles Bader
2011-12-21 16:49                                     ` David Brown
2011-12-22  3:23                                       ` Miles Bader
2011-12-23 21:15                               ` Vincent Lefevre
2011-12-24 17:18                                 ` David Brown
2011-12-26  8:12                                   ` Vincent Lefevre
2011-12-26 13:00                                     ` David Brown
2011-12-26 13:22                                       ` Vincent Lefevre
2011-12-20 15:45                     ` Jeff Kenton
2011-12-20 11:44     ` David Brown
2011-12-20 11:49       ` David Brown
2011-12-20 10:46 ` Marc Glisse
2011-12-20 11:11   ` Ico
2011-12-20 11:16   ` Vincent Lefevre
2011-12-20 12:00     ` Vincent Lefevre
2011-12-20 12:21 ` Tim Prince

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).