public inbox for gcc-help@gcc.gnu.org
 help / color / mirror / Atom feed
* floating point precision  on gcc-4 differs using variables or arrays
@ 2005-03-25 11:34 Asfand Yar Qazi
  2005-03-25 11:47 ` Brian Budge
  0 siblings, 1 reply; 8+ messages in thread
From: Asfand Yar Qazi @ 2005-03-25 11:34 UTC (permalink / raw)
  To: gcc-help

Hi,

Consider the following little bunch of code (perpared specially for 
this question:)

-------------CODE---------------
#ifdef __cplusplus
extern "C"
#endif
int printf(const char *format, ...);

/*

Lesson: store stuff either as variables, or in arrays - not both!

*/

float a[4] = {0.123, -0.231, 0.652, 1};
float b[4] = {-0.523, -9.6421, 0.0123, 1};

float a0 = 0.123,  a1 = -0.231, a2 = 0.652, a3 = 1;
float b0 = -0.523, b1 = -9.6421, b2 = 0.0123, b3 = 1;

const int loopcount = 1000000;

void
thefunc1(void)
{
	int i;
	for(i = 0; i < loopcount; ++i) {
		a[0] = (b[0] * 0.9999f) + (b[0] * 0.00001f);
		a[1] = (b[1] * 0.9999f) + (b[1] * 0.00001f);
		a[2] = (b[2] * 0.9999f) + (b[2] * 0.00001f);

		b[0] = a[0];
		b[1] = a[1];
		b[2] = a[2];
	}
}

void
thefunc2(void)
{
	int i;
	for(i = 0; i < loopcount; ++i) {
		a0 = (b0 * 0.9999f) + (b0 * 0.00001f);
		a1 = (b1 * 0.9999f) + (b1 * 0.00001f);
		a2 = (b2 * 0.9999f) + (b2 * 0.00001f);

		b0 = a0;
		b1 = a1;
		b2 = a2;
	}
}

int
main()
{
	thefunc1();
	thefunc2();

	printf("t1: [%.24e, %.24e, %.24e, %.24e]\n", a[0], a[1], a[2], a[3]);
	printf("t2: [%.24e, %.24e, %.24e, %.24e]\n", a0, a1, a2, a3);

	return 0;
}
-------------CODE---------------

Now, consider the output, using gcc 4 cvs and gcc 3.4.3 (compilation 
flags:  -ffast-math -march=pentium3 -msse -mfpmath=387 -O3 
-fno-unroll-loops )

./gcc_error-3.4.3
t1: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39, 
1.000768933567725568230718e-41, 1.000000000000000000000000e+00]
t2: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39, 
1.000807363220784352053728e-41, 1.000000000000000000000000e+00]

./gcc_error-4
t1: [-4.255302206601370422491529e-40, -7.845133972967615879964511e-39, 
1.000768933567725568230718e-41, 1.000000000000000000000000e+00]
t2: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39, 
1.000807363220784352053728e-41, 1.000000000000000000000000e+00]

Note how on gcc 3.4.3, using variables or arrays of floats gives the 
same results.  However, on gcc 4, it seems this is no longer the case 
(using the above flags, anyhow.)

Its not a bug, I assume, but could someone explain it to me why this 
is happening?

Then, observe the following with sse fpmath (flags: -ffast-math 
-march=pentium3 -msse -mfpmath=sse -O3 -fno-unroll-loops)

./gcc_error-3.4.3
t1: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39, 
9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
t2: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39, 
9.890364561204558886579683e-42, 1.000000000000000000000000e+00]

./gcc_error-4
t1: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39, 
9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
t2: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39, 
9.890364561204558886579683e-42, 1.000000000000000000000000e+00]

Using SSE seems to give the sames answers using variables or arrays. 
Wha?!?!?!  Now I'm even more confused.

Could someone explain the above to me?  Why is there a difference 
using arrays or variables in 387 maths, and not in SSE maths?

Also, is it better (i.e. more efficient, faster, etc.) to use 
variables or arrays to hold the data in matrix/vector classes in C++?

Thanks,
	Asfand Yar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: floating point precision on gcc-4 differs using variables or arrays
  2005-03-25 11:34 floating point precision on gcc-4 differs using variables or arrays Asfand Yar Qazi
@ 2005-03-25 11:47 ` Brian Budge
  2005-03-25 12:41   ` Asfand Yar Qazi
  0 siblings, 1 reply; 8+ messages in thread
From: Brian Budge @ 2005-03-25 11:47 UTC (permalink / raw)
  To: Asfand Yar Qazi; +Cc: gcc-help

I'm suspecting that sse is being used for the arrays because they
happen to be appropriately sized.  If I remember correctly, gcc4 was
to introduce some autovectorization.  Perhaps that's whats going on.

  Brian


On Fri, 25 Mar 2005 10:39:07 +0000, Asfand Yar Qazi
<email@asfandyar.cjb.net> wrote:
> Hi,
> 
> Consider the following little bunch of code (perpared specially for
> this question:)
> 
> -------------CODE---------------
> #ifdef __cplusplus
> extern "C"
> #endif
> int printf(const char *format, ...);
> 
> /*
> 
> Lesson: store stuff either as variables, or in arrays - not both!
> 
> */
> 
> float a[4] = {0.123, -0.231, 0.652, 1};
> float b[4] = {-0.523, -9.6421, 0.0123, 1};
> 
> float a0 = 0.123,  a1 = -0.231, a2 = 0.652, a3 = 1;
> float b0 = -0.523, b1 = -9.6421, b2 = 0.0123, b3 = 1;
> 
> const int loopcount = 1000000;
> 
> void
> thefunc1(void)
> {
>         int i;
>         for(i = 0; i < loopcount; ++i) {
>                 a[0] = (b[0] * 0.9999f) + (b[0] * 0.00001f);
>                 a[1] = (b[1] * 0.9999f) + (b[1] * 0.00001f);
>                 a[2] = (b[2] * 0.9999f) + (b[2] * 0.00001f);
> 
>                 b[0] = a[0];
>                 b[1] = a[1];
>                 b[2] = a[2];
>         }
> }
> 
> void
> thefunc2(void)
> {
>         int i;
>         for(i = 0; i < loopcount; ++i) {
>                 a0 = (b0 * 0.9999f) + (b0 * 0.00001f);
>                 a1 = (b1 * 0.9999f) + (b1 * 0.00001f);
>                 a2 = (b2 * 0.9999f) + (b2 * 0.00001f);
> 
>                 b0 = a0;
>                 b1 = a1;
>                 b2 = a2;
>         }
> }
> 
> int
> main()
> {
>         thefunc1();
>         thefunc2();
> 
>         printf("t1: [%.24e, %.24e, %.24e, %.24e]\n", a[0], a[1], a[2], a[3]);
>         printf("t2: [%.24e, %.24e, %.24e, %.24e]\n", a0, a1, a2, a3);
> 
>         return 0;
> }
> -------------CODE---------------
> 
> Now, consider the output, using gcc 4 cvs and gcc 3.4.3 (compilation
> flags:  -ffast-math -march=pentium3 -msse -mfpmath=387 -O3
> -fno-unroll-loops )
> 
> ./gcc_error-3.4.3
> t1: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39,
> 1.000768933567725568230718e-41, 1.000000000000000000000000e+00]
> t2: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39,
> 1.000807363220784352053728e-41, 1.000000000000000000000000e+00]
> 
> ./gcc_error-4
> t1: [-4.255302206601370422491529e-40, -7.845133972967615879964511e-39,
> 1.000768933567725568230718e-41, 1.000000000000000000000000e+00]
> t2: [-4.255309033630528751103380e-40, -7.845134420060880251139727e-39,
> 1.000807363220784352053728e-41, 1.000000000000000000000000e+00]
> 
> Note how on gcc 3.4.3, using variables or arrays of floats gives the
> same results.  However, on gcc 4, it seems this is no longer the case
> (using the above flags, anyhow.)
> 
> Its not a bug, I assume, but could someone explain it to me why this
> is happening?
> 
> Then, observe the following with sse fpmath (flags: -ffast-math
> -march=pentium3 -msse -mfpmath=sse -O3 -fno-unroll-loops)
> 
> ./gcc_error-3.4.3
> t1: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39,
> 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
> t2: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39,
> 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
> 
> ./gcc_error-4
> t1: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39,
> 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
> t2: [-4.255449163476961232810473e-40, -7.845069960331521309554464e-39,
> 9.890364561204558886579683e-42, 1.000000000000000000000000e+00]
> 
> Using SSE seems to give the sames answers using variables or arrays.
> Wha?!?!?!  Now I'm even more confused.
> 
> Could someone explain the above to me?  Why is there a difference
> using arrays or variables in 387 maths, and not in SSE maths?
> 
> Also, is it better (i.e. more efficient, faster, etc.) to use
> variables or arrays to hold the data in matrix/vector classes in C++?
> 
> Thanks,
>         Asfand Yar
>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: floating point precision on gcc-4 differs using variables or arrays
  2005-03-25 11:47 ` Brian Budge
@ 2005-03-25 12:41   ` Asfand Yar Qazi
  2005-03-25 16:10     ` Eljay Love-Jensen
  0 siblings, 1 reply; 8+ messages in thread
From: Asfand Yar Qazi @ 2005-03-25 12:41 UTC (permalink / raw)
  To: Brian Budge; +Cc: gcc-help

Brian Budge wrote:
> I'm suspecting that sse is being used for the arrays because they
> happen to be appropriately sized.  If I remember correctly, gcc4 was
> to introduce some autovectorization.  Perhaps that's whats going on.
> 
>   Brian
> 
> 


SSE is being used because I told it to (-mfpmath=sse)

That's not the issue, I don't want to insult your intelligence, I hope 
you won't get offended if I ask you to re-read the original post.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: floating point precision on gcc-4 differs using variables  or arrays
  2005-03-25 12:41   ` Asfand Yar Qazi
@ 2005-03-25 16:10     ` Eljay Love-Jensen
  2005-03-25 18:56       ` Asfand Yar Qazi
  0 siblings, 1 reply; 8+ messages in thread
From: Eljay Love-Jensen @ 2005-03-25 16:10 UTC (permalink / raw)
  To: Asfand Yar Qazi; +Cc: gcc-help

Hi Asfand,

#1:  C does a lot of its calculations in double, or long double (depending on platform).

#2:  Intermediate results are often in long double.

#3:  Consider carefully if you should be using float, double, long double, integer or fixed point for your purposes.  The floating point numbers are finite precision.

If you are concerned about twiddly floating point number differences (which, otherwise, are to be EXPECTED)....

#4: Use -msoft-float to avoid twitchy hardware differences.  (I'm not sure if this is 100% IEEE 754 compliant; check the documentation.)  This will affect performance adversely.

Also...

As I recall, in the old days (pre-GCC 4.0), for many operations you'd have to explicitly use SSE instructions.  Still required the SSE flag to be explicitly given.  Hence, I believe Brian Budge's comments are valid and shouldn't be discounted out of hand.

#5: GCC 4.x has not been released yet.  Use with caveats.

HTH,
--Eljay

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: floating point precision on gcc-4 differs using variables  or arrays
  2005-03-25 16:10     ` Eljay Love-Jensen
@ 2005-03-25 18:56       ` Asfand Yar Qazi
  2005-03-25 19:31         ` Eljay Love-Jensen
  0 siblings, 1 reply; 8+ messages in thread
From: Asfand Yar Qazi @ 2005-03-25 18:56 UTC (permalink / raw)
  To: Eljay Love-Jensen; +Cc: gcc-help

Eljay Love-Jensen wrote:
> Hi Asfand,
> 
> #1:  C does a lot of its calculations in double, or long double (depending on platform).

Since this is for a game, I only need as little precision as possible. 
  Probably only about 3-4 decimal places.  But since the "whole number 
part" of the numbers (I forget the offical term, forgive me) need to 
be arbitrarily large, I need to use floats.

> 
> #2:  Intermediate results are often in long double.

Didn't know that.  That kind of undermines the use of floats for 
maximum speed.

> 
> #3:  Consider carefully if you should be using float, double, long double, integer or fixed point for your purposes.  The floating point numbers are finite precision.
> 
> If you are concerned about twiddly floating point number differences (which, otherwise, are to be EXPECTED)....
> 

See above.  I'm not concerned about precision, I just need to know 
such errors are usual.

> #4: Use -msoft-float to avoid twitchy hardware differences.  (I'm not sure if this is 100% IEEE 754 compliant; check the documentation.)  This will affect performance adversely.
> 
> Also...
> 
> As I recall, in the old days (pre-GCC 4.0), for many operations you'd have to explicitly use SSE instructions.  Still required the SSE flag to be explicitly given.  Hence, I believe Brian Budge's comments are valid and shouldn't be discounted out of hand.

I thank him for his input, but I checked the assembler in gcc 3.4.3, 
and SSE floating point was being used.  It seems to be used more 
efficiently in gcc-4-cvs, which is a good thing.

> 
> #5: GCC 4.x has not been released yet.  Use with caveats.

I understand that - but the errors would have been far more off it it 
had been a bug.  I just needed confirmation that this sort of stuff 
was expected - and it seems it is, so that's fine.

All the above is well and good, but please could someone answer the 
following question?  Is it easier for the compiler to work with 
numbers as variables (float a, float b, etc.) or as arrays (float 
data[4])  ?  By easier, I mean is it easier for it to perform 
optimisations.

Thanks,
	Asfand Yar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: floating point precision on gcc-4 differs using variables  or arrays
  2005-03-25 18:56       ` Asfand Yar Qazi
@ 2005-03-25 19:31         ` Eljay Love-Jensen
  2005-03-25 19:43           ` Asfand Yar Qazi
  0 siblings, 1 reply; 8+ messages in thread
From: Eljay Love-Jensen @ 2005-03-25 19:31 UTC (permalink / raw)
  To: Asfand Yar Qazi; +Cc: gcc-help

Hi Asfand,

>All the above is well and good, but please could someone answer the following question?  Is it easier for the compiler to work with numbers as variables (float a, float b, etc.) or as arrays (float data[4])  ?  By easier, I mean is it easier for it to perform optimisations.

I do not know.

I do know that the only way to really be certain is to profile your code with float, and profile it with double and measure the actual performance difference.

I also know that in C, the float data type is a second rate citizen, compared to double.  That may cause code pessimization as the compiler promotes float to double for parameter passing.  (C++ makes float a first rate data type; but there are still some C legacy aspects to C++ that haunts float.)  Passing in float* and/or working with float arrays will curtail that promotion behavior.

In the olden days, the rule of thumb was to avoid floating point data types for performance driven games.  But with these new fangled CPUs, floating point data types are as good (and in some cases, better!) than integer types.  So if you read something that recommends avoiding floating point, it may be out of step with current hardware.  Just a FYI.

HTH,
--Eljay

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: floating point precision on gcc-4 differs using variables  or arrays
  2005-03-25 19:31         ` Eljay Love-Jensen
@ 2005-03-25 19:43           ` Asfand Yar Qazi
  2005-03-25 21:54             ` Asfand Yar Qazi
  0 siblings, 1 reply; 8+ messages in thread
From: Asfand Yar Qazi @ 2005-03-25 19:43 UTC (permalink / raw)
  To: Eljay Love-Jensen; +Cc: gcc-help

Eljay Love-Jensen wrote:
> Hi Asfand,
> 
> 
>>All the above is well and good, but please could someone answer the following question?  Is it easier for the compiler to work with numbers as variables (float a, float b, etc.) or as arrays (float data[4])  ?  By easier, I mean is it easier for it to perform optimisations.
> 
> 
> I do not know.
> 
> I do know that the only way to really be certain is to profile your code with float, and profile it with double and measure the actual performance difference.
> 
> I also know that in C, the float data type is a second rate citizen, compared to double.  That may cause code pessimization as the compiler promotes float to double for parameter passing.  (C++ makes float a first rate data type; but there are still some C legacy aspects to C++ that haunts float.)  Passing in float* and/or working with float arrays will curtail that promotion behavior.

Ah!  I'm using C++, as it happens, in a template expression based 
component wise vector operations framework.  I'll use it in my 
software 3D engine.  I want to make several code-compatible vector 
classes, so I can create several libraries of my code, each using one 
of them, and then load the appropriate one depending on the processor 
the user has.

So one conponent-wise one for normal 387 operation for use on Pentium 
2's and other stuff that only has a 387 unit, one SSE 1/2 one (using 
xmmintrin.h or whatever its called :-) for SSE 1/2 using stuff, one 3D 
Now+ for older Duron and Athlon XP chips, one SSE 3 for my new Opteron 
4200+ ( :-)  ), one AltiVec using one for when I own IBM, etc.

Why go to all the trouble?  'cos its fun, of course.

> 
> In the olden days, the rule of thumb was to avoid floating point data types for performance driven games.  But with these new fangled CPUs, floating point data types are as good (and in some cases, better!) than integer types.  So if you read something that recommends avoiding floating point, it may be out of step with current hardware.  Just a FYI.

Actually, Quake required a floating point unit, 'cos it used floats. 
I think.  I tried comparing adding integers to adding floats, and on a 
Pentium 2 anyway, it ended up being nearly the same.  The trouble 
comes when converting floating point coordinate values to integers for 
rasterization - I hope GCC's built-in routines are quick enough :-)

Anyway, the floats will be stored in classes, and as many inline 
functions will be used as possible.  So there's not really a need for 
passing too many floats around directly, just "here's a pointer to a 
scene graph, Mr. 3D renderer, render it!" or something.

As I said, lots of operations need to be performed on floating point 
data (i.e. 3D transformation of objects in memory, rasterisation, 
etc.) so the smaller the number, the simpler the operation, the better.

I think I'll bung them in a 'float data[4]' array, thanks.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: floating point precision on gcc-4 differs using variables  or arrays
  2005-03-25 19:43           ` Asfand Yar Qazi
@ 2005-03-25 21:54             ` Asfand Yar Qazi
  0 siblings, 0 replies; 8+ messages in thread
From: Asfand Yar Qazi @ 2005-03-25 21:54 UTC (permalink / raw)
  To: Asfand Yar Qazi; +Cc: gcc-help

Guess what?  Using doubles ends up faster that floats with 387 maths.

Crazy, but hey, that teaches me.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-03-25 21:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-03-25 11:34 floating point precision on gcc-4 differs using variables or arrays Asfand Yar Qazi
2005-03-25 11:47 ` Brian Budge
2005-03-25 12:41   ` Asfand Yar Qazi
2005-03-25 16:10     ` Eljay Love-Jensen
2005-03-25 18:56       ` Asfand Yar Qazi
2005-03-25 19:31         ` Eljay Love-Jensen
2005-03-25 19:43           ` Asfand Yar Qazi
2005-03-25 21:54             ` Asfand Yar Qazi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).