Re: Using crlibm as the default math library in GCC sources

public inbox for gcc@gcc.gnu.org
 help / color / mirror / Atom feed

* Re: Using crlibm as the default math library in GCC sources
@ 2007-11-12 16:24 François-Xavier Coudert
  0 siblings, 0 replies; 15+ messages in thread
From: François-Xavier Coudert @ 2007-11-12 16:24 UTC (permalink / raw)
  To: fortran; +Cc: GCC Development, Uros Bizjak

Hi Uros,

It sounds like a great idea! I'm forwarding this to the Fortran list,
because I think it might raise quite some interest there also :
http://gcc.gnu.org/ml/gcc/2007-11/msg00164.html

FX

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-17 13:41       ` Geert Bosch
  2007-11-17 14:39         ` Tim Prince
@ 2007-11-18 17:09         ` Vincent Lefevre
  1 sibling, 0 replies; 15+ messages in thread
From: Vincent Lefevre @ 2007-11-18 17:09 UTC (permalink / raw)
  To: Geert Bosch
  Cc: Michael Matz, Uros Bizjak, GCC, Florent de Dinechin, christoph.lauter

On 2007-11-16 19:33:38 -0500, Geert Bosch wrote:
> If a real number is rounded to the nearest number in a binary floating
> point format with 2*p+1 bits and that result rounded again to
> a p-bit float, the result is the same as if the original value was
> rounded to a p-bit float directly.

No, this is wrong: if you round a number x to 2p+1 bits and get
a number x' which is a rounding boundary for p-bit floats (i.e.
a p+1-bit number whose bit p+1 is 1), then x' will always be
rounded to some p-bit number x'', whether x was above or below x'.

For instance, consider
  log(1.1000101001001101000000010101000110010110101100010101 * 2^37).
The significand y of the result is:
  1.1010000101000001000011011100111011111110111101101101 0 1^56 0100...
  \_____________________ 53 bits ______________________/
where 1^56 means: a sequence of 56 bits 1. One has:

rnd_107(y) =
  1.1010000101000001000011011100111011111110111101101101 1
rnd_53(rnd_107(y)) =
  1.1010000101000001000011011100111011111110111101101110
due to the even-rounding rule.

But rnd_53(y) =
  1.1010000101000001000011011100111011111110111101101101

Of course, it is possible to prove that such a case doesn't happen
for some simple functions (perhaps the square root? I also have a
similar result for floor(x/y) [*]), but in general, no proofs are
known.

[*] http://www.vinc17.org/research/papers/rr_intdiv.pdf Section 5.2.

Alternatively, you need the ternary value for the first rounding (what
MPFR gives, but CRlibm doesn't give it).

-- 
Vincent LefÃ¨vre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-17 13:41       ` Geert Bosch
@ 2007-11-17 14:39         ` Tim Prince
  2007-11-18 17:09         ` Vincent Lefevre
  1 sibling, 0 replies; 15+ messages in thread
From: Tim Prince @ 2007-11-17 14:39 UTC (permalink / raw)
  To: Geert Bosch
  Cc: Vincent Lefevre, Michael Matz, Uros Bizjak, GCC,
	Florent de Dinechin, christoph.lauter

Geert Bosch wrote:
> 
> On Nov 14, 2007, at 05:27, Vincent Lefevre wrote:
> 
>>> Initially, float could simply use double and cast the result.
>>> For double->float the results will remain correctly rounded.
>>
>> Yes, very probably, but this needs to be proven for each supported
>> function, due to the double rounding problem (this may take some time
>> for functions like pow).
> 
> If a real number is rounded to the nearest number in a binary floating
> point format with 2*p+1 bits and that result rounded again to
> a p-bit float, the result is the same as if the original value was
> rounded to a p-bit float directly.
> 
>   -Geert
> 
> PS. Sorry, dont' have a reference to a proof handy at the moment.
In particular, when using x87 code to implement float, it makes no
difference for a single floating point operation followed by a store
whether 32- 53- or 64-bit precision mode is active.  The double rounding
problem becomes a big issue when substituting 64-bit precision mode for
double operations; hence the policy of Microsoft and certain BSD
variants of setting 53-bit precision mode.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-14 14:59     ` Vincent Lefevre
  2007-11-14 16:45       ` Joseph S. Myers
@ 2007-11-17 13:41       ` Geert Bosch
  2007-11-17 14:39         ` Tim Prince
  2007-11-18 17:09         ` Vincent Lefevre
  1 sibling, 2 replies; 15+ messages in thread
From: Geert Bosch @ 2007-11-17 13:41 UTC (permalink / raw)
  To: Vincent Lefevre
  Cc: Michael Matz, Uros Bizjak, GCC, Florent de Dinechin, christoph.lauter


On Nov 14, 2007, at 05:27, Vincent Lefevre wrote:

>> Initially, float could simply use double and cast the result.
>> For double->float the results will remain correctly rounded.
>
> Yes, very probably, but this needs to be proven for each supported
> function, due to the double rounding problem (this may take some time
> for functions like pow).

If a real number is rounded to the nearest number in a binary floating
point format with 2*p+1 bits and that result rounded again to
a p-bit float, the result is the same as if the original value was
rounded to a p-bit float directly.

   -Geert

PS. Sorry, dont' have a reference to a proof handy at the moment.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-13 18:55       ` Geert Bosch
@ 2007-11-15 21:31         ` Christoph Quirin Lauter
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Quirin Lauter @ 2007-11-15 21:31 UTC (permalink / raw)
  To: GCC; +Cc: Geert Bosch, Michael Matz, Uros Bizjak, Florent de Dinechin

Hello,

It's time for CRLibm developpers to step in this discussion.

We confirm that CRLibm is as fast as other portable libraries, or 
faster, and that it keeps improving (some benchmarks below). When we are 
slower, it is because we wanted cleaner code or smaller tables or we 
tuned the code on a given processor and it turns out later that it is a 
bad choice on others. In any case, we can still copy the faster code. 
Not to say that we are the best overall ever but we are concerned with 
performance.

We confirm that CRLibm can be turned into a 0.503 ulp (more or less) 
library at the cost of a few #ifs. We might even add a flag for that in 
the next release. Still it is proven 0.503 ulp, the proof won't go away. 
You will win on x86 5-20 cycles in average (depending on the function), 
and much more in worst case time.

Our opinion is that reproducibility (through correct rounding and in 
general C99 and standard compliance) should be the default for a system
AND that flags should be able to disable it if wanted.

Now we would like to know what GCC people mean with "having a libm with 
GCC". Is it
  a one-size-fits-all libm written in C ? With GCC-controlled #ifs ? 
With builtins etc ?
  a library written in GIMPLE ?
  a library generator within GCC ?
etc...

What we are interested in writing at the moment is a library generator 
(let us call it metalibm) that can output libms for mainstream 
precisions and optimize for various objectives (correct rounding or not, 
speed, memory footprint, register usage, parallelism, whatever).

We could right now write a simple metalibm that mostly removes the 
repetitive work from CRLibm development. This would be mostly printfs 
and ifs, and it would be simple enough one may imagine that it ends up 
in GCC codebase.

Then, since we have a fairly mature expertise with automatic generation 
of optimized polynomial approximation, we could enhance it so that it 
would also include such optimizations, and target single, 
double-extended and quad. Range reduction exploration is also at hand, 
but more target-specific optimisations are not. But then, this metalibm 
begins to depend on so many libraries and has such impredictible runtime 
(it should even have an ATLAS-like profiling mode) that nobody would 
want it in the GCC code base . And it is a lot more development, of 
course. And it is more interesting.

So what should we start with ?

We also confirm that many functions are missing from CRLibm. This is 
mostly a matter of workforce. We'd like to add some of the missing ones 
as a demonstration of a metalibm. Some are more difficult than the 
others, but the only really difficult one is gamma.
We may start with asinh and acosh, is that OK?

     Christoph and Florent

Here is a sample of performance results obtained on an Opteron
( Linux 2.6.17-2-amd64
   gcc (GCC) 4.1.2 20060901 (prerelease) (Debian 4.1.1-13) )

      log     avg time max time
default libm   210    150919      Rem: 12KB of tables
  CRLibm        146       903      Rem: using double-extended arithmetic
                                    (this was an experiment)
  CRLibm        266      1459      Rem: this one using SSE2 doubles

     exp      avg time max time
default libm   141   1105468      REM: 13KB of tables
  CRLibm        184      1210

      sin     avg time max time
default libm   147   1018622
  CRLibm        171      3895

      cos     avg time max time
default libm   152    028302
  CRLibm        171      3752

      tan     avg time max time
default libm   244   1101230
  CRLibm        280      9949

     asin     avg time max time
default libm   107   1018823   Rem: 20KB of tables shard with acos
  CRLibm        316      2296

     acos     avg time max time
default libm    83   1018786
  CRLibm        264      3660

     atan     avg time max time
default libm   144    100066     Rem: 43KB of tables
  CRLibm        243      4724

    log10     avg time max time
default libm   249      1061    NOT correctly rounded
  CRLibm        321      1706

     log2     avg time max time
default libm   174       274    NOT correctly rounded
  CRLibm        320      1663

(while building this table I just noticed a bug line 27 of 
sysdeps/ieee754/dbl-64/sincos.tbl : 40 should be 440)

The same, on an IBM power5 server (time units are arbitrary)

      log     avg time max time
default libm    61     23471
  CRLibm         57       307   Rem: using double-precision arithmetic

     exp      avg time max time
default libm    41     25019
  CRLibm         42       242

      sin     avg time max time
default libm    37    132435
  CRLibm         43      1910

      cos     avg time max time
default libm    38    134045
  CRLibm         44      1946

      tan     avg time max time
default libm    65    141885
  CRLibm         72      4671

     asin     avg time max time
default libm    29    132912
  CRLibm         62       465

     acos     avg time max time
default libm    24    132798
  CRLibm         57       765

     atan     avg time max time
default libm    35     18785
  CRLibm         53      2315

    log10     avg time max time
default libm    65       153
  CRLibm         65       355

     log2     avg time max time
default libm    47        59
  CRLibm         65       351

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-14 14:59     ` Vincent Lefevre
@ 2007-11-14 16:45       ` Joseph S. Myers
  2007-11-17 13:41       ` Geert Bosch
  1 sibling, 0 replies; 15+ messages in thread
From: Joseph S. Myers @ 2007-11-14 16:45 UTC (permalink / raw)
  To: Vincent Lefevre
  Cc: Geert Bosch, Michael Matz, Uros Bizjak, GCC, Florent de Dinechin,
	christoph.lauter

On Wed, 14 Nov 2007, Vincent Lefevre wrote:

> On 2007-11-12 21:29:44 -0500, Geert Bosch wrote:
> > On Nov 12, 2007, at 12:37, Michael Matz wrote:
> >> * only double is implemented, hence long double and float are missing at
> >>  least, at least the long double would need some implementation work,
> >>  as you can't simply enlarge the mantissa and hope all your results are
> >>  still correct
> > Initially, float could simply use double and cast the result.
> > For double->float the results will remain correctly rounded.
> 
> Yes, very probably, but this needs to be proven for each supported
> function, due to the double rounding problem (this may take some time
> for functions like pow).

As I understand the mention of moving towards automatically generating 
functions, crlibm might eventually be able to provide functions for float 
that do most computations in double but use double double for the hard 
cases (like functions for double presently use double double for most 
cases and triple double for hard cases), through automatic generation 
given the type used for function inputs/outputs (float in this case) and 
the types available for intermediate computations (double in this case) as 
well as information on worst cases for rounding?

> > For x86, the use of -mfpmath=sse addresses most, if not all, issues
> > related to excess precision for float and double.
> 
> But not all x86 processors support SSE2. However, I suppose you can
> have crlibm support for some architectures only.

It also seems to me that automatic generation might help address the x86 
issues as well, by providing both SSE2 versions (IEEE double as 
intermediate type) and x87 versions (using long double in all intermediate 
computations to avoid excess precision issues).

> and consistent. For instance, sin(x)*sin(x) + cos(x)*cos(x) will
> always be very close to 1, as mathematically expected.

FWIW, Nick Maclaren once told me that many applications don't care about 
accurate argument reduction for large arguments to sin and cos (for many 
purposes the answer just need be within a few ulp of the correct answer 
for an argument within a few ulp of the given argument - which would allow 
any answer between -1 and 1 for sin and cos of large values), but may 
still expect that sin^2 + cos^2 is very close to 1.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-13  8:59   ` Geert Bosch
  2007-11-13 14:49     ` Michael Matz
@ 2007-11-14 14:59     ` Vincent Lefevre
  2007-11-14 16:45       ` Joseph S. Myers
  2007-11-17 13:41       ` Geert Bosch
  1 sibling, 2 replies; 15+ messages in thread
From: Vincent Lefevre @ 2007-11-14 14:59 UTC (permalink / raw)
  To: Geert Bosch
  Cc: Michael Matz, Uros Bizjak, GCC, Florent de Dinechin, christoph.lauter

On 2007-11-12 21:29:44 -0500, Geert Bosch wrote:
> On Nov 12, 2007, at 12:37, Michael Matz wrote:
>> * only double is implemented, hence long double and float are missing at
>>  least, at least the long double would need some implementation work,
>>  as you can't simply enlarge the mantissa and hope all your results are
>>  still correct
> Initially, float could simply use double and cast the result.
> For double->float the results will remain correctly rounded.

Yes, very probably, but this needs to be proven for each supported
function, due to the double rounding problem (this may take some time
for functions like pow).

> Proving correct rounding involves a lot of testing of problematic
> intervals. I'm not sure the work has been done for long double,
> or even whether it is feasible to do with the current state of
> the art.

If by "long double", you mean the x86 extended format, then this is
feasible in some domains for unary functions.

Now, it is always possible to implement Ziv's strategy up to a
sufficiently large precision; in practice, this would probably
be correct rounding (without being proven). You'll get good
performance in average. Of course, the functions may be slow on
the "worst cases", but such worst cases are rare (in practice,
I think this can be a problem only with real-time applications,
where you need to bound the worst case).

>> * many C99 functions are missing: a{sin,cos}h, frexp, ldexp, modf, error
>>  and gamma, remainder functions, I stoped looking somewhen
>>  many only partially or slowly implemented: pow, exp2
> Most of these can be easily implemented based on the other primitives.
> True, the property of correctly rounded results will be lost, but
> that does not throw away the usefulness of having the other functions
> be correctly rounded.

Worse than losing correct rounding, the accuracy may be quite bad in
some domains.

>> * relies on a IEEE-754 compatible processor, so it a) needs to have
>>  floating point at all and b) even has to use correct precision, e.g. a
>>  problem for x86.  That means that either crlibm doesn't work correctly
>>  when the processor's precision is reset by the user program (e.g. to use
>>  extended precision), or it has to save/set/restore the state on
>>  entry/exit of all it's routines
> For x86, the use of -mfpmath=sse addresses most, if not all, issues
> related to excess precision for float and double.

But not all x86 processors support SSE2. However, I suppose you can
have crlibm support for some architectures only.

> Even if GCC has its own math library, users should still be able to
> specify linking with the systems native math library. As -frounding

Yes.

> In case of the correctly rounded functions, it is incredibly useful
> to know that all answers are both deterministic and as accurate as
> possible.

and consistent. For instance, sin(x)*sin(x) + cos(x)*cos(x) will
always be very close to 1, as mathematically expected.

>> * slow: it's much faster than other correctly rounded libraries, but
>>  nevertheless also much slower then more mundane implementations of libm
> Do you have any numbers or approximations of how slow?

AFAIK, they are fast in average (Florent has some benchmarks).

> I think that in practice you'd probably have a number of
> implementations for the more popular functions, especially
> sin/cos, atan, log/exp. For reasonably good accuracy
> of the trigonometric functions (relative error less than
> 2 epsilon over entire domain), high-precision argument
> reduction is necessary. However, for many applications this
> is not an issue, and much more simplistic argument reduction
> can be used.

Yes, there can be a compile-time option and/or a pragma for that.

[...]
> It would be great for GCC to at least offer a mode in which the user
> can know that the math library gives the most accurate results possible.

I completely agree.

-- 
Vincent LefÃ¨vre <vincent@vinc17.org> - Web: <http://www.vinc17.org/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.org/blog/>
Work: CR INRIA - computer arithmetic / Arenaire project (LIP, ENS-Lyon)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-13 14:49     ` Michael Matz
@ 2007-11-13 18:55       ` Geert Bosch
  2007-11-15 21:31         ` Christoph Quirin Lauter
  0 siblings, 1 reply; 15+ messages in thread
From: Geert Bosch @ 2007-11-13 18:55 UTC (permalink / raw)
  To: Michael Matz; +Cc: Uros Bizjak, GCC, Florent de Dinechin, christoph.lauter

On Nov 13, 2007, at 08:17, Michael Matz wrote:
> You don't have to preach to the choir, I know why I looked at several
> libms in the past :-)
>
> I just say that crlibm is not it.  There are libms which would impose
> much less work, are more complete, faster and more proven in the real
> world.

Do you know if there is any documentation on the numerical aspects
and development objectives of this libm? I have looked at the
implementation of a few functions (such as double precision exp
and sin/cos) and they seem to be sound using accurate argument
reduction techniques and claim a maximum rounding error of 0.503 ulp,
which would be very good.

However, it is hard to look at specific functions and get a good
sense of the entire library and all precisions. It would be
great to have a library that provides good accuracy across all
functions and precisions, for a large range of architectures.
If the FreeBSD libm can provide that, I'm all for it.

When things get a bit more quiet here at work, I hope to be able
to look more into using this library and do some testing.

   -Geert

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-13  8:59   ` Geert Bosch
@ 2007-11-13 14:49     ` Michael Matz
  2007-11-13 18:55       ` Geert Bosch
  2007-11-14 14:59     ` Vincent Lefevre
  1 sibling, 1 reply; 15+ messages in thread
From: Michael Matz @ 2007-11-13 14:49 UTC (permalink / raw)
  To: Geert Bosch; +Cc: Uros Bizjak, GCC, Florent de Dinechin, christoph.lauter

Hi,

On Mon, 12 Nov 2007, Geert Bosch wrote:

> The big advantage of having a libm with GCC is that, instead of only 
> being able to rely on the lowest common denominator for accuracy of math 
> functions, you'll be able to rely on the same precision on all targets. 
> This is a huge benefit, especially on less popular operatings systems 
> and embedded systems, which often have crappy math libraries.

You don't have to preach to the choir, I know why I looked at several 
libms in the past :-)

I just say that crlibm is not it.  There are libms which would impose 
much less work, are more complete, faster and more proven in the real 
world.


Ciao,
Michael.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-12 18:16 ` Michael Matz
@ 2007-11-13  8:59   ` Geert Bosch
  2007-11-13 14:49     ` Michael Matz
  2007-11-14 14:59     ` Vincent Lefevre
  0 siblings, 2 replies; 15+ messages in thread
From: Geert Bosch @ 2007-11-13  8:59 UTC (permalink / raw)
  To: Michael Matz; +Cc: Uros Bizjak, GCC, Florent de Dinechin, christoph.lauter

On Nov 12, 2007, at 12:37, Michael Matz wrote:
> At the time we discussed and evaluated a bundled libm I've looked at
> several ones.  I rules out crlibm relatively quickly for these  
> reasons:
>
> * only double is implemented, hence long double and float are  
> missing at
>  least, at least the long double would need some implementation work,
>  as you can't simply enlarge the mantissa and hope all your results  
> are
>  still correct
Initially, float could simply use double and cast the result.
For double->float the results will remain correctly rounded.
Proving correct rounding involves a lot of testing of problematic
intervals. I'm not sure the work has been done for long double,
or even whether it is feasible to do with the current state of
the art.

> * many C99 functions are missing: a{sin,cos}h, frexp, ldexp, modf,  
> error
>  and gamma, remainder functions, I stoped looking somewhen
>  many only partially or slowly implemented: pow, exp2
Most of these can be easily implemented based on the other primitives.
True, the property of correctly rounded results will be lost, but
that does not throw away the usefulness of having the other functions
be correctly rounded.
>
> * relies on a IEEE-754 compatible processor, so it a) needs to have
>  floating point at all and b) even has to use correct precision,  
> e.g. a
>  problem for x86.  That means that either crlibm doesn't work  
> correctly
>  when the processor's precision is reset by the user program (e.g.  
> to use
>  extended precision), or it has to save/set/restore the state on
>  entry/exit of all it's routines
For x86, the use of -mfpmath=sse addresses most, if not all, issues
related to excess precision for float and double. Even if GCC has
its own math library, users should still be able to specify linking
with the systems native math library. As -frounding

The big advantage of having a libm with GCC is that, instead of
only being able to rely on the lowest common denominator for
accuracy of math functions, you'll be able to rely on the same
precision on all targets. This is a huge benefit, especially on
less popular operatings systems and embedded systems, which
often have crappy math libraries.

In case of the correctly rounded functions, it is incredibly useful
to know that all answers are both deterministic and as accurate as
possible.

In particular, it won't matter whether sin (x) will be computed
at compile time or at run time, the answer will be exactly the same.
As the GCC default is -fno-rounding-math, the use of crlibm
would be consistent with that.
>
> * slow: it's much faster than other correctly rounded libraries, but
>  nevertheless also much slower then more mundane implementations of  
> libm
Do you have any numbers or approximations of how slow?

I think that in practice you'd probably have a number of
implementations for the more popular functions, especially
sin/cos, atan, log/exp. For reasonably good accuracy
of the trigonometric functions (relative error less than
2 epsilon over entire domain), high-precision argument
reduction is necessary. However, for many applications this
is not an issue, and much more simplistic argument reduction
can be used.

Apart from fast and accurate implementations, there would
also be vectorized implementations and possibly implementations
that would be optimized for software floating point.
>
> I would advise against going the crlibm route.  Sorry.  It's very  
> nice for
> the things it's designed to do (provable and correctly rounded) but  
> is not
> a replacement for a general libm.
>
> I think the best way would be to use FreeBSDs libm, which is  
> maintained
> fairly well, implements all functions and is reasonably fast.  I heard
> rumors about other options (like Sony+IBMs SPU libm, which  
> supposedly also
> is written in C + intrinsics, but it never materialized in the wild).

It seems a mistake to view any libm bundled with GCC as a
full replacement of a system libm. For each function, there
lots of different implementations can be best:
   - most accurate
   - highest throughput while being accurate
   - highest throughput at reduced accuracy
   - shortest latency
   - best support for IEEE exceptions, rounding modes, etc
   - runs on widest range of hardware (portable, many precisions)
   - fastest with software floating-point

It would be great for GCC to at least offer a mode in which the user
can know that the math library gives the most accurate results possible.
In the Ada run time library, we've had to re-implement a lot of
the math functions because *some* system had poor accuracy in parts
of the domain.

Inadequate accuracy leads to failed tests, bug reports
and potentially even failure of deployed systems.
Performance sensitive users typically care about performance
of just a handful of functions (typically sin/cos/sqrt/atan)
and only in a small part of the domain.

We already optimize for those with -ffast-math and our
__builtin functions on at least x86. Having a basis that
is as accurate as possible and then the option to override
those with faster functions for highest performance where
appropriate would seem to give us the best of both worlds.

   -Geert

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-06 19:54 Uros Bizjak
  2007-11-11 15:06 ` Richard Sandiford
@ 2007-11-12 18:16 ` Michael Matz
  2007-11-13  8:59   ` Geert Bosch
  1 sibling, 1 reply; 15+ messages in thread
From: Michael Matz @ 2007-11-12 18:16 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: GCC, Florent de Dinechin, christoph.lauter

Hi,

On Tue, 6 Nov 2007, Uros Bizjak wrote:

> Forwarded is a short discussion with Mr. de Dinechin (forwarded with 
> permission), where the possibility to import crlibm as the default gcc 
> math library is discussed. I would like to ask members of the GCC SC for 
> the opinion on the idea of adopting crlibm as the default math library. 

At the time we discussed and evaluated a bundled libm I've looked at 
several ones.  I rules out crlibm relatively quickly for these reasons:

* only double is implemented, hence long double and float are missing at 
  least, at least the long double would need some implementation work,
  as you can't simply enlarge the mantissa and hope all your results are
  still correct
* many C99 functions are missing: a{sin,cos}h, frexp, ldexp, modf, error 
  and gamma, remainder functions, I stoped looking somewhen
  many only partially or slowly implemented: pow, exp2
* relies on a IEEE-754 compatible processor, so it a) needs to have 
  floating point at all and b) even has to use correct precision, e.g. a 
  problem for x86.  That means that either crlibm doesn't work correctly 
  when the processor's precision is reset by the user program (e.g. to use 
  extended precision), or it has to save/set/restore the state on 
  entry/exit of all it's routines
* slow: it's much faster than other correctly rounded libraries, but 
  nevertheless also much slower then more mundane implementations of libm

I would advise against going the crlibm route.  Sorry.  It's very nice for 
the things it's designed to do (provable and correctly rounded) but is not 
a replacement for a general libm.

I think the best way would be to use FreeBSDs libm, which is maintained 
fairly well, implements all functions and is reasonably fast.  I heard 
rumors about other options (like Sony+IBMs SPU libm, which supposedly also 
is written in C + intrinsics, but it never materialized in the wild).

Ciao,
Michael.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-12  2:34   ` Uros Bizjak
@ 2007-11-12 10:28     ` Joe Buck
  0 siblings, 0 replies; 15+ messages in thread
From: Joe Buck @ 2007-11-12 10:28 UTC (permalink / raw)
  To: Uros Bizjak
  Cc: GCC, Florent de Dinechin, christoph.lauter, rsandifo, David Edelsohn

On Sun, Nov 11, 2007 at 11:05:43AM +0100, Uros Bizjak wrote:
> Richard Sandiford wrote:
> >Uros Bizjak <ubizjak@gmail.com> writes:
> >  
> >>Forwarded is a short discussion with Mr. de Dinechin (forwarded with 
> >>permission), where the possibility to import crlibm as the default gcc 
> >>math library is discussed...
> >
> >That sounds like a great idea.  Do you know if the SC is already
> >considering it?  Or is it one of those cases where there needs
> >to be a bit more technical discussion first?
> 
> I don't have any further information from SC side. I have CC'd David in 
> the hope that perhaps he can update us with some info.

There hasn't been any SC discussion yet.  I suggest starting with
technical feasibility issues, so that there is a more concrete proposal to
consider.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-11 15:06 ` Richard Sandiford
@ 2007-11-12  2:34   ` Uros Bizjak
  2007-11-12 10:28     ` Joe Buck
  0 siblings, 1 reply; 15+ messages in thread
From: Uros Bizjak @ 2007-11-12  2:34 UTC (permalink / raw)
  To: Uros Bizjak, GCC, Florent de Dinechin, christoph.lauter, rsandifo
  Cc: David Edelsohn

Richard Sandiford wrote:
> Uros Bizjak <ubizjak@gmail.com> writes:
>   
>> Forwarded is a short discussion with Mr. de Dinechin (forwarded with 
>> permission), where the possibility to import crlibm as the default gcc 
>> math library is discussed. I would like to ask members of the GCC SC for 
>> the opinion on the idea of adopting crlibm as the default math library. 
>> This way, the library can be compiled using all knowledge that gcc has 
>> about the target processor. As hinted in the attached message - 
>> autovectorization and perhaps recently introduced fixed-point 
>> infrastructure can be used effectively in the implementation of this 
>> library, as well as new AMD's FMA instructions.
>>     
>
> That sounds like a great idea.  Do you know if the SC is already
> considering it?  Or is it one of those cases where there needs
> to be a bit more technical discussion first?
>   

I don't have any further information from SC side. I have CC'd David in 
the hope that perhaps he can update us with some info.

Uros.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: Using crlibm as the default math library in GCC sources
  2007-11-06 19:54 Uros Bizjak
@ 2007-11-11 15:06 ` Richard Sandiford
  2007-11-12  2:34   ` Uros Bizjak
  2007-11-12 18:16 ` Michael Matz
  1 sibling, 1 reply; 15+ messages in thread
From: Richard Sandiford @ 2007-11-11 15:06 UTC (permalink / raw)
  To: Uros Bizjak; +Cc: GCC, Florent de Dinechin, christoph.lauter

Uros Bizjak <ubizjak@gmail.com> writes:
> Forwarded is a short discussion with Mr. de Dinechin (forwarded with 
> permission), where the possibility to import crlibm as the default gcc 
> math library is discussed. I would like to ask members of the GCC SC for 
> the opinion on the idea of adopting crlibm as the default math library. 
> This way, the library can be compiled using all knowledge that gcc has 
> about the target processor. As hinted in the attached message - 
> autovectorization and perhaps recently introduced fixed-point 
> infrastructure can be used effectively in the implementation of this 
> library, as well as new AMD's FMA instructions.

That sounds like a great idea.  Do you know if the SC is already
considering it?  Or is it one of those cases where there needs
to be a bit more technical discussion first?

Richard

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Using crlibm as the default math library in GCC sources
@ 2007-11-06 19:54 Uros Bizjak
  2007-11-11 15:06 ` Richard Sandiford
  2007-11-12 18:16 ` Michael Matz
  0 siblings, 2 replies; 15+ messages in thread
From: Uros Bizjak @ 2007-11-06 19:54 UTC (permalink / raw)
  To: GCC; +Cc: Florent de Dinechin, christoph.lauter

Hello!

A little background:

Some time ago, Richard Guenther proposed that GCC imports libm math 
library from libc sources, but this proposal went nowhere, mostly due to 
non-technical reasons. At the time, the idea was that having a local 
math library, we could implement high-performance math library that will 
use SSE instructions with x86_64-like register passing convention also 
on i686-class 32bit targets. Since then, x86_64 gained soft-float 128bit 
support, but there was no 128bit libm available to fully exploit the 
potential of this high-precision infrastructure.

The proposal:

Forwarded is a short discussion with Mr. de Dinechin (forwarded with 
permission), where the possibility to import crlibm as the default gcc 
math library is discussed. I would like to ask members of the GCC SC for 
the opinion on the idea of adopting crlibm as the default math library. 
This way, the library can be compiled using all knowledge that gcc has 
about the target processor. As hinted in the attached message - 
autovectorization and perhaps recently introduced fixed-point 
infrastructure can be used effectively in the implementation of this 
library, as well as new AMD's FMA instructions.

The conclusion:

I think that including active maintained high-accuracy math library in 
GCC would be an important step to use GCC as a high-performance compiler 
in FP intensive applications.

Uros.

---forwarded message---

Hello,

We are very interested. Actually we are more and more working on 
automatic generation of libm functions, as opposed to writing the 
functions themselves. See
http://perso.ens-lyon.fr/christoph.lauter/intelportland.pdf
for a recent overview of what we have.
So our research direction is clearly towards compilers, not libraries.

However the development workforce is now reduced to one third-year PhD 
student (Christoph) and one lecturer (myself), both of whom work 
part-time on crlibm-related research. It is a small workforce for such a 
wide project. Do you think there could be some way to involve more 
people, or fund a post-doc or an engineer?

This was the political answer. More technical comments and issues below.

Uros Bizjak a Ã©crit :
> Hello!
>
> GCC (GNU Compiler Collection) community is looking to include a libm 
> library as a part of gcc support library in future releases of gcc. 
> This way, we plan to introduce certain optimizations and enhancement 
> in FP intensive code, that can be achieved only by tightly coupling 
> the compiler and high-performance libm implementation.

We totally agree. For the kind of optimizations related to scheduling, 
such as loop pipelining etc, we have little competence, however we are 
very interested in more arithmetic optimizations, such as expression 
fusing, compile-time generation of optimized polynomial approximations,
sharing argument reduction for sin and cos of the same value, etc. 
Actually we would like to know what static information gcc would 
currently be able to provide to a floating-point optimizer.

>
> Some time ago, there was an offer from your side [1] to gradually 
> include crlibm in glibc [1]. We think, that more appropriate place for 
> this library could be directly in the compiler infrastructure, so all 
> supported targets can automatically inherit the capabilities of crlibm 
> (newlib and uClinux targets, for example).

Yes.

>
> In addition to target-dependant optimizations that can be achieved by 
> coupling crlibm and the compiler, such as:
>
> a) usage of target processor specific instructions (cmove, FMA) and 
> instruction sets (SSEn)
> b) optimal scheduling of the instructions for certain processor
> c) various target microoptimizations
>
Yes to all.

On one hand, CRLibm can be retargetted relatively easily. It already 
uses macros for the FMA which accelerate it on PowerPC and IA64 for 
instance, and it was written with sufficient dataparallelism to enable 
SSE usage, although current gcc doesn't seem to be able to exploit it 
(hint).
However it will never be efficient on targets without an FPU, for instance.

On the other hand, we are prototyping the automatic implementation of 
libm functions. It is based on the Sollya tool
http://sollya.gforge.inria.fr/
Although Sollya was never intended to be included in a compiler, it can 
be used to generate target-specific code. It still needs quite a lot of 
development, though.

> gcc currently provides the full infrastructure for soft-FP 128bit long 
> double values on a mainstream processor (x86_64). Current libm 
> implementation doesn't use this infrastructure, and there are no plans 
> to do so. Unfortunately, this locks out important segment of users 
> (fluid dynamics) that would benefit greatly from increased precision; 
> both from correct libm implementation and from increased bit widths.
>
There are two issues here. The first is using 128-bit FP to provide 
64-bit FP correctly rounded elementary functions. This is possible, but 
probably less efficient than the current approach in CRLibm (using 
double-double and triple-double).

The second is developping a 128-bit libm, either using soft 128 bit FP, 
or in an ad-hoc way (using combinations of double, double-double, 
triple-double, etc, or going fixed-point, whichever proves more 
efficient). Again, we favor the ad-hoc way, and Sollya can be used for 
that, too. Again at the cost of some development, but this is something 
that interests us.

> I would like to establish "first contact" to kindly ask you what is 
> your opinion about including your library as a default math library 
> into gcc sources, before posting formal proposal to gcc@ mailing list. 
> I think that both, gcc and crlibm could benefit from using your 
> library as a default math library in gcc compiler.
>

We totally agree. Note that there should be no problem with copyright 
transfers.

Incidentally, would there be a possibility of physically  meeting 
somewhere in Europe?

    Florent and Christoph

> [1] http://sources.redhat.com/ml/libc-alpha/2007-02/msg00007.html
>
> Looking forward to hearing from you,
> Uros.
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2007-11-18  2:23 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-11-12 16:24 Using crlibm as the default math library in GCC sources François-Xavier Coudert
  -- strict thread matches above, loose matches on Subject: below --
2007-11-06 19:54 Uros Bizjak
2007-11-11 15:06 ` Richard Sandiford
2007-11-12  2:34   ` Uros Bizjak
2007-11-12 10:28     ` Joe Buck
2007-11-12 18:16 ` Michael Matz
2007-11-13  8:59   ` Geert Bosch
2007-11-13 14:49     ` Michael Matz
2007-11-13 18:55       ` Geert Bosch
2007-11-15 21:31         ` Christoph Quirin Lauter
2007-11-14 14:59     ` Vincent Lefevre
2007-11-14 16:45       ` Joseph S. Myers
2007-11-17 13:41       ` Geert Bosch
2007-11-17 14:39         ` Tim Prince
2007-11-18 17:09         ` Vincent Lefevre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).