RFC Migrating PowerPC to IEEE 128-bit Floating Point

public inbox for libc-alpha@sourceware.org
 help / color / mirror / Atom feed

* RFC Migrating PowerPC to IEEE 128-bit Floating Point
@ 2015-09-30 19:18 Steven Munroe
  2015-09-30 21:15 ` Joseph Myers
  2015-11-16 17:48 ` IEEE128 binary float to decimal float conversion routines Paul E. Murphy
  0 siblings, 2 replies; 46+ messages in thread
From: Steven Munroe @ 2015-09-30 19:18 UTC (permalink / raw)
  To: Joseph Myers, Carlos O'Donell
  Cc: libc-alpha, Tulio Magno Quites Machado Filho, Ulrich Weigand,
	Michael Meissner, David Edelsohn

Several of you have expressed concern with the limits of the current IBM
long double format and have asked when the PowerPC platform could make
the transition to IEEE 128 standard format and rounding support.

At first we were reluctant to give up the hardware advantage of the
double double implementation. But with progress in VSX architecture in
P8 and future processors, I think we can address IEEE 128-bit with
reasonable performance.

But first we need to start with a soft-fp implementation as the first
step to enable GCC for __float128 (IEEE 128-bit format) and then address
VSX optimization as a parallel effort.

We have made good progress on this as documented here:
https://gcc.gnu.org/wiki/Ieee128PowerPC

As the next steps we need to address updates to the master soft-fp
project and then how to migrate GLIBC (libm and printf/scanf)) long
double from IBM long double to __float128.

Of course we need to continue support of existing applications that
currently use IBM long double while we enable __float128. And we will
need to maintain backward compatibility (via versioned symbols) for some
time.

This will require some management of the symbol set in  soft-fp and
eventually symbol versioning of long double functions in libm to allow
existing IBM long double and  new __float128 applications to coexist.

The first issue of the symbol naming for soft-fp implementation for
__float128 for PowerPC64. IBM long double currently used TFmode and the
runtime functions use the "tf" suffix (__floatditf for example).

For the powerpc64 target in GCC we planning to add __float128 as KFmode
and implies use the "kf" suffix in the runtime (__floatdikf for
example). This leaves the current long double runtime functions in place
while we introduce the __float128. This implies come macro and build
trickery to allow soft-fp builds for both TFmode and KFmode to coexist. 

Mike offers some suggestions in his Wiki above (section 3). We would
appreciate feed back on what you would prefer.

It would be good to resolve any such logistics in the soft-fp master
source in time for GCC stage 3.

Then we can engage the GLIBC community on staging and migration of long
double support within libm and libc. I suspect this will of the same
complexity as the original migration from long double == double and will
take while to resolve.

Thanks

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: RFC Migrating PowerPC to IEEE 128-bit Floating Point
  2015-09-30 19:18 RFC Migrating PowerPC to IEEE 128-bit Floating Point Steven Munroe
@ 2015-09-30 21:15 ` Joseph Myers
  2015-10-13 18:06   ` Steven Munroe
                     ` (3 more replies)
  2015-11-16 17:48 ` IEEE128 binary float to decimal float conversion routines Paul E. Murphy
  1 sibling, 4 replies; 46+ messages in thread
From: Joseph Myers @ 2015-09-30 21:15 UTC (permalink / raw)
  To: Steven Munroe
  Cc: Carlos O'Donell, libc-alpha,
	Tulio Magno Quites Machado Filho, Ulrich Weigand,
	Michael Meissner, David Edelsohn, jakub

[Jakub, see question below about your long double compatibility changes 
from 2004-6.]

On Wed, 30 Sep 2015, Steven Munroe wrote:

> We have made good progress on this as documented here:
> https://gcc.gnu.org/wiki/Ieee128PowerPC

I note there you refer to doing comparisons with a function __cmpkf2.  I 
think you need at least two comparison functions (analogous to the fcmpu 
and fcmpo instructions), that differ on whether they raise the "invalid" 
exception for quiet NaN operands.  (LT GT LE GE RTL operations would use 
the version that raises "invalid", other comparisons would use the version 
that doesn't, with some ambiguity for LTGT as discussed in the thread 
starting at 
<https://gcc.gnu.org/ml/gcc-patches/2015-02/threads.html#00555>.  It's 
true that the powerpc port doesn't get this right for hardware comparisons 
at present, <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58684>, but it 
would seem a bad idea to leave the interface to the libgcc function 
ambiguous.)

> Of course we need to continue support of existing applications that
> currently use IBM long double while we enable __float128. And we will
> need to maintain backward compatibility (via versioned symbols) for some
> time.

There are a couple of relevant glibc principles here:

* It's not just "for some time", unless you have a complete ABI 
discontinuity (creating a new port with new symbol versions for everything 
so the two ports can't load any of the same executables or shared 
libraries) old glibc symbols need to stay supported.  It's certainly 
possible to replace an old port for an architecture with a new 
incompatible ABI - ARM old-ABI was replaced by ARM EABI, with an overlap 
of a few years where both ports were supported before old-ABI support 
bit-rotted and then was removed.  But I don't know if that's practical for 
powerpc64 LE (and the other ABIs are widely used on non-VSX systems).

* The ABI exported by each glibc shared library must not depend on the GCC 
version used to compile glibc.  That is, for glibc to export __float128 
functions, the minimum GCC version for compiling glibc on the platform in 
question must be recent enough to support __float128.  Now, the backport 
to older GCC suggested under item 8.2 might help there (but might be 
otherwise risky).  But I tried and failed to obtain consensus simply to 
increase the minimum GCC version for building glibc 2.23 from 4.6 to 4.7; 
a 4.9 requirement, let alone a requirement for GCC 6, would seem likely to 
be controversial for some time (although users of at least some powerpc 
variants might be less concerned by such a version requirement; the 
version required doesn't need to be the same on all architectures).

For reference, there are four current ABIs for powerpc glibc as listed at 
<https://sourceware.org/glibc/wiki/ABIList#powerpc>.  I presume you are 
not proposing to increase that number (unless you do plan a complete ABI 
discontinuity) but to add some symbols to one or more of those existing 
ABIs.

(It's fine for some functions to require particular instruction set 
extensions, e.g. for the ABI for the __float128 entry points to be that 
they require VSX support.  This is not of course an issue for 
little-endian where VSX support is required anyway, only for big-endian if 
any __float128 support is added there.  The build system would need to 
ensure the right options are passed when building __float128 files, 
without adding inappropriate VSX requirements when building other files.)

> For the powerpc64 target in GCC we planning to add __float128 as KFmode
> and implies use the "kf" suffix in the runtime (__floatdikf for
> example). This leaves the current long double runtime functions in place
> while we introduce the __float128. This implies come macro and build
> trickery to allow soft-fp builds for both TFmode and KFmode to coexist. 
> 
> Mike offers some suggestions in his Wiki above (section 3). We would
> appreciate feed back on what you would prefer.

I think any of 3.2/3.3/3.4 (handling the architecture-specific variation 
in the mode name in some architecture-specific way) would seem reasonable.  
3.3 and 3.4 are probably preferable to 3.2 to avoid needing large numbers 
of architecture-specific files checked in.

> Then we can engage the GLIBC community on staging and migration of long
> double support within libm and libc. I suspect this will of the same
> complexity as the original migration from long double == double and will
> take while to resolve.

I think it will be substantially more complex, as you're building a 
library supporting four different types (and need to run all relevant 
tests for both long double variants), which hasn't been supported at all 
in glibc before.  Although the very rough outline design is clear in some 
areas (e.g. follow TS 18661-4 for naming of functions explicitly using 
__float128, so e.g. sinf128, and avoid adding e.g. additional printf 
modifiers for explicit __float128 because TS 18661-4 chose a different 
approach there), there's a substantial amount of work to develop a 
detailed design for what the interfaces and implementation should look 
like, and to obtain consensus on it, and a substantial amount more to 
implement it.

I'd be wary of trusting that the set of functions with existing support 
for variable long double types is in fact the complete set that need 
fixing for such a change.  In particular, as far as I can see, none of 
that compatibility support is present for printf-like functions in argp.h, 
err.h and error.h.  Unless there's some reason I'm missing why no 
compatibility support was needed for those functions in the original 
2004-6 changes (Jakub?), there's an ABI bug there that we can't now do 
anything about for the original transition (because old and new binaries 
use the same symbol versions for those functions, with old binaries 
expecting long double = double and new binaries expecting distinct long 
double), but that we can avoid for a future transition.  I don't know if 
any other affected functions are missing such compatibility support.

On point 5.2 (uses of long double in double functions): the correct fix is 
to remove that use, along with the multiple-precision code that can make 
the functions very slow, after doing sufficient error analysis of the 
initial code in the functions to verify that by itself it yields results 
with better than 1ulp accuracy, and so well within glibc's accuracy goals.  
That will mean that long double performance has no effect on performance 
of double libm functions.

On point 8.1, I don't think fma should be added to libgcc.  If you want to 
use the soft-fp version in glibc instead of the default ldbl-128 version, 
that's easy to do (and should be much faster).

Note that much of the glibc work could be done in the context of 
supporting explicit __float128 functions for x86_64 - all GCC versions 
supported for building glibc already support __float128, and differences 
between __float128 and TS 18661-4 _Float128 should be easy to work around 
when building with older GCC even if proper GCC _Float128 support would be 
a preferred stating point (it allows you e.g. to write _Complex _Float128, 
which isn't possible with __float128).  That wouldn't include anything 
related to changing the type of long double, and would still have a large 
amount of work on design, obtaining consensus and implementation, but 
would allow a large proportion of the work to go in even before the 
minimum GCC version for building powerpc glibc is increased and so 
__float128 support can be added for powerpc glibc.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: RFC Migrating PowerPC to IEEE 128-bit Floating Point
  2015-09-30 21:15 ` Joseph Myers
@ 2015-10-13 18:06   ` Steven Munroe
  2015-10-13 18:53     ` Michael Meissner
  2015-10-13 23:11     ` Joseph Myers
  2015-10-26 18:05   ` [PATCH 0/2] Add minimal code for IEEE 128-bit floating point Tulio Magno Quites Machado Filho
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 46+ messages in thread
From: Steven Munroe @ 2015-10-13 18:06 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Carlos O'Donell, libc-alpha,
	Tulio Magno Quites Machado Filho, Ulrich Weigand,
	Michael Meissner, David Edelsohn, jakub

On Wed, 2015-09-30 at 21:15 +0000, Joseph Myers wrote: 
> [Jakub, see question below about your long double compatibility changes 
> from 2004-6.]
> 
> On Wed, 30 Sep 2015, Steven Munroe wrote:
> 
> > We have made good progress on this as documented here:
> > https://gcc.gnu.org/wiki/Ieee128PowerPC
> 
> I note there you refer to doing comparisons with a function __cmpkf2.  I 
> think you need at least two comparison functions (analogous to the fcmpu 
> and fcmpo instructions), that differ on whether they raise the "invalid" 
> exception for quiet NaN operands.  (LT GT LE GE RTL operations would use 
> the version that raises "invalid", other comparisons would use the version 
> that doesn't, with some ambiguity for LTGT as discussed in the thread 
> starting at 
> <https://gcc.gnu.org/ml/gcc-patches/2015-02/threads.html#00555>.  It's 
> true that the powerpc port doesn't get this right for hardware comparisons 
> at present, <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58684>, but it 
> would seem a bad idea to leave the interface to the libgcc function 
> ambiguous.)
> 
We agree to add ordered and unordered form of compare for soft-float.

> > Of course we need to continue support of existing applications that
> > currently use IBM long double while we enable __float128. And we will
> > need to maintain backward compatibility (via versioned symbols) for some
> > time.
> 
> There are a couple of relevant glibc principles here:
> 
> * It's not just "for some time", unless you have a complete ABI 
> discontinuity (creating a new port with new symbol versions for everything 
> so the two ports can't load any of the same executables or shared 
> libraries) old glibc symbols need to stay supported.  

Understood. We will need to maintain all three [1) long double ==
double, 2) long double == IBM long double, 3) long double == __float128]
until there is a new (not backward compatible) ABI. 

> It's certainly 
> possible to replace an old port for an architecture with a new 
> incompatible ABI - ARM old-ABI was replaced by ARM EABI, with an overlap 
> of a few years where both ports were supported before old-ABI support 
> bit-rotted and then was removed.  But I don't know if that's practical for 
> powerpc64 LE (and the other ABIs are widely used on non-VSX systems).
> 
> * The ABI exported by each glibc shared library must not depend on the GCC 
> version used to compile glibc.  That is, for glibc to export __float128 
> functions, the minimum GCC version for compiling glibc on the platform in 
> question must be recent enough to support __float128.  Now, the backport 
> to older GCC suggested under item 8.2 might help there (but might be 
> otherwise risky).  But I tried and failed to obtain consensus simply to 
> increase the minimum GCC version for building glibc 2.23 from 4.6 to 4.7; 
> a 4.9 requirement, let alone a requirement for GCC 6, would seem likely to 
> be controversial for some time (although users of at least some powerpc 
> variants might be less concerned by such a version requirement; the 
> version required doesn't need to be the same on all architectures).
> 
> For reference, there are four current ABIs for powerpc glibc as listed at 
> <https://sourceware.org/glibc/wiki/ABIList#powerpc>.  I presume you are 
> not proposing to increase that number (unless you do plan a complete ABI 
> discontinuity) but to add some symbols to one or more of those existing 
> ABIs.
> 
The are not plans at this time and to the best of my knowledge.

> (It's fine for some functions to require particular instruction set 
> extensions, e.g. for the ABI for the __float128 entry points to be that 
> they require VSX support.  This is not of course an issue for 
> little-endian where VSX support is required anyway, only for big-endian if 
> any __float128 support is added there.  The build system would need to 
> ensure the right options are passed when building __float128 files, 
> without adding inappropriate VSX requirements when building other files.)
> 
Technically only VMX support is required for parameter passing which
includes PowerMAC G5/970, Freescale e6500, IBM POWER6 and later.

As you say the PPC64LE will have VMX/VSX by definition.

Modern PPC32BE and PPC64BE Hard-float systems support VMX/VSX ans can
migrate to __float128 as this work is completed.

Older POWER servers (power4 power5 power5+) without VMX/VSX are or are
going out of service and are not supported by current distros, and so
will not be compiling for or using __float128. We will maintain
versioned IBM long double (which only requires hard float) to support
existing application migrated to newer systems and distros.

The PPC32 Soft-float ABI would continue to pass by reference and could
migrate to __float128 if they choose.

> > For the powerpc64 target in GCC we planning to add __float128 as KFmode
> > and implies use the "kf" suffix in the runtime (__floatdikf for
> > example). This leaves the current long double runtime functions in place
> > while we introduce the __float128. This implies come macro and build
> > trickery to allow soft-fp builds for both TFmode and KFmode to coexist. 
> > 
> > Mike offers some suggestions in his Wiki above (section 3). We would
> > appreciate feed back on what you would prefer.
> 
> I think any of 3.2/3.3/3.4 (handling the architecture-specific variation 
> in the mode name in some architecture-specific way) would seem reasonable.  
> 3.3 and 3.4 are probably preferable to 3.2 to avoid needing large numbers 
> of architecture-specific files checked in.
> 
> > Then we can engage the GLIBC community on staging and migration of long
> > double support within libm and libc. I suspect this will of the same
> > complexity as the original migration from long double == double and will
> > take while to resolve.
> 
> I think it will be substantially more complex, as you're building a 
> library supporting four different types (and need to run all relevant 
> tests for both long double variants), which hasn't been supported at all 
> in glibc before.  Although the very rough outline design is clear in some 
> areas (e.g. follow TS 18661-4 for naming of functions explicitly using 
> __float128, so e.g. sinf128, and avoid adding e.g. additional printf 
> modifiers for explicit __float128 because TS 18661-4 chose a different 
> approach there), there's a substantial amount of work to develop a 
> detailed design for what the interfaces and implementation should look 
> like, and to obtain consensus on it, and a substantial amount more to 
> implement it.
> 
> 
This is complicated. I thank Uli Weigand for developing this break-down of long double usage.

Looking at what various targets use today, it seems that all platforms supported by glibc use IEEE float and double.   As to long double, the following options are in use:
(A) "long double" is 80-bit IEEE extended (no variants): i386, x86_64, ia64, m68k -- sysdeps directories: ieee754/ldbl-96
(B) "long double" is 128-bit IEEE quad (no variants): aarch64, mips64, sparc64 -- sysdeps directories: ieee754/ldbl-128
(C) "long double" is 128-bit IEEE quad (with "long double == double" variant): alpha, sparc32, s390* -- sysdeps directories: ieee754/ldbl-128, ieee754/ldbl-64-128, ieee754/ldbl-opt
(D) "long double" is IBM double-double (with "long double == double" variant): powerpc* -- sysdeps directories: ieee754/ldbl-128ibm, ieee754/ldbl-opt
(E) "long double" is double (no variants): all other platforms -- sysdeps directories: none

Dealing with one platform and ABI using ldbl-128ibm will be challenging.

But ldbl-128 supports 6? platforms directly or indirectly. So we need the think carefully, as we add __float128 support to this mix. This is complicated by requiring both function symbol and type changes.

I suggest we start by creating a new source directory (./ieee574/ldbl-f128) specifically for introducing __float128 support. This allows us to begin work without impacting to the existing platforms.

Initially this can be source/macro stubs that overlay the existing ldbl-128 implementations and modifying the function names and parameter types..

In parallel we can begin the work of preparing ldbl-128ibm to move out of the ieee-*l name space and prepare for the release where they will become of the old version of long double for the power platform.

Finally once the ldbl-f128 target is fully implemented and tested, interested platforms can migrate to ldbl-f128 or decide to merge ldbl-128 and ldbl-f128 in to a single implementation supporting both names/types.

> 
> I'd be wary of trusting that the set of functions with existing support 
> for variable long double types is in fact the complete set that need 
> fixing for such a change.  In particular, as far as I can see, none of 
> that compatibility support is present for printf-like functions in argp.h, 
> err.h and error.h.  Unless there's some reason I'm missing why no 
> compatibility support was needed for those functions in the original 
> 2004-6 changes (Jakub?), there's an ABI bug there that we can't now do 
> anything about for the original transition (because old and new binaries 
> use the same symbol versions for those functions, with old binaries 
> expecting long double = double and new binaries expecting distinct long 
> double), but that we can avoid for a future transition.  I don't know if 
> any other affected functions are missing such compatibility support.
> 
Ok we will investigate this. I don't seeing what the issues are with argp.h and err.h but looks like the error() and error_at_line() format strings could be a problem.
> 
> 
> On point 5.2 (uses of long double in double functions): the correct fix is 
> to remove that use, along with the multiple-precision code that can make 
> the functions very slow, after doing sufficient error analysis of the 
> initial code in the functions to verify that by itself it yields results 
> with better than 1ulp accuracy, and so well within glibc's accuracy goals.  
> That will mean that long double performance has no effect on performance 
> of double libm functions.

I only found one example of this (./sysdeps/ieee754/dbl-64/slowpow.c) and that usage was wrapped in a conditional (#ifdef USE_LONG_DOUBLE_FOR_MP).

We will lokk at this.

> 
> 
> On point 8.1, I don't think fma should be added to libgcc.  If you want to 
> use the soft-fp version in glibc instead of the default ldbl-128 version, 
> that's easy to do (and should be much faster).
> 
We will handle this as we work the port.
> 
> Note that much of the glibc work could be done in the context of 
> supporting explicit __float128 functions for x86_64 - all GCC versions 
> supported for building glibc already support __float128, and differences 
> between __float128 and TS 18661-4 _Float128 should be easy to work around 
> when building with older GCC even if proper GCC _Float128 support would be 
> a preferred stating point (it allows you e.g. to write _Complex _Float128, 
> which isn't possible with __float128).  That wouldn't include anything 
> related to changing the type of long double, and would still have a large 
> amount of work on design, obtaining consensus and implementation, but 
> would allow a large proportion of the work to go in even before the 
> minimum GCC version for building powerpc glibc is increased and so 
> __float128 support can be added for powerpc glibc.
> 
A I suggested about, creating ./ieee754/ldbl-f128 would be the simplest
and safest way to start this effort.

Yes we will have to address _Float128 but we have to include dealing
with the existing __float128 support as part of the solution.

This is a complex dance that has to be choreographed. Everyone is
invited to participate.



^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: RFC Migrating PowerPC to IEEE 128-bit Floating Point
  2015-10-13 18:06   ` Steven Munroe
@ 2015-10-13 18:53     ` Michael Meissner
  2015-10-13 23:11     ` Joseph Myers
  1 sibling, 0 replies; 46+ messages in thread
From: Michael Meissner @ 2015-10-13 18:53 UTC (permalink / raw)
  To: Steven Munroe
  Cc: Joseph Myers, Carlos O'Donell, libc-alpha,
	Tulio Magno Quites Machado Filho, Ulrich Weigand,
	Michael Meissner, David Edelsohn, jakub

On Tue, Oct 13, 2015 at 01:06:22PM -0500, Steven Munroe wrote:
> On Wed, 2015-09-30 at 21:15 +0000, Joseph Myers wrote: 
> > (It's fine for some functions to require particular instruction set 
> > extensions, e.g. for the ABI for the __float128 entry points to be that 
> > they require VSX support.  This is not of course an issue for 
> > little-endian where VSX support is required anyway, only for big-endian if 
> > any __float128 support is added there.  The build system would need to 
> > ensure the right options are passed when building __float128 files, 
> > without adding inappropriate VSX requirements when building other files.)

In my current patches to libgcc to add the support for __float128 I added
#pragma GCC target ("vsx,float128") 

to make sure the float 128 files were built with the VSX instruction set even
if the rest of libgcc (and eventually glibc) is built with other options.

> Technically only VMX support is required for parameter passing which
> includes PowerMAC G5/970, Freescale e6500, IBM POWER6 and later.

The compiler currently requires VSX, not just Altivec to enable __float128.
I could relax it to just Altivec, but I would worry about unintential unalgined
__float128 accesses given that the Altivec loads and stores ignore the bottom 3
address bits.
 
> 
> As you say the PPC64LE will have VMX/VSX by definition.
> 
> Modern PPC32BE and PPC64BE Hard-float systems support VMX/VSX ans can
> migrate to __float128 as this work is completed.
> 
> Older POWER servers (power4 power5 power5+) without VMX/VSX are or are
> going out of service and are not supported by current distros, and so
> will not be compiling for or using __float128. We will maintain
> versioned IBM long double (which only requires hard float) to support
> existing application migrated to newer systems and distros.
> 
> The PPC32 Soft-float ABI would continue to pass by reference and could
> migrate to __float128 if they choose.

In my original work, I allowed __float128 without the VSX instruction set,
using different emulator functions depending on whether you used VSX or not,
and it worked. However, it was getting cumbersome to document and explain the
two different emulators, and I worried about user code inadvertently having
some modules compiled with VSX and some without.

If there is perceived need, we can resurect this (assuming David is willing to
allow it).

Are there actual users that are asking for __float128 on embedded and other
non-server systems?

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: RFC Migrating PowerPC to IEEE 128-bit Floating Point
  2015-10-13 18:06   ` Steven Munroe
  2015-10-13 18:53     ` Michael Meissner
@ 2015-10-13 23:11     ` Joseph Myers
  2015-10-14  7:38       ` Andreas Schwab
  2016-03-07 23:48       ` Paul E. Murphy
  1 sibling, 2 replies; 46+ messages in thread
From: Joseph Myers @ 2015-10-13 23:11 UTC (permalink / raw)
  To: Steven Munroe
  Cc: Carlos O'Donell, libc-alpha,
	Tulio Magno Quites Machado Filho, Ulrich Weigand,
	Michael Meissner, David Edelsohn, jakub

On Tue, 13 Oct 2015, Steven Munroe wrote:

> This is complicated. I thank Uli Weigand for developing this break-down 
> of long double usage.
> 
> Looking at what various targets use today, it seems that all platforms 
> supported by glibc use IEEE float and double.  As to long double, the 
> following options are in use:
> (A) "long double" is 80-bit IEEE extended (no variants): i386, x86_64, 
> ia64, m68k -- sysdeps directories: ieee754/ldbl-96

(Actually, there are two variants - m68k has one additional normal 
exponent (values with exponent representation 0 are interpreted as if it 
were a normal exponent, but with the explicit MSB of the mantissa allowed 
to be 0), resulting in the least normal value and the least subnormal 
value being half what they are in the Intel variant.  This is handled by 
e.g. gen-auto-libm-tests, but I strongly suspect that some of the code in 
sysdeps/ieee754/ldbl-96 does not handle the m68k variant fully correctly.)

> (B) "long double" is 128-bit IEEE quad (no variants): aarch64, mips64, 
> sparc64 -- sysdeps directories: ieee754/ldbl-128

(There's the MIPS variation in the quiet / signaling NaN convention, but 
very little needs to care about that.)

Neither of those variations I noted is of any direct relevance to the 
addition of __float128 support, except insofar as it probably involves 
updating testcases that deal with such variations.

> I suggest we start by creating a new source directory 
> (./ieee574/ldbl-f128) specifically for introducing __float128 support. 
> This allows us to begin work without impacting to the existing 
> platforms.

When it's being used for a type that's not long double, it's not clear 
that the name should start ldbl-.  But that's a side issue.  The main 
thing is that before adding any ABIs we should think very carefully about 
just what ABIs are appropriate, as well as about the implementation 
design.  And the same applies to APIs.  I think there is a large amount of 
work to do before any directories are added.

> In parallel we can begin the work of preparing ldbl-128ibm to move out 
> of the ieee-*l name space and prepare for the release where they will 
> become of the old version of long double for the power platform.

I don't really see the benefit in such a move.  As in, it's not 
technically complicated to change a directory name, but when all the other 
main implementations for particular floating-point formats are in 
sysdeps/ieee754, does it really make sense for one format to have its 
implementation sitting off on the side somewhere?  The name isn't exactly 
accurate, but is the move worth the disruption any more than removing the 
useless "sysv/" directory component from sysdeps/unix/sysv/linux/ would 
be?

> Finally once the ldbl-f128 target is fully implemented and tested, 
> interested platforms can migrate to ldbl-f128 or decide to merge 
> ldbl-128 and ldbl-f128 in to a single implementation supporting both 
> names/types.

You need separate sysdeps directories for the concepts of "long double is 
binary128" and "__float128 functions where that is not long double".  The 
former is ldbl-128.  The case of "*f128 as aliases for *l when long double 
is binary128" could probably be done through ldbl-128 as well.

> > I'd be wary of trusting that the set of functions with existing support 
> > for variable long double types is in fact the complete set that need 
> > fixing for such a change.  In particular, as far as I can see, none of 
> > that compatibility support is present for printf-like functions in argp.h, 
> > err.h and error.h.  Unless there's some reason I'm missing why no 
> > compatibility support was needed for those functions in the original 
> > 2004-6 changes (Jakub?), there's an ABI bug there that we can't now do 
> > anything about for the original transition (because old and new binaries 
> > use the same symbol versions for those functions, with old binaries 
> > expecting long double = double and new binaries expecting distinct long 
> > double), but that we can avoid for a future transition.  I don't know if 
> > any other affected functions are missing such compatibility support.
> > 
> Ok we will investigate this. I don't seeing what the issues are with 
> argp.h and err.h but looks like the error() and error_at_line() format 
> strings could be a problem.

argp_error and argp_failure take a format string and variable arguments.  
Thus, they are affected by the ABI for long double - users may 
legitimately pass long double arguments to those functions.  The same 
applies to v?warnx? and v?errx?.

> > On point 5.2 (uses of long double in double functions): the correct fix is 
> > to remove that use, along with the multiple-precision code that can make 
> > the functions very slow, after doing sufficient error analysis of the 
> > initial code in the functions to verify that by itself it yields results 
> > with better than 1ulp accuracy, and so well within glibc's accuracy goals.  
> > That will mean that long double performance has no effect on performance 
> > of double libm functions.
> 
> I only found one example of this (./sysdeps/ieee754/dbl-64/slowpow.c) 
> and that usage was wrapped in a conditional (#ifdef 
> USE_LONG_DOUBLE_FOR_MP).

slowexp.c also has such uses.  I think those are the only cases.

> > Note that much of the glibc work could be done in the context of 
> > supporting explicit __float128 functions for x86_64 - all GCC versions 
> > supported for building glibc already support __float128, and differences 
> > between __float128 and TS 18661-4 _Float128 should be easy to work around 
> > when building with older GCC even if proper GCC _Float128 support would be 
> > a preferred stating point (it allows you e.g. to write _Complex _Float128, 
> > which isn't possible with __float128).  That wouldn't include anything 
> > related to changing the type of long double, and would still have a large 
> > amount of work on design, obtaining consensus and implementation, but 
> > would allow a large proportion of the work to go in even before the 
> > minimum GCC version for building powerpc glibc is increased and so 
> > __float128 support can be added for powerpc glibc.
> > 
> A I suggested about, creating ./ieee754/ldbl-f128 would be the simplest
> and safest way to start this effort.

I don't think you'll get consensus to add such a directory without 
substantial design work first.  I can think of several more indirectly 
related incremental changes that would be much more likely candidates for 
"simplest and safest", once things get to the point of such source code 
changes being appropriate.

As one example, a refactoring of the type-generics used (both in installed 
headers and internally) for macros such as signbit.  Because glibc would 
be supporting a set of floating-point types depending on the architecture, 
those macros would need to go via macros with architecture-specific 
definitions.  And because it will no longer be possible to distinguish all 
such types using sizeof, it will be necessary to use other features such 
as __builtin_types_compatible_p or _Generic.  __builtin_types_compatible_p 
is GCC-specific, but supported by more GCC versions than C11 _Generic, 
which also runs into issues with different handling of qualified types in 
different implementations (see DR#423) that would need allowing for.  So 
even such a small piece involves significant design and analysis work to 
determine and justify the design chosen - but is of much lower risk than 
things involving adding new functions or sysdeps directories.

(The type-generics in <tgmath.h> are much more complicated than that, but 
are one of the many things that would need addressing eventually.  See my 
recent observations on the WG14 reflector 
<http://www.open-std.org/jtc1/sc22/wg14/13831>.)

As another example, going through testcases that test for one or more of 
float / double / long double and making those testcases type-generic as 
far as possible so that (a) they may test for existing types not currently 
covered and (b) it's easy to extend them to __float128 in future and we 
have a good understanding of what tests to extend.

As a third example, refactoring the implementations of <complex.h> 
functions to make them type-generic using macros.  This would mean less 
mechanical work in future to update multiple versions of a function when 
fixing a bug, as well as making it easier to add __float128 versions of 
those functions in future.

I'm not saying that one of those should necessarily be the starting point; 
they're simply illustrative examples of some of the simpler pieces that 
might appear fairly early given a more detailed design for the issues that 
need to be addressed and the rough patch sequence involved in addressing 
them.  Even if you prototype changes involving new directories before 
filling in such pieces, I think such pieces would need to appear early in 
the long patch series (and personally I think prefetching implementation 
and mainlining work on such pieces is generally beneficial).

Now, here are some examples of issues to consider in the design and 
consensus-building process.  Where I suggest answers here, I am *not* 
asserting that only a particular answer is acceptable - what is or is not 
required is determined by the community through a consensus process, and 
generally these are cases where if making such changes myself I'd still 
want to allow for the community to reach consensus first rather than 
acting as libm maintainer to treat something as having consensus in the 
absence of discussion.  It's only the two principles I stated in 
<https://sourceware.org/ml/libc-alpha/2015-09/msg00751.html> about ABI 
compatibility requirements that I'm confident asserting as definite 
requirements here, that are well-enough established no further discussion 
of them is required to understand their application to this case.

* We need to agree that the prior rejection of __float128 support in 
<https://sourceware.org/ml/libc-alpha/2004-03/msg00326.html> and 
<https://sourceware.org/ml/libc-alpha/2004-05/msg00055.html> no longer 
stands, in the light of the existence of standard C bindings in TS 
18661-3.  I don't think there should be any controversy about superseding 
such an old decision, but it should still be clear that we are doing so.

* We need to agree on the appropriateness of APIs from TS 18661 - that is, 
to get consensus on a whole set of APIs being appropriate for glibc, so 
that doesn't need redebating every time an individual new API is proposed.  
The APIs from parts 1, 3 and 4, excluding those relating to decimal 
floating point, seem a reasonable set to consider in such a discussion 
(with a clear expectation that such APIs could then be added piecemeal, 
with individual patches still needing consensus on the merits of the 
implementation).  (If controversial, the minimal set for present purposes 
would be the _Float128 APIs from part 3 that correspond to APIs already 
present in glibc for other floating-point types, but I hope it would be 
possible to obtain consensus on a wider set of APIs.)

* We need to agree on rules for ABIs corresponding to such APIs.  My 
suggestion is: where an API meets the ISO C rules for being usable without 
including a header, the function needs an exported symbol with the same 
name (for example, if we support sinf128 on a platform where long double 
is binary128, there does actually need to be a sinf128 exported symbol, 
not just a header redirecting calls to use sinl).  But if it doesn't meet 
such rules, the number of exported symbols should be kept to a minimum 
(for example, functions that exist only for use by type-generic macros 
should not have such aliases).  Functions would go in libc or libm 
according to what header has the declaration, unless consistency with 
existing functions dictates otherwise (e.g. __isinff128 might go in libc 
because other __isinf* functions are there).

* We need to consider what to do about functions not included in TS 
18661-3.  Some possible cases: (a) function aliases such as drem or gamma 
or pow10 shouldn't have *f128 versions.  (b) Obsolescent functions such as 
scalb shouldn't either (though they'll still need long double = binary128 
versions, but not the *f128 APIs).  (c) Where TS 18661 has clearly chosen 
a different direction, do not add *f128 APIs for the old functions (so no 
nexttoward or printf variants - maybe add nextdown, nextup, strfrom 
functions instead as preparatory patches - again, long double = binary128 
verisons will still be needed).  (d) In some cases, *f128 functions as GNU 
APIs are clearly reasonable by analogy - for example, strtof128_l, 
wcstof128, lgammaf128_r.  (e) As a community, we should think especially 
carefully about what to do in cases where the above might miss some 
functionality, e.g. support for grouping when reading strings to 
__float128, or strfmon formatting of __float128 values (the answer might 
well end up being that this functionality can only be accessed in the case 
where long double = __float128).  (f) M_* constants in math.h (needed in 
the implementations in any case).  (Part of the design process is to get a 
*complete* list of such cases for consideration.)

* Would it be best to avoid the new interfaces supporting matherr / 
_LIB_VERSION?  It's been discussed before that we'd like to deprecate that 
functionality; proper deprecation could mean new symbol versions for 
affected functions that avoid using the _LIB_VERSION global at all.  It 
would be unfortunate to add multiple versions of *f128 in quick 
succession, first with _LIB_VERSION support and then without.  This seems 
desirable to discuss even if we end up concluding it's best not to make 
this a dependency.

* Many more specific questions regarding details of the design and 
implementation approach in different areas.

* While working on it, it's important to pay attention to how the work 
relates to the details of TS 18661 and to ongoing discussions in WG14 and 
to track cases where possible issues or ambiguities in TS 18661 are found 
- and to pass such information back to WG14 for consideration when it's 
considered whether to include parts of TS 18661 in the next revision of 
the C standard (actually having an implementation of parts of part 3 might 
make it more likely to be included in the next revision).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: RFC Migrating PowerPC to IEEE 128-bit Floating Point
  2015-10-13 23:11     ` Joseph Myers
@ 2015-10-14  7:38       ` Andreas Schwab
  2015-10-14 12:13         ` Joseph Myers
  2016-03-07 23:48       ` Paul E. Murphy
  1 sibling, 1 reply; 46+ messages in thread
From: Andreas Schwab @ 2015-10-14  7:38 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Steven Munroe, Carlos O'Donell, libc-alpha,
	Tulio Magno Quites Machado Filho, Ulrich Weigand,
	Michael Meissner, David Edelsohn, jakub

Joseph Myers <joseph@codesourcery.com> writes:

> but I strongly suspect that some of the code in
> sysdeps/ieee754/ldbl-96 does not handle the m68k variant fully
> correctly.)

m68k uses only a few of them, and I don't think any of them is testing
for subnormal values.

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: RFC Migrating PowerPC to IEEE 128-bit Floating Point
  2015-10-14  7:38       ` Andreas Schwab
@ 2015-10-14 12:13         ` Joseph Myers
  0 siblings, 0 replies; 46+ messages in thread
From: Joseph Myers @ 2015-10-14 12:13 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: Steven Munroe, Carlos O'Donell, libc-alpha,
	Tulio Magno Quites Machado Filho, Ulrich Weigand,
	Michael Meissner, David Edelsohn, jakub

On Wed, 14 Oct 2015, Andreas Schwab wrote:

> Joseph Myers <joseph@codesourcery.com> writes:
> 
> > but I strongly suspect that some of the code in
> > sysdeps/ieee754/ldbl-96 does not handle the m68k variant fully
> > correctly.)
> 
> m68k uses only a few of them, and I don't think any of them is testing
> for subnormal values.

I'm pretty sure that fmal is not correct for m68k.  There are lots of 
conditionals to avoid internal underflow when the final result does not 
underflow, while ensuring correctly rounded results with underflow 
exception when such an exception is correct.  All of those conditionals 
are analogous to those for flt-32 / dbl-64 / ldbl-128 and make no 
allowance for the m68k format - and testcases were generally written to 
cover problem cases for those formats and for the Intel version of ldbl-96 
and so may not expose problems for m68k either.  (Some conditionals may be 
safe for m68k for one reason or another, but probably not all.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: RFC Migrating PowerPC to IEEE 128-bit Floating Point
  2015-10-13 23:11     ` Joseph Myers
  2015-10-14  7:38       ` Andreas Schwab
@ 2016-03-07 23:48       ` Paul E. Murphy
  2016-03-08  1:31         ` Joseph Myers
  1 sibling, 1 reply; 46+ messages in thread
From: Paul E. Murphy @ 2016-03-07 23:48 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Steven Munroe, Carlos O'Donell, libc-alpha,
	Tulio Magno Quites Machado Filho, Ulrich Weigand

Hi Joseph,

I'd like to continue on with this discussion. You've raised a number of
valid issues we should consider before any code is modified. I really
would like to avoid spinning a one off library for PPC.

On 10/13/2015 06:11 PM, Joseph Myers wrote:

> Now, here are some examples of issues to consider in the design and 
> consensus-building process.  Where I suggest answers here, I am *not* 
> asserting that only a particular answer is acceptable - what is or is not 
> required is determined by the community through a consensus process, and 
> generally these are cases where if making such changes myself I'd still 
> want to allow for the community to reach consensus first rather than 
> acting as libm maintainer to treat something as having consensus in the 
> absence of discussion.  It's only the two principles I stated in 
> <https://sourceware.org/ml/libc-alpha/2015-09/msg00751.html> about ABI 
> compatibility requirements that I'm confident asserting as definite 
> requirements here, that are well-enough established no further discussion 
> of them is required to understand their application to this case.

If I understand you, for a PPC ABI to support __float128, the minimum
compiler version for building glibc would need to be raised to guarantee
the availability of the new ABI?

> * We need to agree that the prior rejection of __float128 support in 
> <https://sourceware.org/ml/libc-alpha/2004-03/msg00326.html> and 
> <https://sourceware.org/ml/libc-alpha/2004-05/msg00055.html> no longer 
> stands, in the light of the existence of standard C bindings in TS 
> 18661-3.  I don't think there should be any controversy about superseding 
> such an old decision, but it should still be clear that we are doing so.

Those old patches appear to attempt to basically merge libquadmath into
glibc (or do I have it wrong, and that was the genesis of quadmath?). Are
you suggesting we should be figuring out a plan to support _Float128 as
defined by TS 18661?

> 
> * We need to agree on the appropriateness of APIs from TS 18661 - that is, 
> to get consensus on a whole set of APIs being appropriate for glibc, so 
> that doesn't need redebating every time an individual new API is proposed.  
> The APIs from parts 1, 3 and 4, excluding those relating to decimal 
> floating point, seem a reasonable set to consider in such a discussion 
> (with a clear expectation that such APIs could then be added piecemeal, 
> with individual patches still needing consensus on the merits of the 
> implementation).  (If controversial, the minimal set for present purposes 
> would be the _Float128 APIs from part 3 that correspond to APIs already 
> present in glibc for other floating-point types, but I hope it would be 
> possible to obtain consensus on a wider set of APIs.)
> 
> * We need to agree on rules for ABIs corresponding to such APIs.  My 
> suggestion is: where an API meets the ISO C rules for being usable without 
> including a header, the function needs an exported symbol with the same 
> name (for example, if we support sinf128 on a platform where long double 
> is binary128, there does actually need to be a sinf128 exported symbol, 
> not just a header redirecting calls to use sinl).  But if it doesn't meet 
> such rules, the number of exported symbols should be kept to a minimum 
> (for example, functions that exist only for use by type-generic macros 
> should not have such aliases).  Functions would go in libc or libm 
> according to what header has the declaration, unless consistency with 
> existing functions dictates otherwise (e.g. __isinff128 might go in libc 
> because other __isinf* functions are there).

Are you able to point out the relevant clauses in ISO C for this? I want to
make sure I understand these rules well.

> 
> * We need to consider what to do about functions not included in TS 
> 18661-3.  Some possible cases: (a) function aliases such as drem or gamma 
> or pow10 shouldn't have *f128 versions.  (b) Obsolescent functions such as 
> scalb shouldn't either (though they'll still need long double = binary128 
> versions, but not the *f128 APIs).  (c) Where TS 18661 has clearly chosen 
> a different direction, do not add *f128 APIs for the old functions (so no 
> nexttoward or printf variants - maybe add nextdown, nextup, strfrom 
> functions instead as preparatory patches - again, long double = binary128 
> verisons will still be needed).  (d) In some cases, *f128 functions as GNU 
> APIs are clearly reasonable by analogy - for example, strtof128_l, 
> wcstof128, lgammaf128_r.  (e) As a community, we should think especially 
> carefully about what to do in cases where the above might miss some 
> functionality, e.g. support for grouping when reading strings to 
> __float128, or strfmon formatting of __float128 values (the answer might 
> well end up being that this functionality can only be accessed in the case 
> where long double = __float128).  (f) M_* constants in math.h (needed in 
> the implementations in any case).  (Part of the design process is to get a 
> *complete* list of such cases for consideration.)
> 
> * Would it be best to avoid the new interfaces supporting matherr / 
> _LIB_VERSION?  It's been discussed before that we'd like to deprecate that 
> functionality; proper deprecation could mean new symbol versions for 
> affected functions that avoid using the _LIB_VERSION global at all.  It 
> would be unfortunate to add multiple versions of *f128 in quick 
> succession, first with _LIB_VERSION support and then without.  This seems 
> desirable to discuss even if we end up concluding it's best not to make 
> this a dependency.
> 
> * Many more specific questions regarding details of the design and 
> implementation approach in different areas.
> 
> * While working on it, it's important to pay attention to how the work 
> relates to the details of TS 18661 and to ongoing discussions in WG14 and 
> to track cases where possible issues or ambiguities in TS 18661 are found 
> - and to pass such information back to WG14 for consideration when it's 
> considered whether to include parts of TS 18661 in the next revision of 
> the C standard (actually having an implementation of parts of part 3 might 
> make it more likely to be included in the next revision).
 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: RFC Migrating PowerPC to IEEE 128-bit Floating Point
  2016-03-07 23:48       ` Paul E. Murphy
@ 2016-03-08  1:31         ` Joseph Myers
  2016-04-19 17:26           ` Steven Munroe
  0 siblings, 1 reply; 46+ messages in thread
From: Joseph Myers @ 2016-03-08  1:31 UTC (permalink / raw)
  To: Paul E. Murphy
  Cc: Steven Munroe, Carlos O'Donell, libc-alpha,
	Tulio Magno Quites Machado Filho, Ulrich Weigand

On Mon, 7 Mar 2016, Paul E. Murphy wrote:

> On 10/13/2015 06:11 PM, Joseph Myers wrote:
> 
> > Now, here are some examples of issues to consider in the design and 
> > consensus-building process.  Where I suggest answers here, I am *not* 
> > asserting that only a particular answer is acceptable - what is or is not 
> > required is determined by the community through a consensus process, and 
> > generally these are cases where if making such changes myself I'd still 
> > want to allow for the community to reach consensus first rather than 
> > acting as libm maintainer to treat something as having consensus in the 
> > absence of discussion.  It's only the two principles I stated in 
> > <https://sourceware.org/ml/libc-alpha/2015-09/msg00751.html> about ABI 
> > compatibility requirements that I'm confident asserting as definite 
> > requirements here, that are well-enough established no further discussion 
> > of them is required to understand their application to this case.
> 
> If I understand you, for a PPC ABI to support __float128, the minimum
> compiler version for building glibc would need to be raised to guarantee
> the availability of the new ABI?

Yes - that is, raised for all PPC platforms for which these functions are 
added to glibc - having a well defined ABI for a given library in a given 
glibc version on a given platform is a pretty fundamental principle.  And 
since certain required features (complex float128 support and a few 
built-in functions) didn't get into GCC 6, that means that the minimum 
version would be at least GCC 7 (as I noted in 
<https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02222.html>).

[There are ways of avoiding needing the complex arithmetic libgcc support 
- not much in libm uses it - but because of the other dependencies that 
will require GCC 7, and because you still need the compiler side of 
complex support, I don't think that would really help.]

This does not of course prevent much of the preparatory work being done in 
the context of supporting such functions on x86_64, say (where the 
compiler support already exists in all compilers supported for building 
glibc) - the vast bulk of that work would be necessary to support a fourth 
floating-point format on any architecture at all, and I wouldn't be 
surprised if it's a 50-patch or more series of complicated changes 
building on each other (and each of which takes several attempts to get 
right and to reach consensus).

(Given such preparation, simply adding powerpc64 to the architectures 
building float128 support would be a fairly small change, while supporting 
it as an alternative long double format - thus supporting three different 
long double formats - would be a much trickier change or series of 
changes; Jakub's original support for multiple long double formats took 
years to get in, and as I noted earlier in this discussion it still missed 
several functions that ought to have variants for different long double 
formats; getting all the cases right is hard.  Furthermore, the design of 
how that handles different printf variants with a TLS __no_long_double 
variable is not AS-safe when different parts of a program use different 
long double formats, and given the view in bug 16060 that at least dprintf 
should be AS-safe, and given that glibc reviewers pay much more attention 
to such issues than they did ten years ago, repeating that design might 
well fail to obtain consensus.)

(Given some of the preparatory work, it should be much easier to update 
libquadmath from glibc, and so get current code to users via GCC before 
the glibc ABI can be updated, than it is at present.)

> > * We need to agree that the prior rejection of __float128 support in 
> > <https://sourceware.org/ml/libc-alpha/2004-03/msg00326.html> and 
> > <https://sourceware.org/ml/libc-alpha/2004-05/msg00055.html> no longer 
> > stands, in the light of the existence of standard C bindings in TS 
> > 18661-3.  I don't think there should be any controversy about superseding 
> > such an old decision, but it should still be clear that we are doing so.
> 
> Those old patches appear to attempt to basically merge libquadmath into
> glibc (or do I have it wrong, and that was the genesis of quadmath?). Are

These old patches long predate libquadmath (or at least its inclusion in 
GCC).

> you suggesting we should be figuring out a plan to support _Float128 as
> defined by TS 18661?

I think any new APIs for float128 support should follow TS 18661-3, yes 
(this doesn't mean the whole of TS 18661-3 needs supporting, where the 
corresponding float / double / long double functions aren't in glibc).  
It does need extending for at least some floating-point functions in glibc 
with no float128 analogues in TS 18661-3 - see my previous points about 
the various classes of such functions that need considering individually 
to decide whether float128 versions should be added.

Now, following TS 18661-3 means there is *nothing* in those 2004 changes 
that is actually wanted in glibc - because they deal with I/O, and TS 
18661 does not add any printf/scanf formats for new floating point types; 
rather, it uses strto* and strfrom* functions (so one or more of the 
preparatory patches in the 50-patch series would be adding the strfrom* 
functions from TS 18661-1 for the existing floating-point types, with 
thorough testcases written in a type-generic way, in preparation for later 
adding them for float128).  (And while sysdeps directory structures are 
one of the more tricky design questions, I don't think those old changes 
have anything useful regarding strto* either.)

There are certainly questions to work out regarding what extra compiler 
support might be desirable in conjunction with float128 support in glibc - 
but for now I think it's desirable to be able to build such support in 
glibc (for x86_64 etc.) with existing compilers that support the 
__float128 name but not _Float128 (and __attribute__ ((__mode__ 
((__TC__)))) for complex _Float128, etc.), although maybe with an 
expectation to phase out that support in time to simplify the code 
eventually (e.g. doing compiler support in parallel with the glibc 
support).  __float128 would become a built-in typedef for _Float128 when 
_Float128 is supported in the compiler.

> > * We need to agree on the appropriateness of APIs from TS 18661 - that is, 
> > to get consensus on a whole set of APIs being appropriate for glibc, so 
> > that doesn't need redebating every time an individual new API is proposed.  
> > The APIs from parts 1, 3 and 4, excluding those relating to decimal 
> > floating point, seem a reasonable set to consider in such a discussion 
> > (with a clear expectation that such APIs could then be added piecemeal, 
> > with individual patches still needing consensus on the merits of the 
> > implementation).  (If controversial, the minimal set for present purposes 
> > would be the _Float128 APIs from part 3 that correspond to APIs already 
> > present in glibc for other floating-point types, but I hope it would be 
> > possible to obtain consensus on a wider set of APIs.)

Regarding such consensus, see the discussion I started at 
<https://sourceware.org/ml/libc-alpha/2015-11/msg00162.html> in an attempt 
to get consensus in advance of any implementation.  There wasn't 
opposition to the appropriateness of those APIs, but not many people 
commented either.

Note that consensus in principle for various new APIs does not mean 
consensus for completely arbitrary subsets thereof; there should be some 
coherence to the sets added.  For example, one thing to consider would be 
whether TS 18661-3 functions should first be added in cases where they 
alias existing functions (including for _Float32, _Float64, _Float32x etc. 
- and including _Float128 on architectures where it aliases the existing 
long double), with support for them as separate being later in the patch 
series.

> > * We need to agree on rules for ABIs corresponding to such APIs.  My 
> > suggestion is: where an API meets the ISO C rules for being usable without 
> > including a header, the function needs an exported symbol with the same 
> > name (for example, if we support sinf128 on a platform where long double 
> > is binary128, there does actually need to be a sinf128 exported symbol, 
> > not just a header redirecting calls to use sinl).  But if it doesn't meet 
> > such rules, the number of exported symbols should be kept to a minimum 
> > (for example, functions that exist only for use by type-generic macros 
> > should not have such aliases).  Functions would go in libc or libm 
> > according to what header has the declaration, unless consistency with 
> > existing functions dictates otherwise (e.g. __isinff128 might go in libc 
> > because other __isinf* functions are there).
> 
> Are you able to point out the relevant clauses in ISO C for this? I want to
> make sure I understand these rules well.

The rule about using a function without a header is in C11 7.1.4#2: 
"Provided that a library function can be declared without reference to any 
type defined in a header, it is also permissible to declare the function 
and use it without including its associated header.".  This applies to the 
vast majority of libm functions.

The rule about which library a function goes in, based on the header, 
comes from POSIX's definition of the c99 utility, saying libm is for 
functions declared in <math.h>, <complex.h> and <fenv.h>.  I think this 
continues to work fine for TS 18661 parts 1, 3 and 4.

> > * While working on it, it's important to pay attention to how the work 
> > relates to the details of TS 18661 and to ongoing discussions in WG14 and 
> > to track cases where possible issues or ambiguities in TS 18661 are found 
> > - and to pass such information back to WG14 for consideration when it's 
> > considered whether to include parts of TS 18661 in the next revision of 
> > the C standard (actually having an implementation of parts of part 3 might 
> > make it more likely to be included in the next revision).

Note especially that there is a change to <tgmath.h> (for an issue I 
identified regarding different feature test macros being defined when 
different headers are included - ISO C feature test macros working 
differently from the existing glibc ones) in the floating-point working 
group's current draft of *part 2* of the TS, postdating the publication of 
parts 1 to 4, which may be relevant to part 3.  Probably the fixes in the 
current drafts will soon become DRs against the published parts of the TS.

  Changes to C11 + TS18661-1:

  To 7.25 #1, append the sentence:

  It declares function identifiers that would be declared if <math.h> and 
  <complex.h> were included by <tgmath.h>, even if <math.h> or <complex.h> 
  were included prior to <tgmath.h> in a context where the identifiers were 
  not declared.

There are implications for the design of how glibc handles declaring 
functions for different floating-point types.  It's not clear how best to 
address those.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: RFC Migrating PowerPC to IEEE 128-bit Floating Point
  2016-03-08  1:31         ` Joseph Myers
@ 2016-04-19 17:26           ` Steven Munroe
  2016-04-21 20:47             ` Joseph Myers
  0 siblings, 1 reply; 46+ messages in thread
From: Steven Munroe @ 2016-04-19 17:26 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Paul E. Murphy, Carlos O'Donell, libc-alpha,
	Tulio Magno Quites Machado Filho, Ulrich Weigand

On Tue, 2016-03-08 at 01:31 +0000, Joseph Myers wrote:
> On Mon, 7 Mar 2016, Paul E. Murphy wrote:
> 
> > On 10/13/2015 06:11 PM, Joseph Myers wrote:
> > 
> > > Now, here are some examples of issues to consider in the design and 
> > > consensus-building process.  Where I suggest answers here, I am *not* 
> > > asserting that only a particular answer is acceptable - what is or is not 
> > > required is determined by the community through a consensus process, and 
> > > generally these are cases where if making such changes myself I'd still 
> > > want to allow for the community to reach consensus first rather than 
> > > acting as libm maintainer to treat something as having consensus in the 
> > > absence of discussion.  It's only the two principles I stated in 
> > > <https://sourceware.org/ml/libc-alpha/2015-09/msg00751.html> about ABI 
> > > compatibility requirements that I'm confident asserting as definite 
> > > requirements here, that are well-enough established no further discussion 
> > > of them is required to understand their application to this case.
> > 
> > If I understand you, for a PPC ABI to support __float128, the minimum
> > compiler version for building glibc would need to be raised to guarantee
> > the availability of the new ABI?
> 
Sorry for my delay in responding to this thread, i was waiting for
clarity on the GCC 6 side.

> Yes - that is, raised for all PPC platforms for which these functions are 
> added to glibc - having a well defined ABI for a given library in a given 
> glibc version on a given platform is a pretty fundamental principle.  And 
> since certain required features (complex float128 support and a few 
> built-in functions) didn't get into GCC 6, that means that the minimum 
> version would be at least GCC 7 (as I noted in 
> <https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02222.html>).
> 
> [There are ways of avoiding needing the complex arithmetic libgcc support 
> - not much in libm uses it - but because of the other dependencies that 
> will require GCC 7, and because you still need the compiler side of 
> complex support, I don't think that would really help.]
> 
Waiting for GCC7 is not acceptable for starting this work. We believe we
have the agreements in place to complete the required __float128
enablement in GCC6.1. This will deliver parity with the existing x86_64
__float128 support, Including complex.



> This does not of course prevent much of the preparatory work being done in 
> the context of supporting such functions on x86_64, say (where the 
> compiler support already exists in all compilers supported for building 
> glibc) - the vast bulk of that work would be necessary to support a fourth 
> floating-point format on any architecture at all, and I wouldn't be 
> surprised if it's a 50-patch or more series of complicated changes 
> building on each other (and each of which takes several attempts to get 
> right and to reach consensus).
> 
As stated above we plan to complete the __float128 enablement for
PowerPC by GCC6.1.

We have requirements and deadlines that do  allow for the delay of a
whole year. If we can't get consensus from the community on reasonable
accommodations on this timeline we will br forced to look at
alternatives.

We would like to spend the development effort for GLIBC-2.24 preparing
the GLIBC infrastructure for adding additional floating types to libm
(as implied and required by  TS18661).

This would include extending the configure and Makefiles as needed to
add another directory under ieee754 contains implementations with new
type related suffixes (other then '', 'f', and 'l').

This would include:
      * getting agreement on additional controlling  Makefile variables
        similar to 'long-m-routines' for f128 and f64x. This will be
        used to extend and control ./math/Makefile.
      * getting agreement on additional controlling macros similar to
        __LONG_DOUBLE_MATH and __NO_LONG_DOUBLE_MATH for f128 and f64x.
        This will be used to extend and control math.h
      * getting agreement on how we use __STDC_WANT_IEC_60559_BFP_EXT__
        within math.h complex.h and tgmath.h. This is not necessarily to
        complete all the requirements of TS18661 but to prepare the way
        to that work.
      * getting agreement on directory naming and file structure.
      * getting agreement on the best way to share implementation across
        functionally equivalent but linguistically different types.

The infrastructure above should remain latent (inactive) until
explicitly enabled for a specific platform. This should to similar to
current mechanism used for long double based on platform specific
Implies files naming additional implementation directories, containing
Makefile fragments that enable the base ./math/Makefile machinery. 

This should have minimal (preferably no) impact on any other platform.

With this infrastructure in place we then use the GLIBC-2.25 development
period for submitting the libm functional implementations. 


> (Given such preparation, simply adding powerpc64 to the architectures 
> building float128 support would be a fairly small change, while supporting 
> it as an alternative long double format - thus supporting three different 
> long double formats - would be a much trickier change or series of 
> changes; Jakub's original support for multiple long double formats took 
> years to get in, and as I noted earlier in this discussion it still missed 
> several functions that ought to have variants for different long double 
> formats; getting all the cases right is hard.  Furthermore, the design of 
> how that handles different printf variants with a TLS __no_long_double 
> variable is not AS-safe when different parts of a program use different 
> long double formats, and given the view in bug 16060 that at least dprintf 
> should be AS-safe, and given that glibc reviewers pay much more attention 
> to such issues than they did ten years ago, repeating that design might 
> well fail to obtain consensus.)
> 
> (Given some of the preparatory work, it should be much easier to update 
> libquadmath from glibc, and so get current code to users via GCC before 
> the glibc ABI can be updated, than it is at present.)
> 
> > > * We need to agree that the prior rejection of __float128 support in 
> > > <https://sourceware.org/ml/libc-alpha/2004-03/msg00326.html> and 
> > > <https://sourceware.org/ml/libc-alpha/2004-05/msg00055.html> no longer 
> > > stands, in the light of the existence of standard C bindings in TS 
> > > 18661-3.  I don't think there should be any controversy about superseding 
> > > such an old decision, but it should still be clear that we are doing so.
> > 
> > Those old patches appear to attempt to basically merge libquadmath into
> > glibc (or do I have it wrong, and that was the genesis of quadmath?). Are
> 
> These old patches long predate libquadmath (or at least its inclusion in 
> GCC).
> 
> > you suggesting we should be figuring out a plan to support _Float128 as
> > defined by TS 18661?
> 
> I think any new APIs for float128 support should follow TS 18661-3, yes 
> (this doesn't mean the whole of TS 18661-3 needs supporting, where the 
> corresponding float / double / long double functions aren't in glibc).  
> It does need extending for at least some floating-point functions in glibc 
> with no float128 analogues in TS 18661-3 - see my previous points about 
> the various classes of such functions that need considering individually 
> to decide whether float128 versions should be added.
> 
> Now, following TS 18661-3 means there is *nothing* in those 2004 changes 
> that is actually wanted in glibc - because they deal with I/O, and TS 
> 18661 does not add any printf/scanf formats for new floating point types; 
> rather, it uses strto* and strfrom* functions (so one or more of the 
> preparatory patches in the 50-patch series would be adding the strfrom* 
> functions from TS 18661-1 for the existing floating-point types, with 
> thorough testcases written in a type-generic way, in preparation for later 
> adding them for float128).  (And while sysdeps directory structures are 
> one of the more tricky design questions, I don't think those old changes 
> have anything useful regarding strto* either.)
> 
> There are certainly questions to work out regarding what extra compiler 
> support might be desirable in conjunction with float128 support in glibc - 
> but for now I think it's desirable to be able to build such support in 
> glibc (for x86_64 etc.) with existing compilers that support the 
> __float128 name but not _Float128 (and __attribute__ ((__mode__ 
> ((__TC__)))) for complex _Float128, etc.), although maybe with an 
> expectation to phase out that support in time to simplify the code 
> eventually (e.g. doing compiler support in parallel with the glibc 
> support).  __float128 would become a built-in typedef for _Float128 when 
> _Float128 is supported in the compiler.
> 
It is my assumption that we will be working on PPC64LE and ABI V2 to
start. There is no need to wait for GCC-7.

X86_64 has different set of issues (80-bit long double) and large set
of ./sysdeps/ platform specific implementations that we don't have the
skills to address.

I sure that the Intel team can address specifics of float128 when and if
they decide to. They will benefit from our early work on common Makefile
machinery.

> > > * We need to agree on the appropriateness of APIs from TS 18661 - that is, 
> > > to get consensus on a whole set of APIs being appropriate for glibc, so 
> > > that doesn't need redebating every time an individual new API is proposed.  
> > > The APIs from parts 1, 3 and 4, excluding those relating to decimal 
> > > floating point, seem a reasonable set to consider in such a discussion 
> > > (with a clear expectation that such APIs could then be added piecemeal, 
> > > with individual patches still needing consensus on the merits of the 
> > > implementation).  (If controversial, the minimal set for present purposes 
> > > would be the _Float128 APIs from part 3 that correspond to APIs already 
> > > present in glibc for other floating-point types, but I hope it would be 
> > > possible to obtain consensus on a wider set of APIs.)
> 
> Regarding such consensus, see the discussion I started at 
> <https://sourceware.org/ml/libc-alpha/2015-11/msg00162.html> in an attempt 
> to get consensus in advance of any implementation.  There wasn't 
> opposition to the appropriateness of those APIs, but not many people 
> commented either.
> 
> Note that consensus in principle for various new APIs does not mean 
> consensus for completely arbitrary subsets thereof; there should be some 
> coherence to the sets added.  For example, one thing to consider would be 
> whether TS 18661-3 functions should first be added in cases where they 
> alias existing functions (including for _Float32, _Float64, _Float32x etc. 
> - and including _Float128 on architectures where it aliases the existing 
> long double), with support for them as separate being later in the patch 
> series.
> 
> > > * We need to agree on rules for ABIs corresponding to such APIs.  My 
> > > suggestion is: where an API meets the ISO C rules for being usable without 
> > > including a header, the function needs an exported symbol with the same 
> > > name (for example, if we support sinf128 on a platform where long double 
> > > is binary128, there does actually need to be a sinf128 exported symbol, 
> > > not just a header redirecting calls to use sinl).  But if it doesn't meet 
> > > such rules, the number of exported symbols should be kept to a minimum 
> > > (for example, functions that exist only for use by type-generic macros 
> > > should not have such aliases).  Functions would go in libc or libm 
> > > according to what header has the declaration, unless consistency with 
> > > existing functions dictates otherwise (e.g. __isinff128 might go in libc 
> > > because other __isinf* functions are there).
> > 
> > Are you able to point out the relevant clauses in ISO C for this? I want to
> > make sure I understand these rules well.
> 
> The rule about using a function without a header is in C11 7.1.4#2: 
> "Provided that a library function can be declared without reference to any 
> type defined in a header, it is also permissible to declare the function 
> and use it without including its associated header.".  This applies to the 
> vast majority of libm functions.
> 
> The rule about which library a function goes in, based on the header, 
> comes from POSIX's definition of the c99 utility, saying libm is for 
> functions declared in <math.h>, <complex.h> and <fenv.h>.  I think this 
> continues to work fine for TS 18661 parts 1, 3 and 4.
> 
> > > * While working on it, it's important to pay attention to how the work 
> > > relates to the details of TS 18661 and to ongoing discussions in WG14 and 
> > > to track cases where possible issues or ambiguities in TS 18661 are found 
> > > - and to pass such information back to WG14 for consideration when it's 
> > > considered whether to include parts of TS 18661 in the next revision of 
> > > the C standard (actually having an implementation of parts of part 3 might 
> > > make it more likely to be included in the next revision).
> 
> Note especially that there is a change to <tgmath.h> (for an issue I 
> identified regarding different feature test macros being defined when 
> different headers are included - ISO C feature test macros working 
> differently from the existing glibc ones) in the floating-point working 
> group's current draft of *part 2* of the TS, postdating the publication of 
> parts 1 to 4, which may be relevant to part 3.  Probably the fixes in the 
> current drafts will soon become DRs against the published parts of the TS.
> 
>   Changes to C11 + TS18661-1:
> 
>   To 7.25 #1, append the sentence:
> 
>   It declares function identifiers that would be declared if <math.h> and 
>   <complex.h> were included by <tgmath.h>, even if <math.h> or <complex.h> 
>   were included prior to <tgmath.h> in a context where the identifiers were 
>   not declared.
> 
> There are implications for the design of how glibc handles declaring 
> functions for different floating-point types.  It's not clear how best to 
> address those.
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: RFC Migrating PowerPC to IEEE 128-bit Floating Point
  2016-04-19 17:26           ` Steven Munroe
@ 2016-04-21 20:47             ` Joseph Myers
  0 siblings, 0 replies; 46+ messages in thread
From: Joseph Myers @ 2016-04-21 20:47 UTC (permalink / raw)
  To: Steven Munroe
  Cc: Paul E. Murphy, Carlos O'Donell, libc-alpha,
	Tulio Magno Quites Machado Filho, Ulrich Weigand

On Tue, 19 Apr 2016, Steven Munroe wrote:

> It is my assumption that we will be working on PPC64LE and ABI V2 to
> start. There is no need to wait for GCC-7.

In that case the key question is whether consensus can be reached for 
requiring GCC 6.1 or later (supposing 6.1 has all the required support, 
both complex arithmetic and the handful of built-in functions for 
__float128 present on other architectures with __float128 support) to 
build glibc 2.25 for PPC64LE (without requiring it for other architectures 
including big-endian powerpc).  I don't know what the views of the PPC64LE 
glibc community on such a requirement might be.  (If you propose such a 
version requirement and no-one objects, then as architecture maintainer 
you can take that to be consensus.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 0/2] Add minimal code for IEEE 128-bit floating point
  2015-09-30 21:15 ` Joseph Myers
  2015-10-13 18:06   ` Steven Munroe
@ 2015-10-26 18:05   ` Tulio Magno Quites Machado Filho
  2015-10-26 18:12     ` Joseph Myers
  2015-10-26 18:06   ` [PATCH 2/2] soft-fp: Add new KF routines Tulio Magno Quites Machado Filho
  2015-10-26 18:06   ` [PATCH 1/2] soft-fp: Automatically create KF files from TF ones Tulio Magno Quites Machado Filho
  3 siblings, 1 reply; 46+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2015-10-26 18:05 UTC (permalink / raw)
  To: libc-alpha
  Cc: joseph, munroesj, meissner, Ulrich.Weigand, dje.gcc, jakub, carlos

The following 2 patches continue this discussion around IEEE 128-bit floating
point.
They're just the first step and provide only the minimal set of functions
required by libgcc at this moment.

Michael Meissner (1):
  soft-fp: Add new KF routines

Tulio Magno Quites Machado Filho (1):
  soft-fp: Automatically create KF files from TF ones

 soft-fp/.gitignore    |  4 +++
 soft-fp/Makefile      | 16 ++++++++++
 soft-fp/cmpukf2.c     | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++
 soft-fp/extendkftf2.c | 59 ++++++++++++++++++++++++++++++++++
 soft-fp/trunctfk2.c   | 65 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 231 insertions(+)
 create mode 100644 soft-fp/.gitignore
 create mode 100644 soft-fp/cmpukf2.c
 create mode 100644 soft-fp/extendkftf2.c
 create mode 100644 soft-fp/trunctfk2.c

-- 
2.1.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/2] Add minimal code for IEEE 128-bit floating point
  2015-10-26 18:05   ` [PATCH 0/2] Add minimal code for IEEE 128-bit floating point Tulio Magno Quites Machado Filho
@ 2015-10-26 18:12     ` Joseph Myers
  2015-10-26 18:45       ` Michael Meissner
  2015-10-26 19:51       ` Steven Munroe
  0 siblings, 2 replies; 46+ messages in thread
From: Joseph Myers @ 2015-10-26 18:12 UTC (permalink / raw)
  To: Tulio Magno Quites Machado Filho
  Cc: libc-alpha, munroesj, meissner, Ulrich.Weigand, dje.gcc, jakub, carlos

On Mon, 26 Oct 2015, Tulio Magno Quites Machado Filho wrote:

> The following 2 patches continue this discussion around IEEE 128-bit floating
> point.
> They're just the first step and provide only the minimal set of functions
> required by libgcc at this moment.

These architecture-specific libgcc files are nothing to do with glibc.  
They belong in libgcc/config/rs6000 in GCC.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/2] Add minimal code for IEEE 128-bit floating point
  2015-10-26 18:12     ` Joseph Myers
@ 2015-10-26 18:45       ` Michael Meissner
  2015-10-26 19:51       ` Steven Munroe
  1 sibling, 0 replies; 46+ messages in thread
From: Michael Meissner @ 2015-10-26 18:45 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Tulio Magno Quites Machado Filho, libc-alpha, munroesj, meissner,
	Ulrich.Weigand, dje.gcc, jakub, carlos

On Mon, Oct 26, 2015 at 06:12:18PM +0000, Joseph Myers wrote:
> On Mon, 26 Oct 2015, Tulio Magno Quites Machado Filho wrote:
> 
> > The following 2 patches continue this discussion around IEEE 128-bit floating
> > point.
> > They're just the first step and provide only the minimal set of functions
> > required by libgcc at this moment.
> 
> These architecture-specific libgcc files are nothing to do with glibc.  
> They belong in libgcc/config/rs6000 in GCC.

Fair enough. I just wasn't certain which part was glibc and which is libgcc.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/2] Add minimal code for IEEE 128-bit floating point
  2015-10-26 18:12     ` Joseph Myers
  2015-10-26 18:45       ` Michael Meissner
@ 2015-10-26 19:51       ` Steven Munroe
  2015-10-26 22:31         ` Joseph Myers
  1 sibling, 1 reply; 46+ messages in thread
From: Steven Munroe @ 2015-10-26 19:51 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Tulio Magno Quites Machado Filho, libc-alpha, meissner,
	Ulrich.Weigand, dje.gcc, jakub, carlos

On Mon, 2015-10-26 at 18:12 +0000, Joseph Myers wrote:
> On Mon, 26 Oct 2015, Tulio Magno Quites Machado Filho wrote:
> 
> > The following 2 patches continue this discussion around IEEE 128-bit floating
> > point.
> > They're just the first step and provide only the minimal set of functions
> > required by libgcc at this moment.
> 
> These architecture-specific libgcc files are nothing to do with glibc.  
> They belong in libgcc/config/rs6000 in GCC.
> 

So even if GLIBC is the master for soft-fp, It is expected that the TF
to KF rename is only in libgcc? And it is Ok for any libm <*>f128
functions that need to call KF versions of soft-fp will link to those
functions in libgcc?

What about any convert functions between TF and KF types (for example
__extendkftf2) Do those only exist in the platform specific libgcc
source or do they need to be included in the soft-fp master source?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 0/2] Add minimal code for IEEE 128-bit floating point
  2015-10-26 19:51       ` Steven Munroe
@ 2015-10-26 22:31         ` Joseph Myers
  0 siblings, 0 replies; 46+ messages in thread
From: Joseph Myers @ 2015-10-26 22:31 UTC (permalink / raw)
  To: Steven Munroe
  Cc: Tulio Magno Quites Machado Filho, libc-alpha, meissner,
	Ulrich.Weigand, dje.gcc, jakub, carlos

On Mon, 26 Oct 2015, Steven Munroe wrote:

> So even if GLIBC is the master for soft-fp, It is expected that the TF
> to KF rename is only in libgcc?

Yes, I think so.  I don't see a benefit to having it in glibc.

Some files only used in libgcc benefit from being present in glibc's copy 
of soft-fp because that way they naturally get included in global changes 
to internal soft-fp interfaces (conversions involving TImode, for 
example).  But that doesn't seem to apply to this sort of Makefile code.

> And it is Ok for any libm <*>f128
> functions that need to call KF versions of soft-fp will link to those
> functions in libgcc?

Yes.  That's just like the existing long double functions calling TFmode 
functions from libgcc; it's nothing unusual at all.  (There are some cases 
where glibc has its own copies of libgcc functions for one reason or 
another, but I don't see any of those reasons applying here.  glibc should 
never need to care what the names of the KFmode libgcc functions are, just 
relying on GCC to generate calls to them.  It may need to know that the 
mode is called KFmode in order to declare complex functions for GCC 
versions supporting the __float128 built-in typedef but not the _Float128 
keyword, but it should be possible to localize that knowledge to a small 
bits/ header.)

> What about any convert functions between TF and KF types (for example
> __extendkftf2) Do those only exist in the platform specific libgcc
> source or do they need to be included in the soft-fp master source?

Only in platform-specific libgcc source.  They're logically more like the 
existing IBM long double support in libgcc/config/rs6000/ibm-ldouble.c 
than the generic conversions between different IEEE formats in the main 
soft-fp code.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 2/2] soft-fp: Add new KF routines
  2015-09-30 21:15 ` Joseph Myers
  2015-10-13 18:06   ` Steven Munroe
  2015-10-26 18:05   ` [PATCH 0/2] Add minimal code for IEEE 128-bit floating point Tulio Magno Quites Machado Filho
@ 2015-10-26 18:06   ` Tulio Magno Quites Machado Filho
  2015-10-26 18:34     ` Joseph Myers
  2015-10-26 18:06   ` [PATCH 1/2] soft-fp: Automatically create KF files from TF ones Tulio Magno Quites Machado Filho
  3 siblings, 1 reply; 46+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2015-10-26 18:06 UTC (permalink / raw)
  To: libc-alpha
  Cc: joseph, munroesj, meissner, Ulrich.Weigand, dje.gcc, jakub, carlos

From: Michael Meissner <meissner@linux.vnet.ibm.com>

Add cmpukf2, extendkftf2 and trunctfk2 to soft-fp.
This is the minimal set of routines required to be imported in GCC for
IEEE 128-bit floating point support.

2015-10-26  Michael Meissner  <meissner@linux.vnet.ibm.com>
            Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>

	* soft-fp/.gitignore: Negate cmpukf2.c, extendkftf2.c and trunctfk2.c.
	* soft-fp/Makefile (gcc-kf-routines): New variable with all KF
        routines.
	* soft-fp/cmpukf2.c: New file
	* soft-fp/extendkftf2.c: Likewise
	* soft-fp/trunctfk2.c: Likewise
---
 soft-fp/.gitignore    |  3 ++
 soft-fp/Makefile      |  2 ++
 soft-fp/cmpukf2.c     | 87 +++++++++++++++++++++++++++++++++++++++++++++++++++
 soft-fp/extendkftf2.c | 59 ++++++++++++++++++++++++++++++++++
 soft-fp/trunctfk2.c   | 65 ++++++++++++++++++++++++++++++++++++++
 5 files changed, 216 insertions(+)
 create mode 100644 soft-fp/cmpukf2.c
 create mode 100644 soft-fp/extendkftf2.c
 create mode 100644 soft-fp/trunctfk2.c

diff --git a/soft-fp/.gitignore b/soft-fp/.gitignore
index 833e136..f68a4fa 100644
--- a/soft-fp/.gitignore
+++ b/soft-fp/.gitignore
@@ -1 +1,4 @@
 *kf*.c
+!cmpukf2.c
+!extendkftf2.c
+!trunctfk2.c
\ No newline at end of file
diff --git a/soft-fp/Makefile b/soft-fp/Makefile
index e392b9d..b398615 100644
--- a/soft-fp/Makefile
+++ b/soft-fp/Makefile
@@ -41,6 +41,8 @@ gcc-quad-routines := negtf2 addtf3 subtf3 multf3 divtf3 eqtf2 \
 gcc-kf-routines-auto := $(subst tf,kf,\
                           $(filter-out sqrttf2,$(gcc-quad-routines)))
 
+gcc-kf-routines := cmpukf2 extendtfkf2 trunctfk2 $(gcc-kf-routines-auto)
+
 generate-routines: $(addsuffix .c,$(gcc-kf-routines-auto))
 
 clean:
diff --git a/soft-fp/cmpukf2.c b/soft-fp/cmpukf2.c
new file mode 100644
index 0000000..d27c544
--- /dev/null
+++ b/soft-fp/cmpukf2.c
@@ -0,0 +1,87 @@
+/* Software IEEE 128-bit floating-point emulation for PowerPC.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Michael Meissner (meissner@linux.vnet.ibm.com)
+   Code is based on the main soft-fp library written by:
+	Richard Henderson (rth@cygnus.com) and
+	Jakub Jelinek (jj@ultra.linux.cz).
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Force the use of the VSX instruction set.  */
+#if defined(_ARCH_PPC) && (!defined(__VSX__) || !defined(__FLOAT128__))
+#pragma GCC target ("vsx,float128")
+#endif
+
+#include "soft-fp.h"
+#include "double.h"
+#include "single.h"
+#include "quad-float128.h"
+
+/* PowerPC condition register bits.  */
+#define PPC_UNORDERED		0x1		/* isnan (a) || isnan (b).  */
+#define PPC_EQUAL		0x2		/* a == b.  */
+#define PPC_GREATER_THEN	0x4		/* a > b.  */
+#define PPC_LESS_THEN		0x8		/* a < b.  */
+
+/* Map FP_CMP_Q output to PowerPC condition register bits.  */
+#define CMP_UNORDERED		(-2)		/* isnan (a) || isnan (b).  */
+#define CMP_LESS_THEN		(-1)		/* a < b.  */
+#define CMP_EQUAL		0		/* a == b.  */
+#define CMP_GREATER_THEN	1		/* a < b.  */
+#define CMP_INVALID		2		/* raise invalid exception.  */
+
+#define CMP_LOW			CMP_UNORDERED	/* comparison low value.  */
+#define CMP_HIGH		CMP_INVALID	/* comparison high value.  */
+
+static const unsigned char ppc_cr_map[] = {
+  PPC_UNORDERED,				/* -2: unordered.  */
+  PPC_LESS_THEN,				/* -1: a < b.  */
+  PPC_EQUAL,					/*  0: a == b.  */
+  PPC_GREATER_THEN,				/*  1: a > b.  */
+  PPC_UNORDERED,				/*  2: invalid.  */
+};
+
+/* Compare two IEEE 128-bit floating point values and return the status.  We
+   return the status as a 4-bit value that can be copied into an appropriate
+   PowerPC conditon code register.  */
+
+CMPtype
+__cmpukf2 (TFtype a, TFtype b)
+{
+  FP_DECL_EX;
+  FP_DECL_Q (A);
+  FP_DECL_Q (B);
+  CMPtype r;
+
+  FP_INIT_EXCEPTIONS;
+  FP_UNPACK_RAW_Q (A, a);
+  FP_UNPACK_RAW_Q (B, b);
+  FP_CMP_Q (r, A, B, 2, 2);
+  if (r == CMP_INVALID)
+    FP_SET_EXCEPTION (FP_EX_INVALID);
+  FP_HANDLE_EXCEPTIONS;
+
+  return (r < CMP_LOW || r > CMP_HIGH) ? PPC_UNORDERED : ppc_cr_map[r-CMP_LOW];
+}
diff --git a/soft-fp/extendkftf2.c b/soft-fp/extendkftf2.c
new file mode 100644
index 0000000..4b78b86
--- /dev/null
+++ b/soft-fp/extendkftf2.c
@@ -0,0 +1,59 @@
+/* Software IEEE 128-bit floating-point emulation for PowerPC.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Michael Meissner (meissner@linux.vnet.ibm.com)
+   Code is based on the main soft-fp library written by:
+	Richard Henderson (rth@cygnus.com) and
+	Jakub Jelinek (jj@ultra.linux.cz).
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Convert IEEE 128-bit floating point to IBM long double.  */
+
+/* Force the use of the VSX instruction set.  */
+#if defined(_ARCH_PPC) && (!defined(__VSX__) || !defined(__FLOAT128__))
+#pragma GCC target ("vsx,float128")
+#endif
+
+extern __ibm128 __extendkftf2 (__float128);
+
+__ibm128
+__extendkftf2 (__float128 value)
+{
+  double high, low;
+
+  high = (double) value;
+  if (__builtin_isnan (high) || __builtin_isinf (high))
+    low = high;
+
+  else
+    {
+      low = (double) (value - (__float128)high);
+
+      /* Use copysign to propigate the sign bit so that -0.0Q becomes -0.0L.  */
+      low = __builtin_copysign (low, high);
+    }
+
+  return __builtin_pack_longdouble (high, low);
+}
diff --git a/soft-fp/trunctfk2.c b/soft-fp/trunctfk2.c
new file mode 100644
index 0000000..0d38c5a
--- /dev/null
+++ b/soft-fp/trunctfk2.c
@@ -0,0 +1,65 @@
+/* Software IEEE 128-bit floating-point emulation for PowerPC.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+   Contributed by Michael Meissner (meissner@linux.vnet.ibm.com)
+   Code is based on the main soft-fp library written by:
+	Richard Henderson (rth@cygnus.com) and
+	Jakub Jelinek (jj@ultra.linux.cz).
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   In addition to the permissions in the GNU Lesser General Public
+   License, the Free Software Foundation gives you unlimited
+   permission to link the compiled version of this file into
+   combinations with other programs, and to distribute those
+   combinations without any restriction coming from the use of this
+   file.  (The Lesser General Public License restrictions do apply in
+   other respects; for example, they cover modification of the file,
+   and distribution when not linked into a combine executable.)
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+/* Convert IBM long double to IEEE 128-bit floating point.  */
+
+/* Force the use of the VSX instruction set.  */
+#if defined(_ARCH_PPC) && (!defined(__VSX__) || !defined(__FLOAT128__))
+#pragma GCC target ("vsx,float128")
+#endif
+
+extern __float128 __trunctfkf2 (__ibm128);
+
+#ifdef __LITTLE_ENDIAN__
+#define HIGH_WORD	1
+#define LOW_WORD	0
+#else
+#define HIGH_WORD	0
+#define LOW_WORD	1
+#endif
+
+__float128
+__trunctfkf2 (__ibm128 value)
+{
+  double high = __builtin_unpack_longdouble (value, HIGH_WORD);
+  double low = __builtin_unpack_longdouble (value, LOW_WORD);
+
+  /* Handle the special cases of NAN and inifinity.  */
+  if (__builtin_isnan (high) || __builtin_isinf (high))
+    return (__float128) high;
+
+  /* If low is 0.0, there no need to do the add.  In addition, avoiding the add
+     produces the correct sign if high is -0.0.  */
+  if (low == 0.0)
+    return (__float128) high;
+
+  return ((__float128)high) + ((__float128)low);
+}
-- 
2.1.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 2/2] soft-fp: Add new KF routines
  2015-10-26 18:06   ` [PATCH 2/2] soft-fp: Add new KF routines Tulio Magno Quites Machado Filho
@ 2015-10-26 18:34     ` Joseph Myers
  0 siblings, 0 replies; 46+ messages in thread
From: Joseph Myers @ 2015-10-26 18:34 UTC (permalink / raw)
  To: Tulio Magno Quites Machado Filho
  Cc: libc-alpha, munroesj, meissner, Ulrich.Weigand, dje.gcc, jakub, carlos

On Mon, 26 Oct 2015, Tulio Magno Quites Machado Filho wrote:

> From: Michael Meissner <meissner@linux.vnet.ibm.com>
> 
> Add cmpukf2, extendkftf2 and trunctfk2 to soft-fp.
> This is the minimal set of routines required to be imported in GCC for
> IEEE 128-bit floating point support.

As noted, these belong directly in libgcc, in an architecture-specific 
directory.  The soft-fp directory in glibc is for generic, 
architecture-independent implementations for the standard 
architecture-independent modes.  It has some files that in fact aren't 
used in glibc, but not anything inherently architecture-specific like 
this.

> +CMPtype
> +__cmpukf2 (TFtype a, TFtype b)
> +{
> +  FP_DECL_EX;
> +  FP_DECL_Q (A);
> +  FP_DECL_Q (B);
> +  CMPtype r;
> +
> +  FP_INIT_EXCEPTIONS;
> +  FP_UNPACK_RAW_Q (A, a);
> +  FP_UNPACK_RAW_Q (B, b);
> +  FP_CMP_Q (r, A, B, 2, 2);
> +  if (r == CMP_INVALID)
> +    FP_SET_EXCEPTION (FP_EX_INVALID);

No, with current FP_CMP macros you never need to set exceptions 
explicitly.  If the last argument is 2 exceptions should be raised for all 
NaN operands, 1 only for signaling NaNs (so for __cmpukf2 you want 1 as 
the last argument, for __cmpokf2 you want 2 there).

> +  FP_HANDLE_EXCEPTIONS;
> +
> +  return (r < CMP_LOW || r > CMP_HIGH) ? PPC_UNORDERED : ppc_cr_map[r-CMP_LOW];

The r value is -1, 0, 1 or 2 (the fourth argument to FP_CMP_Q).  A result 
< CMP_LOW is not possible.

> +__ibm128
> +__extendkftf2 (__float128 value)
> +{
> +  double high, low;
> +
> +  high = (double) value;
> +  if (__builtin_isnan (high) || __builtin_isinf (high))
> +    low = high;

No, that's incorrect for infinities.  See the ibm-ldouble-format file (in 
libgcc/config/rs6000/): the low part of an infinity must be 0 or -0, not 
another infinity (the low part of a NaN doesn't matter).

> +  else
> +    {
> +      low = (double) (value - (__float128)high);

There are cases where this will produce an invalid IBM long double value, 
where the low part is 0.5ulp of the high part and the high part has the 
least significant bit of the mantissa set (note that glibc contains code 
relying on this aspect of the IBM long double format (see 
sysdeps/ieee754/ldbl-128ibm/s_rintl.c: "However, if we have a canonical 
long double with the low double 0.5 or -0.5, then the high double must be 
even.")).

Thus, you need to renormalize after computing the initial approximations 
to the high and low parts.  Something like:

high_new = high + low;
low = (high - high_new) + low;
high = high_new;

> +      /* Use copysign to propigate the sign bit so that -0.0Q becomes -0.0L.  */
> +      low = __builtin_copysign (low, high);

And that's completely wrong.  The correct low part may be nonzero with 
either sign.  You need to special-case zeroes, but you can do that by 
simply including all of zero, infinity and NaN in the initial case where 
the low part is set to 0 (a low part of either +0 or -0 is fine for all 
those cases).

Also watch out for spacing in casts throughout this patch; it should be 
"(type) value" not "(type)value".

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH 1/2] soft-fp: Automatically create KF files from TF ones
  2015-09-30 21:15 ` Joseph Myers
                     ` (2 preceding siblings ...)
  2015-10-26 18:06   ` [PATCH 2/2] soft-fp: Add new KF routines Tulio Magno Quites Machado Filho
@ 2015-10-26 18:06   ` Tulio Magno Quites Machado Filho
  2015-10-26 18:16     ` Joseph Myers
  3 siblings, 1 reply; 46+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2015-10-26 18:06 UTC (permalink / raw)
  To: libc-alpha
  Cc: joseph, munroesj, meissner, Ulrich.Weigand, dje.gcc, jakub, carlos

Use a sed script to automatically generate KF files based on their
respective TF.

2015-10-26  Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>
            Michael Meissner  <meissner@linux.vnet.ibm.com>

	* soft-fp/Makefile: Generate KF files from TF.
	* soft-fp/.gitignore: Ignore auto-generated files.
---
 soft-fp/.gitignore |  1 +
 soft-fp/Makefile   | 14 ++++++++++++++
 2 files changed, 15 insertions(+)
 create mode 100644 soft-fp/.gitignore

diff --git a/soft-fp/.gitignore b/soft-fp/.gitignore
new file mode 100644
index 0000000..833e136
--- /dev/null
+++ b/soft-fp/.gitignore
@@ -0,0 +1 @@
+*kf*.c
diff --git a/soft-fp/Makefile b/soft-fp/Makefile
index 28f9f0c..e392b9d 100644
--- a/soft-fp/Makefile
+++ b/soft-fp/Makefile
@@ -37,4 +37,18 @@ gcc-quad-routines := negtf2 addtf3 subtf3 multf3 divtf3 eqtf2 \
 	fixunstfdi floatditf extendsftf2 trunctfsf2 extenddftf2 \
 	trunctfdf2 sqrttf2 floatunsitf floatunditf
 
+# Auto-generate KF routines list, removing unused files.
+gcc-kf-routines-auto := $(subst tf,kf,\
+                          $(filter-out sqrttf2,$(gcc-quad-routines)))
+
+generate-routines: $(addsuffix .c,$(gcc-kf-routines-auto))
+
+clean:
+	-rm -f $(addsuffix .c,$(gcc-kf-routines-auto))
+
 include ../Rules
+
+.SECONDEXPANSION:
+$(addsuffix .c,$(gcc-kf-routines-auto)): $$(subst kf,tf,$$@)
+	@sed -e 's/\(__[a-z]\+\)tf\([a-z0-9]*\)/\1kf\2/g' \
+	     -e 's/quad[.]h/quad-float128.h/g' $< > $@
-- 
2.1.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/2] soft-fp: Automatically create KF files from TF ones
  2015-10-26 18:06   ` [PATCH 1/2] soft-fp: Automatically create KF files from TF ones Tulio Magno Quites Machado Filho
@ 2015-10-26 18:16     ` Joseph Myers
  2015-10-26 18:44       ` Michael Meissner
  2015-10-26 20:01       ` [PATCHv2] " Tulio Magno Quites Machado Filho
  0 siblings, 2 replies; 46+ messages in thread
From: Joseph Myers @ 2015-10-26 18:16 UTC (permalink / raw)
  To: Tulio Magno Quites Machado Filho
  Cc: libc-alpha, munroesj, meissner, Ulrich.Weigand, dje.gcc, jakub, carlos

On Mon, 26 Oct 2015, Tulio Magno Quites Machado Filho wrote:

> Use a sed script to automatically generate KF files based on their
> respective TF.
> 
> 2015-10-26  Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>
>             Michael Meissner  <meissner@linux.vnet.ibm.com>
> 
> 	* soft-fp/Makefile: Generate KF files from TF.
> 	* soft-fp/.gitignore: Ignore auto-generated files.

Even in GCC, creating generated files like that in the source directory 
would be wrong.  If you can create them with a sed script, you can do so 
at build time and put the files in the build directory, not the source 
directory (which should be able to be on a read-only filesystem).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH 1/2] soft-fp: Automatically create KF files from TF ones
  2015-10-26 18:16     ` Joseph Myers
@ 2015-10-26 18:44       ` Michael Meissner
  2015-10-26 20:01       ` [PATCHv2] " Tulio Magno Quites Machado Filho
  1 sibling, 0 replies; 46+ messages in thread
From: Michael Meissner @ 2015-10-26 18:44 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Tulio Magno Quites Machado Filho, libc-alpha, munroesj, meissner,
	Ulrich.Weigand, dje.gcc, jakub, carlos

On Mon, Oct 26, 2015 at 06:16:38PM +0000, Joseph Myers wrote:
> On Mon, 26 Oct 2015, Tulio Magno Quites Machado Filho wrote:
> 
> > Use a sed script to automatically generate KF files based on their
> > respective TF.
> > 
> > 2015-10-26  Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>
> >             Michael Meissner  <meissner@linux.vnet.ibm.com>
> > 
> > 	* soft-fp/Makefile: Generate KF files from TF.
> > 	* soft-fp/.gitignore: Ignore auto-generated files.
> 
> Even in GCC, creating generated files like that in the source directory 
> would be wrong.  If you can create them with a sed script, you can do so 
> at build time and put the files in the build directory, not the source 
> directory (which should be able to be on a read-only filesystem).

In my original patch for libgcc, it places the created .c files in the build
directory. I tend to think it is better to put it in the build directory.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meissner@linux.vnet.ibm.com, phone: +1 (978) 899-4797

^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCHv2] soft-fp: Automatically create KF files from TF ones
  2015-10-26 18:16     ` Joseph Myers
  2015-10-26 18:44       ` Michael Meissner
@ 2015-10-26 20:01       ` Tulio Magno Quites Machado Filho
  2015-10-26 22:32         ` Joseph Myers
  1 sibling, 1 reply; 46+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2015-10-26 20:01 UTC (permalink / raw)
  To: joseph
  Cc: libc-alpha, munroesj, meissner, Ulrich.Weigand, dje.gcc, jakub, carlos

Joseph Myers <joseph@codesourcery.com> writes:

> Even in GCC, creating generated files like that in the source directory 
> would be wrong.  If you can create them with a sed script, you can do so 
> at build time and put the files in the build directory, not the source 
> directory (which should be able to be on a read-only filesystem).

Ack.  What about this new version?

----8<----

Use a sed script to automatically generate KF files based on their
respective TF.

2015-10-26  Tulio Magno Quites Machado Filho  <tuliom@linux.vnet.ibm.com>
            Michael Meissner  <meissner@linux.vnet.ibm.com>

	* soft-fp/Makefile: Generate KF files from TF.
---
 soft-fp/Makefile | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/soft-fp/Makefile b/soft-fp/Makefile
index 28f9f0c..f5ea630 100644
--- a/soft-fp/Makefile
+++ b/soft-fp/Makefile
@@ -37,4 +37,22 @@ gcc-quad-routines := negtf2 addtf3 subtf3 multf3 divtf3 eqtf2 \
 	fixunstfdi floatditf extendsftf2 trunctfsf2 extenddftf2 \
 	trunctfdf2 sqrttf2 floatunsitf floatunditf
 
+# Auto-generate KF routines list, removing unused files.
+gcc-kf-routines-auto := $(subst tf,kf,\
+                          $(filter-out sqrttf2,$(gcc-quad-routines)))
+gcc-kf-routines-auto-files := $(addprefix $(objpfx),\
+                                          $(addsuffix .c,\
+                                                      $(gcc-kf-routines-auto)))
+
+generate-routines: $(gcc-kf-routines-auto-files)
+
+clean:
+	-rm -f $(gcc-kf-routines-auto-files)
+
 include ../Rules
+
+.SECONDEXPANSION:
+$(gcc-kf-routines-auto-files): $$(subst kf,tf,$$(@F))
+	@mkdir -p $(objpfx)
+	@sed -e 's/\(__[a-z]\+\)tf\([a-z0-9]*\)/\1kf\2/g' \
+	     -e 's/quad[.]h/quad-float128.h/g' $< > $@
-- 
2.1.0

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2] soft-fp: Automatically create KF files from TF ones
  2015-10-26 20:01       ` [PATCHv2] " Tulio Magno Quites Machado Filho
@ 2015-10-26 22:32         ` Joseph Myers
  2015-10-27 11:20           ` Tulio Magno Quites Machado Filho
  0 siblings, 1 reply; 46+ messages in thread
From: Joseph Myers @ 2015-10-26 22:32 UTC (permalink / raw)
  To: Tulio Magno Quites Machado Filho
  Cc: libc-alpha, munroesj, meissner, Ulrich.Weigand, dje.gcc, jakub, carlos

On Mon, 26 Oct 2015, Tulio Magno Quites Machado Filho wrote:

> Joseph Myers <joseph@codesourcery.com> writes:
> 
> > Even in GCC, creating generated files like that in the source directory 
> > would be wrong.  If you can create them with a sed script, you can do so 
> > at build time and put the files in the build directory, not the source 
> > directory (which should be able to be on a read-only filesystem).
> 
> Ack.  What about this new version?

I still say this belongs in the libgcc makefiles, not in glibc at all.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCHv2] soft-fp: Automatically create KF files from TF ones
  2015-10-26 22:32         ` Joseph Myers
@ 2015-10-27 11:20           ` Tulio Magno Quites Machado Filho
  0 siblings, 0 replies; 46+ messages in thread
From: Tulio Magno Quites Machado Filho @ 2015-10-27 11:20 UTC (permalink / raw)
  To: Joseph Myers; +Cc: libc-alpha, meissner

Joseph Myers <joseph@codesourcery.com> writes:

> On Mon, 26 Oct 2015, Tulio Magno Quites Machado Filho wrote:
>
>> Joseph Myers <joseph@codesourcery.com> writes:
>> 
>> > Even in GCC, creating generated files like that in the source directory 
>> > would be wrong.  If you can create them with a sed script, you can do so 
>> > at build time and put the files in the build directory, not the source 
>> > directory (which should be able to be on a read-only filesystem).
>> 
>> Ack.  What about this new version?
>
> I still say this belongs in the libgcc makefiles, not in glibc at all.

OK.  I thought you were referring only to the other patch.

Thanks!

-- 
Tulio Magno

^ permalink raw reply	[flat|nested] 46+ messages in thread

* IEEE128 binary float to decimal float conversion routines
  2015-09-30 19:18 RFC Migrating PowerPC to IEEE 128-bit Floating Point Steven Munroe
  2015-09-30 21:15 ` Joseph Myers
@ 2015-11-16 17:48 ` Paul E. Murphy
  2015-11-16 18:24   ` Joseph Myers
  1 sibling, 1 reply; 46+ messages in thread
From: Paul E. Murphy @ 2015-11-16 17:48 UTC (permalink / raw)
  To: joseph, libc-alpha
  Cc: Steve Munroe, Tulio Magno Quites Machado Filho, Michael R Meissner

Hi Joseph,

I think there may have been a question as to where the
conversion routines between IEEE 128 binary float and
decimal float should live.

Observing existing precedent and machinery, I think
the appropriate place to house them is within libdfp.

The conversion routines between the existing types
reside in here already. Examining the source, we (IBM)
would be adding the following:

	__{dpd,bid}_extend {sd,dd,td} <-> kf
	__{dpd,bid}_trunc  {sd,dd,td} <-> kf

Similarly, it looks like __int128 support never got added:

	__{dpd,bid}_fix {sd,dd,td} -> {ti,unsigned ti}
	__{dpd,bid}_float {ti, unsigned ti} -> {sd,dd,td}

Thanks,
Paul

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-11-16 17:48 ` IEEE128 binary float to decimal float conversion routines Paul E. Murphy
@ 2015-11-16 18:24   ` Joseph Myers
  2015-11-16 18:40     ` Joseph Myers
                       ` (2 more replies)
  0 siblings, 3 replies; 46+ messages in thread
From: Joseph Myers @ 2015-11-16 18:24 UTC (permalink / raw)
  To: Paul E. Murphy
  Cc: libc-alpha, Steve Munroe, Tulio Magno Quites Machado Filho,
	Michael R Meissner

On Mon, 16 Nov 2015, Paul E. Murphy wrote:

> Hi Joseph,
> 
> I think there may have been a question as to where the
> conversion routines between IEEE 128 binary float and
> decimal float should live.
> 
> Observing existing precedent and machinery, I think
> the appropriate place to house them is within libdfp.

I think existing practice is that there's a copy of such functions in 
libgcc and another copy in libdfp - and maybe the libdfp version supports 
exceptions and rounding modes (software or hardware) but the libgcc 
version doesn't?

The BID conversions between binary and decimal float involve several MB of 
tables (whereas libgcc DPD conversions go via strings).  Several MB of 
tables are not of course needed; if doing correctly-rounded conversions 
(required for IEEE 754 conformance) there's a speed/space trade-off in how 
much precomputed data you use versus how much computation you do at 
runtime, and it's up to you what you think the right trade-off for powerpc 
is.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-11-16 18:24   ` Joseph Myers
@ 2015-11-16 18:40     ` Joseph Myers
  2015-11-16 22:07     ` Christoph Lauter
  2015-11-16 23:45     ` Paul E. Murphy
  2 siblings, 0 replies; 46+ messages in thread
From: Joseph Myers @ 2015-11-16 18:40 UTC (permalink / raw)
  To: Paul E. Murphy
  Cc: libc-alpha, Steve Munroe, Tulio Magno Quites Machado Filho,
	Michael R Meissner

On Mon, 16 Nov 2015, Joseph Myers wrote:

> The BID conversions between binary and decimal float involve several MB of 
> tables (whereas libgcc DPD conversions go via strings).  Several MB of 

I should comment a bit more on conversions via strings.

(a) Current glibc has correctly rounded conversions between binary 
floating-point and strings, in both directions (older glibc did not).

(b) Thus, converting via strings is fine as an approach for a correctly 
rounded conversion when converting from decimal to binary.

(c) It's not however correct for non-default rounding modes when 
converting from binary to decimal, because glibc's conversions from binary 
floating-point to decimal strings use the binary rounding mode, in 
accordance with IEEE 754, but for a conversion from binary floating-point 
to decimal floating-point the correct rounding mode to use is the decimal 
rounding mode, again as per IEEE 754.  (If the libgcc code doesn't try to 
handle non-default rounding modes, that may not matter for libgcc, 
however.)

(d) Conversions via strings also aren't a good approach at present for the 
present case where the binary type is __float128, because the conversions 
between __float128 and strings aren't in glibc (that would be part of the 
large and complicated project of supporting TS 18661-3 functions for 
__float128, which in this case also involves a dependency on part of TS 
18661-1 to support strfrom*).  This is why such conversions aren't just an 
easy matter of "build libgcc functions for KFmode like they're built for 
other binary floating-point modes".

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-11-16 18:24   ` Joseph Myers
  2015-11-16 18:40     ` Joseph Myers
@ 2015-11-16 22:07     ` Christoph Lauter
  2015-11-16 22:42       ` Joseph Myers
  2015-11-16 23:45     ` Paul E. Murphy
  2 siblings, 1 reply; 46+ messages in thread
From: Christoph Lauter @ 2015-11-16 22:07 UTC (permalink / raw)
  To: Joseph Myers, Paul E. Murphy
  Cc: libc-alpha, Steve Munroe, Tulio Magno Quites Machado Filho,
	Michael R Meissner

Hi everyone,

Joseph Myers wrote on 11/16/2015 07:24 PM:

> The BID conversions between binary and decimal float involve several MB of
> tables (whereas libgcc DPD conversions go via strings).  Several MB of
> tables are not of course needed; if doing correctly-rounded conversions
> (required for IEEE 754 conformance) there's a speed/space trade-off in how
> much precomputed data you use versus how much computation you do at
> runtime, and it's up to you what you think the right trade-off for powerpc
> is.
>

For what it's worth: some colleagues of mine and myself have recently 
published a paper on exact binary-to-decimal conversions between all 
IEEE754 formats, assuming BID as the encoding for decimal. None of the 
algorithms we proposed uses a table larger that a couple of dozens of 
kilobytes. Converting between binary and decimal FP with correct 
rounding is essentially the same problem, so people might wish to have a 
look at these results:

> http://www.computer.org/csdl/trans/tc/preprint/07271015-abs.html

Best regards,

Christoph Lauter

-- 
Christoph Lauter
MaÃ®tre de confÃ©rences - Associate Professor
Ã‰quipe PEQUAN - LIP6 - UPMC Paris 6
4, place Jussieu, 75252 Paris Cedex 05, 26-00/301
Tel.: +33144278029 / +33182521777
http://www.christoph-lauter.org/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-11-16 22:07     ` Christoph Lauter
@ 2015-11-16 22:42       ` Joseph Myers
  2015-12-18 21:12         ` Steven Munroe
  0 siblings, 1 reply; 46+ messages in thread
From: Joseph Myers @ 2015-11-16 22:42 UTC (permalink / raw)
  To: Christoph Lauter
  Cc: Paul E. Murphy, libc-alpha, Steve Munroe,
	Tulio Magno Quites Machado Filho, Michael R Meissner

On Mon, 16 Nov 2015, Christoph Lauter wrote:

> For what it's worth: some colleagues of mine and myself have recently
> published a paper on exact binary-to-decimal conversions between all IEEE754
> formats, assuming BID as the encoding for decimal. None of the algorithms we
> proposed uses a table larger that a couple of dozens of kilobytes. Converting
> between binary and decimal FP with correct rounding is essentially the same
> problem, so people might wish to have a look at these results:
> 
> > http://www.computer.org/csdl/trans/tc/preprint/07271015-abs.html

Thanks.  I'm not at all expert on decimal floating-point or on its state 
in the GNU toolchain (I just noted the absence of these conversions in the 
course of reviewing patches for __float128 libgcc support for powerpc).  
My general impression is that the IEEE 754 conformance state is probably 
similar to or worse than that for binary floating-point - that is, various 
miscellaneous local issues along with the same general issues of 
optimizations not respecting the state involved in exceptions and rounding 
modes (but because decimal floating-point is less widely used, such issues 
are less likely to have been found, especially if some code is correct for 
binary floating-point and no-one thought about decimal when writing an 
optimization, and especially when involving issues such as preferred 
quantum that don't exist for binary).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-11-16 22:42       ` Joseph Myers
@ 2015-12-18 21:12         ` Steven Munroe
  2015-12-18 22:13           ` Joseph Myers
  0 siblings, 1 reply; 46+ messages in thread
From: Steven Munroe @ 2015-12-18 21:12 UTC (permalink / raw)
  To: Joseph Myers, Christoph Lauter
  Cc: Paul E. Murphy, libc-alpha, Steve Munroe,
	Tulio Magno Quites Machado Filho, Michael R Meissner

On Mon, 2015-11-16 at 22:42 +0000, Joseph Myers wrote:
> On Mon, 16 Nov 2015, Christoph Lauter wrote:
> 
> > For what it's worth: some colleagues of mine and myself have recently
> > published a paper on exact binary-to-decimal conversions between all IEEE754
> > formats, assuming BID as the encoding for decimal. None of the algorithms we
> > proposed uses a table larger that a couple of dozens of kilobytes. Converting
> > between binary and decimal FP with correct rounding is essentially the same
> > problem, so people might wish to have a look at these results:
> > 
> > > http://www.computer.org/csdl/trans/tc/preprint/07271015-abs.html
> 
It took a while to get this from behind the paywall ...

At this time I can NOT say this analysis applies to my case
(IEEE754-2008 Densely Packed Decimal floating point) 

This paper is focused on the Intel-centric BID format the assumes
software emulation.

POWER has been shipping full IEEE Decimal Floating point Hardware since
2007 and libdfp has been available since that time-frame.

Also this paper fails to mention (or cite) the decades of published
research by the esteemed Michael F. Cowlishaw. Mike and IBM brought the
original Decimal Floating Point proposal to the IEEE committee and
contributed DecNumber to open source communities to support Decimal
Floating point adoption.

http://speleotrove.com/decimal/

This over site and direct statements in this paper that you assume BID
format an later assumption that you can use binary (bit) shifts (vs
decimal digit shifts) raise doubts about the applicability of this
specific analysis.

Christoph, if you are interested in new analysis extending to
IEEE-754-2008 DPD implementation and performance. Let me know. I can
provide access to the latest public documentation and POWER systems for
testing if needed.

> Thanks.  I'm not at all expert on decimal floating-point or on its state 
> in the GNU toolchain (I just noted the absence of these conversions in the 
> course of reviewing patches for __float128 libgcc support for powerpc).  
> My general impression is that the IEEE 754 conformance state is probably 
> similar to or worse than that for binary floating-point - that is, various 
> miscellaneous local issues along with the same general issues of 
> optimizations not respecting the state involved in exceptions and rounding 
> modes (but because decimal floating-point is less widely used, such issues 
> are less likely to have been found, especially if some code is correct for 
> binary floating-point and no-one thought about decimal when writing an 
> optimization, and especially when involving issues such as preferred 
> quantum that don't exist for binary).
> 

Joseph,

Please do not assume the Decimal Floating carries all the sins of the
much maligned IBM long double. 

Both the soft-dfp and PowerISA DFP implement the 5 IEEE (FE_DEC_*)
rounding modes. The only caveat that I am aware of is that the HW and
soft-dfp (via the underlying DecNumber library) have additional rounding
modes that are not available at the fe_dec_setround API. This includes
the "Round to Prepare for Shorter Precision" mode, which we will expose
to address some of your concerns about double rounding in conversions.

And preferred quantum is automatic in the implementation or easily
enforced via quantize or quantize immediate instructions.

We are happy to get bug reports about the libdfp implementation and will
work with you to resolve them.

But we have a strong bias toward leveraging our high performance Decimal
hardware and existing libdfp infrastructure.

The _Decimal128 operation are already faster (POWER8) then equivalent
operation on Haswell for BID _Decimal128 (20x for add, 5x for multiply)
or in _float128 from libquadmath (12x for add, 6% for multiply). 

This trend will continue into next generation of processors and as we
implement the PowerISA3.0 quad-precision floating point in hardware, we
expect even better results.

So why would I even consider using software emulation to get one
additional ULP of accuracy?

I get very few complaints about proper rounding. I do get constant
pressure to improve performance. 

I would consider using precision extension techniques that leverage
PowerISA and POWER hardware. For example leveraging the DFP unit's fast
add/multiple/digit-shifter to get 64 or 96 digit precision for internal
operations. I have not coded this yet, but it does not look that hard to
do on existing hardware.

Also looking forward to PowerISA3.0 Quad-precision fussed multiply-add
can be used to deliver 226 bit resolution (al la IBM long double) for
internal conversions.

So we will work on resolving the issues that you have raised, but within
the context of leveraging PowerISA and POWER hardware capabilities. 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-12-18 21:12         ` Steven Munroe
@ 2015-12-18 22:13           ` Joseph Myers
  2015-12-19  5:03             ` Steven Munroe
  0 siblings, 1 reply; 46+ messages in thread
From: Joseph Myers @ 2015-12-18 22:13 UTC (permalink / raw)
  To: Steven Munroe
  Cc: Christoph Lauter, Paul E. Murphy, libc-alpha, Steve Munroe,
	Tulio Magno Quites Machado Filho, Michael R Meissner

On Fri, 18 Dec 2015, Steven Munroe wrote:

> At this time I can NOT say this analysis applies to my case
> (IEEE754-2008 Densely Packed Decimal floating point) 

Since BID and DPD have exactly the same set of values, there is no 
significant difference for the purpose of determining the precision needed 
for conversions.  There's little difference in whether you extract the DFP 
mantissa by frexp / multiplication / converting to an integer type (as in 
libdfp), or directly from the binary representation (as can be done for 
BID).  Much the same applies for storing an integer mantissa when working 
in the other direction.

> > Thanks.  I'm not at all expert on decimal floating-point or on its state 
> > in the GNU toolchain (I just noted the absence of these conversions in the 
> > course of reviewing patches for __float128 libgcc support for powerpc).  
> > My general impression is that the IEEE 754 conformance state is probably 
> > similar to or worse than that for binary floating-point - that is, various 
> > miscellaneous local issues along with the same general issues of 
> > optimizations not respecting the state involved in exceptions and rounding 
> > modes (but because decimal floating-point is less widely used, such issues 
> > are less likely to have been found, especially if some code is correct for 
> > binary floating-point and no-one thought about decimal when writing an 
> > optimization, and especially when involving issues such as preferred 
> > quantum that don't exist for binary).
> > 
> 
> Joseph,
> 
> Please do not assume the Decimal Floating carries all the sins of the
> much maligned IBM long double. 

I'm not assuming that - I'm comparing with support for *IEEE* binary 
types, which also has plenty of deficiencies in GCC.  I'm working from, 
for example, the various open DFP bugs in GCC:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32330
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39878
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43374
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53319
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53332
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58429
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65833

I think that illustrates how there are a range of miscellaneous local 
issues (similar to the state for binary floating-point) as well as the 
same general issues with optimizations as for binary floating-point.  Or 
cf. Fred Tydeman's comments in the October CFP minutes 
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1981.pdf>.

Maybe some of those bugs describe issues that are invalid or already fixed 
- but that doesn't really affect my point.

> So why would I even consider using software emulation to get one
> additional ULP of accuracy?

Well, I'd think the starting point is to implement operations following 
the correct semantics, which are quite clear in both TR 24732:2009 and TS 
18661-2: as per IEEE 754-2008, conversions are correctly rounded with 
correct exceptions.  Then, depending on how common such conversions 
between binary and decimal types actually are (I'd guess not very common), 
and which cases are more or less common, optimize them for the common 
cases, taking full advantage of hardware facilities in the process (and 
with the potential for e.g. separate -fno-trapping-math versions if 
correct exceptions involve significant performance cost).  That leaves 
software emulation, most likely, only for rare worst cases - most cases 
should be able to use fma plus Dekker-style precision extension to avoid 
software emulation, provided you take special case about exact and 
exactly-half-way cases.

If it were a function not fully bound to an IEEE 754 operation, then, yes, 
you probably wouldn't use software emulation for 1ulp extra precision.  
But that's for things such as the slow paths in dbl-64 functions that I 
recently proposed removing.  It's not for IEEE operations such as 
conversions, sqrt or fma.  (Of course the fma that Jakub implemented 
following the method of Boldo and Melquiond, to replace a particularly 
stupid fallback on processors without fma instructions, and that I fixed 
up for various exceptions and rounding modes issues, is slower than a 
fallback that's not correctly rounding.  But fma is an IEEE operation, 
which means doing the slow thing when no faster way of achieving correct 
results is available.)

glibc no longer works, as it used to, on the basis of implementing some 
vague approximation to the desired semantics with whatever deviations from 
the standard someone felt like having, although there is still plenty of 
old code like that (but I've been gradually working through libm functions 
over the past few years to ensure they follow consistent accuracy goals 
and that there is adequate test coverage for this).  For anything 
potentially controversial, especially new interfaces, we work to obtain 
consensus in the community, including a common understanding of the 
relevant standard semantics and how best to implement them in glibc (and, 
if it seems the standard may be defective, a common understanding in that 
regard - working with the relevant standards committees to resolve any 
such defects).  This means much more regard than a few years ago for 
standard semantics first, optimizations only where consistent with the 
standard.

Of course the libdfp maintainers can do what they want in libdfp.  But 
since this discussion was started on libc-alpha, I'm considering things in 
terms of standard glibc accuracy goals (which for operations fully bound 
to IEEE 754, on IEEE types, means exact correctly-rounded results and 
exceptions).  And for any new libm functions (e.g. for float128), getting 
consensus on the design and implementation approach at an early stage, 
working with the community, and following glibc standards at least to the 
extent that the existing ldbl-128 functions follow them, would be 
particularly important.  (It would not be OK, for example, to have 
architecture-specific optimized versions that fail to follow the standard 
semantics when the architecture-independent versions do follow those 
semantics, though architecture maintainers can e.g. decide which cases are 
important to optimize for on their architecture, while still keeping the 
slow cases correct.  Note that we removed various x86 function 
implementations using x87 trig instructions a few years ago because of 
inaccuracy.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-12-18 22:13           ` Joseph Myers
@ 2015-12-19  5:03             ` Steven Munroe
  2015-12-19 13:15               ` Joseph Myers
  2015-12-19 16:40               ` Joseph Myers
  0 siblings, 2 replies; 46+ messages in thread
From: Steven Munroe @ 2015-12-19  5:03 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Christoph Lauter, Paul E. Murphy, libc-alpha, Steve Munroe,
	Tulio Magno Quites Machado Filho, Michael R Meissner

On Fri, 2015-12-18 at 22:12 +0000, Joseph Myers wrote:
> On Fri, 18 Dec 2015, Steven Munroe wrote:
> 
> > At this time I can NOT say this analysis applies to my case
> > (IEEE754-2008 Densely Packed Decimal floating point) 
> 
> Since BID and DPD have exactly the same set of values, there is no 
> significant difference for the purpose of determining the precision needed 
> for conversions.  There's little difference in whether you extract the DFP 
> mantissa by frexp / multiplication / converting to an integer type (as in 
> libdfp), or directly from the binary representation (as can be done for 
> BID).  Much the same applies for storing an integer mantissa when working 
> in the other direction.
> 
> > > Thanks.  I'm not at all expert on decimal floating-point or on its state 
> > > in the GNU toolchain (I just noted the absence of these conversions in the 
> > > course of reviewing patches for __float128 libgcc support for powerpc).  
> > > My general impression is that the IEEE 754 conformance state is probably 
> > > similar to or worse than that for binary floating-point - that is, various 
> > > miscellaneous local issues along with the same general issues of 
> > > optimizations not respecting the state involved in exceptions and rounding 
> > > modes (but because decimal floating-point is less widely used, such issues 
> > > are less likely to have been found, especially if some code is correct for 
> > > binary floating-point and no-one thought about decimal when writing an 
> > > optimization, and especially when involving issues such as preferred 
> > > quantum that don't exist for binary).
> > > 
> > 
> > Joseph,
> > 
> > Please do not assume the Decimal Floating carries all the sins of the
> > much maligned IBM long double. 
> 
> I'm not assuming that - I'm comparing with support for *IEEE* binary 
> types, which also has plenty of deficiencies in GCC.  I'm working from, 
> for example, the various open DFP bugs in GCC:
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=32330
Intel BID
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=39878
GCC4.3 retest 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43374
Intel BID
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53319
Intel BID
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53332
Intel BID
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=58429
Bad test, -mpowerpc64 is not a valid target
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65833
TImode noted being fixed.

I am not sure what you want from us here. Most of these are not my
platform or bad test cases. We are fixing the TIMode issue as an
oversite.

> 
> I think that illustrates how there are a range of miscellaneous local 
> issues (similar to the state for binary floating-point) as well as the 
> same general issues with optimizations as for binary floating-point.  Or 
> cf. Fred Tydeman's comments in the October CFP minutes 
> <http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1981.pdf>.


> Maybe some of those bugs describe issues that are invalid or already fixed 
> - but that doesn't really affect my point.
> 

The bugzilla listed above are NOT an indicator of systematic problems in
libdfp or decnumber. There do seem to be a few problem in the BID
implementation.

> > So why would I even consider using software emulation to get one
> > additional ULP of accuracy?
> 
> Well, I'd think the starting point is to implement operations following 
> the correct semantics, which are quite clear in both TR 24732:2009 and TS 
> 18661-2: as per IEEE 754-2008, conversions are correctly rounded with 
> correct exceptions.  

Then you should be specific! Specific examples with test cases.


> Then, depending on how common such conversions 
> between binary and decimal types actually are (I'd guess not very common), 
> and which cases are more or less common, optimize them for the common 
> cases, taking full advantage of hardware facilities in the process (and 
> with the potential for e.g. separate -fno-trapping-math versions if 
> correct exceptions involve significant performance cost).  That leaves 
> software emulation, most likely, only for rare worst cases - most cases 
> should be able to use fma plus Dekker-style precision extension to avoid 
> software emulation, provided you take special case about exact and 
> exactly-half-way cases.
> 
> If it were a function not fully bound to an IEEE 754 operation, then, yes, 
> you probably wouldn't use software emulation for 1ulp extra precision.  
> But that's for things such as the slow paths in dbl-64 functions that I 
> recently proposed removing.  It's not for IEEE operations such as 
> conversions, sqrt or fma.  (Of course the fma that Jakub implemented 
> following the method of Boldo and Melquiond, to replace a particularly 
> stupid fallback on processors without fma instructions, and that I fixed 
> up for various exceptions and rounding modes issues, is slower than a 
> fallback that's not correctly rounding.  But fma is an IEEE operation, 
> which means doing the slow thing when no faster way of achieving correct 
> results is available.)
> 
> glibc no longer works, as it used to, on the basis of implementing some 
> vague approximation to the desired semantics with whatever deviations from 
> the standard someone felt like having, although there is still plenty of 
> old code like that (but I've been gradually working through libm functions 
> over the past few years to ensure they follow consistent accuracy goals 
> and that there is adequate test coverage for this).  For anything 
> potentially controversial, especially new interfaces, we work to obtain 
> consensus in the community, including a common understanding of the 
> relevant standard semantics and how best to implement them in glibc (and, 
> if it seems the standard may be defective, a common understanding in that 
> regard - working with the relevant standards committees to resolve any 
> such defects).  This means much more regard than a few years ago for 
> standard semantics first, optimizations only where consistent with the 
> standard.
> 
> Of course the libdfp maintainers can do what they want in libdfp.  

The SD/DD/TD conversion will be implemented in libdfp and will not be
submitted to glibc.

> But 
> since this discussion was started on libc-alpha, 

Yes glibc is the primary for soft-fp so we started there. Now we are
working on using soft-fp to implement KFmode in libgcc. Where you
pointed out specific issues we have made the appropriate corrections.

But you seem to be on some type of crusade that I do not understand the
scope of.

I am concerned that you trying to hold our efforts hostage to this
crusade.

I am trying to establish boundaries that we can agree to.

> I'm considering things in 
> terms of standard glibc accuracy goals (which for operations fully bound 
> to IEEE 754, on IEEE types, means exact correctly-rounded results and 
> exceptions).  And for any new libm functions (e.g. for float128), getting 
> consensus on the design and implementation approach at an early stage, 
> working with the community, and following glibc standards at least to the 
> extent that the existing ldbl-128 functions follow them, would be 
> particularly important.  (It would not be OK, for example, to have 
> architecture-specific optimized versions that fail to follow the standard 
> semantics when the architecture-independent versions do follow those 
> semantics, though architecture maintainers can e.g. decide which cases are 
> important to optimize for on their architecture, while still keeping the 
> slow cases correct.  Note that we removed various x86 function 
> implementations using x87 trig instructions a few years ago because of 
> inaccuracy.)
> 


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-12-19  5:03             ` Steven Munroe
@ 2015-12-19 13:15               ` Joseph Myers
  2015-12-19 16:40               ` Joseph Myers
  1 sibling, 0 replies; 46+ messages in thread
From: Joseph Myers @ 2015-12-19 13:15 UTC (permalink / raw)
  To: Steven Munroe
  Cc: Christoph Lauter, Paul E. Murphy, libc-alpha, Steve Munroe,
	Tulio Magno Quites Machado Filho, Michael R Meissner

On Fri, 18 Dec 2015, Steven Munroe wrote:

> > > Joseph,
> > > 
> > > Please do not assume the Decimal Floating carries all the sins of the
> > > much maligned IBM long double. 
> > 
> > I'm not assuming that - I'm comparing with support for *IEEE* binary 
> > types, which also has plenty of deficiencies in GCC.  I'm working from, 
> > for example, the various open DFP bugs in GCC:

> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43374
> Intel BID

"The tests also fail on powerpc64-linux" - just because a bug is reported 
for one case doesn't mean it doesn't apply to other cases as well.

> I am not sure what you want from us here. Most of these are not my
> platform or bad test cases. We are fixing the TIMode issue as an
> oversite.

I don't know who you mean by "us".  These examples are to illustrate the 
basis for my impression of the state of DFP support in the GNU toolchain 
as similar to that for (IEEE) binary floating-point, which is *not* based 
on a comparison with IBM long double - they are not to request fixes.

What I want from *all* people in the glibc community is:

1. Assume good faith in other contributors - do not jump to conclusions 
about motivations other than seeking to improve glibc, the GNU toolchain, 
the GNU system and free software in general.

2. Pay sufficient attention to detail for the discussion in question.  
When discussing algorithms for floating-point conversions, this means 
understanding and taking into account relevant general issues around 
rational approximation, worst cases for correct rounding, precision 
extension techniques, avoiding spurious exceptions, etc. - a general 
notion that rounding to odd might be relevant is not very helpful to such 
a discussion without a good grasp of the uses and limitations of that 
technique.

3. Work constructively with the community on reaching consensus.

4. See <https://sourceware.org/ml/libc-alpha/2015-07/msg00766.html> 
regarding conflicts of interest.

> But you seem to be on some type of crusade that I do not understand the
> scope of.

I am not.  See point 1 above.

> I am concerned that you trying to hold our efforts hostage to this
> crusade.

I am not.  See point 1 above.  Because glibc works by community consensus, 
and subsystem (*including architecture*) maintainers just provide a 
starting point for consensus in their areas rather than determining it, 
it's not possible for one contributor to hold efforts hostage, as other, 
unconnected contributors are free to speak up if they think someone is 
being unreasonable or contrary to consensus.

As far as I'm concerned, what you do with decimal floating point cannot 
block anything to do with float128.  I explicitly said when first 
mentioning the conversions issue in the GCC context that work on the DFP 
conversions could be deferred.

But since you asked for advice, I looked at the existing code.  And, 
having noticed issues in that code that were extremely obvious given a 
general understanding of the relevant floating-point issues, I pointed out 
those issues for your information, as part of the general goal to improve 
the quality of free software everywhere - and described, with appropriate 
attention to detail, the considerations relevant to determining whether an 
approach for implementing conversions is correct or not.  (I'm sure you 
know more about whether a particular approach is *efficient* on POWER 
hardware - my comments were about *correctness*.)

> I am trying to establish boundaries that we can agree to.

See points 1, 2, 3 and 4 above.  It's community consensus that ultimately 
determines what goes into glibc.

It is very unlikely that large contributions - say, float128 libm 
functions - can get into glibc without following those points 
sufficiently.  Saying this is not an attempt to hold efforts hostage; it's 
simply advice about community expectations.  People are of course free to 
jump in with long patch series without sufficient attention to detail and 
to obtaining community consensus at all stages, but doing so would be 
setting oneself up for huge amounts of wasted work.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-12-19  5:03             ` Steven Munroe
  2015-12-19 13:15               ` Joseph Myers
@ 2015-12-19 16:40               ` Joseph Myers
  2015-12-23 17:33                 ` Steven Munroe
  1 sibling, 1 reply; 46+ messages in thread
From: Joseph Myers @ 2015-12-19 16:40 UTC (permalink / raw)
  To: Steven Munroe
  Cc: Christoph Lauter, Paul E. Murphy, libc-alpha, Steve Munroe,
	Tulio Magno Quites Machado Filho, Michael R Meissner

On Fri, 18 Dec 2015, Steven Munroe wrote:

> > Well, I'd think the starting point is to implement operations following 
> > the correct semantics, which are quite clear in both TR 24732:2009 and TS 
> > 18661-2: as per IEEE 754-2008, conversions are correctly rounded with 
> > correct exceptions.  
> 
> Then you should be specific! Specific examples with test cases.

https://github.com/libdfp/libdfp/issues/29
https://github.com/libdfp/libdfp/issues/30
https://github.com/libdfp/libdfp/issues/31
https://github.com/libdfp/libdfp/issues/32
https://github.com/libdfp/libdfp/issues/33
https://github.com/libdfp/libdfp/issues/34

All include testcases, tested with current git libdfp on POWER8 
little-endian.  Five of the six work correctly with the libgcc 
conversions.  All should be understood to be likely to apply to most of 
the binary/decimal conversions in libdfp, in both directions, not just to 
the conversions from _Decimal128 to double used to illustrate them, 
because of the similarity of code used for different conversions.  I think 
together those issues provide reasonable coverage of the problems with 
libdfp conversions that I've identified in this thread.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-12-19 16:40               ` Joseph Myers
@ 2015-12-23 17:33                 ` Steven Munroe
  0 siblings, 0 replies; 46+ messages in thread
From: Steven Munroe @ 2015-12-23 17:33 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Christoph Lauter, Paul E. Murphy, libc-alpha, Steve Munroe,
	Tulio Magno Quites Machado Filho, Michael R Meissner

On Sat, 2015-12-19 at 16:40 +0000, Joseph Myers wrote:
> On Fri, 18 Dec 2015, Steven Munroe wrote:
> 
> > > Well, I'd think the starting point is to implement operations following 
> > > the correct semantics, which are quite clear in both TR 24732:2009 and TS 
> > > 18661-2: as per IEEE 754-2008, conversions are correctly rounded with 
> > > correct exceptions.  
> > 
> > Then you should be specific! Specific examples with test cases.
> 
> https://github.com/libdfp/libdfp/issues/29
> https://github.com/libdfp/libdfp/issues/30
> https://github.com/libdfp/libdfp/issues/31
> https://github.com/libdfp/libdfp/issues/32
> https://github.com/libdfp/libdfp/issues/33
> https://github.com/libdfp/libdfp/issues/34
> 
> All include testcases, tested with current git libdfp on POWER8 
> little-endian.  Five of the six work correctly with the libgcc 
> conversions.  All should be understood to be likely to apply to most of 
> the binary/decimal conversions in libdfp, in both directions, not just to 
> the conversions from _Decimal128 to double used to illustrate them, 
> because of the similarity of code used for different conversions.  I think 
> together those issues provide reasonable coverage of the problems with 
> libdfp conversions that I've identified in this thread.
> 

Yes this is specific and sufficient, we will start working on resolving
these.

I suspect some of these are errors copied from the ieee sources long ago
that you have since corrected. We will use this a guide.

As for the conversions and rounding, libgcc meets your goal because it
is using the decnumber soft-dfp library. So we have that to fall back
on. 

That libdfp is only 1ULP off is actually reassuring.  As the tables use
for these conversions are program generated and need to expanded for
_float128, now is good time to revisit this. Also it is good time to
measure and quantify the performance delta to make appropriate
engineering trade-offs.

Thank you.


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-11-16 18:24   ` Joseph Myers
  2015-11-16 18:40     ` Joseph Myers
  2015-11-16 22:07     ` Christoph Lauter
@ 2015-11-16 23:45     ` Paul E. Murphy
  2015-11-17  0:07       ` Joseph Myers
  2 siblings, 1 reply; 46+ messages in thread
From: Paul E. Murphy @ 2015-11-16 23:45 UTC (permalink / raw)
  To: Joseph Myers
  Cc: libc-alpha, Steve Munroe, Tulio Magno Quites Machado Filho,
	Michael R Meissner



On 11/16/2015 12:24 PM, Joseph Myers wrote:
> On Mon, 16 Nov 2015, Paul E. Murphy wrote:
> 
>> Hi Joseph,
>>
>> I think there may have been a question as to where the
>> conversion routines between IEEE 128 binary float and
>> decimal float should live.
>>
>> Observing existing precedent and machinery, I think
>> the appropriate place to house them is within libdfp.
> 
> I think existing practice is that there's a copy of such functions in 
> libgcc and another copy in libdfp - and maybe the libdfp version supports 
> exceptions and rounding modes (software or hardware) but the libgcc 
> version doesn't?

Thanks for pointing that out. libgcc implements a subset
of the API provided by libdfp. However, libdfp should have
better compliance and optimization for powerpc targets. They
take different approaches when converting formats, among other
things.

This opinion may not be shared, but we consider the libdfp
implementation primary, at least for dpd encoding.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-11-16 23:45     ` Paul E. Murphy
@ 2015-11-17  0:07       ` Joseph Myers
       [not found]         ` <201511180131.tAI1Vs2L023118@d03av01.boulder.ibm.com>
  0 siblings, 1 reply; 46+ messages in thread
From: Joseph Myers @ 2015-11-17  0:07 UTC (permalink / raw)
  To: Paul E. Murphy
  Cc: libc-alpha, Steve Munroe, Tulio Magno Quites Machado Filho,
	Michael R Meissner

On Mon, 16 Nov 2015, Paul E. Murphy wrote:

> Thanks for pointing that out. libgcc implements a subset
> of the API provided by libdfp. However, libdfp should have
> better compliance and optimization for powerpc targets. They
> take different approaches when converting formats, among other
> things.

Thanks for the explanation.

I don't know my way around libdfp, so may be looking at the wrong source 
files that aren't actually used in practice.  But it's not apparent to me 
that e.g. base-math/trunctdsf.c would be correctly rounding; even if the 
conversion to double were exact, _Decimal128 has enough precision that I'd 
expect you to get double values exactly half way between two floats (but 
not exactly equal to the original decimal value) with incorrect results 
following from double rounding.  (And hardcoding infinity and zero as 
overflow and underflow results is obviously wrong in non-default rounding 
modes.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

[parent not found: <201511180131.tAI1Vs2L023118@d03av01.boulder.ibm.com>]

* Re: IEEE128 binary float to decimal float conversion routines
       [not found]         ` <201511180131.tAI1Vs2L023118@d03av01.boulder.ibm.com>
@ 2015-11-18  2:03           ` Joseph Myers
       [not found]             ` <201511182301.tAIN1Igc011083@d03av02.boulder.ibm.com>
  2015-11-19 17:57             ` Paul E. Murphy
  0 siblings, 2 replies; 46+ messages in thread
From: Joseph Myers @ 2015-11-18  2:03 UTC (permalink / raw)
  To: Steve Munroe
  Cc: libc-alpha, Michael R Meissner, Paul E. Murphy,
	Tulio Magno Quites Machado Filho

On Wed, 18 Nov 2015, Steve Munroe wrote:

> I only see one rounds associated with the BINPOWOF10[sexp] multiply/divide.
> 
> The mant = a_norm * 1E+15DL operation is a scaling in decimal and should be
> exact.
> 
> The temp = mant operation is a decimal to long conversion which will cause
> a truncation to 15 digits.
> 
> So my analysis is this code does not double round. Do you think the
> truncation is an issue.

The problem I see is with the final "result = temp;" which converts double 
to float.

The earlier steps are probably accurate to within 1ulp.  But if temp (a 
double) is half way between two representable float values - while the 
original argument is very close to that half way value, but not exact - 
then the final conversion will round to even, which may or may not be 
correct depending on which side of that double value the original 
_Decimal128 value was.  (Much the same applies in other rounding modes 
when the double value equals a float value but the original value isn't 
exactly that float value.)

I haven't done the detailed analysis with continued fractions to determine 
the worst cases for conversion of _Decimal128 to float (it is, however, 
clearly possible to determine the worst cases like that with only a small 
amount of computation needed, unlike the large exhaustive searches needed 
for worst cases for correctly rounded transcendental functions).  Nor have 
I read the paper Christoph helpfully pointed out.  But heuristically, if 
you have a 128-bit input, you can expect there to be some input values for 
which, on converting to binary, the initial 24 bits are followed by (1 
then about 127 0s, then other nonzero bits, or likewise with 0 followed by 
about 127 1s), just by random chance, and so you expect to need about 24 + 
128 bits internal precision for the conversion so as to get a result that 
rounds correctly when truncated to float.

(Actually you expect a few bits less than that to be needed because almost 
all the exponent range of _Decimal128 is outside the range of float.  But 
that doesn't change the basic analysis, that neither double, long double 
nor __float128 is expected to have enough precision as an intermediate 
type for correctly rounded results.  Cf. the BID code in libgcc using at 
least 256-bit precision for internal computations.)

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

[parent not found: <201511182301.tAIN1Igc011083@d03av02.boulder.ibm.com>]

* Re: IEEE128 binary float to decimal float conversion routines
       [not found]             ` <201511182301.tAIN1Igc011083@d03av02.boulder.ibm.com>
@ 2015-11-18 23:53               ` Joseph Myers
       [not found]                 ` <201511190052.tAJ0qd4x018924@d03av02.boulder.ibm.com>
  2015-12-08 17:16                 ` Steven Munroe
  0 siblings, 2 replies; 46+ messages in thread
From: Joseph Myers @ 2015-11-18 23:53 UTC (permalink / raw)
  To: Steve Munroe
  Cc: libc-alpha, Michael R Meissner, Paul E. Murphy,
	Tulio Magno Quites Machado Filho

On Wed, 18 Nov 2015, Steve Munroe wrote:

> > The problem I see is with the final "result = temp;" which converts
> double
> > to float.
> >
> > The earlier steps are probably accurate to within 1ulp.  But if temp (a
> > double) is half way between two representable float values - while the
> > original argument is very close to that half way value, but not exact -
> > then the final conversion will round to even, which may or may not be
> > correct depending on which side of that double value the original
> > _Decimal128 value was.  (Much the same applies in other rounding modes
> > when the double value equals a float value but the original value isn't
> > exactly that float value.)
> >
> Would changing the the decimal to binary conversion to be round to odd,
> offset the following round double to float?
> 
> http://www.exploringbinary.com/gcc-avoids-double-rounding-errors-with-round-to-odd/

No, because it would just offload the problem onto getting a conversion 
from _Decimal128 to double that is correctly rounded to odd, which is no 
easier (indeed, requires more work, not less) than the original problem of 
converting to float.

The existing code loses some of the original precision when taking just 15 
digits of the mantissa for conversion to double (not OK when you want to 
determine the exact value rounded to odd after further operations - in the 
hard cases, the final decimal digit will affect the correct rounding).  
Then the multiplications / divisions by precomputed powers of 10 use a 
table of long double values - while that gives extra precision (though 
probably not enough extra precision), it's also incompatible with doing 
rounding to odd, since IBM long double doesn't give meaningful "inexact" 
exceptions or work in non-default rounding modes, while rounding to odd 
requires working in round-to-zero mode and then checking the "inexact" 
flag.

> We could look at this if it requires a few additional instructions. But I
> would be very reluctant to resort to heavy handed (and extremely slow)
> solutions to get perfect rounding for a few corner cases.

It is of course possible to achieve IEEE-conforming results by first doing 
an approximate conversion with rigorous error bounds, then only doing the 
slower conversion if the result of the first conversion was very close to 
half way / exact (depending on the rounding mode), within the error bounds 
(so only using the slow case rarely, as long as you avoid it in the cases 
where the conversion is exact).  Cf. the dbl-64 libm functions that do 
things like that (and get complaints for the slowness of the slow case, 
because they use far more precision than is actually needed for correct 
rounding - in the case of conversions it's much easier to determine how 
much precision is actually needed).  (Now most of those libm functions 
don't actually need to be correctly rounded at all - TS 18661-4 defines 
separate names such as crexp for correctly rounded functions - whereas 
conversions between binary and decimal are defined to be correctly rounded 
by both TS 18661-2 and the older TR 24732 specification of C bindings for 
decimal floating-point.)

Another issue I see with the implementation: the "Obvious underflow" case 
for exponents below -39 includes a substantial part of the subnormal 
range, so that decimal values in that range will be wrongly converted to 
zero instead of appropriate subnormal floats (so being wildly inaccurate 
rather than the incorrect last place of the issue discussed above).  
Likewise for truncation to double (trunctddf.c).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

[parent not found: <201511190052.tAJ0qd4x018924@d03av02.boulder.ibm.com>]

* Re: IEEE128 binary float to decimal float conversion routines
       [not found]                 ` <201511190052.tAJ0qd4x018924@d03av02.boulder.ibm.com>
@ 2015-11-19  1:22                   ` Joseph Myers
  0 siblings, 0 replies; 46+ messages in thread
From: Joseph Myers @ 2015-11-19  1:22 UTC (permalink / raw)
  To: Steve Munroe
  Cc: libc-alpha, Michael R Meissner, Paul E. Murphy,
	Tulio Magno Quites Machado Filho

On Thu, 19 Nov 2015, Steve Munroe wrote:

> Trying again. Please take some time to study PowerISA-2.07B :Book I Decimal
> Floating-Point (DFP)
> Facility Overview" and consider the implications that Decimal unit has
> rounding modes that separate and independent of binary float. And the one
> of the rounding modes is "Round to Prepare for Shorter Precision".
> 
> This seems to be the decimal analog of round to odd ?

Yes, it is, but it doesn't help here.  What it helps for is implementing 
formatOf arithmetic operations that take wider operands and round just 
once to narrower precision *with the same radix*, such as d32muld64 from 
TS 18661-2 - you'd do the multiplication in decimal64 format using that 
mode, restore the original mode and do the conversion to decimal32 in the 
original mode (and it can also be used in implementing fma).  But when 
what you want to do doesn't have a DFP instruction that can use that 
rounding mode, or when you are producing a binary result and so the binary 
rounding mode is the relevant rounding mode and you essentially need all 
decimal computations to be exact, it doesn't solve your problem.

In other words: if you had a DFP instruction "convert decimal128 to 
binary64, using the decimal rounding mode", but no instruction "convert 
decimal128 to binary32, using the binary rounding mode", you could use the 
former, in the "Round to Prepare for Shorter Precision" mode, to implement 
the latter.  But you don't have such instructions, and I don't think this 
mode helps implement these particular conversions at all.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-11-18 23:53               ` Joseph Myers
       [not found]                 ` <201511190052.tAJ0qd4x018924@d03av02.boulder.ibm.com>
@ 2015-12-08 17:16                 ` Steven Munroe
  2015-12-08 18:25                   ` Joseph Myers
  1 sibling, 1 reply; 46+ messages in thread
From: Steven Munroe @ 2015-12-08 17:16 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Steve Munroe, libc-alpha, Michael R Meissner, Paul E. Murphy,
	Tulio Magno Quites Machado Filho

On Wed, 2015-11-18 at 23:53 +0000, Joseph Myers wrote:
> On Wed, 18 Nov 2015, Steve Munroe wrote:
> 
> > > The problem I see is with the final "result = temp;" which converts
> > double
> > > to float.
> > >
> > > The earlier steps are probably accurate to within 1ulp.  But if temp (a
> > > double) is half way between two representable float values - while the
> > > original argument is very close to that half way value, but not exact -
> > > then the final conversion will round to even, which may or may not be
> > > correct depending on which side of that double value the original
> > > _Decimal128 value was.  (Much the same applies in other rounding modes
> > > when the double value equals a float value but the original value isn't
> > > exactly that float value.)
> > >
> > Would changing the the decimal to binary conversion to be round to odd,
> > offset the following round double to float?
> > 
> > http://www.exploringbinary.com/gcc-avoids-double-rounding-errors-with-round-to-odd/
> 
> No, because it would just offload the problem onto getting a conversion 
> from _Decimal128 to double that is correctly rounded to odd, which is no 
> easier (indeed, requires more work, not less) than the original problem of 
> converting to float.
> 
Joseph I think we are talking pass each other on this.

The PowerISA (2.05 and later ) Decimal Floating-point "Round to Prepare
for Shorter Precision" mode would not address the Decimal128
convert/truncate to shorter binary floating-point (double or float).

But it will address the Float128 convert/truncate to to shorter decimal
floating-pointer (_Decimal64 and _Decimal32).

> The existing code loses some of the original precision when taking just 15 
> digits of the mantissa for conversion to double (not OK when you want to 
> determine the exact value rounded to odd after further operations - in the 
> hard cases, the final decimal digit will affect the correct rounding).  
> Then the multiplications / divisions by precomputed powers of 10 use a 
> table of long double values - while that gives extra precision (though 
> probably not enough extra precision), it's also incompatible with doing 
> rounding to odd, since IBM long double doesn't give meaningful "inexact" 
> exceptions or work in non-default rounding modes, while rounding to odd 
> requires working in round-to-zero mode and then checking the "inexact" 
> flag.
> 
So in the case of TIMode or KFmode conversion to _Decimal64/_Decimal32
we can save the current rounding mode (fe_dec_getround()) then use
fe_dec_setround (DEC_ROUND_05UP) to set the "Round to Prepare for
Shorter Precision" before the multiply that converts the mantissa to the
target radix. Then just before the the instruction that rounds to the
final (_Decimal64 or _Decimal32) type, we restore the callers rounding
more and execute the final version in the correct rounding mode.

I believe that addresses you double rounding concern for these
conversions.

And now we are address the similar issues with float128 with the recent
public release of the PowerISA-3.0:
https://www.ibm.com/developerworks/community/blogs/fe313521-2e95-46f2-817d-44a4f27eba32/entry/Announcing_a_New_Era_of_Openness_with_Power_3_0?lang=en

The PowerISA-3.0 document is here: http://ibm.biz/power-isa3

As an addition to the Vector Scalar extents we add support for IEEE
128-bit binary floating point. This architecture provides the option to
override the default round-mode and force "round to odd" on a
instruction basis for all the key operators. This obviously includes
add/subtract/multiple/divide/sqrt and convert convert quad to double.

This will be sufficient to support the _Decimal128 to double and float
conversion that you mentioned as problematic.

While we are waiting for the hardware implementation of the
PowerISA-3.0, we need to address the specifics of these conversions in
the soft-fp implementation. 

My observation is that a common element of these conversion is a large
precision multiply (to convert the radix of the mantissa) then a
possible truncation (with rounding) to the final precision in the new
radix. 

It seem a simple effort to provide a soft-fp implementation that
combines the multiple and truncation, without intermediate rounding.

As the soft-fp implementation performs rounding in the FP_PACK
operation, it simple to avoid the intermediate PACK/UNPACK steps between
the FP_MUL and FP_TRUNC operations. The only trick is that arithmetic
operations user the canonical PACK/UNPACK operation while the Truncate
operations use the SEMIRAW PACK/UNPACK. Some adjustments of the exponent
are requires to flow the (unrounded) result of the FP_MUL directly into
the FP_TRUNC operation, but this does not seem to require a large
effort. The final FP_PACK operation following the FP_TRUNC will perform
the final and correct rounding to the require precision.

This seems sufficient to address the issues you have raised and seems
much simpler then wholesale additions of round to odd to the soft-fp
implementation. 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-12-08 17:16                 ` Steven Munroe
@ 2015-12-08 18:25                   ` Joseph Myers
  2015-12-15 21:18                     ` Steven Munroe
  0 siblings, 1 reply; 46+ messages in thread
From: Joseph Myers @ 2015-12-08 18:25 UTC (permalink / raw)
  To: Steven Munroe
  Cc: Steve Munroe, libc-alpha, Michael R Meissner, Paul E. Murphy,
	Tulio Magno Quites Machado Filho

On Tue, 8 Dec 2015, Steven Munroe wrote:

> The PowerISA (2.05 and later ) Decimal Floating-point "Round to Prepare
> for Shorter Precision" mode would not address the Decimal128
> convert/truncate to shorter binary floating-point (double or float).
> 
> But it will address the Float128 convert/truncate to to shorter decimal
> floating-pointer (_Decimal64 and _Decimal32).

Yes, if you have a conversion from _Float128 to _Decimal128 that works for 
Round to Prepare for Shorter Precision then you could use that as an 
intermediate step in converting to _Decimal64 and _Decimal32 (it's not the 
most efficient approach, but it's certainly simpler than having multiple 
variants of the full conversion code).

The hardest part is converting from _Float128 to _Decimal128.  Once you 
can do that (for all rounding modes and with correct exceptions), 
converting to the narrower types is easy, whether you have multiple 
variants of the same code or use Round to Prepare for Shorter Precision.  
Likewise for conversions in the other direction - _Decimal128 to _Float128 
is the hardest part, if you can do that then converting to narrower types 
is straightforward.

> So in the case of TIMode or KFmode conversion to _Decimal64/_Decimal32
> we can save the current rounding mode (fe_dec_getround()) then use
> fe_dec_setround (DEC_ROUND_05UP) to set the "Round to Prepare for
> Shorter Precision" before the multiply that converts the mantissa to the
> target radix. Then just before the the instruction that rounds to the
> final (_Decimal64 or _Decimal32) type, we restore the callers rounding
> more and execute the final version in the correct rounding mode.
> 
> I believe that addresses you double rounding concern for these
> conversions.

For TImode it's not hard to avoid double rounding this way, by splitting 
the TImode number into two numbers that are exactly convertible to 
_Decimal128, so the only inexact operation is a single addition, which can 
be done in the Round to Prepare for Shorter Precision mode (and then you 
can convert to _Decimal64 / _Decimal32 in the original mode).  [In all 
cases, getting the preferred quantum for decimal results is a minor matter 
to deal with at the end.]

For _Float128, this only reduces the problem to doing a conversion of 
_Float128 to _Decimal128 in that mode.  Which is not simply a single 
multiply.  Not all mantissa values for _Float128 can be represented in 
_Decimal128 (2**113 > 10**34).  And nor can all powers of 2 that you need 
to multiply / divide by be represented in _Decimal128.  And when you have 
more than one inexact operation, the final result is generally not 
correctly rounded for any rounding mode.  And so the complexity goes 
massively up (compare the fmaf implementation with round-to-odd on double 
- a single inexact addition on double done in round-to-odd followed by 
converting back to float in the original rounding mode - with the 
sysdeps/ieee754/dbl-64/s_fma.c code, which also uses round-to-odd, but 
with far more complexity in order to achieve the precision extension 
required for intermediate computations).

You may well be able to use precision-extension techniques - so doing a 
conversion that produces a sum of two or three _Decimal128 values (the 
exact number needed being determined by a continued fraction analysis) and 
then adding up those values in the Round to Prepare for Shorter Precision 
mode.  But I'd be surprised if there is a simple and correct 
implementation of the conversion that doesn't involve extending 
intermediate precision to have about 128 extra bits, given the complexity 
and extra precision described in the papers on this subject such as the 
one referenced in this thread.

> My observation is that a common element of these conversion is a large
> precision multiply (to convert the radix of the mantissa) then a
> possible truncation (with rounding) to the final precision in the new
> radix. 

Where large precision means about 256 bits (not simply 128 * 128 -> 256 
multiplication, but also having the powers of 2 or 10 to that precision, 
so more like 128 * 256 -> 384 which can be truncated to about 256).  
Again, exact precisions to be determined by continued fraction analysis.

> It seem a simple effort to provide a soft-fp implementation that
> combines the multiple and truncation, without intermediate rounding.

That much is simple (the soft-fp code expects to produce a binary result, 
but you could make it produce integer * power of 10 for the conversions to 
decimal); cf. the _FP_FMA implementation that does a double-width multiply 
plus addition before truncating.  You do need to determine the right 
intermediate precision and add the implementations of that extra-precision 
multiply.

> This seems sufficient to address the issues you have raised and seems
> much simpler then wholesale additions of round to odd to the soft-fp
> implementation. 

Adding round-to-odd would be simple enough as well (only a few places 
check for particular rounding modes); I just don't think it would help 
much.  Anything using round-to-odd when working with separate 
floating-point operations is better done in the soft-fp context by keeping 
the sticky bit and avoiding intermediate roundings (and much the same 
applies to Dekker-style precision extension - it makes much less sense 
with soft-fp than with hardware floating point).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-12-08 18:25                   ` Joseph Myers
@ 2015-12-15 21:18                     ` Steven Munroe
  2015-12-16  0:07                       ` Joseph Myers
  0 siblings, 1 reply; 46+ messages in thread
From: Steven Munroe @ 2015-12-15 21:18 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Steve Munroe, libc-alpha, Michael R Meissner, Paul E. Murphy,
	Tulio Magno Quites Machado Filho

On Tue, 2015-12-08 at 18:25 +0000, Joseph Myers wrote:
> On Tue, 8 Dec 2015, Steven Munroe wrote:
> 
> > The PowerISA (2.05 and later ) Decimal Floating-point "Round to Prepare
> > for Shorter Precision" mode would not address the Decimal128
> > convert/truncate to shorter binary floating-point (double or float).
> > 
> > But it will address the Float128 convert/truncate to to shorter decimal
> > floating-pointer (_Decimal64 and _Decimal32).
> 
> Yes, if you have a conversion from _Float128 to _Decimal128 that works for 
> Round to Prepare for Shorter Precision then you could use that as an 
> intermediate step in converting to _Decimal64 and _Decimal32 (it's not the 
> most efficient approach, but it's certainly simpler than having multiple 
> variants of the full conversion code).
> 
> The hardest part is converting from _Float128 to _Decimal128.  Once you 
> can do that (for all rounding modes and with correct exceptions), 
> converting to the narrower types is easy, whether you have multiple 
> variants of the same code or use Round to Prepare for Shorter Precision.  
> Likewise for conversions in the other direction - _Decimal128 to _Float128 
> is the hardest part, if you can do that then converting to narrower types 
> is straightforward.
> 
> > So in the case of TIMode or KFmode conversion to _Decimal64/_Decimal32
> > we can save the current rounding mode (fe_dec_getround()) then use
> > fe_dec_setround (DEC_ROUND_05UP) to set the "Round to Prepare for
> > Shorter Precision" before the multiply that converts the mantissa to the
> > target radix. Then just before the the instruction that rounds to the
> > final (_Decimal64 or _Decimal32) type, we restore the callers rounding
> > more and execute the final version in the correct rounding mode.
> > 
> > I believe that addresses you double rounding concern for these
> > conversions.
> 
> For TImode it's not hard to avoid double rounding this way, by splitting 
> the TImode number into two numbers that are exactly convertible to 
> _Decimal128, so the only inexact operation is a single addition, which can 
> be done in the Round to Prepare for Shorter Precision mode (and then you 
> can convert to _Decimal64 / _Decimal32 in the original mode).  [In all 
> cases, getting the preferred quantum for decimal results is a minor matter 
> to deal with at the end.]
> 
> For _Float128, this only reduces the problem to doing a conversion of 
> _Float128 to _Decimal128 in that mode.  Which is not simply a single 
> multiply.  Not all mantissa values for _Float128 can be represented in 
> _Decimal128 (2**113 > 10**34).  And nor can all powers of 2 that you need 
> to multiply / divide by be represented in _Decimal128.  And when you have 
> more than one inexact operation, the final result is generally not 
> correctly rounded for any rounding mode.  And so the complexity goes 
> massively up (compare the fmaf implementation with round-to-odd on double 
> - a single inexact addition on double done in round-to-odd followed by 
> converting back to float in the original rounding mode - with the 
> sysdeps/ieee754/dbl-64/s_fma.c code, which also uses round-to-odd, but 
> with far more complexity in order to achieve the precision extension 
> required for intermediate computations).
> 
> You may well be able to use precision-extension techniques - so doing a 
> conversion that produces a sum of two or three _Decimal128 values (the 
> exact number needed being determined by a continued fraction analysis) and 
> then adding up those values in the Round to Prepare for Shorter Precision 
> mode.  But I'd be surprised if there is a simple and correct 
> implementation of the conversion that doesn't involve extending 
> intermediate precision to have about 128 extra bits, given the complexity 
> and extra precision described in the papers on this subject such as the 
> one referenced in this thread.
> 
> > My observation is that a common element of these conversion is a large
> > precision multiply (to convert the radix of the mantissa) then a
> > possible truncation (with rounding) to the final precision in the new
> > radix. 
> 
> Where large precision means about 256 bits (not simply 128 * 128 -> 256 
> multiplication, but also having the powers of 2 or 10 to that precision, 
> so more like 128 * 256 -> 384 which can be truncated to about 256).  
> Again, exact precisions to be determined by continued fraction analysis.
> 

Ok let my try with the simpler case of _Decimal128 to Float128 where the
significand conversion is exact (log2(10e34) -> 112.9 -> <= 113 bits).
So you mention "continued fraction analysis" which was not part of my
formal education (40+ years ago) but I will try.

The question is how many significant bits does it take to represent a
power of 10? This is interesting because my implementation of trunctfkf
involves a multiple of converted (to float128) mantissa by 10eN where N
is the exponent of the original _Decimal128. So what powers of 10 can be
represented exactly as a float128?

The requires significant bits should be log2(10eN), but as the binary of
an exact power of 10 generate trailing zero bit for each N (1000 has 3
trailing zeros, 10000000 has 6, ...)

So the number significant bits are log2(10eN)-N. A quick binary search
of shows that values up to 10e48 require less than 113-bits and so can
be represented exactly in _float128.

So any _Decimal128 < 9999999999999999999999999999999999e48 (1.0e82) can
be converted with one _Float128 multiply, of 2 exact values, giving a
rounded result to 1ULP.

This does not require conversion to string and back or carrying more
precision then naturally available in the _float128.

Now as the exponent of _Decimal128 input exceeds 48 the table of
_float128 powers of 10 will contain values that have been rounded. Now I
assume that some additional exponent range can be covered by by insuring
that the table _float128 powers_of_10 have been pre-rounded to odd?

Do you agree with this analysis?


^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-12-15 21:18                     ` Steven Munroe
@ 2015-12-16  0:07                       ` Joseph Myers
  0 siblings, 0 replies; 46+ messages in thread
From: Joseph Myers @ 2015-12-16  0:07 UTC (permalink / raw)
  To: Steven Munroe
  Cc: Steve Munroe, libc-alpha, Michael R Meissner, Paul E. Murphy,
	Tulio Magno Quites Machado Filho

On Tue, 15 Dec 2015, Steven Munroe wrote:

> Ok let my try with the simpler case of _Decimal128 to Float128 where the
> significand conversion is exact (log2(10e34) -> 112.9 -> <= 113 bits).
> So you mention "continued fraction analysis" which was not part of my
> formal education (40+ years ago) but I will try.

The theory of rational approximations (determining how close a/b can be to 
given x for integer a and b, given bounds on b) applies here, because you 
want to determine how close a*10^m and b*2^n can be, for a and b in the 
ranges of mantissa values and m and n corresponding exponents, which is 
essentially equivalent to determining how close a/b can be to 2^n/10^m.

Earlier in this thread, Christoph Lauter referred to a paper 
<http://www.computer.org/csdl/trans/tc/preprint/07271015-abs.html> about 
comparison between binary and decimal floating-point values - conversions 
are essentially the same issue, and that paper references previous work on 
conversions.  Section 4.2 of that paper discusses how the classical theory 
of continued fractions can be applied to find the closest cases (the exact 
details of the relevant analysis depend on the details of the code you use 
to implement the conversions).

> So any _Decimal128 < 9999999999999999999999999999999999e48 (1.0e82) can
> be converted with one _Float128 multiply, of 2 exact values, giving a
> rounded result to 1ULP.

Those cases are indeed straightforward (and if you round-to-odd then that 
gives conversions of numbers within that range to narrower binary types, 
and you can similarly divide instead of multiplying in cases of integer * 
1e-n, n <= 48).

> Now as the exponent of _Decimal128 input exceeds 48 the table of
> _float128 powers of 10 will contain values that have been rounded. Now I
> assume that some additional exponent range can be covered by by insuring
> that the table _float128 powers_of_10 have been pre-rounded to odd?

But pre-rounding the larger powers to odd doesn't help; rounding to odd 
generally only helps when it's the very last operation before rounding to 
a narrower type that gets rounded to odd, with all previous operations 
being exact.  In this case, you'd have an inexact power followed by a 
multiplication, and because one of the arguments to the multiplication is 
inexact (and the multiplication is not by a power of 2), the result of the 
multiplication may not be correctly rounded.

Instead, the larger powers need to be stored with extra precision, and 
extra precision used for the multiplication.  (You don't need to store all 
the powers in the range of _Float128 because you have a speed/space 
trade-off; you could store fewer powers and then compute the one you need 
at runtime by doing an extra-precision multiply of two or more stored 
powers.  Or you could store them all if that's the right trade-off for the 
processors in question.)

The amount of extra precision needed depends on both how close the closest 
cases for correct rounding are, and on how much error can accumulate from 
the multiplications - there needs to be enough precision that the 
inaccuracy in the stored values and subsequent computations is small 
enough that it does not affect the rounding of the final result.

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-11-18  2:03           ` Joseph Myers
       [not found]             ` <201511182301.tAIN1Igc011083@d03av02.boulder.ibm.com>
@ 2015-11-19 17:57             ` Paul E. Murphy
  2015-11-19 18:14               ` Joseph Myers
  1 sibling, 1 reply; 46+ messages in thread
From: Paul E. Murphy @ 2015-11-19 17:57 UTC (permalink / raw)
  To: Joseph Myers
  Cc: Steve Munroe, libc-alpha, Michael R Meissner,
	Tulio Magno Quites Machado Filho



On 11/17/2015 08:03 PM, Joseph Myers wrote:
> I read the paper Christoph helpfully pointed out.  But heuristically, if 
> you have a 128-bit input, you can expect there to be some input values for 
> which, on converting to binary, the initial 24 bits are followed by (1 
> then about 127 0s, then other nonzero bits, or likewise with 0 followed by 
> about 127 1s), just by random chance, and so you expect to need about 24 + 
> 128 bits internal precision for the conversion so as to get a result that 
> rounds correctly when truncated to float.

Joseph, can you elaborate on this a bit further? I agree with your point that
you need more precision to properly convert, but I'm having trouble following
this bit. 128-bit input == IEEE decimal128? binary == IEEE binary32?

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: IEEE128 binary float to decimal float conversion routines
  2015-11-19 17:57             ` Paul E. Murphy
@ 2015-11-19 18:14               ` Joseph Myers
  0 siblings, 0 replies; 46+ messages in thread
From: Joseph Myers @ 2015-11-19 18:14 UTC (permalink / raw)
  To: Paul E. Murphy
  Cc: Steve Munroe, libc-alpha, Michael R Meissner,
	Tulio Magno Quites Machado Filho

On Thu, 19 Nov 2015, Paul E. Murphy wrote:

> On 11/17/2015 08:03 PM, Joseph Myers wrote:
> > I read the paper Christoph helpfully pointed out.  But heuristically, if 
> > you have a 128-bit input, you can expect there to be some input values for 
> > which, on converting to binary, the initial 24 bits are followed by (1 
> > then about 127 0s, then other nonzero bits, or likewise with 0 followed by 
> > about 127 1s), just by random chance, and so you expect to need about 24 + 
> > 128 bits internal precision for the conversion so as to get a result that 
> > rounds correctly when truncated to float.
> 
> Joseph, can you elaborate on this a bit further? I agree with your point that
> you need more precision to properly convert, but I'm having trouble following
> this bit. 128-bit input == IEEE decimal128? binary == IEEE binary32?

Yes.  If you are converting from a decimal format with A significant bits 
in the representation, to a binary format with B bits in the mantissa, you 
heuristically expect to need about A+B bits internal precision for correct 
rounding (in the worst case - if you're careful about error bounds and 
avoiding spurious exceptions, you can do an initial trial conversion with 
less precision and then test whether that was good enough to know the 
correctly rounded result).  See Christoph's paper for a more detailed 
continued fraction analysis bounding the amount of precision needed (that 
paper deals with mixed-radix comparisons, but conversions are essentially 
the same issue).

Much the same applies to conversions in the other direction (binary to 
decimal).

-- 
Joseph S. Myers
joseph@codesourcery.com

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2016-04-21 20:47 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-30 19:18 RFC Migrating PowerPC to IEEE 128-bit Floating Point Steven Munroe
2015-09-30 21:15 ` Joseph Myers
2015-10-13 18:06   ` Steven Munroe
2015-10-13 18:53     ` Michael Meissner
2015-10-13 23:11     ` Joseph Myers
2015-10-14  7:38       ` Andreas Schwab
2015-10-14 12:13         ` Joseph Myers
2016-03-07 23:48       ` Paul E. Murphy
2016-03-08  1:31         ` Joseph Myers
2016-04-19 17:26           ` Steven Munroe
2016-04-21 20:47             ` Joseph Myers
2015-10-26 18:05   ` [PATCH 0/2] Add minimal code for IEEE 128-bit floating point Tulio Magno Quites Machado Filho
2015-10-26 18:12     ` Joseph Myers
2015-10-26 18:45       ` Michael Meissner
2015-10-26 19:51       ` Steven Munroe
2015-10-26 22:31         ` Joseph Myers
2015-10-26 18:06   ` [PATCH 2/2] soft-fp: Add new KF routines Tulio Magno Quites Machado Filho
2015-10-26 18:34     ` Joseph Myers
2015-10-26 18:06   ` [PATCH 1/2] soft-fp: Automatically create KF files from TF ones Tulio Magno Quites Machado Filho
2015-10-26 18:16     ` Joseph Myers
2015-10-26 18:44       ` Michael Meissner
2015-10-26 20:01       ` [PATCHv2] " Tulio Magno Quites Machado Filho
2015-10-26 22:32         ` Joseph Myers
2015-10-27 11:20           ` Tulio Magno Quites Machado Filho
2015-11-16 17:48 ` IEEE128 binary float to decimal float conversion routines Paul E. Murphy
2015-11-16 18:24   ` Joseph Myers
2015-11-16 18:40     ` Joseph Myers
2015-11-16 22:07     ` Christoph Lauter
2015-11-16 22:42       ` Joseph Myers
2015-12-18 21:12         ` Steven Munroe
2015-12-18 22:13           ` Joseph Myers
2015-12-19  5:03             ` Steven Munroe
2015-12-19 13:15               ` Joseph Myers
2015-12-19 16:40               ` Joseph Myers
2015-12-23 17:33                 ` Steven Munroe
2015-11-16 23:45     ` Paul E. Murphy
2015-11-17  0:07       ` Joseph Myers
     [not found]         ` <201511180131.tAI1Vs2L023118@d03av01.boulder.ibm.com>
2015-11-18  2:03           ` Joseph Myers
     [not found]             ` <201511182301.tAIN1Igc011083@d03av02.boulder.ibm.com>
2015-11-18 23:53               ` Joseph Myers
     [not found]                 ` <201511190052.tAJ0qd4x018924@d03av02.boulder.ibm.com>
2015-11-19  1:22                   ` Joseph Myers
2015-12-08 17:16                 ` Steven Munroe
2015-12-08 18:25                   ` Joseph Myers
2015-12-15 21:18                     ` Steven Munroe
2015-12-16  0:07                       ` Joseph Myers
2015-11-19 17:57             ` Paul E. Murphy
2015-11-19 18:14               ` Joseph Myers

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).