From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <joseph_myers@mentor.com>
Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180])
 by sourceware.org (Postfix) with ESMTPS id 54AB83858C2C
 for <libc-alpha@sourceware.org>; Tue,  4 Jan 2022 19:15:25 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 54AB83858C2C
Authentication-Results: sourceware.org; dmarc=none (p=none dis=none)
 header.from=codesourcery.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com
IronPort-SDR: 5W9P7zbTjmP9o/PymOHMyM5WAxfy4oEa8y47mCgRJQpViY3W2Q5M4cPW8qFRML90DSqr5+NTft
 dp7SW92OGK3LR4oRwF0nbPSEsROLAkaWfra/XVIsX2QFuThRFMRSMNnCmg/fs/6AXKVuxhfdPX
 WVYDu6f8DbfM/9VMQnV/XbocHDLFHDe+DPxtJ8lJTdx0lzwda8AGvEgpifuzoyDxH+4TU4zacC
 ufi+ORKL6ZXGkceGe5v38SgJb9AdIo3f3hfc7d26w3JBJ/VzXucUInQ7NbiJU03k+hgXL4k4zT
 1DqfMkTQ3YuWBhzvsklzv++Z
X-IronPort-AV: E=Sophos;i="5.88,261,1635235200"; d="scan'208";a="70242017"
Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167])
 by esa3.mentor.iphmx.com with ESMTP; 04 Jan 2022 11:15:25 -0800
IronPort-SDR: gPWlF0hEzq30/WbaN5F67YPi6cfRBtRoz7Al/aaYX2MpQeP4SjFqHK5AyRlYYygiU0I0/HiIMl
 EekfScztbE2/VoKfRHxjuUv8+4dt0LcfV+acvUhsap9s5afwnxqiba/gkrWsJJmmFWrK5OhIQ/
 qbTYVm5QGXGZf3fZVDB5Cu7Uwy+iKUb6jJkZDf0KDReQuhOs4IWwv4c0VgkrpqXbEk80w+jdIs
 DlMN0pWzWofjhgTCZMd/hn46bwJcCUM4BvFAb1k4rvceuH/4Ak0XsdlAJ+95LmOui/ZrhgjA41
 uEM=
Date: Tue, 4 Jan 2022 19:15:19 +0000
From: Joseph Myers <joseph@codesourcery.com>
X-X-Sender: jsm28@digraph.polyomino.org.uk
To: Paul Zimmermann <Paul.Zimmermann@inria.fr>
CC: <libc-alpha@sourceware.org>
Subject: Re: correctly rounded mathematical functions
In-Reply-To: <mwfsq5ouii.fsf@tomate.loria.fr>
Message-ID: <alpine.DEB.2.22.394.2201041838300.2547324@digraph.polyomino.org.uk>
References: <mwfsq5ouii.fsf@tomate.loria.fr>
User-Agent: Alpine 2.22 (DEB 394 2020-01-19)
MIME-Version: 1.0
Content-Type: text/plain; charset="US-ASCII"
X-Originating-IP: [137.202.0.90]
X-ClientProxiedBy: svr-ies-mbx-05.mgc.mentorg.com (139.181.222.5) To
 svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1)
X-Spam-Status: No, score=-3115.8 required=5.0 tests=BAYES_00,
 HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS,
 TXREP autolearn=no autolearn_force=no version=3.4.4
X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on
 server2.sourceware.org
X-BeenThere: libc-alpha@sourceware.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Libc-alpha mailing list <libc-alpha.sourceware.org>
List-Unsubscribe: <https://sourceware.org/mailman/options/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=unsubscribe>
List-Archive: <https://sourceware.org/pipermail/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-request@sourceware.org?subject=help>
List-Subscribe: <https://sourceware.org/mailman/listinfo/libc-alpha>,
 <mailto:libc-alpha-request@sourceware.org?subject=subscribe>
X-List-Received-Date: Tue, 04 Jan 2022 19:15:27 -0000

On Mon, 3 Jan 2022, Paul Zimmermann wrote:

> We propose to add such correctly rounded functions to the GNU libc,
> for the three IEEE formats (binary32, binary64, binary128) and the
> "extended double" format (long double on x86_64).

Do you have some way of determining the worst cases for correct rounding 
for binary128 with feasible resource usage, for functions where it's 
necessary to determine such worst cases by exhaustive search (which is 
most of them)?

> cr_exp/cr_log/cr_sin/... functions. Our goal is nevertheless to get the
> best performance possible, and in some cases our implementation outperforms
> the GNU libc one.

If the correctly rounded implementation is faster than the current one, it 
might make sense to build it twice for glibc, so the non-correctly-rounded 
function names get the same implementation as the correctly-rounded names 
but without any additional slower-case checks or computations that are 
only needed for being correctly rounded and not for results within 1ulp of 
the correctly rounded value.

(In general, I expect that many of the less accurate libm functions could 
be reimplemented to get results within 1ulp of the correctly rounded 
results while becoming faster at the same time, even if correctly rounded 
versions of those functions are hard.)

> We have started to work on two functions (cbrt and acos), for which we
> provide presumably correctly rounded implementations (up to the knowledge
> of hard-to-round cases) [2]. [These implementations do replace the current

For cbrt I suppose you can produce functions that are correctly rounded, 
without needing exhaustive search, based on elementary results on rational 
approximation that imply you don't need more than about three times the 
precision to determine the correctly rounded result - is that what you're 
doing here?  (Unless there are any effective versions of Thue-Siegel-Roth 
that give a bound that's actually better for cbrt for the specific 
precisions relevant here, rather than just asymptotically.)

Note that cr_cbrt is not actually one of the names reserved in the current 
C23 draft (N2731) (presumably because the cr_* names are for the 
recommended correctly rounded operations in IEEE 754, and cbrt isn't one 
of those).  So there may be two separate questions here: do we want cr_* 
functions reserved in C23, and do we want other cr_* variants of math.h 
functions for which C23 doesn't reserve the name (such as cr_cbrt or 
cr_tgamma)?


A key question for adding such functions is what impact they have on glibc 
maintainability.  At present libm functions are generally provided across 
all supported types - the headers and build system need to know which 
types are supported, and sometimes which types have the same format as 
which other types, but they don't generally need to deal with issues such 
as "function X is supported for long double in some glibc configurations 
but not others" (and thus to a large extent the headers don't need to know 
details of e.g. which format long double has; such information is kept as 
local as possible in a few places in the installed headers).  So it's 
important to understand how things would be complicated by having such 
functions.  Also note when adding new functions to glibc that there are a 
lot of different variations between floating-point support on different 
architectures to allow for, including:

* before-rounding and after-rounding tininess detection;

* architectures may or may not have hardware fma support;

* architectures might have just round-to-nearest, or just a subset of C99 
rounding modes, or all the C99 rounding modes;

* architectures might or might not support floating-point exceptions.

So some important questions to consider for understanding how glibc might 
be complicated by adding these functions include:

* How will you arrange for installed headers to declare the supported 
functions for those types with formats for which the correctly rounded 
functions are supported but not for types for which they are not supported 
- what will the conditionals in bits/mathcalls.h (for example) look like?

* Likewise, what will things look like in the makefiles to arrange for the 
functions to be built and tested when supported but not otherwise?

* When you have correctly rounded functions for x86 "extended" (ldbl-96 in 
glibc), do those also work properly for the m68k variant of that format 
(which handles biased exponent 0 like other biased exponents, so has 
normal and subnormal exponent ranges going one exponent smaller), or do 
you avoid providing the cr_* functions for that format variant?

* Are there any dependencies on features of the architecture (for example, 
do any functions depend on fma support, or on the architecture supporting 
exceptions or all the IEEE rounding modes)?

* How do you arrange for test inputs in auto-libm-test-in to be shared for 
the cr_* functions (while presumably needing separate auto-libm-test-out-* 
and lib-test-*.inc files for them, since gen-auto-libm-tests does some 
things differently for functions required to be correctly rounded)?

* What configurations are you doing execution testing on?  (Build testing 
with build-many-glibcs.py is also generally a good idea for new libm 
functions to e.g. make sure you got all the ABI baseline updates right, 
given all the variations in supported types between different 
architectures, but execution testing should also be done for enough 
architectures to cover all the main variations in what code is used on 
what architectures.)

* Where applicable, do you handle both before-rounding and after-rounding 
tininess detection (that matters for e.g. cr_sin (DBL_MIN), which is tiny 
with before-rounding tininess detection but not with after-rounding 
tininess detection)?  The semantics of cr_* include getting this right - 
raising the correct exceptions, including this case where it's 
architecture-specific what exceptions should be raised.  (If explicit 
conditionals in the code are needed, use glibc's tininess.h 
TININESS_AFTER_ROUNDING macro; see sysdeps/ieee754/dbl-64/s_fma.c for 
example.)

* Does the analysis that shows functions to be correctly rounded cover 
FE_TONEARESTFROMZERO as well as the four C99 rounding modes (i.e., does it 
cover all the current IEEE rounding modes)?  We don't currently have any 
support for FE_TONEARESTFROMZERO, but the RISC-V architecture supports it, 
and we don't want these functions to be an obstacle to adding support for 
FE_TONEARESTFROMZERO on RISC-V in future.

-- 
Joseph S. Myers
joseph@codesourcery.com