From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id 54AB83858C2C for ; Tue, 4 Jan 2022 19:15:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 54AB83858C2C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: 5W9P7zbTjmP9o/PymOHMyM5WAxfy4oEa8y47mCgRJQpViY3W2Q5M4cPW8qFRML90DSqr5+NTft dp7SW92OGK3LR4oRwF0nbPSEsROLAkaWfra/XVIsX2QFuThRFMRSMNnCmg/fs/6AXKVuxhfdPX WVYDu6f8DbfM/9VMQnV/XbocHDLFHDe+DPxtJ8lJTdx0lzwda8AGvEgpifuzoyDxH+4TU4zacC ufi+ORKL6ZXGkceGe5v38SgJb9AdIo3f3hfc7d26w3JBJ/VzXucUInQ7NbiJU03k+hgXL4k4zT 1DqfMkTQ3YuWBhzvsklzv++Z X-IronPort-AV: E=Sophos;i="5.88,261,1635235200"; d="scan'208";a="70242017" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 04 Jan 2022 11:15:25 -0800 IronPort-SDR: gPWlF0hEzq30/WbaN5F67YPi6cfRBtRoz7Al/aaYX2MpQeP4SjFqHK5AyRlYYygiU0I0/HiIMl EekfScztbE2/VoKfRHxjuUv8+4dt0LcfV+acvUhsap9s5afwnxqiba/gkrWsJJmmFWrK5OhIQ/ qbTYVm5QGXGZf3fZVDB5Cu7Uwy+iKUb6jJkZDf0KDReQuhOs4IWwv4c0VgkrpqXbEk80w+jdIs DlMN0pWzWofjhgTCZMd/hn46bwJcCUM4BvFAb1k4rvceuH/4Ak0XsdlAJ+95LmOui/ZrhgjA41 uEM= Date: Tue, 4 Jan 2022 19:15:19 +0000 From: Joseph Myers X-X-Sender: jsm28@digraph.polyomino.org.uk To: Paul Zimmermann CC: Subject: Re: correctly rounded mathematical functions In-Reply-To: Message-ID: References: User-Agent: Alpine 2.22 (DEB 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-05.mgc.mentorg.com (139.181.222.5) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-3115.8 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Jan 2022 19:15:27 -0000 On Mon, 3 Jan 2022, Paul Zimmermann wrote: > We propose to add such correctly rounded functions to the GNU libc, > for the three IEEE formats (binary32, binary64, binary128) and the > "extended double" format (long double on x86_64). Do you have some way of determining the worst cases for correct rounding for binary128 with feasible resource usage, for functions where it's necessary to determine such worst cases by exhaustive search (which is most of them)? > cr_exp/cr_log/cr_sin/... functions. Our goal is nevertheless to get the > best performance possible, and in some cases our implementation outperforms > the GNU libc one. If the correctly rounded implementation is faster than the current one, it might make sense to build it twice for glibc, so the non-correctly-rounded function names get the same implementation as the correctly-rounded names but without any additional slower-case checks or computations that are only needed for being correctly rounded and not for results within 1ulp of the correctly rounded value. (In general, I expect that many of the less accurate libm functions could be reimplemented to get results within 1ulp of the correctly rounded results while becoming faster at the same time, even if correctly rounded versions of those functions are hard.) > We have started to work on two functions (cbrt and acos), for which we > provide presumably correctly rounded implementations (up to the knowledge > of hard-to-round cases) [2]. [These implementations do replace the current For cbrt I suppose you can produce functions that are correctly rounded, without needing exhaustive search, based on elementary results on rational approximation that imply you don't need more than about three times the precision to determine the correctly rounded result - is that what you're doing here? (Unless there are any effective versions of Thue-Siegel-Roth that give a bound that's actually better for cbrt for the specific precisions relevant here, rather than just asymptotically.) Note that cr_cbrt is not actually one of the names reserved in the current C23 draft (N2731) (presumably because the cr_* names are for the recommended correctly rounded operations in IEEE 754, and cbrt isn't one of those). So there may be two separate questions here: do we want cr_* functions reserved in C23, and do we want other cr_* variants of math.h functions for which C23 doesn't reserve the name (such as cr_cbrt or cr_tgamma)? A key question for adding such functions is what impact they have on glibc maintainability. At present libm functions are generally provided across all supported types - the headers and build system need to know which types are supported, and sometimes which types have the same format as which other types, but they don't generally need to deal with issues such as "function X is supported for long double in some glibc configurations but not others" (and thus to a large extent the headers don't need to know details of e.g. which format long double has; such information is kept as local as possible in a few places in the installed headers). So it's important to understand how things would be complicated by having such functions. Also note when adding new functions to glibc that there are a lot of different variations between floating-point support on different architectures to allow for, including: * before-rounding and after-rounding tininess detection; * architectures may or may not have hardware fma support; * architectures might have just round-to-nearest, or just a subset of C99 rounding modes, or all the C99 rounding modes; * architectures might or might not support floating-point exceptions. So some important questions to consider for understanding how glibc might be complicated by adding these functions include: * How will you arrange for installed headers to declare the supported functions for those types with formats for which the correctly rounded functions are supported but not for types for which they are not supported - what will the conditionals in bits/mathcalls.h (for example) look like? * Likewise, what will things look like in the makefiles to arrange for the functions to be built and tested when supported but not otherwise? * When you have correctly rounded functions for x86 "extended" (ldbl-96 in glibc), do those also work properly for the m68k variant of that format (which handles biased exponent 0 like other biased exponents, so has normal and subnormal exponent ranges going one exponent smaller), or do you avoid providing the cr_* functions for that format variant? * Are there any dependencies on features of the architecture (for example, do any functions depend on fma support, or on the architecture supporting exceptions or all the IEEE rounding modes)? * How do you arrange for test inputs in auto-libm-test-in to be shared for the cr_* functions (while presumably needing separate auto-libm-test-out-* and lib-test-*.inc files for them, since gen-auto-libm-tests does some things differently for functions required to be correctly rounded)? * What configurations are you doing execution testing on? (Build testing with build-many-glibcs.py is also generally a good idea for new libm functions to e.g. make sure you got all the ABI baseline updates right, given all the variations in supported types between different architectures, but execution testing should also be done for enough architectures to cover all the main variations in what code is used on what architectures.) * Where applicable, do you handle both before-rounding and after-rounding tininess detection (that matters for e.g. cr_sin (DBL_MIN), which is tiny with before-rounding tininess detection but not with after-rounding tininess detection)? The semantics of cr_* include getting this right - raising the correct exceptions, including this case where it's architecture-specific what exceptions should be raised. (If explicit conditionals in the code are needed, use glibc's tininess.h TININESS_AFTER_ROUNDING macro; see sysdeps/ieee754/dbl-64/s_fma.c for example.) * Does the analysis that shows functions to be correctly rounded cover FE_TONEARESTFROMZERO as well as the four C99 rounding modes (i.e., does it cover all the current IEEE rounding modes)? We don't currently have any support for FE_TONEARESTFROMZERO, but the RISC-V architecture supports it, and we don't want these functions to be an obstacle to adding support for FE_TONEARESTFROMZERO on RISC-V in future. -- Joseph S. Myers joseph@codesourcery.com