From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 956AE395B408; Fri, 10 Jun 2022 17:40:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 956AE395B408 From: "joseph at codesourcery dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug libquadmath/105101] incorrect rounding for sqrtq Date: Fri, 10 Jun 2022 17:40:25 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libquadmath X-Bugzilla-Version: unknown X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: joseph at codesourcery dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 10 Jun 2022 17:40:25 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105101 --- Comment #18 from joseph at codesourcery dot com --- libquadmath is essentially legacy code. People working directly in C=20 should be using the C23 _Float128 interfaces and *f128 functions, as in=20 current glibc, rather than libquadmath interfaces (unless their code needs= =20 to support old glibc or non-glibc C libraries that don't support _Float128= =20 in C23 Annex H). It would be desirable to make GCC generate *f128 calls=20 when appropriate from Fortran code using this format as well; see=20 for= =20 more discussion of the different cases involved. Most of libquadmath is derived from code in glibc - some of it can now be=20 updated from the glibc code automatically (see update-quadmath.py), other=20 parts can't (although it would certainly be desirable to extend=20 update-quadmath.py to cover that other code as well). See the commit=20 message for commit 4239f144ce50c94f2c6cc232028f167b6ebfd506 for a more=20 detailed discussion of what code comes from glibc and what is / is not=20 automatically handled by update-quadmath.py. Since update-quadmath.py=20 hasn't been run for a while, it might need changes to work with more=20 recent changes to the glibc code. sqrtq.c is one of the files not based on glibc code. That's probably=20 because glibc didn't have a convenient generic implementation of binary128= =20 sqrt to use when libquadmath was added - it has soft-fp implementations=20 used for various architectures, but those require sfp-machine.h for each=20 architecture (which maybe we do in fact have in libgcc for each relevant=20 architecture, but it's an extra complication). Certainly making it=20 possible to use code from glibc for binary128 sqrt would be a good idea,=20 but while we aren't doing that, it should also be OK to improve sqrtq=20 locally in libquadmath. The glibc functions for this format are generally *not* optimized for=20 speed yet (this includes the soft-fp-based versions of sqrt). Note that=20 what's best for speed may depend a lot on whether the architecture has=20 hardware support for binary128 arithmetic; if it has such support, it's=20 more likely an implementation based on binary128 floating-point operations= =20 is efficient; if it doesn't, direct use of integer arithmetic, without=20 lots of intermediate packing / unpacking into the binary128 format, is=20 likely to be more efficient. See the discussion starting at=20 = =20 for more on this - glibc is a better place for working on most optimized=20 function implementations than GCC. See also=20 - those functions are aiming to=20 be correctly rounding, which is *not* a goal for most glibc libm=20 functions, but are still quite likely to be faster than the existing=20 non-optimized functions in glibc. fma is a particularly tricky case because it *is* required to be correctly= =20 rounding, in all rounding modes, and correct rounding implies correct=20 exceptions, *and* correct exceptions for fma includes getting right the=20 architecture-specific choice of whether tininess is detected before or=20 after rounding. Correct exceptions for sqrt are simpler, but to be correct for glibc it=20 still needs to avoid spurious "inexact" exceptions - for example, from the= =20 use of double in intermediate computations in your version (see the=20 optimized feholdexcept / fesetenv operations used in glibc for cases where= =20 exceptions from intermediate computations are to be discarded). For functions that aren't required to be correctly rounding, the glibc=20 manual discusses the accuracy goals (including on exceptions, e.g.=20 avoiding spurious "underflow" exceptions from intermediate computations=20 for results where the rounded result returned is not consistent with=20 rounding a tiny, inexact value).=