From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 1020A3858C52; Sat, 2 Apr 2022 19:33:59 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1020A3858C52 From: "already5chosen at yahoo dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug libquadmath/105101] incorrect rounding for sqrtq Date: Sat, 02 Apr 2022 19:33:58 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libquadmath X-Bugzilla-Version: unknown X-Bugzilla-Keywords: wrong-code X-Bugzilla-Severity: normal X-Bugzilla-Who: already5chosen at yahoo dot com X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: cc Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 X-BeenThere: gcc-bugs@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-bugs mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 02 Apr 2022 19:33:59 -0000 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D105101 Michael_S changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |already5chosen at yahoo do= t com --- Comment #4 from Michael_S --- If you want quick fix for immediate shipment then you can take that: #include #include __float128 quick_and_dirty_sqrtq(__float128 x) { if (isnanq(x)) return x; if (x=3D=3D0) return x; if (x < 0) return nanq(""); if (isinfq(x)) return x; int xExp; x =3D frexpq(x, &xExp); if (xExp & 1) x *=3D 2.0; // x in [0.5:2.0) __float128 r =3D (__float128)(1.0/sqrt((double)x)); // r=3Drsqrt(x) estim= ate (53 bits) r *=3D 1.5 - r*r*x*0.5; // NR iteration improves precision of r to 105.4 = bit __float128 y =3D x*r; // y=3Dsqrt(x) estimate (105.4 bits) // extended-precision NR iteration __float128 yH =3D (double)y; __float128 yL =3D y - yH; __float128 deltaX =3D x - yH*yH; deltaX -=3D yH*yL*2; deltaX -=3D yL*yL; y +=3D deltaX*r*0.5; // improve precision of y to ~210.2 bits. Not enough= for perfect rounding, but not too bad return ldexpq(y, xExp >> 1); } It is very slow, even slower than what you have now, which by itself is qui= te astonishingly slow. It is also not sufficiently precise for correct rounding in all cases. But, at least, the worst error is something like (0.5+2**-98) ULP, so you a= re unlikely to be ever caught by black box type of testing. It's biggest advantage is extreme portability. Should run on all platforms where double=3D=3DIEEE binary64 and __float128 = =3D=3D IEEE binary128. May be, few days later I'll have better variant for "good" 64-bit platforms i.e. for those where we have __int128. It would be 15-25 times faster than the variant above and rounding would be mathematically correct rather than just "impossible to be caught" like abov= e. But it would not run everywhere. Also, I want to give it away under MIT or BSD license, rather than under GP= L.=