From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 96AD6385DC3F; Fri, 5 Apr 2024 01:08:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 96AD6385DC3F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1712279330; bh=kL13oaRYcmMofb+zARZqy/a42OJ7FJiIdsExZr/X8Ks=; h=From:To:Subject:Date:In-Reply-To:References:From; b=AR4AnemUu9+g8VNd61l4zoGk1QPhNvLtyLMsTy2DcS9TpM2ONsNJQIzZzIYUzjLt4 zn+l9SGKr+tySBYS7AIdaQedL3IJUgrf27p6G2tBdSZAbUEgLwNm6ZV9IgKREeIfIf jWvwfnmTgYPmmRkgyfgARiJnO2LekzvhfmDBADSM= From: "g.peterhoff@t-online.de" To: gcc-bugs@gcc.gnu.org Subject: [Bug libstdc++/77776] C++17 std::hypot implementation is poor Date: Fri, 05 Apr 2024 01:08:47 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: libstdc++ X-Bugzilla-Version: 7.0 X-Bugzilla-Keywords: X-Bugzilla-Severity: normal X-Bugzilla-Who: g.peterhoff@t-online.de X-Bugzilla-Status: ASSIGNED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: emsr at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D77776 --- Comment #25 from g.peterhoff@t-online.de --- Hi Matthias, to get good results on average (all FP-types: (B)FP16..FP128, scalar/vectorized(SIMD)/parallel/...) this algorithm seems to me (so far) t= o be suitable: template inline constexpr Type hypot_gp(Type x, Type y, Type z) noexcept { using limits =3D std::numeric_limits; constexpr Type sqrt_min =3D std::sqrt(limits::min()), sqrt_max =3D std::sqrt(limits::max()), scale_up =3D std::exp2( Type(limits::max_exponent/2)= ), scale_down =3D std::exp2(-Type(limits::max_exponent/2)= ), zero =3D 0; x =3D std::abs(x); y =3D std::abs(y); z =3D std::abs(z); if (!(std::isnormal(x) && std::isnormal(y) && std::isnormal(z))) [[unlikely]] { if (std::isinf(x) | std::isinf(y) | std::isinf(z)) ret= urn limits::infinity(); else if (std::isnan(x) | std::isnan(y) | std::isnan(z)) ret= urn limits::quiet_NaN(); else { const bool xz{x =3D=3D zero}, yz{y =3D=3D zero}, zz{z =3D=3D zero}; if (xz) { if (yz) return zz ? zero : z; else if (zz) return y; } else if (yz && zz) return x; } } if (x > z) std::swap(x, z); if (y > z) std::swap(y, z); if ((z >=3D sqrt_min) && (z <=3D sqrt_max)) [[likely]] { // no scale return std::sqrt(__builtin_assoc_barrier(x*x + y*y) + z*z); } else { const Type scale =3D (z >=3D sqrt_min) ? scale_down : scale_up; x *=3D scale; y *=3D scale; z *=3D scale; return std::sqrt(__builtin_assoc_barrier(x*x + y*y) + z*z); } }=