From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id D877D385B50F; Sun, 26 Feb 2023 06:39:36 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D877D385B50F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1677393576; bh=xIhopBdJfYjWPxFQ1r8F7knEhuMiwwLds6rZul2haz0=; h=From:To:Subject:Date:In-Reply-To:References:From; b=FzZEBGsJzQxpUsSz3k7Cj36GMYPn24Wpy47uIRAR/yHhl/cwnZNfyQSAEsTgsBZ1m yIJdRjUKKGMpmindVS2rVePVcTyooiBzeqRZiNNDe3bsS/rl3EJoV/S8H8DLpz0dkL oUwqO2tSW8MQ2IaHS+ACsvSskiuph7l8o0yFq6as= From: "jkratochvil at azul dot com" To: gcc-bugs@gcc.gnu.org Subject: [Bug target/108922] fmod() 13x slowdown in gcc4.9 dropping "fprem" and calling fmod() Date: Sun, 26 Feb 2023 06:39:35 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: target X-Bugzilla-Version: 12.2.1 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: jkratochvil at azul dot com X-Bugzilla-Status: WAITING X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D108922 --- Comment #8 from Jan Kratochvil --- (In reply to Andrew Pinski from comment #2) > So the simple test is run the full GCC bootstrap/test with all languages = and > check if the testcase fails or not. I suspect it will. It does not. Tested on Fedora 36 x86-64. I did test only a revert of: https://gcc.gnu.org/git/?p=3Dgcc.git;a=3Dcommitdiff;h=3D93ba85fdd253b4b9cf2= b9e54e8e5969b1a3db098 The revert makes it 13x faster. But the produced code still falls back to calling glibc fmod() as shown in the disassembly in Comment 0. If I use the "fprem" instruction directly it gets 15x faster - but I did not figure out some (easy) way for me how to patch GCC to no longer produce the call to fmod() at all and produce only the "fprem" instruction. (In reply to Alexander Monakov from comment #4) > Plus, Glibc does use fprem/fprem1 for fmodl/remainderl on x86_64, It is true replacing fmod() with fmodl() makes it 5x faster (but only 5x). There is still some infinity check and I haven't found any real justificati= on in glibc sources for it: 28 if (__builtin_expect (isinf (x) || y =3D=3D 0.0L, 0) 29 && _LIB_VERSION !=3D _IEEE_ && !isnan (y) && !isnan (x)) 30 /* fmod(+-Inf,y) or fmod(x,0) */ 31 return __kernel_standard_l (x, y, 227); > The ieee_2.f90 testcase attempts to change rounding mode. It 2014 it > probably just was "miscompiled". The testsuite run did include "gfortran.dg/ieee/ieee_2.f90" and it has no regression.=