From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 005223858CDB; Thu, 3 Nov 2022 21:16:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 005223858CDB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667510186; bh=4PN2ClA/VKkzzXG8qQ6WXOJEXRHti0tEbv6xPEm8dJQ=; h=From:To:Subject:Date:From; b=n/DZ+RQT+9ZhzwXnFAG10kihjvCsdcCqFZ4DDhKzxNH40kgap/DBhxJ22jlEtrGwm 3fnASEWy55Tr03BrxwhOj+E4WVJ07nhtqUL18J1sLcZpH5EGnm86uTkys/9XyCl1NC ZP2LZ2JCg7s1lmV1ZH/HJ2iiejwSR9i3Y/FfGdPA= From: "glisse at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/107520] New: Optimize std::lerp(d, d, 0.5) Date: Thu, 03 Nov 2022 21:15:50 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: new X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: enhancement X-Bugzilla-Who: glisse at gcc dot gnu.org X-Bugzilla-Status: UNCONFIRMED X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: --- X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: bug_id short_desc product version bug_status keywords bug_severity priority component assigned_to reporter target_milestone Message-ID: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D107520 Bug ID: 107520 Summary: Optimize std::lerp(d, d, 0.5) Product: gcc Version: 13.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: enhancement Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- In some C++ code I have, it would be convenient if the compiler, possibly w= ith the help of the standard library, could make the following function cheap, ideally just the identity. I'll probably end up wrapping lerp with a functi= on that first checks with __builtin_constant_p if the 2 bounds are equal, but = I'll post this in case people have ideas how to improve things. #include double f(double d){ return std::lerp(d, d, .5); } Currently, with -O3, we generate movapd %xmm0, %xmm1 pxor %xmm0, %xmm0 comisd %xmm1, %xmm0 jnb .L7 comisd %xmm0, %xmm1 jb .L6 .L7: pxor %xmm0, %xmm0 ucomisd %xmm0, %xmm1 jp .L6 je .L11 .L6: movapd %xmm1, %xmm0 subsd %xmm1, %xmm0 mulsd .LC1(%rip), %xmm0 addsd %xmm1, %xmm0 maxsd %xmm1, %xmm0 ret .p2align 4,,10 .p2align 3 .L11: mulsd .LC1(%rip), %xmm1 movapd %xmm1, %xmm0 addsd %xmm1, %xmm0 ret (clang is better at avoiding the redundant comparison) With -fno-trapping-math to help a bit, I see at the beginning if (d_2(D) =3D=3D 0.0) goto ; [34.00%] else goto ; [66.00%] [local count: 475287355]: _7 =3D d_2(D) * 5.0e-1; _10 =3D _7 * 2.0e+0; I think that even with the default -fsigned-zeros, simplifying to _10 =3D d= _2(D) is valid. Adding -fno-signed-zeros [local count: 1073741824]: if (d_2(D) =3D=3D 0.0) goto ; [34.00%] else goto ; [66.00%] [local count: 598454470]: _13 =3D d_2(D) - d_2(D); _14 =3D _13 * 5.0e-1; __x_15 =3D d_2(D) + _14; if (d_2(D) u>=3D __x_15) goto ; [50.00%] else goto ; [50.00%] [local count: 299227235]: [local count: 1073741825]: # _12 =3D PHI return _12; _13 is 0 or NaN, which doesn't change for _14, and __x_15 is just d_2, so we always return d_2.=