From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 6168D3858D35; Mon, 10 Jul 2023 10:46:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6168D3858D35 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1688986005; bh=/XknOUTANu4teQs6fHxzkqz4Tjej3sH6caxCo6X9CEY=; h=From:To:Subject:Date:In-Reply-To:References:From; b=nY3WSWRgf9uReHkOtpsz3Nk/d3Fu7gb3v/uqwebcXMOZnw8C3iqVjmsAGp4WbVUhC j2u25ga/VdF7OiUuOgef7Q1OcUOvavrspHlkT8byRc8wlG5dWfnFECfaz8GZWnfs9A kdMNCAX36+LRppT2zeE/L/5w4aUtHe/fxMEbYkD0= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/109154] [13/14 regression] jump threading de-optimizes nested floating point comparisons Date: Mon, 10 Jul 2023 10:46:45 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 13.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P2 X-Bugzilla-Assigned-To: tnfchris at gcc dot gnu.org X-Bugzilla-Target-Milestone: 13.2 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D109154 --- Comment #62 from Richard Biener --- (In reply to Tamar Christina from comment #61) > (In reply to Richard Biener from comment #60) > > (In reply to Tamar Christina from comment #59) > > > after ifcvt we end up with: > > >=20 > > > _162 =3D chrg_init_70 * iftmp.8_76; > > > _164 =3D ABS_EXPR <_162>; > > > _167 =3D -_164; > > > _ifc__166 =3D distbb_74 < iftmp.0_97 ? _167 : 0.0; > > > prephitmp_169 =3D distbb_74 >=3D 0.0 ? _ifc__166 : _168; > > >=20=20=20 > > > instead of > > >=20 > > > _160 =3D chrg_init_75 * iftmp.8_80; > > > prephitmp_161 =3D distbb_79 < 0.0 ? chrg_init_75 : _160; > > > _164 =3D ABS_EXPR ; > > > _166 =3D -_164; > > > prephitmp_167 =3D distbb_79 < iftmp.0_96 ? _166 : 0.0; > > >=20 > > > previously we'd make COND_MUL and COND_NEG and so don't need a VCOND = in the > > > end, > > > now we select after the multiplication, so we only have a COND_NEG fo= llowed > > > by a VCOND. > > >=20 > > > This is obviously worse, but I have no idea how to recover it. Any i= deas? > >=20 > > None. This is with -O3, right? Can you try selectively disabling parts > > of PRE with -fno-tree-partial-pre -fno-code-hoisting? But I suspect it= 's > > the improvement for general PRE that we hit here. > >=20 >=20 > Those don't seem to make a difference sadly. >=20 > > One idea that was always floating around was to move PRE after loop opts > > like we did with predcom. But the no PRE before loop will likely hurt = as > > well > > so we might instead want to limit PRE when it involves generating > > constants in PHIs and schedule another PRE after loop opts (at some cost > > then). It's something to experiment with ... >=20 > It looks like `-fno-tree-pre` does the trick, but then of course, messes = up > elsewhere. The conditional statement seem to stay in the most complicated > form possible in scalar code. >=20 > I'll try to track down what to turn off and experiment with a pre2 after > vect. > Is before predcom a good place? I would avoid putting it into the loop pipeline. Instead I'd turn the FRE pass that runs after tracer into PRE. Maybe conditional on whether there are any loops. Note it's not so easy to "tame" PRE, the existing things happen at elimination time in eliminate_dom_walker::eliminate_stmt. I would experiment with restricting the use of inserted PHIs in innermost(!) loops containing invariants, maybe only if the number of PHI args is more than two ... (but that's somewhat artificial). That said, I'm not really convinced this is a good idea.=