From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 48) id 96E963858D39; Thu, 7 Mar 2024 08:04:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 96E963858D39 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1709798697; bh=PJn9PGndDuiht7qCk3INcD/00LAYUdSZEoey428LXek=; h=From:To:Subject:Date:In-Reply-To:References:From; b=s1U0U3tFPufDnUN05Arwtl243qn+gmVMFsooyi5GBXx2gliE7NtY6aF7wyLdK0X18 Z4S9VDKBIY5DHi9PRxA8j56549FRlAEXWOGw9T8L6NMZ9h1jL7B15aP674lnCrhHQD ZgJZqChSE3lQMXH2labk77CkW/cbgANB7xm5m4/g= From: "rguenth at gcc dot gnu.org" To: gcc-bugs@gcc.gnu.org Subject: [Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since r14-9193 Date: Thu, 07 Mar 2024 08:04:55 +0000 X-Bugzilla-Reason: CC X-Bugzilla-Type: changed X-Bugzilla-Watch-Reason: None X-Bugzilla-Product: gcc X-Bugzilla-Component: tree-optimization X-Bugzilla-Version: 14.0 X-Bugzilla-Keywords: missed-optimization X-Bugzilla-Severity: normal X-Bugzilla-Who: rguenth at gcc dot gnu.org X-Bugzilla-Status: NEW X-Bugzilla-Resolution: X-Bugzilla-Priority: P3 X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org X-Bugzilla-Target-Milestone: 14.0 X-Bugzilla-Flags: X-Bugzilla-Changed-Fields: Message-ID: In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/ Auto-Submitted: auto-generated MIME-Version: 1.0 List-Id: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114151 --- Comment #14 from Richard Biener --- (In reply to Andrew Macleod from comment #13) > Created attachment 57638 [details] > patch >=20 > Ok, there were 2 issues with simply invoking range_of_stmt, which this new > patch resolves. IF we aren't looking to fix this in GCC 14 right now > anyway, this is the way to go. >=20 > 1) The cache has always tried to provide a global range by pre-folding a > stmt for an estimate using global values. This is a bad idea for PHIs wh= en > SCEV is invoked AND SCEV is calling ranger. This changes it to not > pre-evaluate PHIs, which also saves time when functions have a lot of edg= es. > Its mostly pointless for PHIs anyway since we're about to do a real > evaluation. >=20 > 2) The cache's entry range propagator was not re-entrant. We didn't > previously need this, but with SCEV (and possible other place) invoking > range_of_expr without context and having range_of_stmt being called, we c= an > occasionally get layered calls for cache filling (of different ssa-names)= =20 >=20 > With those 2 changes, we can now safely invoke range_of_stmt from a > contextless range_of_expr call. >=20 > We would have tripped over this earlier if SCEV or one of those other pla= ces > using range_of_expr without context had instead invoked range_of_stmt. T= hat > would have been perfectly reasonable, and would have resulting in these s= ame > issues. We never tripped over it because range_of_stmt is not used much > outside of ranger. That is the primary reason I wanted to track this dow= n.=20 > There were alternative paths to the same end result that would have > triggered these issues. It sounds like this part is a bugfix? > Give this patch a try. it also bootstraps with no regressions. I will qu= eue > it up for stage 1 instead assuming all is good. It seems to work well, it now computes a lot of additional ranges and causes a minor code generation change on the testcase (it doesn't fix the observed regression though). Thanks for working on this. As of things unexplored is whether we can with better range-info lift the constraint on the folding some more. We're turning (A + i * B) * C into (A * C + i * (B * C)) and need to avoid any additional intermediate undefin= ed overflow with this association for i in [0, n] (with n being the number of iterations of the loop where i varies). As said, if the regression is too important to ignore we could choose to leave the bug unfixed for all but the case with A, B and C constant which was the case for the testcase in the original PR.=