From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id 96E963858D39; Thu,  7 Mar 2024 08:04:57 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 96E963858D39
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1709798697;
	bh=PJn9PGndDuiht7qCk3INcD/00LAYUdSZEoey428LXek=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=s1U0U3tFPufDnUN05Arwtl243qn+gmVMFsooyi5GBXx2gliE7NtY6aF7wyLdK0X18
	 Z4S9VDKBIY5DHi9PRxA8j56549FRlAEXWOGw9T8L6NMZ9h1jL7B15aP674lnCrhHQD
	 ZgJZqChSE3lQMXH2labk77CkW/cbgANB7xm5m4/g=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/114151] [14 Regression] weird and inefficient
 codegen and addressing modes since r14-9193
Date: Thu, 07 Mar 2024 08:04:55 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 14.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-114151-4-FKMM20XxBx@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-114151-4@http.gcc.gnu.org/bugzilla/>
References: <bug-114151-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114151
--- Comment #14 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Andrew Macleod from comment #13)
> Created attachment 57638 [details]
> patch
>=20
> Ok, there were 2 issues with simply invoking range_of_stmt, which this new
> patch resolves.  IF we aren't looking to fix this in GCC 14 right now
> anyway, this is the way to go.
>=20
> 1) The cache has always tried to provide a global range by pre-folding a
> stmt for an estimate using global values.  This is a bad idea for PHIs wh=
en
> SCEV is invoked AND SCEV is calling ranger. This changes it to not
> pre-evaluate PHIs, which also saves time when functions have a lot of edg=
es.
> Its mostly pointless for PHIs anyway since we're about to do a real
> evaluation.
>=20
> 2) The cache's entry range propagator was not re-entrant.  We didn't
> previously need this, but with SCEV (and possible other place) invoking
> range_of_expr without context and having range_of_stmt being called, we c=
an
> occasionally get layered calls for cache filling (of different ssa-names)=
=20
>=20
> With those 2 changes, we can now safely invoke range_of_stmt from a
> contextless range_of_expr call.
>=20
> We would have tripped over this earlier if SCEV or one of those other pla=
ces
> using range_of_expr without context had instead invoked range_of_stmt.  T=
hat
> would have been perfectly reasonable, and would have resulting in these s=
ame
> issues.  We never tripped over it because range_of_stmt is not used much
> outside of ranger.  That is the primary reason I wanted to track this dow=
n.=20
> There were alternative paths to the same end result that would have
> triggered these issues.

It sounds like this part is a bugfix?

> Give this patch a try. it also bootstraps with no regressions.  I will qu=
eue
> it up for stage 1 instead assuming all is good.

It seems to work well, it now computes a lot of additional ranges and
causes a minor code generation change on the testcase (it doesn't fix the
observed regression though).

Thanks for working on this.

As of things unexplored is whether we can with better range-info lift the
constraint on the folding some more.  We're turning (A + i * B) * C into
(A * C + i * (B * C)) and need to avoid any additional intermediate undefin=
ed
overflow with this association for i in [0, n] (with n being the number of
iterations of the loop where i varies).

As said, if the regression is too important to ignore we could choose to
leave the bug unfixed for all but the case with A, B and C constant which
was the case for the testcase in the original PR.=