From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gcc-bugzilla@gcc.gnu.org>
Received: by sourceware.org (Postfix, from userid 48)
	id E191438582A8; Wed,  6 Mar 2024 07:31:59 +0000 (GMT)
DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E191438582A8
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org;
	s=default; t=1709710319;
	bh=15QchhMKngl5fSvVOTZk3QR/8VMfkgUl+aGVk5wGdjI=;
	h=From:To:Subject:Date:In-Reply-To:References:From;
	b=sNVkFU3tzoMxNaPKLQoc10nt4u5qj6Yio5xfwMgsDSLhob4EHZDuYxigDc7Kh9vFv
	 Zo2afZ3DXalIcp7mt2goCvh2P7KgAN3B91y61uxUNPA+KUntmT1OpUyB5ag9PtbnPe
	 r6aEB1vn9HDMul7iP0Qy0K3CS8ESjlU8E54C1w/8=
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/114151] [14 Regression] weird and inefficient
 codegen and addressing modes since r14-9193
Date: Wed, 06 Mar 2024 07:31:58 +0000
X-Bugzilla-Reason: CC
X-Bugzilla-Type: changed
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: gcc
X-Bugzilla-Component: tree-optimization
X-Bugzilla-Version: 14.0
X-Bugzilla-Keywords: missed-optimization
X-Bugzilla-Severity: normal
X-Bugzilla-Who: rguenth at gcc dot gnu.org
X-Bugzilla-Status: NEW
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: P3
X-Bugzilla-Assigned-To: unassigned at gcc dot gnu.org
X-Bugzilla-Target-Milestone: 14.0
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: 
Message-ID: <bug-114151-4-3o8v6zbj2f@http.gcc.gnu.org/bugzilla/>
In-Reply-To: <bug-114151-4@http.gcc.gnu.org/bugzilla/>
References: <bug-114151-4@http.gcc.gnu.org/bugzilla/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://gcc.gnu.org/bugzilla/
Auto-Submitted: auto-generated
MIME-Version: 1.0
List-Id: <gcc-bugs.sourceware.org>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D114151

--- Comment #11 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #10)
> (In reply to Andrew Macleod from comment #9)
> > Created attachment 57620 [details]
> > proposed patch
> >=20
> > Does this solve your problem if there is an active ranger?  it bootstra=
ps
> > with no regressions
>=20
> I'll check what it does.

So it does seem to help, not on the testcases ultimate outcome, but for the
important bits of the analysis.  With adding an active ranger around IVOPTs
with
diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
index 7cae5bdefea..626fc5bf5d7 100644
--- a/gcc/tree-ssa-loop-ivopts.cc
+++ b/gcc/tree-ssa-loop-ivopts.cc
@@ -132,6 +132,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "tree-vectorizer.h"
 #include "dbgcnt.h"
 #include "cfganal.h"
+#include "gimple-range.h"

 /* For lang_hooks.types.type_for_mode.  */
 #include "langhooks.h"
@@ -8280,6 +8281,8 @@ tree_ssa_iv_optimize (void)
   tree_ssa_iv_optimize_init (&data);
   mark_ssa_maybe_undefs ();

+  enable_ranger (cfun);
+
   /* Optimize the loops starting with the innermost ones.  */
   for (auto loop : loops_list (cfun, LI_FROM_INNERMOST))
     {
@@ -8292,6 +8295,8 @@ tree_ssa_iv_optimize (void)
       tree_ssa_iv_optimize_loop (&data, loop, toremove);
     }

+  disable_ranger (cfun);
+
   /* Remove eliminated IV defs.  */
   release_defs_bitset (toremove);


I then see the following difference with a ranger-debug dump during IVOPTs:

 11       range_of_expr(_12)
-         TRUE : (11) range_of_expr (_12) [irange] int VARYING
+         TRUE : (11) range_of_expr (_12) [irange] int [0, +INF]
...
   Base:        (long unsigned int) (int) ((unsigned int) _12 + 1) * 2
   Step:        2
   Biv: N
-  Overflowness wrto loop niter:        Overflow
+  Overflowness wrto loop niter:        No-overflow
...
-74       range_of_expr(_103)
-         TRUE : (74) range_of_expr (_103) [irange] int VARYING
+64       range_of_expr(_103)
+         TRUE : (64) range_of_expr (_103) [irange] int [-INF, 0]
 Analyzing # of iterations of loop 1
   exit condition [1, + , 1](no_overflow) <=3D _103
-  bounds on difference of bases: -2147483649 ... 2147483646
+  bounds on difference of bases: -2147483649 ... -1
   result:
     zero if _103 < 0
-    # of iterations (unsigned int) _103, bounded by 2147483647
+    # of iterations (unsigned int) _103, bounded by 0

So the important part is that it got the fact that _12 is positive.  As
analyzed in earlier comments I think that's all we can do, we don't know
anything about the other variable involved and thus can't avoid the
unsigned punning during SCEV analysis.

I think it's a good change, let's keep it queued for stage1 at this point
unless we really know a case it helps to avoid a regression with
r14-9193-ga0b1798042d033

For testing, what's the "easiest" pass/thing to do to recompute global
ranges now?  In the past I'd schedule EVRP but is there now a ranger
API to do this?  Just to see if full global range compute before IVOPTs
would help.=