public inbox for gcc-bugs@sourceware.org
help / color / mirror / Atom feed
From: "rguenth at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org>
To: gcc-bugs@gcc.gnu.org
Subject: [Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since r14-9193
Date: Tue, 12 Mar 2024 09:59:33 +0000	[thread overview]
Message-ID: <bug-114151-4-EnXnQMSmvC@http.gcc.gnu.org/bugzilla/> (raw)
In-Reply-To: <bug-114151-4@http.gcc.gnu.org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114151

--- Comment #19 from Richard Biener <rguenth at gcc dot gnu.org> ---
So what remains here is differences like

-  (chrec = {(long unsigned int) (col_stride_10 * _105), +, (long unsigned int)
col_stride_10}_2)
+  (chrec = (long unsigned int) (int) {(unsigned int) col_stride_10 * (unsigned
int) _105, +, (unsigned int) col_stride_10}_2)

where we can't pull the sign-extension inside the CHREC because it might
overflow.

And

 (set_scalar_evolution 
   instantiated_below = 22 
   (scalar = _59)
-  (scalar_evolution = {(long unsigned int) (col_stride_10 * _105) * 2, +,
(long unsigned int) col_stride_10 * 2}_2))
+  (scalar_evolution = _59))
+)

which is failure to analyze at all.  This one looks like

  <bb 4> [local count: 118111600]:
  # col_stride_10 = PHI <size_15(D)(11), 1(2)>
  if (size_15(D) > 0)
    goto <bb 21>; [89.00%]
  else
    goto <bb 5>; [11.00%]

  <bb 5> [local count: 118111600]:
  return;
...
  <bb 15> [local count: 343854870]:
  # RANGE [irange] int [0, 2147483646]
  # j_73 = PHI <_105(22), _68(19)>
...
  col_i_61 = col_stride_10 * j_73;
  # RANGE [irange] long unsigned int [0, 2147483647][18446744071562067968,
+INF]
  _60 = (long unsigned int) col_i_61;
  # RANGE [irange] long unsigned int [0, 4294967294][18446744069414584320,
18446744073709551614] MASK 0xfffffffffffffffe VALUE 0x0
  _59 = _60 * 2;

j_73 is {_105, +, 1}_2
col_i_61 is (int) {(unsigned int) col_stride_10 * (unsigned int) _105, +,
(unsigned int) col_stride_10}_2
_60 is (long unsigned int) (int) {(unsigned int) col_stride_10 * (unsigned int)
_105, +, (unsigned int) col_stride_10}_2

and on the _60 * 2 multiply we fail.  When applying Andrews proposed patch
this doesn't help since the range of col_stride_10 can only conditionally
be adjusted to positive.

SCEV caches a scalar evolution based on SSA_NAME and 'instantiated below'
block which is "block_before_loop" which is a loops preheader or the
function ENTRY block for analyses of scalars in the loop tree root.
A conservative context for analysis of the SCEV might be
 1) the definition stmt of the SSA name
 2) the instantiated-below block (on-exit ranges of it)

With doing 2) by feeding the last stmt of the block as context (when the
block is empty that won't work :/) the testcase is optimized again when
I discard the SCEV cache at the start of IVOPTs and wrap IVOPTs in a
ranger instance.

While ranger has a range_on_exit API this doesn't work on GENERIC expressions
as far as I can see but only SSA names but I guess that could be "fixed"
given range_on_exit also looks at the last stmt and eventually defers to
range_of_expr (or range_on_entry), but possibly get_tree_range needs
variants for on_entry/on_exit (it doesn't seem to use it's 'stmt' context
very consistently, notably not for SSA_NAMEs ...).

Interestingly enough we somehow still need the

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index c16b776c1e3..c0eda5fc51d 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -102,7 +102,15 @@ gimple_ranger::range_of_expr (vrange &r, tree expr, gimple
*stmt)
   if (!stmt)
     {
       Value_Range tmp (TREE_TYPE (expr));
-      m_cache.get_global_range (r, expr);
+      // If there is no global range for EXPR yet, try to evaluate it.
+      // THis call does set R to a global range regardless.
+      if (!m_cache.get_global_range (r, expr))
+       {
+         gimple *s = SSA_NAME_DEF_STMT (expr);
+         // Calculate a range for S if it is safe to do so.
+         if (s && gimple_bb (s) && gimple_get_lhs (s) == expr)
+           return range_of_stmt (r, s);
+       }
       // Pick up implied context information from the on-entry cache
       // if current_bb is set.  Do not attempt any new calculations.
       if (current_bb && m_cache.block_range (tmp, current_bb, expr, false))

hunk of Andrews patch to do it :/

There's one other detail - the problematical multiply folding is
col_stride_10 * {_105, +, 1}_2
I'm thinking that similar to CHREC_LEFT == 0 we can handle CHREC_RIGHT == 1
without unsigned promotion.  In the second iteration we are replacing
(_105 + 1) * col_stride_10 with _105 * col_stride_10 + col_stride_10
but we know already that _105 * col_stride_10 doesn't overflow as we
computed that in the first iteration.  And 1 * X never overflows.
The third iteration is problematic - we don't know whether 2 * col_stride_10
overflows if _105 was zero, if it was not it might have been -1 which
means the second iteration computed 0 * col_stride_10 originally.  Hmm,
so _105 == -1 is problematic, so no - I don't think we can handle
CHREC_RIGHT == 1 specially.

  parent reply	other threads:[~2024-03-12  9:59 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-28 13:57 [Bug tree-optimization/114151] New: [14 Regression] weird and inefficient codegen and addressing modes since g:a0b1798042d033fd2cc2c806afbb77875dd2909b tnfchris at gcc dot gnu.org
2024-02-28 14:33 ` [Bug tree-optimization/114151] " rguenth at gcc dot gnu.org
2024-02-28 14:36 ` rguenth at gcc dot gnu.org
2024-02-28 16:51 ` tnfchris at gcc dot gnu.org
2024-02-29  7:19 ` rguenther at suse dot de
2024-02-29 18:15 ` amacleod at redhat dot com
2024-03-01  9:37 ` rguenth at gcc dot gnu.org
2024-03-01 15:02 ` [Bug tree-optimization/114151] [14 Regression] weird and inefficient codegen and addressing modes since r14-9193 amacleod at redhat dot com
2024-03-04  7:47 ` rguenth at gcc dot gnu.org
2024-03-06  3:37 ` amacleod at redhat dot com
2024-03-06  7:14 ` rguenth at gcc dot gnu.org
2024-03-06  7:31 ` rguenth at gcc dot gnu.org
2024-03-06 14:57 ` amacleod at redhat dot com
2024-03-06 20:05 ` amacleod at redhat dot com
2024-03-07  8:04 ` rguenth at gcc dot gnu.org
2024-03-07 15:53 ` amacleod at redhat dot com
2024-03-07 20:37 ` law at gcc dot gnu.org
2024-03-08 10:13 ` rguenth at gcc dot gnu.org
2024-03-08 10:22 ` tnfchris at gcc dot gnu.org
2024-03-08 14:10 ` rguenth at gcc dot gnu.org
2024-03-12  9:59 ` rguenth at gcc dot gnu.org [this message]
2024-03-12 10:00 ` rguenth at gcc dot gnu.org
2024-03-12 20:41 ` amacleod at redhat dot com
2024-03-13  7:38 ` rguenth at gcc dot gnu.org
2024-03-13 17:37 ` amacleod at redhat dot com
2024-03-19 12:12 ` cvs-commit at gcc dot gnu.org
2024-03-19 12:16 ` rguenth at gcc dot gnu.org

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bug-114151-4-EnXnQMSmvC@http.gcc.gnu.org/bugzilla/ \
    --to=gcc-bugzilla@gcc.gnu.org \
    --cc=gcc-bugs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).