public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Aldy Hernandez <aldyh@redhat.com>
Cc: Andrew MacLeod <amacleod@redhat.com>,
	Jeff Law <jeffreyalaw@gmail.com>,
	 gcc-patches <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] Support threading of just the exit edge
Date: Tue, 16 Aug 2022 09:18:31 +0000 (UTC)	[thread overview]
Message-ID: <nycvar.YFH.7.77.849.2208160836000.13569@jbgna.fhfr.qr> (raw)
In-Reply-To: <CAGm3qMUdZ=+cGZFin3BpzdPYftGmowGc-TXFcvip147b0YBU8Q@mail.gmail.com>

On Mon, 15 Aug 2022, Aldy Hernandez wrote:

> On Mon, Aug 15, 2022 at 9:24 PM Andrew MacLeod <amacleod@redhat.com> wrote:
> >
> > heh. or just
> >
> >
> > +      int_range<2> r;
> > +      if (!fold_range (r, const_cast <gcond *> (cond_stmt))
> > +      || !r.singleton_p (&val))
> >
> >
> > if you do not provide a range_query to any of the fold_using_range code,
> > it defaults to:
> >
> > fur_source::fur_source (range_query *q)
> > {
> >    if (q)
> >      m_query = q;
> >    else if (cfun)
> >      m_query = get_range_query (cfun);
> >    else
> >      m_query = get_global_range_query ();
> >    m_gori = NULL;
> > }
> >
> 
> Sweet.  Even better!

So when I do the following incremental change ontop of the posted
patch then I see that the path query is able to simplify more
"single BB paths" than the global range folding.

diff --git a/gcc/tree-ssa-threadbackward.cc 
b/gcc/tree-ssa-threadbackward.cc
index 669098e4ec3..777e778037f 100644
--- a/gcc/tree-ssa-threadbackward.cc
+++ b/gcc/tree-ssa-threadbackward.cc
@@ -314,6 +314,12 @@ back_threader::find_taken_edge_cond (const 
vec<basic_block> &path,
 {
   int_range_max r;
 
+  int_range<2> rf;
+  if (path.length () == 1)
+    {
+      fold_range (rf, cond);
+    }
+
   m_solver->compute_ranges (path, m_imports);
   m_solver->range_of_stmt (r, cond);
 
@@ -325,6 +331,8 @@ back_threader::find_taken_edge_cond (const 
vec<basic_block> &path,
 
   if (r == true_range || r == false_range)
     {
+      if (path.length () == 1)
+       gcc_assert  (r == rf);
       edge e_true, e_false;
       basic_block bb = gimple_bb (cond);
       extract_true_false_edges_from_block (bb, &e_true, &e_false);

Even doing the following (not sure what's the difference and in
particular expense over the path range query) results in missed
simplifications (checking my set of cc1files).

diff --git a/gcc/tree-ssa-threadbackward.cc 
b/gcc/tree-ssa-threadbackward.cc
index 669098e4ec3..1d43a179d08 100644
--- a/gcc/tree-ssa-threadbackward.cc
+++ b/gcc/tree-ssa-threadbackward.cc
@@ -99,6 +99,7 @@ private:
 
   back_threader_registry m_registry;
   back_threader_profitability m_profit;
+  gimple_ranger *m_ranger;
   path_range_query *m_solver;
 
   // Current path being analyzed.
@@ -146,12 +147,14 @@ back_threader::back_threader (function *fun, 
unsigned flags, bool first)
   // The path solver needs EDGE_DFS_BACK in resolving mode.
   if (flags & BT_RESOLVE)
     mark_dfs_back_edges ();
-  m_solver = new path_range_query (flags & BT_RESOLVE);
+  m_ranger = new gimple_ranger;
+  m_solver = new path_range_query (flags & BT_RESOLVE, m_ranger);
 }
 
 back_threader::~back_threader ()
 {
   delete m_solver;
+  delete m_ranger;
 
   loop_optimizer_finalize ();
 }
@@ -314,6 +317,12 @@ back_threader::find_taken_edge_cond (const 
vec<basic_block> &path,
 {
   int_range_max r;
 
+  int_range<2> rf;
+  if (path.length () == 1)
+    {
+      fold_range (rf, cond, m_ranger);
+    }
+
   m_solver->compute_ranges (path, m_imports);
   m_solver->range_of_stmt (r, cond);
 
@@ -325,6 +334,8 @@ back_threader::find_taken_edge_cond (const 
vec<basic_block> &path,
 
   if (r == true_range || r == false_range)
     {
+      if (path.length () == 1)
+       gcc_assert  (r == rf);
       edge e_true, e_false;
       basic_block bb = gimple_bb (cond);
       extract_true_false_edges_from_block (bb, &e_true, &e_false);

one example is

<bb 176> [local count: 14414059]:
_128 = node_177(D)->typed.type;
pretmp_413 = MEM[(const union tree_node *)_128].base.code;
_431 = pretmp_413 + 65519;
if (_128 == 0B)
  goto <bb 199>; [18.09%]
else
  goto <bb 177>; [81.91%]

where m_imports for the path is just _128 and the range computed is
false while the ranger query returns VARYING.  But
path_range_query::range_defined_in_block does

  if (bb && POINTER_TYPE_P (TREE_TYPE (name)))
    m_ranger->m_cache.m_exit.maybe_adjust_range (r, name, bb);

which adjusts the range to ~[0, 0], probably because of the
dereference in the following stmt.

Why does fold_range not do this when folding the exit test?  Is there
a way to make it do so?  It looks like the only routine using this
in gimple-range.cc is range_on_edge and there it's used for e->src
after calling range_on_exit for e->src (why's it not done in
range_on_exit?).  A testcase for this is

int foo (int **p, int i)
{
  int *q = *p;
  int res = *q + i;
  if (q)
    return res;
  return -1;
}

which we "thread" with the path and with the above ICEs because
fold_range doesn't get that if (q) is always true.  Without the
patch ethread doesn't want to duplicate the block (it's too large)
but threadfull will if you disable evrp (if you remove the increment
by 'i' it again won't since nothing interesting prevails and it
won't go to BB 0 and fails to pick up a thread of length > 1):

Checking profitability of path (backwards):  bb:2 (6 insns) bb:0
  Control statement insns: 2
  Overall: 4 insns
  [1] Registering jump thread: (0, 2) incoming edge;  (2, 3) nocopy;
path: 0->2->3 SUCCESS
Removing basic block 2
;; basic block 2, loop depth 0
;;  pred:
_1 = *p_6(D);
_2 = (long unsigned int) n_7(D);
_3 = _2 * 4;
q_8 = _1 + _3;
res_9 = *q_8;
if (q_8 != 0B)
  goto <bb 3>; [98.33%]
else
  goto <bb 4>; [1.67%]
;;  succ:       3
;;              4

... (it copied BB 2) ...

int foo (int * * p, int n)
{
  int res;
  int * q;
  int * _10;
  long unsigned int _11;
  long unsigned int _12;

  <bb 2> [local count: 1073741824]:
  _10 = *p_6(D);
  _11 = (long unsigned int) n_7(D);
  _12 = _11 * 4;
  q_13 = _10 + _12;
  res_14 = *q_13;
  return res_14;


  reply	other threads:[~2022-08-16  9:18 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-12 12:01 Richard Biener
2022-08-12 16:03 ` Aldy Hernandez
2022-08-15  9:39   ` Richard Biener
2022-08-15 19:09     ` Aldy Hernandez
2022-08-15 19:24       ` Andrew MacLeod
2022-08-15 19:29         ` Aldy Hernandez
2022-08-16  9:18           ` Richard Biener [this message]
2022-08-16 10:06             ` Aldy Hernandez
2022-08-16 11:32               ` Richard Biener
2022-08-16 11:42                 ` Aldy Hernandez
2022-08-16 13:44                 ` Richard Biener
2022-08-16 14:30             ` Andrew MacLeod
2022-08-17  7:42               ` Richard Biener
2022-08-17 14:39                 ` Andrew MacLeod
2022-08-18  7:08                   ` Richard Biener
2022-08-18 13:18                     ` Andrew MacLeod
2022-08-15 15:22   ` Jeff Law

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YFH.7.77.849.2208160836000.13569@jbgna.fhfr.qr \
    --to=rguenther@suse.de \
    --cc=aldyh@redhat.com \
    --cc=amacleod@redhat.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=jeffreyalaw@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).