public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <rguenther@suse.de>
To: Filip Kastl <fkastl@suse.cz>
Cc: gcc-patches@gcc.gnu.org, hubicka@ucw.cz
Subject: Re: [PATCH v2] A new copy propagation and PHI elimination pass
Date: Thu, 30 Nov 2023 14:04:20 +0100 (CET)	[thread overview]
Message-ID: <39489or3-7659-5306-4133-6p3n2pr58p99@fhfr.qr> (raw)
In-Reply-To: <ZV4OQituWlGc_sox@localhost.localdomain>

On Wed, 22 Nov 2023, Filip Kastl wrote:

> Hi Richard,
> 
> > Can you name the new file gimple-ssa-sccopy.cc please?
> 
> Yes, no problem.
> 
> Btw, I thought that it is standard that gimple ssa passes have the tree-ssa-
> prefix. Do I understand it correctly that this is not true and many
> tree-ssa-*.cc passes should actually be named gimple-ssa-*.cc but remain
> tree-ssa-*.cc for historical reasons?

Yes.  Newer passes are gimple[-ssa]-xyz.cc, but it's just names ...

> >> +   3 A set of PHI statements that only refer to each other or to one other
> >> +     value.
> >> +
> >> +   _8 = PHI <_9, _10>;
> >> +   _9 = PHI <_8, _10>;
> >> +   _10 = PHI <_8, _9, _1>;
> > 
> > this case necessarily involves a cyclic CFG, so maybe say
> > 
> > "This is a lightweight SSA copy propagation pass that is able to handle
> > cycles optimistically, eliminating PHIs within those."
> > 
> > ?  Or is this a mis-characterization?
> 
> I'm not sure what you mean here. Yes, this case always involves a cyclic CFG.
> Is it weird that a lightweight pass is able to handle cyclic CFG and therefore
> you suggest to comment this fact and say that the pass handles cycles
> optimistically?
>
> I'm not sure if optimistic is a good word to characterize the pass. I'd expect
> an "optimistic" pass to make assumptions which may not be true and therefore
> not always all redundancies it can. This pass however should achieve all that
> it sets out to do.

I thought "optimistic" refers to defering processing of the backedge
value, "optimistically" treating it equal until you prove otherwise
when visiting its definition.

Otherwise there would be no difference to the existing copy propagation
pass which handles copies not optimistically (well, for other reasons).
 
> > It might be nice to optimize SCCs of size 1 somehow, not sure how
> > many times these appear - possibly prevent them from even entering
> > the SCC discovery?
> 
> Maybe that could be done. I would have to think about it and make sure it
> doesn't break anything. I'd prefer to get this version into upstream and then
> possibly post this upgrade later.
> 
> Btw, SCCs of size of size 1 appear all the time. Those are the cases 1 and 2
> described in the comment at the beginning of the file.
> 
> > I'll note that while you are working with stmts everywhere that
> > you are really bound to using SSA defs and those would already
> > nicely have numbers (the SSA_NAME_VERSION).  In principle the
> > SCC lattice could be pre-allocated once, indexed by
> > SSA_NAME_VERSION and you could keep a "generation" number
> > indicating what SCC discovery round it belongs to (aka the
> > set_using).
> 
> I see. I could allocate a vertex struct for each statement only once when the
> pass is invoked instead of allocating the structs each time tarjan_compute_sccs
> is called. Will do that.
> 
> I'm not sure if I want to use SSA_NAME_VERSION for indexing an vec/array with
> all those vertex structs. Many SSA names will be defined neither by PHI nor by
> a copy assignment statement. If I created a vertex struct for every SSA name I
> would allocate a lot of extra memory.

Not sure I would characterize it as a lot.

> > There's a old SCC finding algorithm working on the SSA graph
> > in the old SCC based value-numbering, for example on the
> > gcc 7 branch in tree-ssa-sccvn.c:DFS
> 
> > For reading it would be nice to put the SCC finding in its
> > own class.
> 
> Okay, I'll do that.
> 
> > > +	}
> > > +    }
> > > +
> > > +  if (!stack.is_empty ())
> > > +    gcc_unreachable ();
> > > +
> > > +  /* Clear copy stmts' 'using' flags.  */
> > > +  for (vertex v : vs)
> > > +    {
> > > +      gimple *s = v.stmt;
> > > +      tarjan_clear_using (s);
> > > +    }
> > > +
> > > +  return sccs;
> > > +}
> > > +
> > > +/* Could this statement potentially be a copy statement?
> > > +
> > > +   This pass only considers statements for which this function returns 'true'.
> > > +   Those are basically PHI functions and assignment statements similar to
> > > +
> > > +   _2 = _1;
> > > +   or
> > > +   _2 = 5;  */
> > > +
> > > +static bool
> > > +stmt_may_generate_copy (gimple *stmt)
> > > +{
> > > +  if (gimple_code (stmt) == GIMPLE_PHI)
> > > +    {
> > > +      gphi *phi = as_a <gphi *> (stmt);
> > > +
> > > +      /* No OCCURS_IN_ABNORMAL_PHI SSA names in lhs nor rhs.  */
> > > +      if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi)))
> > > +	return false;
> > > +
> > > +      unsigned i;
> > > +      for (i = 0; i < gimple_phi_num_args (phi); i++)
> > > +	{
> > > +	  tree op = gimple_phi_arg_def (phi, i);
> > > +	  if (TREE_CODE (op) == SSA_NAME
> > > +	      && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
> > > +	    return false;
> > > +	}
> > 
> > When there's more than one non-SSA PHI argument and they are not
> > the same then the stmt also cannot be a copy, right?
> > 
> > > +      return true;
> > > +    }
> 
> Do I understand you correctly that you propose to put another check here?
> Something like
> 
> unsigned nonssa_args_num = 0;
> unsigned i;
> for (i = 0; i < gimple_phi_num_args (phi); i++)
>   {
>     tree op = gimple_phi_arg_def (phi, i);
>     if (TREE_CODE (op) == SSA_NAME)
>       {
>         nonssa_args_num++;
> 	if (nonssa_args_num >= 2)
> 	  return false;
> 
> 	if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
> 	  return false;
>       }
>   }


If you do allow constants then maybe

  tree const_arg = NULL_TREE:
  for (...)
   {
     if (TREE_CODE (op) != SSA_NAME)
       {
          if (!const_arg)
            const_arg = op;
          else if (!operand_equal_p (const_arg, op))
            return false;
       }

instead?  You'd then optimistically assume all SSA name args are
equal to the constant by other means.

> > > +
> > > +  if (gimple_code (stmt) != GIMPLE_ASSIGN)
> > > +    return false;
> > > +
> > > +  /* If the statement has volatile operands, it won't generate a
> > > +     useful copy.  */
> > > +  if (gimple_has_volatile_ops (stmt))
> > > +    return false;
> > > +
> > > +  /* Statements with loads and/or stores will never generate a useful copy.  */
> > > +  if (gimple_store_p (stmt) || gimple_assign_load_p (stmt))
> > > +    return false;
> > > +
> > > +  if (!gimple_assign_single_p (stmt))
> > > +    return false;
> > > +
> > > +  tree lhs = gimple_assign_lhs (stmt);
> > > +  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
> > > +    return false;
> > > +
> > > +  /* If the assignment is from a constant it generates a useful copy.  */
> > > +  if (is_gimple_min_invariant (gimple_assign_rhs1 (stmt)))
> > > +    return true;
> > > +
> > > +  tree rhs = single_ssa_tree_operand (stmt, SSA_OP_USE);
> > > +
> > > +  if (!is_gimple_val (gimple_assign_rhs1 (stmt)))
> > > +    return false;
> > > +
> > > +  /* rhs shouldn't flow through any abnormal edges.  */
> > > +  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs))
> > > +    return false;
> > > +
> > > +  if (!rhs)
> > > +    return false;
> > 
> > so this means
> > 
> >  _1 = 0;
> > 
> > doesn't generate a copy?  But you allowed those in PHIs?
> > 
> > Generally I wanted to mention there's gimple_assign_ssa_name_copy_p
> > which exactly matches _1 = _2 (and nothing more).  There cannot
> > be volatiles or side-effects in those, no stores or loads
> > (but SSA_NAME_OCCURS_IN_ABNORMAL_PHI can appear).
> > 
> > Can you use that and remove the unneeded checks above?
> 
> I'm pretty sure that _1 = 0; generates a copy. The lines
> 
> if (is_gimple_min_invariant (gimple_assign_rhs1 (stmt)))
>   return true;
> 
> ensure that. I think I have even considered using gimple_assign_ssa_name_copy_p
> at some point but decided to instead write more elementary checks. I think I
> did that so that I am able to mark assignments of constants like _1 = 0; as
> generating copy statements too.

I see.  But then you simply want

     if (!(TREE_CODE (gimple_assign_rhs1 (stmt)) == SSA_NAME
           || is_gimple_min_invariant (gimple_assign_rhs1 (stmt))))
       return false;

and check the LHS to be a SSA_NAME.  

gimple_store_p (stmt) || gimple_assign_load_p (stmt)

isn't really "elementary" here.

> 
> > > +  /* If rhs and lhs are pointers, alignment of lhs and rhs should be the same.
> > > +     Usage of __builtin_assume_aligned can cause alignment of lhs to be greater
> > > +     than alignment of rhs.  In that case we don't want to propagate rhs since
> > > +     we would lose the alignment information.  */
> > 
> > The same might be true for value-ranges.  I think it would be better to
> > do it like copy-prop does and in cycles propagate the info to the
> > copy-of variable we substitute with.  For non-cyclic cases we cannot
> > generally copy flow-sensitive info.
> 
> This sounds complicated. I already thought about propagating the alignment
> info. But detecting if it is part of a cycle or not... right now I'm not sure
> how I would go about doing that. Thanks for the note that also value-range info
> needs to be handled.
> 
> > A more conservative test would be to compare
> > SSA_NAME_PTR_INFO for pointers and SSA_NAME_RANGE_INFO for non-pointers.
> 
> Let's do this more conservative check. Then later I could upgrade the pass to
> do the aligment and value-range info propagation.

OK.

> > > +static auto_vec<gimple *>
> > > +get_all_stmt_may_generate_copy (void)
> > > +{
> > > +  auto_vec<gimple *> result;
> > > +
> > > +  basic_block bb;
> > > +  FOR_EACH_BB_FN (bb, cfun)
> > > +    {
> > > +      gimple_stmt_iterator gsi;
> > > +      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> > > +	{
> > > +	  gimple *s = gsi_stmt (gsi);
> > > +	  if (stmt_may_generate_copy (s))
> > > +	    result.safe_push (s);
> > > +	}
> > > +
> > > +      gphi_iterator pi;
> > > +      for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi))
> > > +	{
> > > +	  gimple *s = pi.phi ();
> > > +	  if (stmt_may_generate_copy (s))
> > > +	    result.safe_push (s);
> > > +	}
> > > +    }
> > 
> > I wonder if this can do stmt-to-vertices on-the-fly, thus not
> > build the vector of gimple *.  That is, you build very many
> > vec<>s (probably not many in practice, but ...)
> 
> If I understand correctly, you're suggesting to check if a statement may
> generate copy during runtime of the sccopy algorithm. I guess that could be
> done but I think that it would make the code less readable. And it wouldn't
> reduce the number of used vec<>s that much -- get_all_stmt_may_generate_copy is
> run just once during the execution of the pass so it only creates one vec<>.
> Correct me if I didn't get what you meant here.

Yes, you'd simply only consider copy edges during SCC finding rather than
first collecting copy stmts.

> 
> > If I understand correctly sofar you'll never replace a SSA name by
> > a constant, so REPLACE_BY is always an SSA_NAME.
> 
> No, I certainly do replace SSA names by constants.
> 
> > I wonder if you ever hit the need to do any of the fancy cleanups.
> 
> I originally didn't have them then I ran the GCC testsuite and some tests were
> failing so I added the cleanups.

I see.

> > In fact I suggest you use the existing replace_uses_by API instead
> > of re-inventing it.
> 
> Oh nice. I didn't know that replace_uses_by API already exists in GCC. Will
> look into that.
> 
> > > +  /* Remove all propagated statements.  */
> > > +  FOR_EACH_BB_FN (bb, cfun)
> > > +    {
> > > +      gimple_stmt_iterator gsi;
> > > +      for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
> 
> > You generally should remove stmts back-to-front to make
> > debug stmt generation work.  To not re-invent the wheel
> > it might be simplest to use simple_dce_from_worklist with
> > the dead stmts defs as seeds (just remove set/is_stmt_dead
> > and track dead SSA defs in a bitmap).
> > 
> > > +	{
> > > +	  gimple *stmt = gsi_stmt (gsi);
> > > +	  if (is_stmt_dead (stmt))
> > > +	    gsi_remove (&gsi, true);
> > > +	  else
> > > +	    gsi_next (&gsi);
> > 
> > this also fails to release the SSA def in the stmt.
> > 
> >   release_defs (stmt);
> > 
> > would do that (or simple_dce_from_worklist).
> 
> Interesting. Wouldn't expect that the order of removing statements would have
> influence on debug information.
> 
> Alright. Will rewrite this part of the pass to use simple_dce_from_worklist.
> 
> > > +	}
> > > +
> > > +      gphi_iterator pi;
> > > +      for (pi = gsi_start_phis (bb); !gsi_end_p (pi);)
> > > +	{
> > > +	  gphi *stmt = pi.phi ();
> > > +
> > > +	  if (is_stmt_dead (stmt))
> > > +	    remove_phi_node (&pi, true);
> > > +	  else
> > > +	    gsi_next (&pi);
> > > +	}
> > > +    }
> > > +
> > > +  /* More cleanup.  */
> > > +  FOR_EACH_BB_FN (bb, cfun)
> > > +    gimple_purge_dead_eh_edges (bb);
> > 
> > You only removed copies, I'm sure this should neve do anything?
> 
> I think some testsuite testcases fail if I don't include this eh cleanup in my
> pass. I don't understand eh edges very well but my explanation for this is that
> I may replace an SSA name with a constant and ensure that a statement won't
> throw any exceptions anymore. For example in a x = 10 / y; statement I may
> replace y by 5 which ensures that we won't have to handle zero division
> exception for this statement.

Yes, when you replace with constants that can happen.

> > As a general comment I wonder if we can simplify most of the
> > pass by starting SSA based SCC discovery from a copy stmt and
> > whether we can reduce the number of vec<>s that are allocated
> > and possibly re-allocated many times?
> 
> I'm not sure what you mean by "starting SCC discovery from a copy stmt" here.

When you currently collect copy stmts in the vec<> you could do SCC
discovery with that as the starting stmt instead (if SCC discovery
would handle identifying copy stmts itself).

Thanks and sorry for the late reply.
Richard.

> Thanks for the review!
> Filip
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

  reply	other threads:[~2023-11-30 13:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-02 13:00 Filip Kastl
2023-11-17 13:01 ` Richard Biener
2023-11-22 14:20   ` Filip Kastl
2023-11-30 13:04     ` Richard Biener [this message]
2023-12-08 15:15       ` [PATCH v3] " Filip Kastl
2023-12-13 11:35         ` Richard Biener
2023-12-13 16:12           ` [PATCH v4] " Filip Kastl
2023-12-13 16:15             ` Richard Biener
2023-12-14 10:26               ` Filip Kastl
2023-12-14 13:22             ` In 'gcc/gimple-ssa-sccopy.cc', '#define INCLUDE_ALGORITHM' instead of '#include <algorithm>' (was: [PATCH v4] A new copy propagation and PHI elimination pass) Thomas Schwinge

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=39489or3-7659-5306-4133-6p3n2pr58p99@fhfr.qr \
    --to=rguenther@suse.de \
    --cc=fkastl@suse.cz \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=hubicka@ucw.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).