public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Richard Biener <richard.guenther@gmail.com>
To: Di Zhao OS <dizhao@os.amperecomputing.com>
Cc: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction
Date: Wed, 28 Jul 2021 11:38:52 +0200	[thread overview]
Message-ID: <CAFiYyc3c2KgJVFGC-iOobvn97E5qAwAi7fAzYB42zvhxitdCAA@mail.gmail.com> (raw)
In-Reply-To: <SN6PR01MB4240F51DE5A515BEE7419F91E8E09@SN6PR01MB4240.prod.exchangelabs.com>

On Sun, Jul 18, 2021 at 9:25 PM Di Zhao OS
<dizhao@os.amperecomputing.com> wrote:
>
>
> I tried to improve the patch following your advices and to catch more
> opportunities. Hope it'll be helpful.

Sorry for the late reply.

> On 6/24/21 8:29 AM, Richard Biener wrote:
> > On Thu, Jun 24, 2021 at 11:55 AM Di Zhao via Gcc-patches <gcc-
> > patches@gcc.gnu.org> wrote:
> >
> > I have some reservations about extending the ad-hoc "predicated value" code.
> >
> > Some comments on the patch:
> >
> > +/* hashtable & helpers to record equivalences at given bb.  */
> > +
> > +typedef struct val_equiv_s
> > +{
> > +  val_equiv_s *next;
> > +  val_equiv_s *unwind_to;
> > +  hashval_t hashcode;
> > +  /* SSA name this val_equiv_s is associated with.  */
> > +  tree name;
> > +  /* RESULT in a vn_pval entry is SSA name of a equivalence.  */
> > +  vn_pval *values;
> > +} * val_equiv_t;
> >
> > all of this (and using a hashtable for recording) is IMHO a bit overkill.
> > Since you only ever record equivalences for values the more natural place to
> > hook those in is the vn_ssa_aux structure where we also record the availability
> > chain.
>
> I tried to store the equivalences in the vn_ssa_aux structure, but I didn't
> optimize the second case successfully: I need to record the equivalence
> of a PHI expression's result and arguments, but their value number results will
> become VARYING first, so they won't be changed. Maybe I'm missing something, or
> can I force change a VARYING result?

But VARYING still has a value-number - it's the result itself?

> Besides, the way currently used, equivalences only need to be "predictable"
> rather than available, maybe availability chains do not represent them very
> well?

Sure, they are a different beast - I'm only commenting on the place you store
them as being not too efficient.

> > There's little commentary in the new code, in particular function-level
> > comments are missing everywhere.
>
> Added more comments.
>
> > There's complexity issues, like I see val_equiv_insert has a "recurse"
> > feature but also find_predicated_value_by_equiv is quadratic in the number of
> > equivalences of the lhs/rhs.  Without knowing what the recursion on the
> > former is for - nothing tells me - I suspect either of both should be redundant.
>
> The intention was, given {A==B, B==C, X==Y, Y==Z} and a previous result of
> "C opcode Z", to find the result of "A opcode Y". I removed the "recurse"
> feature and modified the searching logic so solve the issue. Now a temporary
> hash_set is used to record the equivalences that are visited when searching.

OK, so you're covering transitivity at query time - that looks
expensive to me.  As
said I wonder if there's a more efficient way to store equivalences here.

> > You seem to record equivalences at possible use points which looks odd at best
> > - I'd expected equivalences being recorded at the same point we record
> > predicated values and for the current condition, not the one determining some
> > other predication.
> > What was the motivation to do it the way you do it?
>
> The purpose is to "bring down" what can be known from a previous basic-block
> that effectively dominates current block, but not actually does so (in the
> example it is because jump threading is hindered by a loop). For example in
> this case:
>
>   if (a != 0)
>       // Nothing useful can be recorded here, because this BB doesn't dominate
>       // the BB that we want to simplify.
>       c = b;
>   for (unsigned i = 0; i < c; i++)
>     {
>       if (a != 0)  // The recording is triggered here.
>        {
>          // c == b will be recorded here, so it can be used for simplification.
>          // In gimple it is the equivalence of a PHI's result and argument.
>          if (i >= b)
>            foo ();
>
> These requires finding a previous condition that is identical with current
> one, so it is convenient to do this in FRE. Besides, as FRE records derived
> predicate, so for relative conditions there also might be opportunities for
> optimization. In the new patch code this is included.

I still don't quite understand why you cannot record the c = b equivalence
when processing its block.  You'd record "c == b if a != 0" and later
you look for equivalences on b and see if they are valid at the use site?
That's how predicated values work.

I'd like to see this equivalence stuff more naturally integrated with the
value lattice, not a collection of bolted-on hashmaps.

> Besides, to find more opportunities, added a hashmap to store mappings from
> immediate dominators to basic-blocks with PHIs of interest.
>
> > Why is the code conditional on 'iterate'?
>
> I haven't worked it out to fit the non-iterate mode, so it now breaks the
> if_conversion pass. I think this is because some of the equivalence-recordings
> are too optimistic for non-iterate mode.

Huh.  The non-iterative mode should be easier to deal with, in fact if you
are running into correctness issues this hints at latent issues even with the
iterative mode.

> > You've probably realized that there's no "nice" way to handle temporary
> > equivalences in a VN scheme using hashing for expressions (unless you degrade
> > hashes a lot).
>
> I modified the code to use TREE_HASH on ssa names. Would that be better?
>
> > You quote opportunities that are catched with this like
> >
> > +  if (a != 0)
> > +    {
> > +      c = b;
> > +    }
> > +  for (unsigned i = 0; i < c; i++)
> > +    {
> > +      if (a != 0)
> > +       {
> > +         if (i >= b)
> > +           /* Should be eliminated.
> > +            */
> > +           foo ();
> >
> > but say other "obvious" cases like
> >
> > +  if (a != 0)
> > +    {
> > +      c = b;
> > +    }
> > +  for (unsigned i = 0; i < c; i++)
> > +    {
> > +      if (a != 0)
> > +       {
> > +           /* Should be zero.  */
> >              return b - c;
> >
> > are not handled.  That's in line with the current "value predication"
> > which mainly aims at catching simple jump threading opportunities; you only
> > simplify conditions with the recorded equivalences.  But then the complexity of
> > handling equivalences does probably not outweight the opportunities catched -
> > can you share some numbers on how many branches are newly known taken
> > during VN with this patch during say bootstrap or build of SPEC CPU?
>
> I extended the code a little to cover the cases like "A - A" and "A xor A".
>
> Here are some results on one bootstrap step:
>            | values found | more bb removed | values found in all
>            | in fre1      | at fre1         | fre & pre passes
> -----------------------------------------------------------------
>  bootstrap | 592          | 40              | 1272
>
> As the code is different for bootstrap, the "more bb removed" metric is not
> precise. I also tested on SPEC CPU 2017 integer cases:
>                 | values found | more bb removed | values found in all
>                 | in fre1      | at fre1         | fre & pre passes
> -----------------------------------------------------------------
>  500.perlbench_r| 3            |  0              | 9
>  502.gcc_r      | 25           |  39             | 241
>  520.omnetpp    | 9            |  6              | 34
>  523.xalancbmk_r| 12           |  0              | 35
>  541.leela_r    | 2            |  0              | 2
>
> In cases not listed above there's no value found by equivalences. Benefits
> after fre1 are not counted as CGF may be different from here (for
> 523.xalancbmk_r there're 8 more basic-blocks removed at fre3). Although the
> chances are not plenty, there might be potential benefits, such as making a
> function pure.

That's very little improvements for this large pice of code :/

Richard.

> > I've hoped to ditch the current "value predication" code by eventually using the
> > relation oracle from ranger but did not yet have the chance to look into that.
> > Now, the predicated equivalences are likely not something that infrastructure
> > can handle?
> >
> > In the end I think we should research into maintaining an alternate expression
> > table for conditions (those we like to simplify with
> > equivalences) and use a data structure that more easily allows to introduce
> > (temporary) equivalences.  Like by maintaining back-references of values we've
> > recorded a condition for and a way to quickly re-canonicalize conditions.  Well -
> > it requires some research, as said.
> >
> > Richard.
> >
> > > Regards,
> > > Di Zhao
> > >
> > > Extend FRE with an "equivalence map" for condition prediction.
> > >
> > > 2021-06-24  Di Zhao  <dizhao@os.amperecomputing.com>
> > >
>
> Thanks,
> Di Zhao
>
> --------
> Extend FRE with an "equivalence map" for condition prediction.
>
> 2021-07-18  Di Zhao  <dizhao@os.amperecomputing.com>
>
> gcc/ChangeLog:
>         PR tree-optimization/101186
>         * tree-ssa-sccvn.c (vn_tracking_edge): Extracted utility function.
>         (dominated_by_p_w_unex): Moved upward, no change.
>         (vn_nary_op_get_predicated_value): Moved upward, no change.
>         (struct val_equiv_hasher): Hasher for the "equivalence map".
>         (is_vn_valid_at_bb): Check if vn_pval is valid at BB.
>         (val_equiv_insert): Insert into "equivalence map".
>         (vn_lookup_binary_op_result): Lookup binary expression's result by VN.
>         (iterate_val_equivs): Iterate on equivalences and returns a non-NULL
>         result returned by callback.
>         (find_predicated_binary_by_lhs_equiv): Callback for iterate_val_equivs.
>         Lookup a binary operations result by LHS equivalences.
>         (find_predicated_binary_by_rhs_equiv): Callback for iterate_val_equivs.
>         Lookup a binary operations result by RHS equivalences.
>         (find_predicated_binary_by_equiv): Lookup predicated value of a binary
>         operation by equivalences.
>         (is_relaxed_cond_code): Whether operation code is a relaxed condition
>         code derived from original code.
>         (branch_may_come_from_another): Whether there's a path from the one
>         true or false destination to another.
>         (record_equiv_from_previous_edge): Record equivalence relation from a
>         previous condition on current bb' true and false edges.
>         (record_equiv_from_previous_cond): Record equivalences generated by
>         previous conditions on current BB's true and false edges.
>         (vn_nary_op_insert_pieces_predicated): Extract utility function. Insert
>         into the "equivalence map" for predicate like "x==y is true".
>         (record_dom_to_phi_bb): Record mappings from immediate dominator to
>         basic_block with PHIs.
>         (vn_phi_lookup): Record mappings from immediate dominator to PHIs.
>         (visit_nary_op): Add lookup predicated values of binaries by
>         equivalences.
>         (free_rpo_vn): Free the "equivalence map".
>         (process_bb): Insert into & lookup from the "equivalence map".
>         (struct unwind_state): Add "equivalence map" unwind state.
>         (do_unwind): Unwind the "equivalence map".
>         (do_rpo_vn): Update "equivalence map" unwind state.
>
> gcc/testsuite/ChangeLog:
>         PR tree-optimization/101186
>         * gcc.dg/tree-ssa/pr71947-1.c: Disable fre.
>         * gcc.dg/tree-ssa/pr71947-2.c: Disable fre.
>         * gcc.dg/tree-ssa/pr71947-3.c: Disable fre.
>         * gcc.dg/tree-ssa/pr71947-5.c: Disable fre.
>         * gcc.dg/tree-ssa/vrp03.c: Disable fre.
>         * gcc.dg/tree-ssa/ssa-fre-95.c: New test.
>         * gcc.dg/tree-ssa/ssa-fre-96.c: New test.

      reply	other threads:[~2021-07-28  9:39 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-24  9:54 Di Zhao
2021-06-24 12:29 ` Richard Biener
2021-06-24 13:25   ` Andrew MacLeod
2021-06-24 15:01     ` Andrew MacLeod
2021-06-25  7:38       ` Richard Biener
2021-06-27 15:46         ` Aldy Hernandez
2021-06-28  8:12           ` Richard Biener
2021-06-28 13:15           ` Andrew MacLeod
2021-06-29 10:54             ` Richard Biener
2021-07-18 19:25   ` Di Zhao OS
2021-07-28  9:38     ` Richard Biener [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAFiYyc3c2KgJVFGC-iOobvn97E5qAwAi7fAzYB42zvhxitdCAA@mail.gmail.com \
    --to=richard.guenther@gmail.com \
    --cc=dizhao@os.amperecomputing.com \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).