public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PING] [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction
@ 2021-11-08  3:21 Di Zhao OS
  0 siblings, 0 replies; only message in thread
From: Di Zhao OS @ 2021-11-08  3:21 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 16094 bytes --]

Hi,

Gentle ping on this.

Di Zhao

-----Original Message-----
From: Di Zhao OS 
Sent: Monday, October 25, 2021 3:03 AM
To: Richard Biener <richard.guenther@gmail.com>
Cc: gcc-patches@gcc.gnu.org
Subject: RE: [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction

Hi,

Attached is a new version of the patch, mainly for improving performance
and simplifying the code.

First, regarding the comments:

> -----Original Message-----
> From: Richard Biener <richard.guenther@gmail.com>
> Sent: Friday, October 1, 2021 9:00 PM
> To: Di Zhao OS <dizhao@os.amperecomputing.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v2] tree-optimization/101186 - extend FRE with
> "equivalence map" for condition prediction
> 
> On Thu, Sep 16, 2021 at 8:13 PM Di Zhao OS
> <dizhao@os.amperecomputing.com> wrote:
> >
> > Sorry about updating on this after so long. It took me much time to work out a
> > new plan and pass the tests.
> >
> > The new idea is to use one variable to represent a set of equal variables at
> > some basic-block. This variable is called a "equivalence head" or "equiv-head"
> > in the code. (There's no-longer a "equivalence map".)
> >
> > - Initially an SSA_NAME's "equivalence head" is its value number. Temporary
> >   equivalence heads are recorded as unary NOP_EXPR results in the vn_nary_op_t
> >   map. Besides, when inserting into vn_nary_op_t map, make the new result at
> >   front of the vn_pval list, so that when searching for a variable's
> >   equivalence head, the first result represents the largest equivalence set at
> >   current location.
> > - In vn_ssa_aux_t, maintain a list of references to valid_info->nary entry.
> >   For recorded equivalences, the reference is result->entry; for normal N-ary
> >   operations, the reference is operand->entry.
> > - When recording equivalences, if one side A is constant or has more refs, make
> >   it the new equivalence head of the other side B. Traverse B's ref-list, if a
> >   variable C's previous equiv-head is B, update to A. And re-insert B's n-ary
> >   operations by replacing B with A.
> > - When inserting and looking for the results of n-ary operations, insert and
> >   lookup by the operands' equiv-heads.
> > ...
> >
> > Thanks,
> > Di Zhao
> >
> > --------
> > Extend FRE with temporary equivalences.
> 
> Comments on the patch:
> 
> +  /* nary_ref count.  */
> +  unsigned num_nary_ref;
> +
> 
> I think a unsigned short should be enough and that would nicely
> pack after value_id together with the bitfield (maybe change that
> to unsigned short :1 then).

Changed num_nary_ref to unsigned short and moved after value_id.

> @@ -7307,17 +7839,23 @@ process_bb (rpo_elim &avail, basic_block bb,
>             tree val = gimple_simplify (gimple_cond_code (last),
>                                         boolean_type_node, lhs, rhs,
>                                         NULL, vn_valueize);
> +           vn_nary_op_t vnresult = NULL;
>             /* If the condition didn't simplfy see if we have recorded
>                an expression from sofar taken edges.  */
>             if (! val || TREE_CODE (val) != INTEGER_CST)
>               {
> -               vn_nary_op_t vnresult;
> 
> looks like you don't need vnresult outside of the if()?

vnresult is reused later to record equivalences generated by PHI nodes.

> +/* Find predicated value of vn_nary_op by the operands' equivalences.  Return
> + * NULL_TREE if no known result is found.  */
> +
> +static tree
> +find_predicated_value_by_equivs (vn_nary_op_t vno, basic_block bb,
> +                                vn_nary_op_t *vnresult)
> +{
> +  lookup_equiv_heads (vno->length, vno->op, vno->op, bb);
> +  tree result
> +    = simplify_nary_op (vno->length, vno->opcode, vno->op, vno->type);
> 
> why is it necessary to simplify here?  It looks like the caller
> already does this.

In the new patch, changed the code a little to remove redundant calculation.

> I wonder whether it's valid to always perform find_predicated_value_by_equivs
> from inside vn_nary_op_get_predicated_value instead of bolting it to only
> a single place?

Removed function find_predicated_value_by_equivs and inlined the code.

Because lookup_equiv_head uses vn_nary_op_get_predicated_value, so I left
vn_nary_op_get_predicated_value unchanged. Instead, operands are set to
equiv-heads in init_vn_nary_op_from_stmt. So altogether, predicates are always
inserted and searched by equiv-heads.

> +
> +static vn_nary_op_t
> +val_equiv_insert (tree op1, tree op2, edge e)
> +{
> 
> +  if (is_gimple_min_invariant (lhs))
> +    std::swap (lhs, rhs);
> +  if (is_gimple_min_invariant (lhs) || TREE_CODE (lhs) != SSA_NAME)
> +    /* Possible if invoked from record_equiv_from_previous_cond.  */
> +    return NULL;
> 
> Better formulate all of the above in terms of only SSA_NAME checks since...
> 
> +  /* Make the hand-side with more recorded n-ary expressions new
> +   * equivalence-head, to make fewer re-insertions.  */
> +  if (TREE_CODE (rhs) == SSA_NAME
> +      && VN_INFO (rhs)->num_nary_ref < VN_INFO (lhs)->num_nary_ref)
> +    std::swap (lhs, rhs);
> 
> here LHS needs to be an SSA_NAME.

Tried to fix this in the new patch.

> +  /* Record equivalence as unary NOP_EXPR.  */
> +  vn_nary_op_t val
> +    = vn_nary_op_insert_pieces_predicated_1 (1, NOP_EXPR, TREE_TYPE (lhs),
> +                                            &lhs, rhs, 0, bb);
> 
> so you are recording an equivalence of lhs to rhs as (NOP_EXPR)lhs
> with value rhs?
> Presumably you think that's good enough to not overlap with entries
> for "real" IL?

By checking related codes, it seems to me there's no overlap as long as
equivalences are inserted and searched as predicated_values. It should be easy
to replace this with other preferred opcode.

> In particular why did you choose to _not_ use (2, EQ_EXPR, boolean_type_node,
> &[lhs, rhs], boolean_true_node, ...) like I think what would be
> inserted for by the
> existing code in process_bb via vn_nary_op_insert_pieces_predicated?
> It would probably felt more natural if the "new" code bolted on the case where
> that is called with an equivalence?

The "equivalence head"s actually implement Disjoint-sets. So by a hashtable
search in the nary-map, the set of equivalences of a variable can be found.
Also it's easy to find out whether two variables are equal. And the 
disjoint-sets are path-compressed by updating "equivalence head" and
re-inserting predicates.

In the new attached patch, predicates like "A==B is true" and "A!=B is false"
are not inserted, as they can be computed from equivalences. Same with the
predicates derived from them. This reduces insertions of predicates by a lot (
counting the equivalences themselves and re-inserted predicates). Below are
test results on SPEC intrate, comparing with trunk (f45610a4, Oct 19):

              |more bb |more bb |more stmt|more stmt|less      |less
              |removed |removed |removed  |removed  |predicates|predicates
              |at fre1 |at fre1 |at fre1  |at fre1  |inserted  |inserted
--------------------------------------------------------------------------
500.perlbench_r| 64    | 1.56%  | 102    | 0.18%   | 51928    | 53.44%
502.gcc_r      | 679   | 4.88%  | 290    | 0.21%   | 149968   | 65.5%
505.mcf_r      | 5     | 35.71% | 9      | 1.40%   | 349      | 27.48%
520.omnetpp    | 132   | 5.36%  | 48     | 0.14%   | 28192    | 58.38%
523.xalancbmk_r| 231   | 3.15%  | 308    | 0.35%   | 65131    | 58.95%
525.x264_r     | 4     | 1.36%  | 27     | 0.11%   | 10401    | 40.68%
531.deepsjeng_r| 1     | 3.45%  | 2      | 0.14%   | 1412     | 53.81%
541.leela_r    | 2     | 0.62%  | 5      | 0.06%   | 3874     | 48.56%
548.exchange2_r| 0     | 0.00%  | 3      | 0.04%   | 159      | 3.33%
557.xz_r       | 0     | 0.00%  | 3      | 0.07%   | 1894     | 52.39%

Considering the copying and unwinding work saved, this patch could reduce
compile time rather than increase it.

> Btw, I didn't find any condition that prevents equivalences being used for
> non-integer types?  Are you sure this is all valid for FP compares without
> further restrictions?

I think this approach follows sccvn's previous decisions on FP comparisons,
for equivalences are inserted only when EQ_EXPRs were evaluated to be true.
The patch is tested on regression tests and SPEC fprate cases.

> 
> @@ -4224,36 +4404,345 @@ vn_nary_op_insert_pieces_predicated (unsigned
> int length, enum tree_code code,
>        fprintf (dump_file, " == %s\n",
>                integer_zerop (result) ? "false" : "true");
>      }
> -  vn_nary_op_t vno1 = alloc_vn_nary_op (length, NULL_TREE, value_id);
> -  init_vn_nary_op_from_pieces (vno1, length, code, type, ops);
> -  vno1->predicated_values = 1;
> -  vno1->u.values = (vn_pval *) obstack_alloc (&vn_tables_obstack,
> -                                             sizeof (vn_pval));
> -  vno1->u.values->next = NULL;
> -  vno1->u.values->result = result;
> -  vno1->u.values->n = 1;
> -  vno1->u.values->valid_dominated_by_p[0] = pred_e->dest->index;
> -  return vn_nary_op_insert_into (vno1, valid_info->nary, true);
> +  /* Insert by operands' equiv-heads.  */
> +  tree new_ops[3];
> +  lookup_equiv_heads (length, ops, new_ops, pred_e->dest);
> +  /* Skip if the result is known.  */
> +  tree simplified = simplify_nary_op (length, code, new_ops, type);
> +  if (simplified)
> +    {
> +      if (dump_file && (dump_flags & TDF_DETAILS))
> +       {
> +         fprintf (dump_file, "Result is known: ");
> +         print_generic_expr (dump_file, simplified, TDF_NONE);
> +         fprintf (dump_file, ", skipped.\n");
> +       }
> +      return NULL;
> +    }
> 
> soo - how do the existing predicated expressions change to be handled here?
> Why would they ever be simplified?  Should this simplification not better
> happen
> upthread when visiting the conditional?  Maybe that needs to use the equivs
> already?

This happens when updating variable A's equiv_head from B to C, and 
re-inserting former predicates of B with C. For example, "x < 0" is recorded,
then x is assigned with new equiv_head -1 at some location, then -1 < 0 can
simplify. It shouldn't happen when inserting new predicates.

> 
> You re-use the predicated value list which I think is great, can you explain
> how "latest" is meant in the following?
> 
> @@ -4145,7 +4264,12 @@ vn_nary_op_insert_into (vn_nary_op_t vno,
> vn_nary_op_table_type *table,
>               next = &(*next)->next;
>             }
>           if (!found)
> -           *next = nval;
> +           {
> +             /* Insert new value at the front, so that lookup_equiv_head can
> +              * find the latest result.  */
> +             nval->next = vno->u.values;
> +             vno->u.values = nval;
> +           }

It meant the "most general" result. (I picked the wrong word. The description
is changed in the new patch file. )

There can be multiple valid results at the same time. As we traverse in RPO and
update equiv-heads (i.e. unite disjoint-sets), the last inserted result
represents the most complete set of equivalences of the variable.

(In the last patch version, when "Appending predicate to value", the new result
is not moved to the front. That violates the property of equiv-heads and some
opportunities will be lost. Fixed this in the new patch file.)

> 
> OK, I have to stop now but will try to understand and review further next week.
> Sorry for the delays but the patch is quite complex to understand :/
> 
> Richard.
> 
> > 2021-09-16  Di Zhao  <dizhao@os.amperecomputing.com>
> >

Other changes:
- Fixed some errs in comments.
- On recording temporary equivalences generated by PHI nodes, altered the
  logic, hope it can be a bit more clear.
  1. When a conditional expression's true/false edge E is assessed to be 
     executable, search for conflicting predicates for E to be taken. If a
     conflicting predicate is found at L, and L's single predecessor P 
     dominates current BB, then we know P->L cannot be taken if E is taken.
  2. Process phi_bb (immediately dominated by P, and dominates BB), rule out
     PHI arguments (x2) that can only come from P->L. If there's a single 
     PHI argument (x1) left, record the equivalence of x1 and PHI result x
     at edge E.
          -----
          | P | 
          -----
           |  \
           |   -----
           |   | L |  <== conflicting predicate's location
           |   -----
           |    /
  -------------------------
  | x = PHI<x1(P), x2(L)> | phi_bb
  -------------------------
             |
            ....
             |
           -----
           | BB | <== current bb
           -----
          /     \ edge E
        ...     -------
                | dest |
                -------

- Fixed a correctness problem when recording equivalences from PHI in previous
  version. Also test case ssa-fre-95.c:k in previous version was incorrect,
  fixed that.

Thanks,
Di Zhao

---
Extend FRE with temporary equivalences.

2021-10-24  Di Zhao  <dizhao@os.amperecomputing.com>

gcc/ChangeLog:
        PR tree-optimization/101186
        * tree-ssa-sccvn.c (VN_INFO): remove assertions (there could be a
        predicate already).
        (dominated_by_p_w_unex): Moved upward.
        (vn_nary_op_get_predicated_value): Moved upward.
        (is_vn_valid_at_bb): Check if vn_pval is valid at BB.
        (lookup_equiv_head): Lookup the "equivalence head" of given node.
        (lookup_equiv_heads): Lookup the "equivalence head"s of given nodes.
        (vn_tracking_edge): Extracted utility function.
        (init_vn_nary_op_from_stmt): Insert and lookup by "equivalence head"s.
        (vn_nary_op_insert_into): Insert new value at the front.
        (vn_nary_op_insert_pieces_predicated_1): Insert as predicated values
        from pieces.
        (fold_const_from_equiv_heads): Fold N-ary expression of equiv-heads.
        (push_new_nary_ref): Insert a back-reference to vn_nary_op_t.
        (val_equiv_insert): Record temporary equivalence.
        (vn_nary_op_insert_pieces_predicated): Record equivalences instead of
        some predicates; insert back-refs.
        (record_equiv_from_prev_phi_1): Record temporary equivalences generated
        by PHI nodes.
        (record_equiv_from_prev_phi): Given an outgoing edge of a conditional
        expression taken, record equivalences generated by PHI nodes.
        (visit_nary_op): Add lookup previous results of N-ary operations by
        equivalences.
        (insert_related_predicates_on_edge): Some predicates can be computed
        from equivalences, no need to insert them.
        (process_bb): Add lookup predicated values by equivalenc~/tmp/time-frees.
        (struct unwind_state): Unwind state of back-refs to vn_nary_op_t.
        (do_unwind): Unwind the back-refs to vn_nary_op_t.
        (do_rpo_vn): Update back-reference unwind state.
        * tree-ssa-sccvn.h (struct nary_ref): hold a lists of references to the
        nary map entries.

gcc/testsuite/ChangeLog:

        * gcc.dg/tree-ssa/pr68619-2.c: Disable fre.
        * gcc.dg/tree-ssa/pr71947-1.c: Disable fre.
        * gcc.dg/tree-ssa/pr71947-2.c: Disable fre.
        * gcc.dg/tree-ssa/pr71947-3.c: Disable fre.
        * gcc.dg/tree-ssa/pr71947-5.c: Disable fre.
        * gcc.dg/tree-ssa/pr71947-7.c: Disable fre.
        * gcc.dg/tree-ssa/pr71947-8.c: Disable fre.
        * gcc.dg/tree-ssa/pr71947-9.c: Disable fre.
        * gcc.dg/tree-ssa/vrp03.c: Disable fre.
        * gcc.dg/tree-ssa/ssa-fre-100.c: New test.
        * gcc.dg/tree-ssa/ssa-fre-101.c: New test.
        * gcc.dg/tree-ssa/ssa-fre-102.c: New test.
        * gcc.dg/tree-ssa/ssa-pre-34.c: New test.


[-- Attachment #2: v3-tree-optimization-101186.patch --]
[-- Type: application/octet-stream, Size: 37568 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr68619-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr68619-2.c
index cca706e0c4f..0a99c3c620e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr68619-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr68619-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-dom2-details -w" } */
+/* { dg-options "-O2 -fno-tree-fre -fdump-tree-dom2-details -w" } */
 
 typedef union tree_node *tree;
 struct gcc_options
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
index ac8271cc574..e9a797e82b0 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */ 
-/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+/* { dg-options "-O2 -fno-tree-vrp -fno-tree-fre -fdump-tree-dom-details" } */
 
 
 int f(int x, int y)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
index b2c09cbb021..bcbc419575c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+/* { dg-options "-O2 -fno-tree-vrp -fno-tree-fre -fdump-tree-dom-details" } */
 
 
 int f(int x, int y)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
index 2316f7e1c04..d63291da440 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-3.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+/* { dg-options "-O2 -fno-tree-vrp -fno-tree-fre -fdump-tree-dom-details" } */
 
 int f(int x, int y)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-5.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-5.c
index e7038d0237f..c1e5667b45a 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-5.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-5.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+/* { dg-options "-O2 -fno-tree-vrp -fno-tree-fre -fdump-tree-dom-details" } */
 
 
 static inline long load(long *p)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-7.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-7.c
index b44c064aa23..069e4a106c3 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+/* { dg-options "-O2 -fno-tree-vrp -fno-tree-fre -fdump-tree-dom-details" } */
 
 
 int f(int x, int y)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-8.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-8.c
index 26e5abbdc29..26b303e896e 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-8.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-8.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+/* { dg-options "-O2 -fno-tree-vrp -fno-tree-fre -fdump-tree-dom-details" } */
 
 
 int f(int x, int y)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-9.c b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-9.c
index 22b68d5ede0..3ed3bf6ad29 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr71947-9.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr71947-9.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fno-tree-vrp -fdump-tree-dom-details" } */
+/* { dg-options "-O2 -fno-tree-vrp -fno-tree-fre -fdump-tree-dom-details" } */
 
 
 int f(int x, int y)
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
new file mode 100644
index 00000000000..ccb0fdaa698
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-100.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fre1" } */
+
+extern void bar(void);
+extern void boo(void);
+extern void foo(void);
+
+/* Test temporary equivalences: basic.  */
+
+void f (unsigned int a, unsigned int b)
+{
+  if (a == b)
+    {
+      for (unsigned i = 0; i < a; i++)
+	{
+	  bar ();
+	  if (i >= b)
+	    /* Unreachable.  */
+	    foo ();
+	}
+    }
+}
+
+/* Test temporary equivalences: transitive.  */
+
+void g (unsigned int a, unsigned int b, unsigned int c)
+{
+  for (unsigned i = 0; i < c; i++)
+    {
+      if (a == b)
+	{
+	  if (b == c)
+	    {
+	      boo ();
+	      if (i >= a)
+		/* Unreachable.  */
+		foo ();
+	    }
+	}
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "foo" "fre1" } } */
+/* { dg-final { scan-tree-dump "bar" "fre1" } } */
+/* { dg-final { scan-tree-dump "boo" "fre1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-101.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-101.c
new file mode 100644
index 00000000000..48954d50f2e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-101.c
@@ -0,0 +1,74 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fre1" } */
+
+extern void bar(void);
+extern void bag(void);
+extern void bas(void);
+extern void foo(void);
+
+void f (unsigned int a, unsigned int b)
+{
+  if (a == b)
+    {
+      for (unsigned i = a; i < 100; i++)
+	{
+	  if (i > b)
+	    /* Should not be eliminated.  */
+	    bar ();
+	}
+    }
+}
+
+/* Test temporary equivalences: derived condition.  */
+
+void g (unsigned int a, unsigned int b, unsigned int d)
+{
+  unsigned int c = 100;
+  if (a > 0)
+    c = b;
+  else
+    c = d;
+  for (unsigned i = 0; i < c; i++)
+    {
+      if (a >= 0)
+	{
+	  if (i >= b)
+	    /* Should not be eliminated. ("a >= 0" cannot derive "a > 0".)  */
+	    bas ();
+	}
+      else
+	{
+	  if (i >= d)
+	    /* Should be eliminated, as "a < 0" can derive "a <= 0".  */
+	    foo ();
+	}
+      i++;
+    }
+}
+
+/* Test temporary equivalences: no domination.  */
+
+void h (unsigned int a, unsigned int b)
+{
+  unsigned int c = 100;
+  if (b % 2)
+    {
+      if (a != 0)
+	c = b;
+    }
+  for (unsigned i = 0; i < c; i++)
+    {
+      if (a != 0)
+	{
+	  if (i >= b)
+	    /* Should not be eliminated.  */
+	    bag ();
+	}
+      i++;
+    }
+}
+
+/* { dg-final { scan-tree-dump "bar" "fre1" } } */
+/* { dg-final { scan-tree-dump "bas" "fre1" } } */
+/* { dg-final { scan-tree-dump "bag" "fre1" } } */
+/* { dg-final { scan-tree-dump-not "foo" "fre1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-102.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-102.c
new file mode 100644
index 00000000000..02f12929cbb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-102.c
@@ -0,0 +1,45 @@
+
+
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-fre1" } */
+
+extern void bar(void);
+extern void bag(void);
+extern void foo(void);
+int g;
+
+/* Test temporary equivalences: optimize N-ary expressions.  */
+
+void
+f (unsigned int a, unsigned int b, unsigned int x, unsigned int y)
+{
+  if (a == b)
+    {
+      g = 100;
+      if (a + x == 99)
+	g = x + b; /* should be simplified to 99 */
+    }
+  else
+    // a!= b
+    {
+      if (a == 100 - x)
+	{
+	  g++;
+	  if (b == 100 - x)
+	    foo (); /* should be removed */
+	}
+      else
+	{
+	  g--;
+	  if (b == 100 - x)
+	    bag (); /* should not be removed */
+	  else
+	    bar (); /* should not be removed */
+	}
+    }
+}
+
+/* { dg-final { scan-tree-dump "bag" "fre1" } } */
+/* { dg-final { scan-tree-dump "bar" "fre1" } } */
+/* { dg-final { scan-tree-dump-not "foo" "fre1" } } */
+/* { dg-final { scan-tree-dump-not "= b_* + x_" "fre1" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-34.c b/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-34.c
new file mode 100644
index 00000000000..dba9aed338a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-34.c
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-pre" } */
+
+extern void foo(void);
+extern void bag(void);
+extern void bas(void);
+
+/* Test temporary equivalences: identical condition.  */
+
+void h (int a, int b, int x, int y)
+{
+  int c = y;
+  if (a != 0)
+    c = x;
+  while (b < 1000)
+    {
+      if (a != 0)
+	{
+	  if (c > x)
+	    /* Unreachable.  */
+	    foo ();
+	}
+      else
+	bas ();
+      b++;
+    }
+}
+
+/* Test temporary equivalences: derived condition.  */
+
+void k (int a, int b, int x, int y)
+{
+  int c = y;
+  if (a != 0)
+    c = x;
+  while (b < 1000)
+    {
+      if (a > 0)
+	/* All paths reaching "a>0 is true" come from "a!=0 is true".  */
+	{
+	  if (c > x)
+	    /* Unreachable.  */
+	    foo ();
+	}
+      else
+	bag ();
+      b++;
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "foo" "fre1" } } */
+/* { dg-final { scan-tree-dump "bag" "fre1" } } */
+/* { dg-final { scan-tree-dump "bas" "fre1" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c b/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
index 1d7ea4e8ffb..87df31d4f0d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/vrp03.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdisable-tree-evrp -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
+/* { dg-options "-O2 -fdisable-tree-evrp -fno-tree-fre -fdump-tree-vrp1 -fdisable-tree-ethread -fdisable-tree-thread1" } */
 
 struct A
 {
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 893b1d0ddaa..8e3cf28b937 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -353,6 +353,8 @@ static vn_reference_t last_inserted_ref;
 static vn_phi_t last_inserted_phi;
 static vn_nary_op_t last_inserted_nary;
 static vn_ssa_aux_t last_pushed_avail;
+static vn_ssa_aux_t last_pushed_nary_ref;
+static nary_ref* nary_ref_freelist;
 
 /* Valid hashtables storing information we have proven to be
    correct.  */
@@ -491,9 +493,9 @@ VN_INFO (tree name)
 	    nary->predicated_values = 0;
 	    nary->u.result = boolean_true_node;
 	    vn_nary_op_insert_into (nary, valid_info->nary, true);
-	    gcc_assert (nary->unwind_to == NULL);
 	    /* Also do not link it into the undo chain.  */
 	    last_inserted_nary = nary->next;
+	    /* There could be a predicate already.  */
 	    nary->next = (vn_nary_op_t)(void *)-1;
 	    nary = alloc_vn_nary_op_noinit (2, &vn_tables_insert_obstack);
 	    init_vn_nary_op_from_pieces (nary, 2, EQ_EXPR,
@@ -501,7 +503,6 @@ VN_INFO (tree name)
 	    nary->predicated_values = 0;
 	    nary->u.result = boolean_false_node;
 	    vn_nary_op_insert_into (nary, valid_info->nary, true);
-	    gcc_assert (nary->unwind_to == NULL);
 	    last_inserted_nary = nary->next;
 	    nary->next = (vn_nary_op_t)(void *)-1;
 	    if (dump_file && (dump_flags & TDF_DETAILS))
@@ -3847,6 +3848,114 @@ vn_reference_insert_pieces (tree vuse, alias_set_type set,
   return vr1;
 }
 
+static bool
+dominated_by_p_w_unex (basic_block bb1, basic_block bb2, bool);
+
+static tree
+vn_nary_op_get_predicated_value (vn_nary_op_t vno, basic_block bb)
+{
+  if (! vno->predicated_values)
+    return vno->u.result;
+  for (vn_pval *val = vno->u.values; val; val = val->next)
+    for (unsigned i = 0; i < val->n; ++i)
+      {
+	/* Do not handle backedge executability optimistically since
+	 when figuring out whether to iterate we do not consider
+	 changed predication.  */
+	if (dominated_by_p_w_unex (
+	      bb, BASIC_BLOCK_FOR_FN (cfun, val->valid_dominated_by_p[i]),
+	      false))
+	return val->result;
+      }
+  return NULL_TREE;
+}
+
+/* Whether a predicated value VAL is valid at BB.  */
+
+static inline bool
+is_vn_valid_at_bb (vn_pval *val, basic_block bb)
+{
+  for (unsigned i = 0; i < val->n; ++i)
+    {
+      basic_block val_bb
+	= BASIC_BLOCK_FOR_FN (cfun, val->valid_dominated_by_p[i]);
+      if (dominated_by_p_w_unex (bb, val_bb, false))
+	return true;
+    }
+  return false;
+}
+
+/* Lookup the variable that represents a set of equivalences of NAME at
+   basic-block BB.  This variable is called the "equivalence head" of NAME.  */
+
+static tree
+lookup_equiv_head (tree name, basic_block bb)
+{
+  if (TREE_CODE (name) != SSA_NAME)
+    return name;
+  tree result = VN_INFO (name)->valnum;
+  if (result == VN_TOP)
+    /* VN_TOP is not allowed, otherwise it will be too optimistic for
+       non-iterating mode.  */
+    return name;
+  vn_nary_op_t vnresult;
+  /* As we insert new value at front in vn_nary_op_insert_into, we don't need
+     search iteratively for the most general equiv-head.  */
+  vn_nary_op_lookup_pieces (1, NOP_EXPR, TREE_TYPE (result), &result,
+			    &vnresult);
+  if (vnresult && vnresult->predicated_values)
+    {
+      tree eq = vn_nary_op_get_predicated_value (vnresult, bb);
+      if (eq)
+	return eq;
+    }
+  return result;
+}
+
+/* Lookup the "equivalence head"s of OPS at BB, and store them in TO.  N is the
+   size of OPS.  */
+
+static void
+lookup_equiv_heads (unsigned n, tree *ops, tree *to, basic_block bb)
+{
+  for (unsigned i = 0; i < n; i++)
+    {
+      tree equiv_head = lookup_equiv_head (ops[i], bb);
+      if (dump_file && (dump_flags & TDF_DETAILS) && equiv_head != ops[i])
+	{
+	  fprintf (dump_file, "Use equiv-head ");
+	  print_generic_expr (dump_file, equiv_head, TDF_NONE);
+	  fprintf (dump_file, " for ");
+	  print_generic_expr (dump_file, ops[i], TDF_NONE);
+	  fprintf (dump_file, "\n");
+	}
+      to[i] = equiv_head;
+    }
+}
+
+/* Returns whether edge PRED_E is currently being tracked.  */
+
+static inline bool
+vn_tracking_edge (edge pred_e)
+{
+  if (!single_pred_p (pred_e->dest))
+    {
+      /* Never record for backedges.  */
+      if (pred_e->flags & EDGE_DFS_BACK)
+	return false;
+      edge_iterator ei;
+      edge e;
+      int cnt = 0;
+      /* Ignore backedges.  */
+      FOR_EACH_EDGE (e, ei, pred_e->dest->preds)
+	if (!dominated_by_p (CDI_DOMINATORS, e->src, e->dest))
+	  cnt++;
+      if (cnt != 1)
+	return false;
+    }
+  return true;
+}
+
 /* Compute and return the hash value for nary operation VBO1.  */
 
 static hashval_t
@@ -3984,6 +4093,9 @@ init_vn_nary_op_from_stmt (vn_nary_op_t vno, gassign *stmt)
       for (i = 0; i < vno->length; ++i)
 	vno->op[i] = gimple_op (stmt, i + 1);
     }
+  /* Insert and lookup N-ary results by the operands' equivalence heads.  */
+  if (gimple_bb (stmt))
+    lookup_equiv_heads (vno->length, vno->op, vno->op, gimple_bb (stmt));
 }
 
 /* Compute the hashcode for VNO and look for it in the hash table;
@@ -4126,12 +4238,10 @@ vn_nary_op_insert_into (vn_nary_op_t vno, vn_nary_op_table_type *table,
 	    = BASIC_BLOCK_FOR_FN (cfun, vno->u.values->valid_dominated_by_p[0]);
 	  vn_pval *nval = vno->u.values;
 	  vn_pval **next = &vno->u.values;
-	  bool found = false;
 	  for (vn_pval *val = (*slot)->u.values; val; val = val->next)
 	    {
 	      if (expressions_equal_p (val->result, nval->result))
 		{
-		  found = true;
 		  for (unsigned i = 0; i < val->n; ++i)
 		    {
 		      basic_block val_bb
@@ -4145,17 +4255,15 @@ vn_nary_op_insert_into (vn_nary_op_t vno, vn_nary_op_table_type *table,
 			gcc_unreachable ();
 		    }
 		  /* Append value.  */
-		  *next = (vn_pval *) obstack_alloc (&vn_tables_obstack,
-						     sizeof (vn_pval)
-						     + val->n * sizeof (int));
-		  (*next)->next = NULL;
-		  (*next)->result = val->result;
-		  (*next)->n = val->n + 1;
-		  memcpy ((*next)->valid_dominated_by_p,
-			  val->valid_dominated_by_p,
+		  nval = (vn_pval *) obstack_alloc (&vn_tables_obstack,
+						    sizeof (vn_pval)
+						      + val->n * sizeof (int));
+		  nval->next = NULL;
+		  nval->result = val->result;
+		  nval->n = val->n + 1;
+		  memcpy (nval->valid_dominated_by_p, val->valid_dominated_by_p,
 			  val->n * sizeof (int));
-		  (*next)->valid_dominated_by_p[val->n] = vno_bb->index;
-		  next = &(*next)->next;
+		  nval->valid_dominated_by_p[val->n] = vno_bb->index;
 		  if (dump_file && (dump_flags & TDF_DETAILS))
 		    fprintf (dump_file, "Appending predicate to value.\n");
 		  continue;
@@ -4168,8 +4276,10 @@ vn_nary_op_insert_into (vn_nary_op_t vno, vn_nary_op_table_type *table,
 	      (*next)->next = NULL;
 	      next = &(*next)->next;
 	    }
-	  if (!found)
-	    *next = nval;
+	  /* Put new result at front, so lookup_equiv_head can find the last
+	     inserted (most general) result.  */
+	  nval->next = vno->u.values;
+	  vno->u.values = nval;
 
 	  *slot = vno;
 	  vno->next = last_inserted_nary;
@@ -4214,6 +4324,86 @@ vn_nary_op_insert_pieces (unsigned int length, enum tree_code code,
   return vn_nary_op_insert_into (vno1, valid_info->nary, true);
 }
 
+/* Similar with vn_nary_op_insert_pieces, but insert as predicated values.  */
+
+static vn_nary_op_t
+vn_nary_op_insert_pieces_predicated_1 (unsigned int length, enum tree_code code,
+				       tree type, tree *ops, tree result,
+				       unsigned int value_id, basic_block bb)
+{
+  vn_nary_op_t vno1 = alloc_vn_nary_op (length, NULL_TREE, value_id);
+  init_vn_nary_op_from_pieces (vno1, length, code, type, ops);
+  vno1->predicated_values = 1;
+  vno1->u.values
+    = (vn_pval *) obstack_alloc (&vn_tables_obstack, sizeof (vn_pval));
+  vno1->u.values->next = NULL;
+  vno1->u.values->result = result;
+  vno1->u.values->n = 1;
+  vno1->u.values->valid_dominated_by_p[0] = bb->index;
+  return vn_nary_op_insert_into (vno1, valid_info->nary, true);
+}
+
+/* Fold N-ary expression of equiv-heads, return NULL_TREE if not folded.  */
+
+static tree
+fold_const_from_equiv_heads (unsigned length, tree_code opcode, tree *op,
+			     tree type)
+{
+  tree result = NULL_TREE;
+  /* We're using equiv-heads, i.e. the most general operands, shortcut for
+     situations that can fold.  */
+  switch (length)
+    {
+    case 1:
+      if (CONSTANT_CLASS_P (op[0]))
+	result
+	  = gimple_simplify (opcode, type, op[0], NULL, no_follow_ssa_edges);
+      break;
+    case 2:
+      if ((CONSTANT_CLASS_P (op[0]) && CONSTANT_CLASS_P (op[1]))
+	  || op[0] == op[1])
+	result = gimple_simplify (opcode, type, op[0], op[1], NULL,
+				  no_follow_ssa_edges);
+      break;
+    case 3:
+      if (CONSTANT_CLASS_P (op[0]) && CONSTANT_CLASS_P (op[1])
+	  && CONSTANT_CLASS_P (op[2]))
+	result = gimple_simplify (opcode, type, op[0], op[1], op[2], NULL,
+				  no_follow_ssa_edges);
+      break;
+    default:
+      break;
+    }
+  if (result && !useless_type_conversion_p (type, TREE_TYPE (result)))
+    result = fold_convert (type, result);
+  return result;
+}
+
+/* Insert a back-reference to vn_nary_op_t in vn_ssa_aux.  */
+
+static inline void
+push_new_nary_ref (vn_nary_op_t slot, vn_ssa_aux_t info)
+{
+  nary_ref *ref;
+  if (nary_ref_freelist)
+    {
+      ref = nary_ref_freelist;
+      nary_ref_freelist = nary_ref_freelist->next;
+    }
+  else
+    ref = XOBNEW (&vn_ssa_aux_obstack, nary_ref);
+  ref->next = info->ref;
+  ref->next_undo = last_pushed_nary_ref;
+  ref->slot = slot;
+  last_pushed_nary_ref = info;
+  info->ref = ref;
+  ++info->num_nary_ref;
+  gcc_assert (info->num_nary_ref > 0);
+}
+
+static vn_nary_op_t
+val_equiv_insert (tree op1, tree op2, edge e);
+
 static vn_nary_op_t
 vn_nary_op_insert_pieces_predicated (unsigned int length, enum tree_code code,
 				     tree type, tree *ops,
@@ -4221,21 +4411,15 @@ vn_nary_op_insert_pieces_predicated (unsigned int length, enum tree_code code,
 				     edge pred_e)
 {
   /* ???  Currently tracking BBs.  */
-  if (! single_pred_p (pred_e->dest))
-    {
-      /* Never record for backedges.  */
-      if (pred_e->flags & EDGE_DFS_BACK)
-	return NULL;
-      edge_iterator ei;
-      edge e;
-      int cnt = 0;
-      /* Ignore backedges.  */
-      FOR_EACH_EDGE (e, ei, pred_e->dest->preds)
-	if (! dominated_by_p (CDI_DOMINATORS, e->src, e->dest))
-	  cnt++;
-      if (cnt != 1)
-	return NULL;
-    }
+  if (!vn_tracking_edge (pred_e))
+    return NULL;
+  /* Equivalence can represent this.  */
+  if (code == EQ_EXPR && result == boolean_true_node)
+    return val_equiv_insert (ops[0], ops[1], pred_e);
+  /* Leave this for inverted condition.  */
+  if (code == NE_EXPR && result == boolean_false_node)
+    return NULL;
+
   if (dump_file && (dump_flags & TDF_DETAILS)
       /* ???  Fix dumping, but currently we only get comparisons.  */
       && TREE_CODE_CLASS (code) == tcc_comparison)
@@ -4248,36 +4432,252 @@ vn_nary_op_insert_pieces_predicated (unsigned int length, enum tree_code code,
       fprintf (dump_file, " == %s\n",
 	       integer_zerop (result) ? "false" : "true");
     }
-  vn_nary_op_t vno1 = alloc_vn_nary_op (length, NULL_TREE, value_id);
-  init_vn_nary_op_from_pieces (vno1, length, code, type, ops);
-  vno1->predicated_values = 1;
-  vno1->u.values = (vn_pval *) obstack_alloc (&vn_tables_obstack,
-					      sizeof (vn_pval));
-  vno1->u.values->next = NULL;
-  vno1->u.values->result = result;
-  vno1->u.values->n = 1;
-  vno1->u.values->valid_dominated_by_p[0] = pred_e->dest->index;
-  return vn_nary_op_insert_into (vno1, valid_info->nary, true);
+  for (unsigned i = 0; i < length; i++)
+    gcc_checking_assert (lookup_equiv_head (ops[i], pred_e->dest) == ops[i]);
+
+  /* Skip if the result is known. (Caused by re-inserting predicates.)  */
+  tree simplified = fold_const_from_equiv_heads (length, code, ops, type);
+  if (simplified)
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	{
+	  fprintf (dump_file, "Result is known: ");
+	  print_generic_expr (dump_file, simplified, TDF_NONE);
+	  fprintf (dump_file, ", skipped.\n");
+	}
+      return NULL;
+    }
+
+  vn_nary_op_t val
+    = vn_nary_op_insert_pieces_predicated_1 (length, code, type, ops,
+					     result, value_id, pred_e->dest);
+  /* Insert back-refs (operand->slot) to vn_ssa_aux structure.  */
+  for (unsigned i = 0; i < length; i++)
+    if (TREE_CODE (ops[i]) == SSA_NAME)
+      {
+	vn_ssa_aux_t info = VN_INFO (ops[i]);
+	if (info->valnum == ops[i])
+	  push_new_nary_ref (val, info);
+      }
+  return val;
 }
 
-static bool
-dominated_by_p_w_unex (basic_block bb1, basic_block bb2, bool);
+/* Record the equivalence of OP1 and OP2 on edge E.  */
 
-static tree
-vn_nary_op_get_predicated_value (vn_nary_op_t vno, basic_block bb)
+static vn_nary_op_t
+val_equiv_insert (tree op1, tree op2, edge e)
 {
-  if (! vno->predicated_values)
-    return vno->u.result;
+  /* Record by equivalence heads.  */
+  basic_block bb = e->dest;
+  tree lhs = lookup_equiv_head (op1, bb);
+  tree rhs = lookup_equiv_head (op2, bb);
+  if (lhs == rhs)
+    return NULL;
+  /* Set RHS to be new equiv-head for LHS.  LHS must be SSA_NAME.  */
+  if (TREE_CODE (lhs) != SSA_NAME)
+    {
+      if (TREE_CODE (rhs) == SSA_NAME)
+	std::swap (lhs, rhs);
+      else
+	return NULL;
+    }
+  vn_ssa_aux_t rhs_info = NULL, lhs_info = VN_INFO (lhs);
+
+  /* Choose the hand-side with more recorded n-ary expressions as new
+     equivalence head, to make fewer re-insertions.  */
+  if (TREE_CODE (rhs) == SSA_NAME)
+    {
+      rhs_info = VN_INFO (rhs);
+      if (rhs_info->num_nary_ref < lhs_info->num_nary_ref)
+	{
+	  std::swap (lhs, rhs);
+	  std::swap (rhs_info, lhs_info);
+	}
+    }
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "Recording equivalence of ");
+      print_generic_expr (dump_file, lhs);
+      fprintf (dump_file, " and ");
+      print_generic_expr (dump_file, rhs);
+      fprintf (dump_file, " at BB%d\n", bb->index);
+    }
+  /* Record equivalence as unary NOP_EXPR.  */
+  vn_nary_op_t val
+    = vn_nary_op_insert_pieces_predicated_1 (1, NOP_EXPR, TREE_TYPE (lhs),
+					     &lhs, rhs, 0, bb);
+  /* Back-ref for equivalences is result->slot rather than op->slot, so that
+     when new equiv-head is appended, we can update them.  */
+  if (TREE_CODE (rhs) == SSA_NAME)
+    push_new_nary_ref (val, rhs_info);
+
+  /* Iterate on LHS's references, first update equiv-heads, then re-insert
+     former predicates, so that the result will be latest.  */
+  auto_vec<vn_nary_op_t> predicates;
+  auto_vec<tree> results;
+  vn_nary_op_t slot;
+  tree old_head = lhs_info->valnum;
+  nary_ref *ref = lhs_info->ref;
+  for (; ref; ref = ref->next)
+    {
+      slot = ref->slot;
+      if (!slot->predicated_values)
+	continue;
+      /* Update recorded equivalences with new equiv-head.  */
+      if (slot->opcode == NOP_EXPR && slot->length == 1)
+	{
+	  if (slot->op[0] == rhs)
+	    continue;
+	  for (vn_pval *pval = slot->u.values; pval; pval = pval->next)
+	    if (pval->result == old_head && is_vn_valid_at_bb (pval, bb))
+	      {
+		if (dump_file && (dump_flags & TDF_DETAILS))
+		  {
+		    fprintf (dump_file, "Updating equiv-head of ");
+		    print_generic_expr (dump_file, slot->op[0], TDF_SLIM);
+		    fprintf (dump_file, " to ");
+		    print_generic_expr (dump_file, rhs, TDF_SLIM);
+		    fprintf (dump_file, " at BB%d\n", bb->index);
+		  }
+		vn_nary_op_t new_val
+		  = vn_nary_op_insert_pieces_predicated_1 (1, NOP_EXPR,
+							   slot->type, slot->op,
+							   rhs, 0, bb);
+		/* Push back-ref (result->slot) for new equivalence.  */
+		if (TREE_CODE (rhs) == SSA_NAME)
+		  push_new_nary_ref (new_val, rhs_info);
+	      }
+	}
+      else
+	/* Predicated values are collected, to be re-inserted later.  */
+	for (vn_pval *pval = slot->u.values; pval; pval = pval->next)
+	  {
+	    if (!is_vn_valid_at_bb (pval, bb))
+	      continue;
+	    predicates.safe_push (slot);
+	    results.safe_push (pval->result);
+	  }
+    }
+  unsigned i;
+  if (predicates.length () > 0 && dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file, "Re-inserting %d predicates...\n",
+	     predicates.length ());
+  tree new_ops[3];
+  FOR_EACH_VEC_ELT (predicates, i, slot)
+    {
+      lookup_equiv_heads (slot->length, slot->op, new_ops, e->dest);
+      vn_nary_op_insert_pieces_predicated (slot->length, slot->opcode,
+					   slot->type, new_ops, results[i],
+					   slot->value_id, e);
+    }
+  return val;
+}
+
+/* Record temporary equivalences generated by PHI nodes, and are valid when
+   BB's outgoing edge E is taken.  VNO contains the interested expression.  If
+   INVERSE is true, rule out VNO's results that are different with RESULT;
+   otherwise rule out RESULT itself.  */
+
+static void
+record_equiv_from_prev_phi_1 (vn_nary_op_t vno, edge e, basic_block bb,
+			      tree result, bool inverse)
+{
+  if (!(vno && vno->predicated_values))
+    return;
+
+  /* Process predicated values in VNO, looking for a conflicting result, and
+     the location's single_pred dominates BB.  */
   for (vn_pval *val = vno->u.values; val; val = val->next)
-    for (unsigned i = 0; i < val->n; ++i)
-      /* Do not handle backedge executability optimistically since
-	 when figuring out whether to iterate we do not consider
-	 changed predication.  */
-      if (dominated_by_p_w_unex
-	    (bb, BASIC_BLOCK_FOR_FN (cfun, val->valid_dominated_by_p[i]),
-	     false))
-	return val->result;
-  return NULL_TREE;
+    {
+      if (inverse ? val->result == result : val->result != result)
+	continue;
+      /* Found a result to rule out, process each location.  */
+      for (unsigned i = 0; i < val->n; ++i)
+	{
+	  basic_block bb1
+	    = BASIC_BLOCK_FOR_FN (cfun, val->valid_dominated_by_p[i]);
+	  if (!single_pred_p (bb1))
+	    continue;
+	  basic_block p = single_pred (bb1);
+	  if (!dominated_by_p_w_unex (bb, p, false))
+	    continue;
+	  gcc_assert (EDGE_COUNT (p->succs) == 2);
+	  /* Iterate on possible phi_bbs (can be BB itself). Considering
+	     non-executable edges, there can be multiple phi_bbs.  */
+	  for (basic_block phi_bb : get_dominated_by (CDI_DOMINATORS, p))
+	    {
+	      if (!dominated_by_p_w_unex (bb, phi_bb, false))
+		continue;
+	      /* Process PHIs, rule out incoming edges that definitely come
+		 from p->bb1.  If there's a single PHI argument X left, record
+		 the equivalence of X and PHI result at edge e.  */
+	      for (gphi_iterator gsi = gsi_start_nonvirtual_phis (phi_bb);
+		   !gsi_end_p (gsi); gsi_next_nonvirtual_phi (&gsi))
+		{
+		  gphi *phi = gsi.phi ();
+		  tree equiv = NULL_TREE;
+		  for (unsigned j = 0; j < gimple_phi_num_args (phi); ++j)
+		    {
+		      edge in = gimple_phi_arg_edge (phi, j);
+		      if (in->src == bb1
+			  || dominated_by_p_w_unex (in->src, bb1, false))
+			/* Definitely comes from p->bb1.  */
+			continue;
+		      tree arg = vn_valueize (gimple_phi_arg_def (phi, j));
+		      if (equiv && equiv != arg)
+			{
+			  equiv = NULL_TREE;
+			  break;
+			}
+		      equiv = arg;
+		    }
+		  if (equiv)
+		    val_equiv_insert (equiv, PHI_RESULT (phi), e);
+		}
+	    }
+	}
+    }
+}
+
+/* Record temporary equivalences generated by PHI nodes, and are valid when
+   BB's outgoing edge E is taken.  CODE, OPS and RESULT are the condition and
+   result for E to be taken.  VNO contains the condition's results recorded in
+   nary map.
+
+   The idea is to find a conflicting predicate at some location L, where L's
+   predecessor P dominates BB, so we know path P->L will not be taken if E is
+   taken.  If by ruling out P->L can generate some equivalences, record them at
+   E->dest.  */
+
+static void
+record_equiv_from_prev_phi (vn_nary_op_t vno, edge e, basic_block bb,
+			    tree_code code, tree *ops, tree result)
+{
+  if (!vn_tracking_edge (e))
+    return;
+  record_equiv_from_prev_phi_1 (vno, e, bb, result, true);
+
+  /* For predicates like "a==b is false" or "a!=b is true", equivalences are
+     conflicting results, i.e. "NO_OP (a) is b" or "NO_OP (b) is a".  */
+  if ((result == boolean_true_node
+       && (code == NE_EXPR || code == GT_EXPR || code == LT_EXPR))
+      || (result == boolean_false_node && code == EQ_EXPR))
+    {
+      vn_nary_op_t vnresult;
+      if (TREE_CODE (ops[0]) == SSA_NAME)
+	{
+	  vn_nary_op_lookup_pieces (1, NOP_EXPR, TREE_TYPE (ops[0]), ops,
+				    &vnresult);
+	  record_equiv_from_prev_phi_1 (vnresult, e, bb, ops[1], false);
+	}
+      if (TREE_CODE (ops[1]) == SSA_NAME)
+	{
+	  vn_nary_op_lookup_pieces (1, NOP_EXPR, TREE_TYPE (ops[1]), &ops[1],
+				    &vnresult);
+	  record_equiv_from_prev_phi_1 (vnresult, e, bb, ops[0], false);
+	}
+    }
 }
 
 /* Insert the rhs of STMT into the current hash table with a value number of
@@ -4896,8 +5296,25 @@ static bool
 visit_nary_op (tree lhs, gassign *stmt)
 {
   vn_nary_op_t vnresult;
-  tree result = vn_nary_op_lookup_stmt (stmt, &vnresult);
-  if (! result && vnresult)
+  unsigned length = vn_nary_length_from_stmt (stmt);
+  vn_nary_op_t vno
+    = XALLOCAVAR (struct vn_nary_op_s, sizeof_vn_nary_op (length));
+  init_vn_nary_op_from_stmt (vno, stmt);
+  tree result = NULL_TREE;
+  /* Try to get a simplified result.  */
+  /* Do not simplify variable used in PHI at loop exit, or
+     simplify_peeled_chrec/constant_after_peeling may miss the loop.  */
+  gimple *use_stmt;
+  use_operand_p use_p;
+  if (!(single_imm_use (lhs, &use_p, &use_stmt)
+	&& gimple_code (use_stmt) == GIMPLE_PHI
+	&& single_succ_p (gimple_bb (use_stmt))
+	&& (single_succ_edge (gimple_bb (use_stmt))->flags & EDGE_DFS_BACK)))
+    result = fold_const_from_equiv_heads (vno->length, vno->opcode, vno->op,
+					  vno->type);
+  if (!result)
+    result = vn_nary_op_lookup_1 (vno, &vnresult);
+  if (!result && vnresult)
     result = vn_nary_op_get_predicated_value (vnresult, gimple_bb (stmt));
   if (result)
     return set_ssa_val_to (lhs, result);
@@ -7170,12 +7587,8 @@ insert_related_predicates_on_edge (enum tree_code code, tree *ops, edge pred_e)
 					   ops, boolean_false_node, 0, pred_e);
       break;
     case EQ_EXPR:
-      /* a == b -> ! a {<,>} b */
-      vn_nary_op_insert_pieces_predicated (2, LT_EXPR, boolean_type_node,
-					   ops, boolean_false_node, 0, pred_e);
-      vn_nary_op_insert_pieces_predicated (2, GT_EXPR, boolean_type_node,
-					   ops, boolean_false_node, 0, pred_e);
-      break;
+      /* No need to insert derived predicates for '==', as they can be computed
+	 by equiv-heads & fold_const_from_equiv_heads.  */
     case LE_EXPR:
     case GE_EXPR:
     case NE_EXPR:
@@ -7342,24 +7755,31 @@ process_bb (rpo_elim &avail, basic_block bb,
       switch (gimple_code (last))
 	{
 	case GIMPLE_SWITCH:
+	  /* TODO: utilize temporary equivalences.  */
 	  e = find_taken_edge (bb, vn_valueize (gimple_switch_index
 						(as_a <gswitch *> (last))));
 	  break;
 	case GIMPLE_COND:
 	  {
-	    tree lhs = vn_valueize (gimple_cond_lhs (last));
-	    tree rhs = vn_valueize (gimple_cond_rhs (last));
-	    tree val = gimple_simplify (gimple_cond_code (last),
-					boolean_type_node, lhs, rhs,
-					NULL, vn_valueize);
+	    tree lhs = lookup_equiv_head (gimple_cond_lhs (last), bb);
+	    tree rhs = lookup_equiv_head (gimple_cond_rhs (last), bb);
+	    tree ops[2];
+	    ops[0] = lhs;
+	    ops[1] = rhs;
+	    tree val = fold_const_from_equiv_heads (2, gimple_cond_code (last),
+						    ops, boolean_type_node);
+	    if (val && (dump_flags & TDF_DETAILS))
+	      {
+		fprintf (dump_file, "Got simplified result ");
+		print_generic_expr (dump_file, val, TDF_NONE);
+		fprintf (dump_file, " for ");
+		print_gimple_stmt (dump_file, last, TDF_SLIM);
+	      }
+	    vn_nary_op_t vnresult = NULL;
 	    /* If the condition didn't simplfy see if we have recorded
 	       an expression from sofar taken edges.  */
 	    if (! val || TREE_CODE (val) != INTEGER_CST)
 	      {
-		vn_nary_op_t vnresult;
-		tree ops[2];
-		ops[0] = lhs;
-		ops[1] = rhs;
 		val = vn_nary_op_lookup_pieces (2, gimple_cond_code (last),
 						boolean_type_node, ops,
 						&vnresult);
@@ -7427,6 +7847,15 @@ process_bb (rpo_elim &avail, basic_block bb,
 		    if (false_e)
 		      insert_related_predicates_on_edge (icode, ops, false_e);
 		  }
+
+		/* Try to record equivalences generated by previous PHI nodes
+		    on current true & false edges.  */
+		if (true_e)
+		  record_equiv_from_prev_phi (vnresult, true_e, bb, code, ops,
+					      boolean_true_node);
+		if (false_e)
+		  record_equiv_from_prev_phi (vnresult, false_e, bb, code, ops,
+					      boolean_false_node);
 	      }
 	    break;
 	  }
@@ -7549,6 +7978,7 @@ struct unwind_state
   vn_phi_t phi_top;
   vn_nary_op_t nary_top;
   vn_avail *avail_top;
+  nary_ref *nary_ref_top;
 };
 
 /* Unwind the RPO VN state for iteration.  */
@@ -7598,6 +8028,17 @@ do_unwind (unwind_state *to, rpo_elim &avail)
       av->next = avail.m_avail_freelist;
       avail.m_avail_freelist = av;
     }
+  /* Drain nary_refs to freelist, to reuse the memory.  */
+  for (; last_pushed_nary_ref && last_pushed_nary_ref->ref != to->nary_ref_top;)
+    {
+      vn_ssa_aux_t val = last_pushed_nary_ref;
+      nary_ref *ref = val->ref;
+      val->ref = ref->next;
+      last_pushed_nary_ref = ref->next_undo;
+      ref->next = nary_ref_freelist;
+      nary_ref_freelist = ref;
+      --val->num_nary_ref;
+    }
 }
 
 /* Do VN on a SEME region specified by ENTRY and EXIT_BBS in FN.
@@ -7711,6 +8152,8 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
   last_inserted_phi = NULL;
   last_inserted_nary = NULL;
   last_pushed_avail = NULL;
+  last_pushed_nary_ref = NULL;
+  nary_ref_freelist = NULL;
 
   vn_valueize = rpo_vn_valueize;
 
@@ -7805,6 +8248,8 @@ do_rpo_vn (function *fn, edge entry, bitmap exit_bbs,
 	    rpo_state[idx].nary_top = last_inserted_nary;
 	    rpo_state[idx].avail_top
 	      = last_pushed_avail ? last_pushed_avail->avail : NULL;
+	    rpo_state[idx].nary_ref_top
+	      = last_pushed_nary_ref ? last_pushed_nary_ref->ref : NULL;
 	  }
 
 	if (!(bb->flags & BB_EXECUTABLE))
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index 8a1b649c726..65240c9bf25 100644
--- a/gcc/tree-ssa-sccvn.h
+++ b/gcc/tree-ssa-sccvn.h
@@ -217,6 +217,16 @@ struct vn_avail
   struct vn_ssa_aux *next_undo;
 };
 
+/* In vn_ssa_aux structure, hold a lists of references to the nary map entries,
+   so that when recording new equivalence to the value number, we can re-insert
+   expressions' results based on this.  */
+struct nary_ref
+{
+  nary_ref *next;
+  vn_nary_op_t slot;
+  struct vn_ssa_aux *next_undo;
+};
+
 typedef struct vn_ssa_aux
 {
   /* SSA name this vn_ssa_aux is associated with in the lattice.  */
@@ -230,9 +240,15 @@ typedef struct vn_ssa_aux
      for SSA names also serving as values (NAME == VALNUM).  */
   vn_avail *avail;
 
+  /* References to slots in the nary map, with VALNUM as operand.  */
+  nary_ref *ref;
+
   /* Unique identifier that all expressions with the same value have. */
   unsigned int value_id;
 
+  /* nary_ref count.  */
+  unsigned short num_nary_ref;
+
   /* Whether the SSA_NAME has been processed at least once.  */
   unsigned visited : 1;
 

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2021-11-08  3:21 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-08  3:21 [PING] [PATCH v2] tree-optimization/101186 - extend FRE with "equivalence map" for condition prediction Di Zhao OS

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).