public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v3 0/3] reassoc: Propagate PHI_LOOP_BIAS along single uses
@ 2021-09-26 11:09 Ilya Leoshkevich
  2021-09-26 11:09 ` [PATCH v3 1/3] reassoc: Do not bias loop-carried PHIs early Ilya Leoshkevich
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Ilya Leoshkevich @ 2021-09-26 11:09 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Andreas Krebbel, Ilya Leoshkevich

v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579976.html
Changes in v3:
* Do not propagate bias along tcc_references.
* Call get_rank () before checking biased_names.
* Add loop-carried phis to biased_names.
* Move the propagate_bias_p () call outside of the loop.
* Test with -ftree-vectorize, adjust expectations.

Ilya Leoshkevich (3):
  reassoc: Do not bias loop-carried PHIs early
  reassoc: Propagate PHI_LOOP_BIAS along single uses
  reassoc: Test rank biasing

 gcc/passes.def                             |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c |   7 ++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h |  33 ++++++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c |   9 ++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c |   9 ++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c |  11 ++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c |  10 ++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c |  11 ++
 gcc/tree-ssa-reassoc.c                     | 125 +++++++++++++++------
 9 files changed, 180 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 1/3] reassoc: Do not bias loop-carried PHIs early
  2021-09-26 11:09 [PATCH v3 0/3] reassoc: Propagate PHI_LOOP_BIAS along single uses Ilya Leoshkevich
@ 2021-09-26 11:09 ` Ilya Leoshkevich
  2021-09-28 11:21   ` Richard Biener
  2021-09-26 11:09 ` [PATCH v3 2/3] reassoc: Propagate PHI_LOOP_BIAS along single uses Ilya Leoshkevich
  2021-09-26 11:09 ` [PATCH v3 3/3] reassoc: Test rank biasing Ilya Leoshkevich
  2 siblings, 1 reply; 9+ messages in thread
From: Ilya Leoshkevich @ 2021-09-26 11:09 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Andreas Krebbel, Ilya Leoshkevich

Biasing loop-carried PHIs during the 1st reassociation pass interferes
with reduction chains and does not bring measurable benefits, so do it
only during the 2nd reassociation pass.

gcc/ChangeLog:

            * passes.def (pass_reassoc): Rename parameter to early_p.
            * tree-ssa-reassoc.c (reassoc_bias_loop_carried_phi_ranks_p):
            New variable.
            (phi_rank): Don't bias loop-carried phi ranks
            before vectorization pass.
            (execute_reassoc): Add bias_loop_carried_phi_ranks_p parameter.
            (pass_reassoc::pass_reassoc): Add bias_loop_carried_phi_ranks_p
            initializer.
            (pass_reassoc::set_param): Set bias_loop_carried_phi_ranks_p
            value.
            (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p to
            execute_reassoc.
            (pass_reassoc::bias_loop_carried_phi_ranks_p): New member.
---
 gcc/passes.def         |  4 ++--
 gcc/tree-ssa-reassoc.c | 16 ++++++++++++++--
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/gcc/passes.def b/gcc/passes.def
index d7a1f8c97a6..c5f915d04c6 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -242,7 +242,7 @@ along with GCC; see the file COPYING3.  If not see
       /* Identify paths that should never be executed in a conforming
 	 program and isolate those paths.  */
       NEXT_PASS (pass_isolate_erroneous_paths);
-      NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
+      NEXT_PASS (pass_reassoc, true /* early_p */);
       NEXT_PASS (pass_dce);
       NEXT_PASS (pass_forwprop);
       NEXT_PASS (pass_phiopt, false /* early_p */);
@@ -325,7 +325,7 @@ along with GCC; see the file COPYING3.  If not see
       NEXT_PASS (pass_lower_vector_ssa);
       NEXT_PASS (pass_lower_switch);
       NEXT_PASS (pass_cse_reciprocals);
-      NEXT_PASS (pass_reassoc, false /* insert_powi_p */);
+      NEXT_PASS (pass_reassoc, false /* early_p */);
       NEXT_PASS (pass_strength_reduction);
       NEXT_PASS (pass_split_paths);
       NEXT_PASS (pass_tracer);
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 8498cfc7aa8..420c14e8cf5 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -180,6 +180,10 @@ along with GCC; see the file COPYING3.  If not see
    point 3a in the pass header comment.  */
 static bool reassoc_insert_powi_p;
 
+/* Enable biasing ranks of loop accumulators.  We don't want this before
+   vectorization, since it interferes with reduction chains.  */
+static bool reassoc_bias_loop_carried_phi_ranks_p;
+
 /* Statistics */
 static struct
 {
@@ -269,6 +273,9 @@ phi_rank (gimple *stmt)
   use_operand_p use;
   gimple *use_stmt;
 
+  if (!reassoc_bias_loop_carried_phi_ranks_p)
+    return bb_rank[bb->index];
+
   /* We only care about real loops (those with a latch).  */
   if (!father->latch)
     return bb_rank[bb->index];
@@ -6940,9 +6947,10 @@ fini_reassoc (void)
    optimization of a gimple conditional.  Otherwise returns zero.  */
 
 static unsigned int
-execute_reassoc (bool insert_powi_p)
+execute_reassoc (bool insert_powi_p, bool bias_loop_carried_phi_ranks_p)
 {
   reassoc_insert_powi_p = insert_powi_p;
+  reassoc_bias_loop_carried_phi_ranks_p = bias_loop_carried_phi_ranks_p;
 
   init_reassoc ();
 
@@ -6983,15 +6991,19 @@ public:
     {
       gcc_assert (n == 0);
       insert_powi_p = param;
+      bias_loop_carried_phi_ranks_p = !param;
     }
   virtual bool gate (function *) { return flag_tree_reassoc != 0; }
   virtual unsigned int execute (function *)
-    { return execute_reassoc (insert_powi_p); }
+  {
+    return execute_reassoc (insert_powi_p, bias_loop_carried_phi_ranks_p);
+  }
 
  private:
   /* Enable insertion of __builtin_powi calls during execute_reassoc.  See
      point 3a in the pass header comment.  */
   bool insert_powi_p;
+  bool bias_loop_carried_phi_ranks_p;
 }; // class pass_reassoc
 
 } // anon namespace
-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 2/3] reassoc: Propagate PHI_LOOP_BIAS along single uses
  2021-09-26 11:09 [PATCH v3 0/3] reassoc: Propagate PHI_LOOP_BIAS along single uses Ilya Leoshkevich
  2021-09-26 11:09 ` [PATCH v3 1/3] reassoc: Do not bias loop-carried PHIs early Ilya Leoshkevich
@ 2021-09-26 11:09 ` Ilya Leoshkevich
  2021-09-28 11:24   ` Richard Biener
  2021-09-26 11:09 ` [PATCH v3 3/3] reassoc: Test rank biasing Ilya Leoshkevich
  2 siblings, 1 reply; 9+ messages in thread
From: Ilya Leoshkevich @ 2021-09-26 11:09 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Andreas Krebbel, Ilya Leoshkevich

PR tree-optimization/49749 introduced code that shortens dependency
chains containing loop accumulators by placing them last on operand
lists of associative operations.

456.hmmer benchmark on s390 could benefit from this, however, the code
that needs it modifies loop accumulator before using it, and since only
so-called loop-carried phis are are treated as loop accumulators, the
code in the present form doesn't really help.   According to Bill
Schmidt - the original author - such a conservative approach was chosen
so as to avoid unnecessarily swapping operands, which might cause
unpredictable effects.  However, giving special treatment to forms of
loop accumulators is acceptable.

The definition of loop-carried phi is: it's a single-use phi, which is
used in the same innermost loop it's defined in, at least one argument
of which is defined in the same innermost loop as the phi itself.
Given this, it seems natural to treat single uses of such phis as phis
themselves.

gcc/ChangeLog:

	* tree-ssa-reassoc.c (biased_names): New global.
	(propagate_bias_p): New function.
	(loop_carried_phi): Remove.
	(propagate_rank): Propagate bias along single uses.
	(get_rank): Update biased_names when needed.
---
 gcc/tree-ssa-reassoc.c | 109 ++++++++++++++++++++++++++++-------------
 1 file changed, 74 insertions(+), 35 deletions(-)

diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 420c14e8cf5..db9fb4e1cac 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -211,6 +211,10 @@ static int64_t *bb_rank;
 /* Operand->rank hashtable.  */
 static hash_map<tree, int64_t> *operand_rank;
 
+/* SSA_NAMEs that are forms of loop accumulators and whose ranks need to be
+   biased.  */
+static auto_bitmap biased_names;
+
 /* Vector of SSA_NAMEs on which after reassociate_bb is done with
    all basic blocks the CFG should be adjusted - basic blocks
    split right after that SSA_NAME's definition statement and before
@@ -256,6 +260,53 @@ reassoc_remove_stmt (gimple_stmt_iterator *gsi)
    the rank difference between two blocks.  */
 #define PHI_LOOP_BIAS (1 << 15)
 
+/* Return TRUE iff PHI_LOOP_BIAS should be propagated from one of the STMT's
+   operands to the STMT's left-hand side.  The goal is to preserve bias in code
+   like this:
+
+     x_1 = phi(x_0, x_2)
+     a = x_1 | 1
+     b = a ^ 2
+     .MEM = b
+     c = b + d
+     x_2 = c + e
+
+   That is, we need to preserve bias along single-use chains originating from
+   loop-carried phis.  Only GIMPLE_ASSIGNs to SSA_NAMEs are considered to be
+   uses, because only they participate in rank propagation.  */
+static bool
+propagate_bias_p (gimple *stmt)
+{
+  use_operand_p use;
+  imm_use_iterator use_iter;
+  gimple *single_use_stmt = NULL;
+
+  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_reference)
+    return false;
+
+  FOR_EACH_IMM_USE_FAST (use, use_iter, gimple_assign_lhs (stmt))
+    {
+      gimple *current_use_stmt = USE_STMT (use);
+
+      if (is_gimple_assign (current_use_stmt)
+	  && TREE_CODE (gimple_assign_lhs (current_use_stmt)) == SSA_NAME)
+	{
+	  if (single_use_stmt != NULL && single_use_stmt != current_use_stmt)
+	    return false;
+	  single_use_stmt = current_use_stmt;
+	}
+    }
+
+  if (single_use_stmt == NULL)
+    return false;
+
+  if (gimple_bb (stmt)->loop_father
+      != gimple_bb (single_use_stmt)->loop_father)
+    return false;
+
+  return true;
+}
+
 /* Rank assigned to a phi statement.  If STMT is a loop-carried phi of
    an innermost loop, and the phi has only a single use which is inside
    the loop, then the rank is the block rank of the loop latch plus an
@@ -313,49 +364,27 @@ phi_rank (gimple *stmt)
   return bb_rank[bb->index];
 }
 
-/* If EXP is an SSA_NAME defined by a PHI statement that represents a
-   loop-carried dependence of an innermost loop, return TRUE; else
-   return FALSE.  */
-static bool
-loop_carried_phi (tree exp)
-{
-  gimple *phi_stmt;
-  int64_t block_rank;
-
-  if (TREE_CODE (exp) != SSA_NAME
-      || SSA_NAME_IS_DEFAULT_DEF (exp))
-    return false;
-
-  phi_stmt = SSA_NAME_DEF_STMT (exp);
-
-  if (gimple_code (SSA_NAME_DEF_STMT (exp)) != GIMPLE_PHI)
-    return false;
-
-  /* Non-loop-carried phis have block rank.  Loop-carried phis have
-     an additional bias added in.  If this phi doesn't have block rank,
-     it's biased and should not be propagated.  */
-  block_rank = bb_rank[gimple_bb (phi_stmt)->index];
-
-  if (phi_rank (phi_stmt) != block_rank)
-    return true;
-
-  return false;
-}
-
 /* Return the maximum of RANK and the rank that should be propagated
    from expression OP.  For most operands, this is just the rank of OP.
    For loop-carried phis, the value is zero to avoid undoing the bias
    in favor of the phi.  */
 static int64_t
-propagate_rank (int64_t rank, tree op)
+propagate_rank (int64_t rank, tree op, bool *maybe_biased_p)
 {
   int64_t op_rank;
 
-  if (loop_carried_phi (op))
-    return rank;
-
   op_rank = get_rank (op);
 
+  /* Check whether op is biased after the get_rank () call, since it might have
+     updated biased_names.  */
+  if (TREE_CODE (op) == SSA_NAME
+      && bitmap_bit_p (biased_names, SSA_NAME_VERSION (op)))
+    {
+      if (maybe_biased_p == NULL)
+	return rank;
+      *maybe_biased_p = true;
+    }
+
   return MAX (rank, op_rank);
 }
 
@@ -433,13 +462,20 @@ get_rank (tree e)
 
       stmt = SSA_NAME_DEF_STMT (e);
       if (gimple_code (stmt) == GIMPLE_PHI)
-	rank = phi_rank (stmt);
+	{
+	  rank = phi_rank (stmt);
+	  if (rank != bb_rank[gimple_bb (stmt)->index])
+	    bitmap_set_bit (biased_names, SSA_NAME_VERSION (e));
+	}
 
       else if (!is_gimple_assign (stmt))
 	rank = bb_rank[gimple_bb (stmt)->index];
 
       else
 	{
+	  bool biased_p = false;
+	  bool *maybe_biased_p = propagate_bias_p (stmt) ? &biased_p : NULL;
+
 	  /* Otherwise, find the maximum rank for the operands.  As an
 	     exception, remove the bias from loop-carried phis when propagating
 	     the rank so that dependent operations are not also biased.  */
@@ -448,9 +484,11 @@ get_rank (tree e)
 	     thus have rank 0.  */
 	  rank = 0;
 	  FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
-	    rank = propagate_rank (rank, op);
+	    rank = propagate_rank (rank, op, maybe_biased_p);
 
 	  rank += 1;
+	  if (biased_p)
+	    bitmap_set_bit (biased_names, SSA_NAME_VERSION (e));
 	}
 
       if (dump_file && (dump_flags & TDF_DETAILS))
@@ -6933,6 +6971,7 @@ fini_reassoc (void)
 			    reassociate_stats.pows_created);
 
   delete operand_rank;
+  bitmap_clear (biased_names);
   operand_entry_pool.release ();
   free (bb_rank);
   plus_negates.release ();
-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v3 3/3] reassoc: Test rank biasing
  2021-09-26 11:09 [PATCH v3 0/3] reassoc: Propagate PHI_LOOP_BIAS along single uses Ilya Leoshkevich
  2021-09-26 11:09 ` [PATCH v3 1/3] reassoc: Do not bias loop-carried PHIs early Ilya Leoshkevich
  2021-09-26 11:09 ` [PATCH v3 2/3] reassoc: Propagate PHI_LOOP_BIAS along single uses Ilya Leoshkevich
@ 2021-09-26 11:09 ` Ilya Leoshkevich
  2021-09-28 11:28   ` Richard Biener
  2 siblings, 1 reply; 9+ messages in thread
From: Ilya Leoshkevich @ 2021-09-26 11:09 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Andreas Krebbel, Ilya Leoshkevich

Add both positive and negative tests.

gcc/testsuite/ChangeLog:

	* gcc.dg/tree-ssa/reassoc-46.c: New test.
	* gcc.dg/tree-ssa/reassoc-46.h: Common code for new tests.
	* gcc.dg/tree-ssa/reassoc-47.c: New test.
	* gcc.dg/tree-ssa/reassoc-48.c: New test.
	* gcc.dg/tree-ssa/reassoc-49.c: New test.
	* gcc.dg/tree-ssa/reassoc-50.c: New test.
	* gcc.dg/tree-ssa/reassoc-51.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c |  7 +++++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h | 33 ++++++++++++++++++++++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c |  9 ++++++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c |  9 ++++++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c | 11 ++++++++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c | 10 +++++++
 gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c | 11 ++++++++
 7 files changed, 90 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c
new file mode 100644
index 00000000000..97563dd929f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
+
+#include "reassoc-46.h"
+
+/* Check that the loop accumulator is added last.  */
+/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h
new file mode 100644
index 00000000000..e60b490ea0d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h
@@ -0,0 +1,33 @@
+#define M 1024
+unsigned int arr1[M];
+unsigned int arr2[M];
+volatile unsigned int sink;
+
+unsigned int
+test (void)
+{
+  unsigned int sum = 0;
+  for (int i = 0; i < M; i++)
+    {
+#ifdef MODIFY
+      /* Modify the loop accumulator using a chain of operations - this should
+         not affect its rank biasing.  */
+      sum |= 1;
+      sum ^= 2;
+#endif
+#ifdef STORE
+      /* Save the loop accumulator into a global variable - this should not
+         affect its rank biasing.  */
+      sink = sum;
+#endif
+#ifdef USE
+      /* Add a tricky use of the loop accumulator - this should prevent its
+         rank biasing.  */
+      i = (i + sum) % M;
+#endif
+      /* Use addends with different ranks.  */
+      sum += arr1[i];
+      sum += arr2[((i ^ 1) + 1) % M];
+    }
+  return sum;
+}
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
new file mode 100644
index 00000000000..1b0f0fdabe1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
+
+#define MODIFY
+#include "reassoc-46.h"
+
+/* Check that if the loop accumulator is saved into a global variable, it's
+   still added last.  */
+/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
new file mode 100644
index 00000000000..13836ebe8e6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
+
+#define STORE
+#include "reassoc-46.h"
+
+/* Check that if the loop accumulator is modified using a chain of operations
+   other than addition, its new value is still added last.  */
+/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c
new file mode 100644
index 00000000000..c1136a447a2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
+
+#define MODIFY
+#define STORE
+#include "reassoc-46.h"
+
+/* Check that if the loop accumulator is both modified using a chain of
+   operations other than addition and stored into a global variable, its new
+   value is still added last.  */
+/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 1 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c
new file mode 100644
index 00000000000..e35a4ff1278
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
+
+#define MODIFY
+#define USE
+#include "reassoc-46.h"
+
+/* Check that if the loop accumulator has multiple uses inside the loop, it's
+   not forced to the end of the reassociation chain.  */
+/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 2 "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c
new file mode 100644
index 00000000000..07175675e24
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
+
+#define MODIFY
+#define STORE
+#define USE
+#include "reassoc-46.h"
+
+/* Check that if the loop accumulator has multiple uses inside the loop, it's
+   not forced to the end of the reassociation chain.  */
+/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 2 "optimized" } } */
-- 
2.31.1


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 1/3] reassoc: Do not bias loop-carried PHIs early
  2021-09-26 11:09 ` [PATCH v3 1/3] reassoc: Do not bias loop-carried PHIs early Ilya Leoshkevich
@ 2021-09-28 11:21   ` Richard Biener
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Biener @ 2021-09-28 11:21 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: gcc-patches, Andreas Krebbel

On Sun, 26 Sep 2021, Ilya Leoshkevich wrote:

> Biasing loop-carried PHIs during the 1st reassociation pass interferes
> with reduction chains and does not bring measurable benefits, so do it
> only during the 2nd reassociation pass.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
>             * passes.def (pass_reassoc): Rename parameter to early_p.
>             * tree-ssa-reassoc.c (reassoc_bias_loop_carried_phi_ranks_p):
>             New variable.
>             (phi_rank): Don't bias loop-carried phi ranks
>             before vectorization pass.
>             (execute_reassoc): Add bias_loop_carried_phi_ranks_p parameter.
>             (pass_reassoc::pass_reassoc): Add bias_loop_carried_phi_ranks_p
>             initializer.
>             (pass_reassoc::set_param): Set bias_loop_carried_phi_ranks_p
>             value.
>             (pass_reassoc::execute): Pass bias_loop_carried_phi_ranks_p to
>             execute_reassoc.
>             (pass_reassoc::bias_loop_carried_phi_ranks_p): New member.
> ---
>  gcc/passes.def         |  4 ++--
>  gcc/tree-ssa-reassoc.c | 16 ++++++++++++++--
>  2 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/passes.def b/gcc/passes.def
> index d7a1f8c97a6..c5f915d04c6 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -242,7 +242,7 @@ along with GCC; see the file COPYING3.  If not see
>        /* Identify paths that should never be executed in a conforming
>  	 program and isolate those paths.  */
>        NEXT_PASS (pass_isolate_erroneous_paths);
> -      NEXT_PASS (pass_reassoc, true /* insert_powi_p */);
> +      NEXT_PASS (pass_reassoc, true /* early_p */);
>        NEXT_PASS (pass_dce);
>        NEXT_PASS (pass_forwprop);
>        NEXT_PASS (pass_phiopt, false /* early_p */);
> @@ -325,7 +325,7 @@ along with GCC; see the file COPYING3.  If not see
>        NEXT_PASS (pass_lower_vector_ssa);
>        NEXT_PASS (pass_lower_switch);
>        NEXT_PASS (pass_cse_reciprocals);
> -      NEXT_PASS (pass_reassoc, false /* insert_powi_p */);
> +      NEXT_PASS (pass_reassoc, false /* early_p */);
>        NEXT_PASS (pass_strength_reduction);
>        NEXT_PASS (pass_split_paths);
>        NEXT_PASS (pass_tracer);
> diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
> index 8498cfc7aa8..420c14e8cf5 100644
> --- a/gcc/tree-ssa-reassoc.c
> +++ b/gcc/tree-ssa-reassoc.c
> @@ -180,6 +180,10 @@ along with GCC; see the file COPYING3.  If not see
>     point 3a in the pass header comment.  */
>  static bool reassoc_insert_powi_p;
>  
> +/* Enable biasing ranks of loop accumulators.  We don't want this before
> +   vectorization, since it interferes with reduction chains.  */
> +static bool reassoc_bias_loop_carried_phi_ranks_p;
> +
>  /* Statistics */
>  static struct
>  {
> @@ -269,6 +273,9 @@ phi_rank (gimple *stmt)
>    use_operand_p use;
>    gimple *use_stmt;
>  
> +  if (!reassoc_bias_loop_carried_phi_ranks_p)
> +    return bb_rank[bb->index];
> +
>    /* We only care about real loops (those with a latch).  */
>    if (!father->latch)
>      return bb_rank[bb->index];
> @@ -6940,9 +6947,10 @@ fini_reassoc (void)
>     optimization of a gimple conditional.  Otherwise returns zero.  */
>  
>  static unsigned int
> -execute_reassoc (bool insert_powi_p)
> +execute_reassoc (bool insert_powi_p, bool bias_loop_carried_phi_ranks_p)
>  {
>    reassoc_insert_powi_p = insert_powi_p;
> +  reassoc_bias_loop_carried_phi_ranks_p = bias_loop_carried_phi_ranks_p;
>  
>    init_reassoc ();
>  
> @@ -6983,15 +6991,19 @@ public:
>      {
>        gcc_assert (n == 0);
>        insert_powi_p = param;
> +      bias_loop_carried_phi_ranks_p = !param;
>      }
>    virtual bool gate (function *) { return flag_tree_reassoc != 0; }
>    virtual unsigned int execute (function *)
> -    { return execute_reassoc (insert_powi_p); }
> +  {
> +    return execute_reassoc (insert_powi_p, bias_loop_carried_phi_ranks_p);
> +  }
>  
>   private:
>    /* Enable insertion of __builtin_powi calls during execute_reassoc.  See
>       point 3a in the pass header comment.  */
>    bool insert_powi_p;
> +  bool bias_loop_carried_phi_ranks_p;
>  }; // class pass_reassoc
>  
>  } // anon namespace
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 2/3] reassoc: Propagate PHI_LOOP_BIAS along single uses
  2021-09-26 11:09 ` [PATCH v3 2/3] reassoc: Propagate PHI_LOOP_BIAS along single uses Ilya Leoshkevich
@ 2021-09-28 11:24   ` Richard Biener
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Biener @ 2021-09-28 11:24 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: gcc-patches, Andreas Krebbel

On Sun, 26 Sep 2021, Ilya Leoshkevich wrote:

> PR tree-optimization/49749 introduced code that shortens dependency
> chains containing loop accumulators by placing them last on operand
> lists of associative operations.
> 
> 456.hmmer benchmark on s390 could benefit from this, however, the code
> that needs it modifies loop accumulator before using it, and since only
> so-called loop-carried phis are are treated as loop accumulators, the
> code in the present form doesn't really help.   According to Bill
> Schmidt - the original author - such a conservative approach was chosen
> so as to avoid unnecessarily swapping operands, which might cause
> unpredictable effects.  However, giving special treatment to forms of
> loop accumulators is acceptable.
> 
> The definition of loop-carried phi is: it's a single-use phi, which is
> used in the same innermost loop it's defined in, at least one argument
> of which is defined in the same innermost loop as the phi itself.
> Given this, it seems natural to treat single uses of such phis as phis
> themselves.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
> 	* tree-ssa-reassoc.c (biased_names): New global.
> 	(propagate_bias_p): New function.
> 	(loop_carried_phi): Remove.
> 	(propagate_rank): Propagate bias along single uses.
> 	(get_rank): Update biased_names when needed.
> ---
>  gcc/tree-ssa-reassoc.c | 109 ++++++++++++++++++++++++++++-------------
>  1 file changed, 74 insertions(+), 35 deletions(-)
> 
> diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
> index 420c14e8cf5..db9fb4e1cac 100644
> --- a/gcc/tree-ssa-reassoc.c
> +++ b/gcc/tree-ssa-reassoc.c
> @@ -211,6 +211,10 @@ static int64_t *bb_rank;
>  /* Operand->rank hashtable.  */
>  static hash_map<tree, int64_t> *operand_rank;
>  
> +/* SSA_NAMEs that are forms of loop accumulators and whose ranks need to be
> +   biased.  */
> +static auto_bitmap biased_names;
> +
>  /* Vector of SSA_NAMEs on which after reassociate_bb is done with
>     all basic blocks the CFG should be adjusted - basic blocks
>     split right after that SSA_NAME's definition statement and before
> @@ -256,6 +260,53 @@ reassoc_remove_stmt (gimple_stmt_iterator *gsi)
>     the rank difference between two blocks.  */
>  #define PHI_LOOP_BIAS (1 << 15)
>  
> +/* Return TRUE iff PHI_LOOP_BIAS should be propagated from one of the STMT's
> +   operands to the STMT's left-hand side.  The goal is to preserve bias in code
> +   like this:
> +
> +     x_1 = phi(x_0, x_2)
> +     a = x_1 | 1
> +     b = a ^ 2
> +     .MEM = b
> +     c = b + d
> +     x_2 = c + e
> +
> +   That is, we need to preserve bias along single-use chains originating from
> +   loop-carried phis.  Only GIMPLE_ASSIGNs to SSA_NAMEs are considered to be
> +   uses, because only they participate in rank propagation.  */
> +static bool
> +propagate_bias_p (gimple *stmt)
> +{
> +  use_operand_p use;
> +  imm_use_iterator use_iter;
> +  gimple *single_use_stmt = NULL;
> +
> +  if (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_reference)
> +    return false;
> +
> +  FOR_EACH_IMM_USE_FAST (use, use_iter, gimple_assign_lhs (stmt))
> +    {
> +      gimple *current_use_stmt = USE_STMT (use);
> +
> +      if (is_gimple_assign (current_use_stmt)
> +	  && TREE_CODE (gimple_assign_lhs (current_use_stmt)) == SSA_NAME)
> +	{
> +	  if (single_use_stmt != NULL && single_use_stmt != current_use_stmt)
> +	    return false;
> +	  single_use_stmt = current_use_stmt;
> +	}
> +    }
> +
> +  if (single_use_stmt == NULL)
> +    return false;
> +
> +  if (gimple_bb (stmt)->loop_father
> +      != gimple_bb (single_use_stmt)->loop_father)
> +    return false;
> +
> +  return true;
> +}
> +
>  /* Rank assigned to a phi statement.  If STMT is a loop-carried phi of
>     an innermost loop, and the phi has only a single use which is inside
>     the loop, then the rank is the block rank of the loop latch plus an
> @@ -313,49 +364,27 @@ phi_rank (gimple *stmt)
>    return bb_rank[bb->index];
>  }
>  
> -/* If EXP is an SSA_NAME defined by a PHI statement that represents a
> -   loop-carried dependence of an innermost loop, return TRUE; else
> -   return FALSE.  */
> -static bool
> -loop_carried_phi (tree exp)
> -{
> -  gimple *phi_stmt;
> -  int64_t block_rank;
> -
> -  if (TREE_CODE (exp) != SSA_NAME
> -      || SSA_NAME_IS_DEFAULT_DEF (exp))
> -    return false;
> -
> -  phi_stmt = SSA_NAME_DEF_STMT (exp);
> -
> -  if (gimple_code (SSA_NAME_DEF_STMT (exp)) != GIMPLE_PHI)
> -    return false;
> -
> -  /* Non-loop-carried phis have block rank.  Loop-carried phis have
> -     an additional bias added in.  If this phi doesn't have block rank,
> -     it's biased and should not be propagated.  */
> -  block_rank = bb_rank[gimple_bb (phi_stmt)->index];
> -
> -  if (phi_rank (phi_stmt) != block_rank)
> -    return true;
> -
> -  return false;
> -}
> -
>  /* Return the maximum of RANK and the rank that should be propagated
>     from expression OP.  For most operands, this is just the rank of OP.
>     For loop-carried phis, the value is zero to avoid undoing the bias
>     in favor of the phi.  */
>  static int64_t
> -propagate_rank (int64_t rank, tree op)
> +propagate_rank (int64_t rank, tree op, bool *maybe_biased_p)
>  {
>    int64_t op_rank;
>  
> -  if (loop_carried_phi (op))
> -    return rank;
> -
>    op_rank = get_rank (op);
>  
> +  /* Check whether op is biased after the get_rank () call, since it might have
> +     updated biased_names.  */
> +  if (TREE_CODE (op) == SSA_NAME
> +      && bitmap_bit_p (biased_names, SSA_NAME_VERSION (op)))
> +    {
> +      if (maybe_biased_p == NULL)
> +	return rank;
> +      *maybe_biased_p = true;
> +    }
> +
>    return MAX (rank, op_rank);
>  }
>  
> @@ -433,13 +462,20 @@ get_rank (tree e)
>  
>        stmt = SSA_NAME_DEF_STMT (e);
>        if (gimple_code (stmt) == GIMPLE_PHI)
> -	rank = phi_rank (stmt);
> +	{
> +	  rank = phi_rank (stmt);
> +	  if (rank != bb_rank[gimple_bb (stmt)->index])
> +	    bitmap_set_bit (biased_names, SSA_NAME_VERSION (e));
> +	}
>  
>        else if (!is_gimple_assign (stmt))
>  	rank = bb_rank[gimple_bb (stmt)->index];
>  
>        else
>  	{
> +	  bool biased_p = false;
> +	  bool *maybe_biased_p = propagate_bias_p (stmt) ? &biased_p : NULL;
> +
>  	  /* Otherwise, find the maximum rank for the operands.  As an
>  	     exception, remove the bias from loop-carried phis when propagating
>  	     the rank so that dependent operations are not also biased.  */
> @@ -448,9 +484,11 @@ get_rank (tree e)
>  	     thus have rank 0.  */
>  	  rank = 0;
>  	  FOR_EACH_SSA_TREE_OPERAND (op, stmt, iter, SSA_OP_USE)
> -	    rank = propagate_rank (rank, op);
> +	    rank = propagate_rank (rank, op, maybe_biased_p);
>  
>  	  rank += 1;
> +	  if (biased_p)
> +	    bitmap_set_bit (biased_names, SSA_NAME_VERSION (e));
>  	}
>  
>        if (dump_file && (dump_flags & TDF_DETAILS))
> @@ -6933,6 +6971,7 @@ fini_reassoc (void)
>  			    reassociate_stats.pows_created);
>  
>    delete operand_rank;
> +  bitmap_clear (biased_names);
>    operand_entry_pool.release ();
>    free (bb_rank);
>    plus_negates.release ();
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 3/3] reassoc: Test rank biasing
  2021-09-26 11:09 ` [PATCH v3 3/3] reassoc: Test rank biasing Ilya Leoshkevich
@ 2021-09-28 11:28   ` Richard Biener
  2021-09-28 11:46     ` Ilya Leoshkevich
  0 siblings, 1 reply; 9+ messages in thread
From: Richard Biener @ 2021-09-28 11:28 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: gcc-patches, Andreas Krebbel

On Sun, 26 Sep 2021, Ilya Leoshkevich wrote:

> Add both positive and negative tests.

The tests will likely be quite fragile with respect to what is
actually vectorized on which target.  If you move the tests
to gcc.dg/vect/ you could at least do

/* { dg-require-effective-target vect_int } */

do you need to look for the exact GIMPLE IL or is it enough to
verify we are vectorizing the reduction?

Thanks,
Richard.


> gcc/testsuite/ChangeLog:
> 
> 	* gcc.dg/tree-ssa/reassoc-46.c: New test.
> 	* gcc.dg/tree-ssa/reassoc-46.h: Common code for new tests.
> 	* gcc.dg/tree-ssa/reassoc-47.c: New test.
> 	* gcc.dg/tree-ssa/reassoc-48.c: New test.
> 	* gcc.dg/tree-ssa/reassoc-49.c: New test.
> 	* gcc.dg/tree-ssa/reassoc-50.c: New test.
> 	* gcc.dg/tree-ssa/reassoc-51.c: New test.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c |  7 +++++
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h | 33 ++++++++++++++++++++++
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c |  9 ++++++
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c |  9 ++++++
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c | 11 ++++++++
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c | 10 +++++++
>  gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c | 11 ++++++++
>  7 files changed, 90 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c
> 
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c
> new file mode 100644
> index 00000000000..97563dd929f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.c
> @@ -0,0 +1,7 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
> +
> +#include "reassoc-46.h"
> +
> +/* Check that the loop accumulator is added last.  */
> +/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h
> new file mode 100644
> index 00000000000..e60b490ea0d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-46.h
> @@ -0,0 +1,33 @@
> +#define M 1024
> +unsigned int arr1[M];
> +unsigned int arr2[M];
> +volatile unsigned int sink;
> +
> +unsigned int
> +test (void)
> +{
> +  unsigned int sum = 0;
> +  for (int i = 0; i < M; i++)
> +    {
> +#ifdef MODIFY
> +      /* Modify the loop accumulator using a chain of operations - this should
> +         not affect its rank biasing.  */
> +      sum |= 1;
> +      sum ^= 2;
> +#endif
> +#ifdef STORE
> +      /* Save the loop accumulator into a global variable - this should not
> +         affect its rank biasing.  */
> +      sink = sum;
> +#endif
> +#ifdef USE
> +      /* Add a tricky use of the loop accumulator - this should prevent its
> +         rank biasing.  */
> +      i = (i + sum) % M;
> +#endif
> +      /* Use addends with different ranks.  */
> +      sum += arr1[i];
> +      sum += arr2[((i ^ 1) + 1) % M];
> +    }
> +  return sum;
> +}
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
> new file mode 100644
> index 00000000000..1b0f0fdabe1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-47.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
> +
> +#define MODIFY
> +#include "reassoc-46.h"
> +
> +/* Check that if the loop accumulator is saved into a global variable, it's
> +   still added last.  */
> +/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
> new file mode 100644
> index 00000000000..13836ebe8e6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-48.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
> +
> +#define STORE
> +#include "reassoc-46.h"
> +
> +/* Check that if the loop accumulator is modified using a chain of operations
> +   other than addition, its new value is still added last.  */
> +/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c
> new file mode 100644
> index 00000000000..c1136a447a2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-49.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
> +
> +#define MODIFY
> +#define STORE
> +#include "reassoc-46.h"
> +
> +/* Check that if the loop accumulator is both modified using a chain of
> +   operations other than addition and stored into a global variable, its new
> +   value is still added last.  */
> +/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 1 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c
> new file mode 100644
> index 00000000000..e35a4ff1278
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-50.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
> +
> +#define MODIFY
> +#define USE
> +#include "reassoc-46.h"
> +
> +/* Check that if the loop accumulator has multiple uses inside the loop, it's
> +   not forced to the end of the reassociation chain.  */
> +/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 2 "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c
> new file mode 100644
> index 00000000000..07175675e24
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-51.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -ftree-vectorize" } */
> +
> +#define MODIFY
> +#define STORE
> +#define USE
> +#include "reassoc-46.h"
> +
> +/* Check that if the loop accumulator has multiple uses inside the loop, it's
> +   not forced to the end of the reassociation chain.  */
> +/* { dg-final { scan-tree-dump-times {(?:vect_)?sum_[\d._]+ = (?:(?:vect_)?_[\d._]+ \+ (?:vect_)?sum_[\d._]+|(?:vect_)?sum_[\d._]+ \+ (?:vect_)?_[\d._]+)} 2 "optimized" } } */

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 3/3] reassoc: Test rank biasing
  2021-09-28 11:28   ` Richard Biener
@ 2021-09-28 11:46     ` Ilya Leoshkevich
  2021-09-28 12:38       ` Richard Biener
  0 siblings, 1 reply; 9+ messages in thread
From: Ilya Leoshkevich @ 2021-09-28 11:46 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, Andreas Krebbel

On Tue, 2021-09-28 at 13:28 +0200, Richard Biener wrote:
> On Sun, 26 Sep 2021, Ilya Leoshkevich wrote:
> 
> > Add both positive and negative tests.
> 
> The tests will likely be quite fragile with respect to what is
> actually vectorized on which target.  If you move the tests
> to gcc.dg/vect/ you could at least do
> 
> /* { dg-require-effective-target vect_int } */
> 
> do you need to look for the exact GIMPLE IL or is it enough to
> verify we are vectorizing the reduction?

Actually I don't think vectorization is that important here, and I
only check how many times sum_x = sum_y + _z appears.  So I use
(?:vect_)?, which may or may not be there.

An alternative I considered was to use -fno-tree-vectorize to get
smaller regexes, but I thought it would be nice to know that
vectorization does not mess up reassociation results.

Best regards,
Ilya


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v3 3/3] reassoc: Test rank biasing
  2021-09-28 11:46     ` Ilya Leoshkevich
@ 2021-09-28 12:38       ` Richard Biener
  0 siblings, 0 replies; 9+ messages in thread
From: Richard Biener @ 2021-09-28 12:38 UTC (permalink / raw)
  To: Ilya Leoshkevich; +Cc: gcc-patches, Andreas Krebbel

On Tue, 28 Sep 2021, Ilya Leoshkevich wrote:

> On Tue, 2021-09-28 at 13:28 +0200, Richard Biener wrote:
> > On Sun, 26 Sep 2021, Ilya Leoshkevich wrote:
> > 
> > > Add both positive and negative tests.
> > 
> > The tests will likely be quite fragile with respect to what is
> > actually vectorized on which target.  If you move the tests
> > to gcc.dg/vect/ you could at least do
> > 
> > /* { dg-require-effective-target vect_int } */
> > 
> > do you need to look for the exact GIMPLE IL or is it enough to
> > verify we are vectorizing the reduction?
> 
> Actually I don't think vectorization is that important here, and I
> only check how many times sum_x = sum_y + _z appears.  So I use
> (?:vect_)?, which may or may not be there.
> 
> An alternative I considered was to use -fno-tree-vectorize to get
> smaller regexes, but I thought it would be nice to know that
> vectorization does not mess up reassociation results.

Ah, OK.  So lets go ahead with the patch unchanged, but be prepared
to deal with eventual fallout here on weird targets.

Thanks,
Richard.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2021-09-28 12:38 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-26 11:09 [PATCH v3 0/3] reassoc: Propagate PHI_LOOP_BIAS along single uses Ilya Leoshkevich
2021-09-26 11:09 ` [PATCH v3 1/3] reassoc: Do not bias loop-carried PHIs early Ilya Leoshkevich
2021-09-28 11:21   ` Richard Biener
2021-09-26 11:09 ` [PATCH v3 2/3] reassoc: Propagate PHI_LOOP_BIAS along single uses Ilya Leoshkevich
2021-09-28 11:24   ` Richard Biener
2021-09-26 11:09 ` [PATCH v3 3/3] reassoc: Test rank biasing Ilya Leoshkevich
2021-09-28 11:28   ` Richard Biener
2021-09-28 11:46     ` Ilya Leoshkevich
2021-09-28 12:38       ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).