public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS
@ 2017-06-12 17:02 Bin Cheng
  2017-06-23 10:10 ` Bin.Cheng
  0 siblings, 1 reply; 10+ messages in thread
From: Bin Cheng @ 2017-06-12 17:02 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 1150 bytes --]

Hi,
I was asked by upstream to split the loop distribution patch into small ones.
It is hard because data structure and algorithm are closely coupled together.
Anyway, this is the patch series with smaller patches.  Basically I tried to
separate data structure and bug-fix changes apart with one as the main patch.
Note I only made necessary code refactoring in order to separate patch, apart
from that, there is no change against the last version.

This is the first patch introducing new internal function IFN_LOOP_DIST_ALIAS.
GCC will distribute loops under condition of this function call.

Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin
2017-06-07  Bin Cheng  <bin.cheng@arm.com>

	* cfgloop.h (struct loop): New field ldist_alias_id.
	* cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
	* internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
	* internal-fn.def (LOOP_DIST_ALIAS): New.
	* tree-vectorizer.c (vect_loop_dist_alias_call): New function.
	(fold_loop_dist_alias_call): New function.
	(vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending on
	successful vectorization or not.

[-- Attachment #2: 0001-ifn_loop_dist_alias-20170607.txt --]
[-- Type: text/plain, Size: 6503 bytes --]

From 3598491598e0b425f1cfa4b7bb4c180886a08bef Mon Sep 17 00:00:00 2001
From: Bin Cheng <binche01@e108451-lin.cambridge.arm.com>
Date: Wed, 7 Jun 2017 13:04:03 +0100
Subject: [PATCH 01/14] ifn_loop_dist_alias-20170607.txt

---
 gcc/cfgloop.h         |  9 +++++++
 gcc/cfgloopmanip.c    |  3 ++-
 gcc/internal-fn.c     |  8 ++++++
 gcc/internal-fn.def   |  1 +
 gcc/tree-vectorizer.c | 75 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 5 files changed, 94 insertions(+), 2 deletions(-)

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index a8bec1d..be4187a 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -225,6 +225,15 @@ struct GTY ((chain_next ("%h.next"))) loop {
      builtins.  */
   tree simduid;
 
+  /* For loops generated by distribution with runtime alias checks, this
+     is a unique identifier of the original distributed loop.  Generally
+     it is the number of the original loop.  IFN_LOOP_DIST_ALIAS builtin
+     uses this id as its first argument.  Give a loop with an id, we can
+     look upward in dominance tree for the corresponding IFN_LOOP_DIST_ALIAS
+     buildin.  Note this id has no meanling after IFN_LOOP_DIST_ALIAS is
+     folded and eliminated.  */
+  int ldist_alias_id;
+
   /* Upper bound on number of iterations of a loop.  */
   struct nb_iter_bound *bounds;
 
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index d764ab9..adb2f65 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1653,7 +1653,8 @@ force_single_succ_latches (void)
 
   THEN_PROB is the probability of then branch of the condition.
   ELSE_PROB is the probability of else branch. Note that they may be both
-  REG_BR_PROB_BASE when condition is IFN_LOOP_VECTORIZED.  */
+  REG_BR_PROB_BASE when condition is IFN_LOOP_VECTORIZED or
+  IFN_LOOP_DIST_ALIAS.  */
 
 static basic_block
 lv_adjust_loop_entry_edge (basic_block first_head, basic_block second_head,
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 75fe027..96e40cb 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2250,6 +2250,14 @@ expand_LOOP_VECTORIZED (internal_fn, gcall *)
   gcc_unreachable ();
 }
 
+/* This should get folded in tree-vectorizer.c.  */
+
+static void
+expand_LOOP_DIST_ALIAS (internal_fn, gcall *)
+{
+  gcc_unreachable ();
+}
+
 /* Expand MASK_LOAD call STMT using optab OPTAB.  */
 
 static void
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index e162d81..79c19fb 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -158,6 +158,7 @@ DEF_INTERNAL_FN (GOMP_SIMD_LAST_LANE, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_ORDERED_START, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_ORDERED_END, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (LOOP_VECTORIZED, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (LOOP_DIST_ALIAS, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (ANNOTATE,  ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (UBSAN_NULL, ECF_LEAF | ECF_NOTHROW, ".R.")
 DEF_INTERNAL_FN (UBSAN_BOUNDS, ECF_LEAF | ECF_NOTHROW, NULL)
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 1bef2e4..0d83d33 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -469,6 +469,63 @@ fold_loop_vectorized_call (gimple *g, tree value)
     }
 }
 
+/* If LOOP has been versioned during loop distribution, return the internal
+   call guarding it.  */
+
+static gimple *
+vect_loop_dist_alias_call (struct loop *loop)
+{
+  gimple_stmt_iterator gsi;
+  gimple *g;
+  basic_block bb = loop_preheader_edge (loop)->src;
+  struct loop *outer_loop = bb->loop_father;
+
+  /* Look upward in dominance tree.  */
+  for (; bb != ENTRY_BLOCK_PTR_FOR_FN (cfun) && bb->loop_father == outer_loop;
+       bb = get_immediate_dominator (CDI_DOMINATORS, bb))
+    {
+      g = last_stmt (bb);
+      if (g == NULL || gimple_code (g) != GIMPLE_COND)
+	continue;
+
+      gsi = gsi_for_stmt (g);
+      gsi_prev (&gsi);
+      if (gsi_end_p (gsi))
+	continue;
+
+      g = gsi_stmt (gsi);
+      /* The guarding internal function call must have the same distribution
+	 alias id.  */
+      if (gimple_call_internal_p (g, IFN_LOOP_DIST_ALIAS)
+	  && (tree_to_shwi (gimple_call_arg (g, 0)) == loop->ldist_alias_id))
+	return g;
+    }
+  return NULL;
+}
+
+/* Fold LOOP_DIST_ALIAS internal call stmt according to KEEP_P and update
+   any immediate uses of it's LHS.  Stmt is folded to its second argument
+   if KEEP_P is true, otherwise to boolean_false_node.  */
+
+static void
+fold_loop_dist_alias_call (gimple *g, bool keep_p)
+{
+  tree lhs = gimple_call_lhs (g);
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  gimple *use_stmt;
+  gimple_stmt_iterator gsi = gsi_for_stmt (g);
+  tree folded_value = keep_p ? gimple_call_arg (g, 1) : boolean_false_node;
+
+  update_call_from_tree (&gsi, folded_value);
+  FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs)
+    {
+      FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
+	SET_USE (use_p, folded_value);
+      update_stmt (use_stmt);
+    }
+}
+
 /* Set the uids of all the statements in basic blocks inside loop
    represented by LOOP_VINFO. LOOP_VECTORIZED_CALL is the internal
    call guarding the loop which has been if converted.  */
@@ -595,7 +652,7 @@ vectorize_loops (void)
     else
       {
 	loop_vec_info loop_vinfo, orig_loop_vinfo;
-	gimple *loop_vectorized_call;
+	gimple *loop_vectorized_call, *loop_dist_alias_call;
        try_vectorize:
 	if (!((flag_tree_loop_vectorize
 	       && optimize_loop_nest_for_speed_p (loop))
@@ -603,6 +660,7 @@ vectorize_loops (void)
 	  continue;
 	orig_loop_vinfo = NULL;
 	loop_vectorized_call = vect_loop_vectorized_call (loop);
+	loop_dist_alias_call = vect_loop_dist_alias_call (loop);
        vectorize_epilogue:
 	vect_location = find_loop_location (loop);
         if (LOCATION_LOCUS (vect_location) != UNKNOWN_LOCATION
@@ -710,6 +768,12 @@ vectorize_loops (void)
 	    loop_vectorized_call = NULL;
 	    ret |= TODO_cleanup_cfg;
 	  }
+	if (loop_dist_alias_call)
+	  {
+	    fold_loop_dist_alias_call (loop_dist_alias_call, true);
+	    loop_dist_alias_call = NULL;
+	    ret |= TODO_cleanup_cfg;
+	  }
 
 	if (new_loop)
 	  {
@@ -743,6 +807,15 @@ vectorize_loops (void)
 	      {
 		fold_loop_vectorized_call (g, boolean_false_node);
 		ret |= TODO_cleanup_cfg;
+		g = NULL;
+	      }
+	    else
+	      g = vect_loop_dist_alias_call (loop);
+
+	    if (g)
+	      {
+		fold_loop_dist_alias_call (g, false);
+		ret |= TODO_cleanup_cfg;
 	      }
 	  }
       }
-- 
1.9.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS
  2017-06-12 17:02 [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS Bin Cheng
@ 2017-06-23 10:10 ` Bin.Cheng
  2017-06-26 10:23   ` Richard Sandiford
  2017-06-27 12:58   ` Richard Biener
  0 siblings, 2 replies; 10+ messages in thread
From: Bin.Cheng @ 2017-06-23 10:10 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1498 bytes --]

On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> I was asked by upstream to split the loop distribution patch into small ones.
> It is hard because data structure and algorithm are closely coupled together.
> Anyway, this is the patch series with smaller patches.  Basically I tried to
> separate data structure and bug-fix changes apart with one as the main patch.
> Note I only made necessary code refactoring in order to separate patch, apart
> from that, there is no change against the last version.
>
> This is the first patch introducing new internal function IFN_LOOP_DIST_ALIAS.
> GCC will distribute loops under condition of this function call.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK?
Hi,
I need to update this patch fixing an issue in
vect_loop_dist_alias_call.  The previous patch fails to find some
IFN_LOOP_DIST_ALIAS calls.

Bootstrap and test in series.  Is it OK?

Thanks,
bin
>
> Thanks,
> bin
> 2017-06-07  Bin Cheng  <bin.cheng@arm.com>
>
>         * cfgloop.h (struct loop): New field ldist_alias_id.
>         * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
>         * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
>         * internal-fn.def (LOOP_DIST_ALIAS): New.
>         * tree-vectorizer.c (vect_loop_dist_alias_call): New function.
>         (fold_loop_dist_alias_call): New function.
>         (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending on
>         successful vectorization or not.

[-- Attachment #2: 0001-ifn_loop_dist_alias-20170608.txt.patch --]
[-- Type: text/x-patch, Size: 6596 bytes --]

From ab8334c5f109c593610df3efcf1aa5a2edcf6be9 Mon Sep 17 00:00:00 2001
From: Bin Cheng <binche01@e108451-lin.cambridge.arm.com>
Date: Wed, 7 Jun 2017 13:04:03 +0100
Subject: [PATCH 01/13] ifn_loop_dist_alias-20170608.txt

---
 gcc/cfgloop.h         |  9 ++++++
 gcc/cfgloopmanip.c    |  3 +-
 gcc/internal-fn.c     |  8 ++++++
 gcc/internal-fn.def   |  1 +
 gcc/tree-vectorizer.c | 79 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 5 files changed, 98 insertions(+), 2 deletions(-)

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index a8bec1d..be4187a 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -225,6 +225,15 @@ struct GTY ((chain_next ("%h.next"))) loop {
      builtins.  */
   tree simduid;
 
+  /* For loops generated by distribution with runtime alias checks, this
+     is a unique identifier of the original distributed loop.  Generally
+     it is the number of the original loop.  IFN_LOOP_DIST_ALIAS builtin
+     uses this id as its first argument.  Give a loop with an id, we can
+     look upward in dominance tree for the corresponding IFN_LOOP_DIST_ALIAS
+     buildin.  Note this id has no meanling after IFN_LOOP_DIST_ALIAS is
+     folded and eliminated.  */
+  int ldist_alias_id;
+
   /* Upper bound on number of iterations of a loop.  */
   struct nb_iter_bound *bounds;
 
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index d764ab9..adb2f65 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1653,7 +1653,8 @@ force_single_succ_latches (void)
 
   THEN_PROB is the probability of then branch of the condition.
   ELSE_PROB is the probability of else branch. Note that they may be both
-  REG_BR_PROB_BASE when condition is IFN_LOOP_VECTORIZED.  */
+  REG_BR_PROB_BASE when condition is IFN_LOOP_VECTORIZED or
+  IFN_LOOP_DIST_ALIAS.  */
 
 static basic_block
 lv_adjust_loop_entry_edge (basic_block first_head, basic_block second_head,
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 75fe027..96e40cb 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2250,6 +2250,14 @@ expand_LOOP_VECTORIZED (internal_fn, gcall *)
   gcc_unreachable ();
 }
 
+/* This should get folded in tree-vectorizer.c.  */
+
+static void
+expand_LOOP_DIST_ALIAS (internal_fn, gcall *)
+{
+  gcc_unreachable ();
+}
+
 /* Expand MASK_LOAD call STMT using optab OPTAB.  */
 
 static void
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index e162d81..79c19fb 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -158,6 +158,7 @@ DEF_INTERNAL_FN (GOMP_SIMD_LAST_LANE, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_ORDERED_START, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_ORDERED_END, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (LOOP_VECTORIZED, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (LOOP_DIST_ALIAS, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (ANNOTATE,  ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (UBSAN_NULL, ECF_LEAF | ECF_NOTHROW, ".R.")
 DEF_INTERNAL_FN (UBSAN_BOUNDS, ECF_LEAF | ECF_NOTHROW, NULL)
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 1bef2e4..05e9f84 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -469,6 +469,67 @@ fold_loop_vectorized_call (gimple *g, tree value)
     }
 }
 
+/* If LOOP has been versioned during loop distribution, return the internal
+   call guarding it.  */
+
+static gimple *
+vect_loop_dist_alias_call (struct loop *loop)
+{
+  gimple_stmt_iterator gsi;
+  gimple *g;
+  basic_block bb = loop_preheader_edge (loop)->src;
+  basic_block entry = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+  struct loop *outer_loop = bb->loop_father;
+
+  /* Look upward in dominance tree.  */
+  for (; bb != entry && flow_bb_inside_loop_p (outer_loop, bb);
+       bb = get_immediate_dominator (CDI_DOMINATORS, bb))
+    {
+      if (bb->loop_father != outer_loop)
+	continue;
+
+      g = last_stmt (bb);
+      if (g == NULL || gimple_code (g) != GIMPLE_COND)
+	continue;
+
+      gsi = gsi_for_stmt (g);
+      gsi_prev (&gsi);
+      if (gsi_end_p (gsi))
+	continue;
+
+      g = gsi_stmt (gsi);
+      /* The guarding internal function call must have the same distribution
+	 alias id.  */
+      if (gimple_call_internal_p (g, IFN_LOOP_DIST_ALIAS)
+	  && (tree_to_shwi (gimple_call_arg (g, 0)) == loop->ldist_alias_id))
+	return g;
+    }
+  return NULL;
+}
+
+/* Fold LOOP_DIST_ALIAS internal call stmt according to KEEP_P and update
+   any immediate uses of it's LHS.  Stmt is folded to its second argument
+   if KEEP_P is true, otherwise to boolean_false_node.  */
+
+static void
+fold_loop_dist_alias_call (gimple *g, bool keep_p)
+{
+  tree lhs = gimple_call_lhs (g);
+  use_operand_p use_p;
+  imm_use_iterator iter;
+  gimple *use_stmt;
+  gimple_stmt_iterator gsi = gsi_for_stmt (g);
+  tree folded_value = keep_p ? gimple_call_arg (g, 1) : boolean_false_node;
+
+  update_call_from_tree (&gsi, folded_value);
+  FOR_EACH_IMM_USE_STMT (use_stmt, iter, lhs)
+    {
+      FOR_EACH_IMM_USE_ON_STMT (use_p, iter)
+	SET_USE (use_p, folded_value);
+      update_stmt (use_stmt);
+    }
+}
+
 /* Set the uids of all the statements in basic blocks inside loop
    represented by LOOP_VINFO. LOOP_VECTORIZED_CALL is the internal
    call guarding the loop which has been if converted.  */
@@ -595,7 +656,7 @@ vectorize_loops (void)
     else
       {
 	loop_vec_info loop_vinfo, orig_loop_vinfo;
-	gimple *loop_vectorized_call;
+	gimple *loop_vectorized_call, *loop_dist_alias_call;
        try_vectorize:
 	if (!((flag_tree_loop_vectorize
 	       && optimize_loop_nest_for_speed_p (loop))
@@ -603,6 +664,7 @@ vectorize_loops (void)
 	  continue;
 	orig_loop_vinfo = NULL;
 	loop_vectorized_call = vect_loop_vectorized_call (loop);
+	loop_dist_alias_call = vect_loop_dist_alias_call (loop);
        vectorize_epilogue:
 	vect_location = find_loop_location (loop);
         if (LOCATION_LOCUS (vect_location) != UNKNOWN_LOCATION
@@ -710,6 +772,12 @@ vectorize_loops (void)
 	    loop_vectorized_call = NULL;
 	    ret |= TODO_cleanup_cfg;
 	  }
+	if (loop_dist_alias_call)
+	  {
+	    fold_loop_dist_alias_call (loop_dist_alias_call, true);
+	    loop_dist_alias_call = NULL;
+	    ret |= TODO_cleanup_cfg;
+	  }
 
 	if (new_loop)
 	  {
@@ -743,6 +811,15 @@ vectorize_loops (void)
 	      {
 		fold_loop_vectorized_call (g, boolean_false_node);
 		ret |= TODO_cleanup_cfg;
+		g = NULL;
+	      }
+	    else
+	      g = vect_loop_dist_alias_call (loop);
+
+	    if (g)
+	      {
+		fold_loop_dist_alias_call (g, false);
+		ret |= TODO_cleanup_cfg;
 	      }
 	  }
       }
-- 
1.9.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS
  2017-06-23 10:10 ` Bin.Cheng
@ 2017-06-26 10:23   ` Richard Sandiford
  2017-06-27 12:58   ` Richard Biener
  1 sibling, 0 replies; 10+ messages in thread
From: Richard Sandiford @ 2017-06-26 10:23 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: gcc-patches

Just a couple of cosmetic things:

"Bin.Cheng" <amker.cheng@gmail.com> writes:
> @@ -225,6 +225,15 @@ struct GTY ((chain_next ("%h.next"))) loop {
>       builtins.  */
>    tree simduid;
>  
> +  /* For loops generated by distribution with runtime alias checks, this
> +     is a unique identifier of the original distributed loop.  Generally
> +     it is the number of the original loop.  IFN_LOOP_DIST_ALIAS builtin
> +     uses this id as its first argument.  Give a loop with an id, we can
> +     look upward in dominance tree for the corresponding IFN_LOOP_DIST_ALIAS
> +     buildin.  Note this id has no meanling after IFN_LOOP_DIST_ALIAS is

s/meanling/meaning/

> +/* Fold LOOP_DIST_ALIAS internal call stmt according to KEEP_P and update
> +   any immediate uses of it's LHS.  Stmt is folded to its second argument

s/it's/its/

Thanks,
Richard

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS
  2017-06-23 10:10 ` Bin.Cheng
  2017-06-26 10:23   ` Richard Sandiford
@ 2017-06-27 12:58   ` Richard Biener
  2017-06-27 14:27     ` Bin.Cheng
  1 sibling, 1 reply; 10+ messages in thread
From: Richard Biener @ 2017-06-27 12:58 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: gcc-patches

On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>> Hi,
>> I was asked by upstream to split the loop distribution patch into small ones.
>> It is hard because data structure and algorithm are closely coupled together.
>> Anyway, this is the patch series with smaller patches.  Basically I tried to
>> separate data structure and bug-fix changes apart with one as the main patch.
>> Note I only made necessary code refactoring in order to separate patch, apart
>> from that, there is no change against the last version.
>>
>> This is the first patch introducing new internal function IFN_LOOP_DIST_ALIAS.
>> GCC will distribute loops under condition of this function call.
>>
>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
> Hi,
> I need to update this patch fixing an issue in
> vect_loop_dist_alias_call.  The previous patch fails to find some
> IFN_LOOP_DIST_ALIAS calls.
>
> Bootstrap and test in series.  Is it OK?

So I wonder if we really need to track ldist_alias_id or if we can do sth
more "general", like tracking a copy_of or origin and then directly
go to nearest_common_dominator (loop->header, copy_of->header)
to find the controlling condition?

That said "ldist_alias_id" is a bit too narrow of purpose to "waste"
an int inside struct loop?  I'd set copy_of/origi in loop_version for example.
'origin' would probably be better given the ldist cases aren't really
full "copies".

fold_loop_dist_alias_call should re-use / rename fold_loop_vectorized_call
by just passing folded_value to it.

Richard.

> Thanks,
> bin
>>
>> Thanks,
>> bin
>> 2017-06-07  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * cfgloop.h (struct loop): New field ldist_alias_id.
>>         * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
>>         * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
>>         * internal-fn.def (LOOP_DIST_ALIAS): New.
>>         * tree-vectorizer.c (vect_loop_dist_alias_call): New function.
>>         (fold_loop_dist_alias_call): New function.
>>         (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending on
>>         successful vectorization or not.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS
  2017-06-27 12:58   ` Richard Biener
@ 2017-06-27 14:27     ` Bin.Cheng
  2017-06-27 14:59       ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Bin.Cheng @ 2017-06-27 14:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>> Hi,
>>> I was asked by upstream to split the loop distribution patch into small ones.
>>> It is hard because data structure and algorithm are closely coupled together.
>>> Anyway, this is the patch series with smaller patches.  Basically I tried to
>>> separate data structure and bug-fix changes apart with one as the main patch.
>>> Note I only made necessary code refactoring in order to separate patch, apart
>>> from that, there is no change against the last version.
>>>
>>> This is the first patch introducing new internal function IFN_LOOP_DIST_ALIAS.
>>> GCC will distribute loops under condition of this function call.
>>>
>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>> Hi,
>> I need to update this patch fixing an issue in
>> vect_loop_dist_alias_call.  The previous patch fails to find some
>> IFN_LOOP_DIST_ALIAS calls.
>>
>> Bootstrap and test in series.  Is it OK?
>
> So I wonder if we really need to track ldist_alias_id or if we can do sth
Yes, it is needed because otherwise we probably falsely trying to
search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
loop.

> more "general", like tracking a copy_of or origin and then directly
> go to nearest_common_dominator (loop->header, copy_of->header)
> to find the controlling condition?
I tend to not record any pointer in loop structure, it can easily go
dangling for a across passes data structure.  As far as memory usage
is concerned.  I actually don't need a whole integer to record the
loop num.  I can simply restrict number of distributions in one
function to at most 256, and record such id in a char field in struct
loop?  Does this sounds better?

Thanks,
bin
>
> That said "ldist_alias_id" is a bit too narrow of purpose to "waste"
> an int inside struct loop?  I'd set copy_of/origi in loop_version for example.
> 'origin' would probably be better given the ldist cases aren't really
> full "copies".
>
> fold_loop_dist_alias_call should re-use / rename fold_loop_vectorized_call
> by just passing folded_value to it.
>
> Richard.
>
>> Thanks,
>> bin
>>>
>>> Thanks,
>>> bin
>>> 2017-06-07  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         * cfgloop.h (struct loop): New field ldist_alias_id.
>>>         * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
>>>         * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
>>>         * internal-fn.def (LOOP_DIST_ALIAS): New.
>>>         * tree-vectorizer.c (vect_loop_dist_alias_call): New function.
>>>         (fold_loop_dist_alias_call): New function.
>>>         (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending on
>>>         successful vectorization or not.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS
  2017-06-27 14:27     ` Bin.Cheng
@ 2017-06-27 14:59       ` Richard Biener
  2017-06-27 16:46         ` Bin.Cheng
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2017-06-27 14:59 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: gcc-patches

On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng" <amker.cheng@gmail.com> wrote:
>On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
><richard.guenther@gmail.com> wrote:
>> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng <amker.cheng@gmail.com>
>wrote:
>>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng <Bin.Cheng@arm.com>
>wrote:
>>>> Hi,
>>>> I was asked by upstream to split the loop distribution patch into
>small ones.
>>>> It is hard because data structure and algorithm are closely coupled
>together.
>>>> Anyway, this is the patch series with smaller patches.  Basically I
>tried to
>>>> separate data structure and bug-fix changes apart with one as the
>main patch.
>>>> Note I only made necessary code refactoring in order to separate
>patch, apart
>>>> from that, there is no change against the last version.
>>>>
>>>> This is the first patch introducing new internal function
>IFN_LOOP_DIST_ALIAS.
>>>> GCC will distribute loops under condition of this function call.
>>>>
>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>> Hi,
>>> I need to update this patch fixing an issue in
>>> vect_loop_dist_alias_call.  The previous patch fails to find some
>>> IFN_LOOP_DIST_ALIAS calls.
>>>
>>> Bootstrap and test in series.  Is it OK?
>>
>> So I wonder if we really need to track ldist_alias_id or if we can do
>sth
>Yes, it is needed because otherwise we probably falsely trying to
>search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
>loop.
>
>> more "general", like tracking a copy_of or origin and then directly
>> go to nearest_common_dominator (loop->header, copy_of->header)
>> to find the controlling condition?
>I tend to not record any pointer in loop structure, it can easily go
>dangling for a across passes data structure.  

I didn't mean to record a pointer, just rename your field and make it more general.  The common dominator thing shod still work, no?

As far as memory usage
>is concerned.  I actually don't need a whole integer to record the
>loop num.  I can simply restrict number of distributions in one
>function to at most 256, and record such id in a char field in struct
>loop?  Does this sounds better?

As said, tracking loop origin sounds useful anyway so I'd rather add and use that somehow.

>Thanks,
>bin
>>
>> That said "ldist_alias_id" is a bit too narrow of purpose to "waste"
>> an int inside struct loop?  I'd set copy_of/origi in loop_version for
>example.
>> 'origin' would probably be better given the ldist cases aren't really
>> full "copies".
>>
>> fold_loop_dist_alias_call should re-use / rename
>fold_loop_vectorized_call
>> by just passing folded_value to it.
>>
>> Richard.
>>
>>> Thanks,
>>> bin
>>>>
>>>> Thanks,
>>>> bin
>>>> 2017-06-07  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         * cfgloop.h (struct loop): New field ldist_alias_id.
>>>>         * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment
>change.
>>>>         * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
>>>>         * internal-fn.def (LOOP_DIST_ALIAS): New.
>>>>         * tree-vectorizer.c (vect_loop_dist_alias_call): New
>function.
>>>>         (fold_loop_dist_alias_call): New function.
>>>>         (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending
>on
>>>>         successful vectorization or not.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS
  2017-06-27 14:59       ` Richard Biener
@ 2017-06-27 16:46         ` Bin.Cheng
  2017-06-28  7:29           ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Bin.Cheng @ 2017-06-27 16:46 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Tue, Jun 27, 2017 at 3:59 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng" <amker.cheng@gmail.com> wrote:
>>On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
>><richard.guenther@gmail.com> wrote:
>>> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng <amker.cheng@gmail.com>
>>wrote:
>>>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng <Bin.Cheng@arm.com>
>>wrote:
>>>>> Hi,
>>>>> I was asked by upstream to split the loop distribution patch into
>>small ones.
>>>>> It is hard because data structure and algorithm are closely coupled
>>together.
>>>>> Anyway, this is the patch series with smaller patches.  Basically I
>>tried to
>>>>> separate data structure and bug-fix changes apart with one as the
>>main patch.
>>>>> Note I only made necessary code refactoring in order to separate
>>patch, apart
>>>>> from that, there is no change against the last version.
>>>>>
>>>>> This is the first patch introducing new internal function
>>IFN_LOOP_DIST_ALIAS.
>>>>> GCC will distribute loops under condition of this function call.
>>>>>
>>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>> Hi,
>>>> I need to update this patch fixing an issue in
>>>> vect_loop_dist_alias_call.  The previous patch fails to find some
>>>> IFN_LOOP_DIST_ALIAS calls.
>>>>
>>>> Bootstrap and test in series.  Is it OK?
>>>
>>> So I wonder if we really need to track ldist_alias_id or if we can do
>>sth
>>Yes, it is needed because otherwise we probably falsely trying to
>>search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
>>loop.
>>
>>> more "general", like tracking a copy_of or origin and then directly
>>> go to nearest_common_dominator (loop->header, copy_of->header)
>>> to find the controlling condition?
>>I tend to not record any pointer in loop structure, it can easily go
>>dangling for a across passes data structure.
>
> I didn't mean to record a pointer, just rename your field and make it more general.  The common dominator thing shod still work, no?
I might not be following.  If we record the original loop->num in the
renamed field, nearest_common_dominator can't work because we don't
have basic blocks to start the call?  The original loop could be
eliminated at several points, for example, instantly after
distribution, or folded in vectorizer for other loops distributed from
the original loop.
BTW, setting the copy_of/origin field in loop_version is not enough
for this use case, all generated loops (actually, except the versioned
loop) from distribution need to be set.
>
> As far as memory usage
>>is concerned.  I actually don't need a whole integer to record the
>>loop num.  I can simply restrict number of distributions in one
>>function to at most 256, and record such id in a char field in struct
>>loop?  Does this sounds better?
>
> As said, tracking loop origin sounds useful anyway so I'd rather add and use that somehow.
To be honest, I don't know.  the current field works like a unique
index of distribution operation.  The original loop could be destroyed
at different points thus no longer exists, this makes the recorded
copy_of/origin less meaningful?

Thanks,
bin
>
>>Thanks,
>>bin
>>>
>>> That said "ldist_alias_id" is a bit too narrow of purpose to "waste"
>>> an int inside struct loop?  I'd set copy_of/origi in loop_version for
>>example.
>>> 'origin' would probably be better given the ldist cases aren't really
>>> full "copies".
>>>
>>> fold_loop_dist_alias_call should re-use / rename
>>fold_loop_vectorized_call
>>> by just passing folded_value to it.
>>>
>>> Richard.
>>>
>>>> Thanks,
>>>> bin
>>>>>
>>>>> Thanks,
>>>>> bin
>>>>> 2017-06-07  Bin Cheng  <bin.cheng@arm.com>
>>>>>
>>>>>         * cfgloop.h (struct loop): New field ldist_alias_id.
>>>>>         * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment
>>change.
>>>>>         * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
>>>>>         * internal-fn.def (LOOP_DIST_ALIAS): New.
>>>>>         * tree-vectorizer.c (vect_loop_dist_alias_call): New
>>function.
>>>>>         (fold_loop_dist_alias_call): New function.
>>>>>         (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending
>>on
>>>>>         successful vectorization or not.
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS
  2017-06-27 16:46         ` Bin.Cheng
@ 2017-06-28  7:29           ` Richard Biener
  2017-06-30 10:37             ` Bin.Cheng
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2017-06-28  7:29 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: gcc-patches

On Tue, Jun 27, 2017 at 6:46 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Jun 27, 2017 at 3:59 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng" <amker.cheng@gmail.com> wrote:
>>>On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
>>><richard.guenther@gmail.com> wrote:
>>>> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng <amker.cheng@gmail.com>
>>>wrote:
>>>>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng <Bin.Cheng@arm.com>
>>>wrote:
>>>>>> Hi,
>>>>>> I was asked by upstream to split the loop distribution patch into
>>>small ones.
>>>>>> It is hard because data structure and algorithm are closely coupled
>>>together.
>>>>>> Anyway, this is the patch series with smaller patches.  Basically I
>>>tried to
>>>>>> separate data structure and bug-fix changes apart with one as the
>>>main patch.
>>>>>> Note I only made necessary code refactoring in order to separate
>>>patch, apart
>>>>>> from that, there is no change against the last version.
>>>>>>
>>>>>> This is the first patch introducing new internal function
>>>IFN_LOOP_DIST_ALIAS.
>>>>>> GCC will distribute loops under condition of this function call.
>>>>>>
>>>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>>> Hi,
>>>>> I need to update this patch fixing an issue in
>>>>> vect_loop_dist_alias_call.  The previous patch fails to find some
>>>>> IFN_LOOP_DIST_ALIAS calls.
>>>>>
>>>>> Bootstrap and test in series.  Is it OK?
>>>>
>>>> So I wonder if we really need to track ldist_alias_id or if we can do
>>>sth
>>>Yes, it is needed because otherwise we probably falsely trying to
>>>search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
>>>loop.
>>>
>>>> more "general", like tracking a copy_of or origin and then directly
>>>> go to nearest_common_dominator (loop->header, copy_of->header)
>>>> to find the controlling condition?
>>>I tend to not record any pointer in loop structure, it can easily go
>>>dangling for a across passes data structure.
>>
>> I didn't mean to record a pointer, just rename your field and make it more general.  The common dominator thing shod still work, no?
> I might not be following.  If we record the original loop->num in the
> renamed field, nearest_common_dominator can't work because we don't
> have basic blocks to start the call?  The original loop could be
> eliminated at several points, for example, instantly after
> distribution, or folded in vectorizer for other loops distributed from
> the original loop.
> BTW, setting the copy_of/origin field in loop_version is not enough
> for this use case, all generated loops (actually, except the versioned
> loop) from distribution need to be set.

Of course it would need to be set for all distributed loops.

I'm not sure "loop vanishes" is the case to optimize for though.  If the loop
is still available then origin->header should work as BB.  If not then can't
we conclude, for the purpose of IFN_LOOP_DIST_ALIAS, that the whole
region must be dead?  We still have to identify it of course, but it means
we can fold stray IFN_LOOP_DIST_ALIAS calls left over after vectorization
to whatever we like?

>>
>> As far as memory usage
>>>is concerned.  I actually don't need a whole integer to record the
>>>loop num.  I can simply restrict number of distributions in one
>>>function to at most 256, and record such id in a char field in struct
>>>loop?  Does this sounds better?
>>
>> As said, tracking loop origin sounds useful anyway so I'd rather add and use that somehow.
> To be honest, I don't know.  the current field works like a unique
> index of distribution operation.  The original loop could be destroyed
> at different points thus no longer exists, this makes the recorded
> copy_of/origin less meaningful?

I think we talked about prologue and epilogue loops to be easier identifiable
as so (and as to what "main" loop).  So lets say we have one "origin" field
and accompaning flags "origin_is_loop_dist_alias_version",
"origin_is_main_loop_of_prologue", etc.?  I can't think of the case where
origin would be two things at the same time (you can always walk up
the origin tree).

Vanishing origins could also be cleared btw, or we can make the new origin
the origin of the vanishing origin...

Richard.

> Thanks,
> bin
>>
>>>Thanks,
>>>bin
>>>>
>>>> That said "ldist_alias_id" is a bit too narrow of purpose to "waste"
>>>> an int inside struct loop?  I'd set copy_of/origi in loop_version for
>>>example.
>>>> 'origin' would probably be better given the ldist cases aren't really
>>>> full "copies".
>>>>
>>>> fold_loop_dist_alias_call should re-use / rename
>>>fold_loop_vectorized_call
>>>> by just passing folded_value to it.
>>>>
>>>> Richard.
>>>>
>>>>> Thanks,
>>>>> bin
>>>>>>
>>>>>> Thanks,
>>>>>> bin
>>>>>> 2017-06-07  Bin Cheng  <bin.cheng@arm.com>
>>>>>>
>>>>>>         * cfgloop.h (struct loop): New field ldist_alias_id.
>>>>>>         * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment
>>>change.
>>>>>>         * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
>>>>>>         * internal-fn.def (LOOP_DIST_ALIAS): New.
>>>>>>         * tree-vectorizer.c (vect_loop_dist_alias_call): New
>>>function.
>>>>>>         (fold_loop_dist_alias_call): New function.
>>>>>>         (vectorize_loops): Fold IFN_LOOP_DIST_ALIAS call depending
>>>on
>>>>>>         successful vectorization or not.
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS
  2017-06-28  7:29           ` Richard Biener
@ 2017-06-30 10:37             ` Bin.Cheng
  2017-07-03 11:48               ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Bin.Cheng @ 2017-06-30 10:37 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5663 bytes --]

On Wed, Jun 28, 2017 at 8:29 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Jun 27, 2017 at 6:46 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Tue, Jun 27, 2017 at 3:59 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng" <amker.cheng@gmail.com> wrote:
>>>>On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
>>>><richard.guenther@gmail.com> wrote:
>>>>> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng <amker.cheng@gmail.com>
>>>>wrote:
>>>>>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng <Bin.Cheng@arm.com>
>>>>wrote:
>>>>>>> Hi,
>>>>>>> I was asked by upstream to split the loop distribution patch into
>>>>small ones.
>>>>>>> It is hard because data structure and algorithm are closely coupled
>>>>together.
>>>>>>> Anyway, this is the patch series with smaller patches.  Basically I
>>>>tried to
>>>>>>> separate data structure and bug-fix changes apart with one as the
>>>>main patch.
>>>>>>> Note I only made necessary code refactoring in order to separate
>>>>patch, apart
>>>>>>> from that, there is no change against the last version.
>>>>>>>
>>>>>>> This is the first patch introducing new internal function
>>>>IFN_LOOP_DIST_ALIAS.
>>>>>>> GCC will distribute loops under condition of this function call.
>>>>>>>
>>>>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>>>> Hi,
>>>>>> I need to update this patch fixing an issue in
>>>>>> vect_loop_dist_alias_call.  The previous patch fails to find some
>>>>>> IFN_LOOP_DIST_ALIAS calls.
>>>>>>
>>>>>> Bootstrap and test in series.  Is it OK?
>>>>>
>>>>> So I wonder if we really need to track ldist_alias_id or if we can do
>>>>sth
>>>>Yes, it is needed because otherwise we probably falsely trying to
>>>>search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
>>>>loop.
>>>>
>>>>> more "general", like tracking a copy_of or origin and then directly
>>>>> go to nearest_common_dominator (loop->header, copy_of->header)
>>>>> to find the controlling condition?
>>>>I tend to not record any pointer in loop structure, it can easily go
>>>>dangling for a across passes data structure.
>>>
>>> I didn't mean to record a pointer, just rename your field and make it more general.  The common dominator thing shod still work, no?
>> I might not be following.  If we record the original loop->num in the
>> renamed field, nearest_common_dominator can't work because we don't
>> have basic blocks to start the call?  The original loop could be
>> eliminated at several points, for example, instantly after
>> distribution, or folded in vectorizer for other loops distributed from
>> the original loop.
>> BTW, setting the copy_of/origin field in loop_version is not enough
>> for this use case, all generated loops (actually, except the versioned
>> loop) from distribution need to be set.
>
> Of course it would need to be set for all distributed loops.
>
> I'm not sure "loop vanishes" is the case to optimize for though.  If the loop
> is still available then origin->header should work as BB.  If not then can't
> we conclude, for the purpose of IFN_LOOP_DIST_ALIAS, that the whole
> region must be dead?  We still have to identify it of course, but it means
> we can fold stray IFN_LOOP_DIST_ALIAS calls left over after vectorization
> to whatever we like?
>
>>>
>>> As far as memory usage
>>>>is concerned.  I actually don't need a whole integer to record the
>>>>loop num.  I can simply restrict number of distributions in one
>>>>function to at most 256, and record such id in a char field in struct
>>>>loop?  Does this sounds better?
>>>
>>> As said, tracking loop origin sounds useful anyway so I'd rather add and use that somehow.
>> To be honest, I don't know.  the current field works like a unique
>> index of distribution operation.  The original loop could be destroyed
>> at different points thus no longer exists, this makes the recorded
>> copy_of/origin less meaningful?
>
> I think we talked about prologue and epilogue loops to be easier identifiable
> as so (and as to what "main" loop).  So lets say we have one "origin" field
> and accompaning flags "origin_is_loop_dist_alias_version",
> "origin_is_main_loop_of_prologue", etc.?  I can't think of the case where
> origin would be two things at the same time (you can always walk up
> the origin tree).
Hi,
Here is the updated patch working in this way.  There is still one
problem with this method.  Considering one distributed loop is
if-converted later, the orig loop for if-converted distributed loop is
different.  Though we can update orig_loop_num, it's inaccurate and
one consequence is we need to walk up dominance tree till entry_block.
Note if orig_loop_num is not shared, we can stop once basic block goes
beyond outer loop.
I didn't introduce flags in this patch, which can be done along with
solving the above problem.

>
> Vanishing origins could also be cleared btw, or we can make the new origin
> the origin of the vanishing origin...
Yes, I did this in distribution patch, so that dominance walking
routine can be simplified.

Bootstrap and test.  Is it OK?

Thanks,
bin

2017-06-27  Bin Cheng  <bin.cheng@arm.com>

* cfgloop.h (struct loop): Add comment.  New field orig_loop_num.
* cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
* internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
* internal-fn.def (LOOP_DIST_ALIAS): New.
* tree-vectorizer.c (fold_loop_vectorized_call): Rename to ...
(fold_loop_internal_call): ... this.
(vect_loop_dist_alias_call): New function.
(set_uid_loop_bbs): Call fold_loop_internal_call.
(vectorize_loops): Fold IFN_LOOP_VECTORIZED and IFN_LOOP_DIST_ALIAS
internal calls.

[-- Attachment #2: 0001-ifn_loop_dist_alias-20170621.txt.patch --]
[-- Type: text/x-patch, Size: 8267 bytes --]

From 8c3631f383a53653a40bd30c1133ca03f9cc6973 Mon Sep 17 00:00:00 2001
From: Bin Cheng <binche01@e108451-lin.cambridge.arm.com>
Date: Wed, 7 Jun 2017 13:04:03 +0100
Subject: [PATCH 01/13] ifn_loop_dist_alias-20170621.txt

---
 gcc/cfgloop.h         | 13 +++++++-
 gcc/cfgloopmanip.c    |  3 +-
 gcc/internal-fn.c     |  8 +++++
 gcc/internal-fn.def   |  1 +
 gcc/tree-vectorizer.c | 89 +++++++++++++++++++++++++++++++++++++++++++++------
 5 files changed, 103 insertions(+), 11 deletions(-)

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index a8bec1d..e7ffa23 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -114,7 +114,8 @@ struct GTY ((chain_next ("%h.next"))) control_iv {
 
 /* Structure to hold information for each natural loop.  */
 struct GTY ((chain_next ("%h.next"))) loop {
-  /* Index into loops array.  */
+  /* Index into loops array.  Note indices will never be reused after loop
+     is destroyed.  */
   int num;
 
   /* Number of loop insns.  */
@@ -225,6 +226,16 @@ struct GTY ((chain_next ("%h.next"))) loop {
      builtins.  */
   tree simduid;
 
+  /* In loop optimization, it's common to generate loops from the original
+     loop.  This field records the index of the original loop which can be
+     used to track the original loop from newly generated loops.  This can
+     be done by calling function get_loop (cfun, orig_loop_num).  Note the
+     original loop could be destroyed for various reasons thus no longer
+     exists, as a result, function call to get_loop returns NULL pointer.
+     In this case, this field should not be used and needs to be cleared
+     whenever possible.  */
+  int orig_loop_num;
+
   /* Upper bound on number of iterations of a loop.  */
   struct nb_iter_bound *bounds;
 
diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index d764ab9..adb2f65 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1653,7 +1653,8 @@ force_single_succ_latches (void)
 
   THEN_PROB is the probability of then branch of the condition.
   ELSE_PROB is the probability of else branch. Note that they may be both
-  REG_BR_PROB_BASE when condition is IFN_LOOP_VECTORIZED.  */
+  REG_BR_PROB_BASE when condition is IFN_LOOP_VECTORIZED or
+  IFN_LOOP_DIST_ALIAS.  */
 
 static basic_block
 lv_adjust_loop_entry_edge (basic_block first_head, basic_block second_head,
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index 75fe027..96e40cb 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -2250,6 +2250,14 @@ expand_LOOP_VECTORIZED (internal_fn, gcall *)
   gcc_unreachable ();
 }
 
+/* This should get folded in tree-vectorizer.c.  */
+
+static void
+expand_LOOP_DIST_ALIAS (internal_fn, gcall *)
+{
+  gcc_unreachable ();
+}
+
 /* Expand MASK_LOAD call STMT using optab OPTAB.  */
 
 static void
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index e162d81..79c19fb 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -158,6 +158,7 @@ DEF_INTERNAL_FN (GOMP_SIMD_LAST_LANE, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_ORDERED_START, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (GOMP_SIMD_ORDERED_END, ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (LOOP_VECTORIZED, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
+DEF_INTERNAL_FN (LOOP_DIST_ALIAS, ECF_NOVOPS | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (ANNOTATE,  ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)
 DEF_INTERNAL_FN (UBSAN_NULL, ECF_LEAF | ECF_NOTHROW, ".R.")
 DEF_INTERNAL_FN (UBSAN_BOUNDS, ECF_LEAF | ECF_NOTHROW, NULL)
diff --git a/gcc/tree-vectorizer.c b/gcc/tree-vectorizer.c
index 0d62c82..33a1f58 100644
--- a/gcc/tree-vectorizer.c
+++ b/gcc/tree-vectorizer.c
@@ -448,11 +448,11 @@ vect_loop_vectorized_call (struct loop *loop)
   return NULL;
 }
 
-/* Fold LOOP_VECTORIZED internal call G to VALUE and
-   update any immediate uses of it's LHS.  */
+/* Fold loop internal call G like IFN_LOOP_VECTORIZED/IFN_LOOP_DIST_ALIAS
+   to VALUE and update any immediate uses of it's LHS.  */
 
 static void
-fold_loop_vectorized_call (gimple *g, tree value)
+fold_loop_internal_call (gimple *g, tree value)
 {
   tree lhs = gimple_call_lhs (g);
   use_operand_p use_p;
@@ -469,6 +469,60 @@ fold_loop_vectorized_call (gimple *g, tree value)
     }
 }
 
+/* If LOOP has been versioned during loop distribution, return the gurading
+   internal call.  */
+
+static gimple *
+vect_loop_dist_alias_call (struct loop *loop)
+{
+  basic_block bb;
+  basic_block entry;
+  struct loop *outer, *orig;
+  gimple_stmt_iterator gsi;
+  gimple *g;
+
+  if (loop->orig_loop_num == 0)
+    return NULL;
+
+  orig = get_loop (cfun, loop->orig_loop_num);
+  if (orig == NULL)
+    {
+      /* The original loop is somehow destroyed.  Clear the information.  */
+      loop->orig_loop_num = 0;
+      return NULL;
+    }
+
+  if (loop != orig)
+    bb = nearest_common_dominator (CDI_DOMINATORS, loop->header, orig->header);
+  else
+    bb = loop_preheader_edge (loop)->src;
+
+  outer = bb->loop_father;
+  entry = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+
+  /* Look upward in dominance tree.  */
+  for (; bb != entry && flow_bb_inside_loop_p (outer, bb);
+       bb = get_immediate_dominator (CDI_DOMINATORS, bb))
+    {
+      g = last_stmt (bb);
+      if (g == NULL || gimple_code (g) != GIMPLE_COND)
+	continue;
+
+      gsi = gsi_for_stmt (g);
+      gsi_prev (&gsi);
+      if (gsi_end_p (gsi))
+	continue;
+
+      g = gsi_stmt (gsi);
+      /* The guarding internal function call must have the same distribution
+	 alias id.  */
+      if (gimple_call_internal_p (g, IFN_LOOP_DIST_ALIAS)
+	  && (tree_to_shwi (gimple_call_arg (g, 0)) == loop->orig_loop_num))
+	return g;
+    }
+  return NULL;
+}
+
 /* Set the uids of all the statements in basic blocks inside loop
    represented by LOOP_VINFO. LOOP_VECTORIZED_CALL is the internal
    call guarding the loop which has been if converted.  */
@@ -494,7 +548,7 @@ set_uid_loop_bbs (loop_vec_info loop_vinfo, gimple *loop_vectorized_call)
 	{
 	  arg = gimple_call_arg (g, 0);
 	  get_loop (cfun, tree_to_shwi (arg))->dont_vectorize = true;
-	  fold_loop_vectorized_call (g, boolean_false_node);
+	  fold_loop_internal_call (g, boolean_false_node);
 	}
     }
   bbs = get_loop_body (scalar_loop);
@@ -595,7 +649,7 @@ vectorize_loops (void)
     else
       {
 	loop_vec_info loop_vinfo, orig_loop_vinfo;
-	gimple *loop_vectorized_call;
+	gimple *loop_vectorized_call, *loop_dist_alias_call;
        try_vectorize:
 	if (!((flag_tree_loop_vectorize
 	       && optimize_loop_nest_for_speed_p (loop))
@@ -603,6 +657,7 @@ vectorize_loops (void)
 	  continue;
 	orig_loop_vinfo = NULL;
 	loop_vectorized_call = vect_loop_vectorized_call (loop);
+	loop_dist_alias_call = vect_loop_dist_alias_call (loop);
        vectorize_epilogue:
 	vect_location = find_loop_location (loop);
         if (LOCATION_LOCUS (vect_location) != UNKNOWN_LOCATION
@@ -652,8 +707,8 @@ vectorize_loops (void)
 		  {
 		    dump_printf_loc (MSG_OPTIMIZED_LOCATIONS, vect_location,
 				     "basic block vectorized\n");
-		    fold_loop_vectorized_call (loop_vectorized_call,
-					       boolean_true_node);
+		    fold_loop_internal_call (loop_vectorized_call,
+					     boolean_true_node);
 		    loop_vectorized_call = NULL;
 		    ret |= TODO_cleanup_cfg;
 		  }
@@ -706,10 +761,17 @@ vectorize_loops (void)
 
 	if (loop_vectorized_call)
 	  {
-	    fold_loop_vectorized_call (loop_vectorized_call, boolean_true_node);
+	    fold_loop_internal_call (loop_vectorized_call, boolean_true_node);
 	    loop_vectorized_call = NULL;
 	    ret |= TODO_cleanup_cfg;
 	  }
+	if (loop_dist_alias_call)
+	  {
+	    tree value = gimple_call_arg (loop_dist_alias_call, 1);
+	    fold_loop_internal_call (loop_dist_alias_call, value);
+	    loop_dist_alias_call = NULL;
+	    ret |= TODO_cleanup_cfg;
+	  }
 
 	if (new_loop)
 	  {
@@ -741,7 +803,16 @@ vectorize_loops (void)
 	    gimple *g = vect_loop_vectorized_call (loop);
 	    if (g)
 	      {
-		fold_loop_vectorized_call (g, boolean_false_node);
+		fold_loop_internal_call (g, boolean_false_node);
+		ret |= TODO_cleanup_cfg;
+		g = NULL;
+	      }
+	    else
+	      g = vect_loop_dist_alias_call (loop);
+
+	    if (g)
+	      {
+		fold_loop_internal_call (g, boolean_false_node);
 		ret |= TODO_cleanup_cfg;
 	      }
 	  }
-- 
1.9.1


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS
  2017-06-30 10:37             ` Bin.Cheng
@ 2017-07-03 11:48               ` Richard Biener
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Biener @ 2017-07-03 11:48 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: gcc-patches

On Fri, Jun 30, 2017 at 12:37 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Wed, Jun 28, 2017 at 8:29 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Jun 27, 2017 at 6:46 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Tue, Jun 27, 2017 at 3:59 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On June 27, 2017 4:27:17 PM GMT+02:00, "Bin.Cheng" <amker.cheng@gmail.com> wrote:
>>>>>On Tue, Jun 27, 2017 at 1:58 PM, Richard Biener
>>>>><richard.guenther@gmail.com> wrote:
>>>>>> On Fri, Jun 23, 2017 at 12:10 PM, Bin.Cheng <amker.cheng@gmail.com>
>>>>>wrote:
>>>>>>> On Mon, Jun 12, 2017 at 6:02 PM, Bin Cheng <Bin.Cheng@arm.com>
>>>>>wrote:
>>>>>>>> Hi,
>>>>>>>> I was asked by upstream to split the loop distribution patch into
>>>>>small ones.
>>>>>>>> It is hard because data structure and algorithm are closely coupled
>>>>>together.
>>>>>>>> Anyway, this is the patch series with smaller patches.  Basically I
>>>>>tried to
>>>>>>>> separate data structure and bug-fix changes apart with one as the
>>>>>main patch.
>>>>>>>> Note I only made necessary code refactoring in order to separate
>>>>>patch, apart
>>>>>>>> from that, there is no change against the last version.
>>>>>>>>
>>>>>>>> This is the first patch introducing new internal function
>>>>>IFN_LOOP_DIST_ALIAS.
>>>>>>>> GCC will distribute loops under condition of this function call.
>>>>>>>>
>>>>>>>> Bootstrap and test on x86_64 and AArch64.  Is it OK?
>>>>>>> Hi,
>>>>>>> I need to update this patch fixing an issue in
>>>>>>> vect_loop_dist_alias_call.  The previous patch fails to find some
>>>>>>> IFN_LOOP_DIST_ALIAS calls.
>>>>>>>
>>>>>>> Bootstrap and test in series.  Is it OK?
>>>>>>
>>>>>> So I wonder if we really need to track ldist_alias_id or if we can do
>>>>>sth
>>>>>Yes, it is needed because otherwise we probably falsely trying to
>>>>>search for IFN_LOOP_DIST_ALIAS for a normal (not from distribution)
>>>>>loop.
>>>>>
>>>>>> more "general", like tracking a copy_of or origin and then directly
>>>>>> go to nearest_common_dominator (loop->header, copy_of->header)
>>>>>> to find the controlling condition?
>>>>>I tend to not record any pointer in loop structure, it can easily go
>>>>>dangling for a across passes data structure.
>>>>
>>>> I didn't mean to record a pointer, just rename your field and make it more general.  The common dominator thing shod still work, no?
>>> I might not be following.  If we record the original loop->num in the
>>> renamed field, nearest_common_dominator can't work because we don't
>>> have basic blocks to start the call?  The original loop could be
>>> eliminated at several points, for example, instantly after
>>> distribution, or folded in vectorizer for other loops distributed from
>>> the original loop.
>>> BTW, setting the copy_of/origin field in loop_version is not enough
>>> for this use case, all generated loops (actually, except the versioned
>>> loop) from distribution need to be set.
>>
>> Of course it would need to be set for all distributed loops.
>>
>> I'm not sure "loop vanishes" is the case to optimize for though.  If the loop
>> is still available then origin->header should work as BB.  If not then can't
>> we conclude, for the purpose of IFN_LOOP_DIST_ALIAS, that the whole
>> region must be dead?  We still have to identify it of course, but it means
>> we can fold stray IFN_LOOP_DIST_ALIAS calls left over after vectorization
>> to whatever we like?
>>
>>>>
>>>> As far as memory usage
>>>>>is concerned.  I actually don't need a whole integer to record the
>>>>>loop num.  I can simply restrict number of distributions in one
>>>>>function to at most 256, and record such id in a char field in struct
>>>>>loop?  Does this sounds better?
>>>>
>>>> As said, tracking loop origin sounds useful anyway so I'd rather add and use that somehow.
>>> To be honest, I don't know.  the current field works like a unique
>>> index of distribution operation.  The original loop could be destroyed
>>> at different points thus no longer exists, this makes the recorded
>>> copy_of/origin less meaningful?
>>
>> I think we talked about prologue and epilogue loops to be easier identifiable
>> as so (and as to what "main" loop).  So lets say we have one "origin" field
>> and accompaning flags "origin_is_loop_dist_alias_version",
>> "origin_is_main_loop_of_prologue", etc.?  I can't think of the case where
>> origin would be two things at the same time (you can always walk up
>> the origin tree).
> Hi,
> Here is the updated patch working in this way.  There is still one
> problem with this method.  Considering one distributed loop is
> if-converted later, the orig loop for if-converted distributed loop is
> different.  Though we can update orig_loop_num, it's inaccurate and
> one consequence is we need to walk up dominance tree till entry_block.
> Note if orig_loop_num is not shared, we can stop once basic block goes
> beyond outer loop.
> I didn't introduce flags in this patch, which can be done along with
> solving the above problem.
>
>>
>> Vanishing origins could also be cleared btw, or we can make the new origin
>> the origin of the vanishing origin...
> Yes, I did this in distribution patch, so that dominance walking
> routine can be simplified.
>
> Bootstrap and test.  Is it OK?

Ok.

Thanks,
Richard.

> Thanks,
> bin
>
> 2017-06-27  Bin Cheng  <bin.cheng@arm.com>
>
> * cfgloop.h (struct loop): Add comment.  New field orig_loop_num.
> * cfgloopmanip.c (lv_adjust_loop_entry_edge): Comment change.
> * internal-fn.c (expand_LOOP_DIST_ALIAS): New function.
> * internal-fn.def (LOOP_DIST_ALIAS): New.
> * tree-vectorizer.c (fold_loop_vectorized_call): Rename to ...
> (fold_loop_internal_call): ... this.
> (vect_loop_dist_alias_call): New function.
> (set_uid_loop_bbs): Call fold_loop_internal_call.
> (vectorize_loops): Fold IFN_LOOP_VECTORIZED and IFN_LOOP_DIST_ALIAS
> internal calls.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-07-03 11:48 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-06-12 17:02 [PATCH GCC][01/13]Introduce internal function IFN_LOOP_DIST_ALIAS Bin Cheng
2017-06-23 10:10 ` Bin.Cheng
2017-06-26 10:23   ` Richard Sandiford
2017-06-27 12:58   ` Richard Biener
2017-06-27 14:27     ` Bin.Cheng
2017-06-27 14:59       ` Richard Biener
2017-06-27 16:46         ` Bin.Cheng
2017-06-28  7:29           ` Richard Biener
2017-06-30 10:37             ` Bin.Cheng
2017-07-03 11:48               ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).