public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH GCC][v2]Simplify alias check code generation in vectorizer
@ 2016-09-20 15:30 Bin Cheng
  2016-09-21 14:21 ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Bin Cheng @ 2016-09-20 15:30 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 3417 bytes --]

Hi,
I originally posted a patch improving code generation for alias check in vectorizer at https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00929.html.  Here it's the 2nd version (better) patch.  It detects data reference pair in which the two references are only different to each other wrto index.  In this case the patch generates intersect range checks based on index of address of reference, rather than the address of reference.  Take example from patch's comment, given below two data references:

                       DR_A                           DR_B
      data-ref         arr[i]                         arr[j]
      base_object      arr                            arr
      index            {i_0, +, 1}_loop               {j_0, +, 1}_loop

The addresses and their index can be depicted as:

        |<- ADDR_A    ->|          |<- ADDR_B    ->|
     ------------------------------------------------------->
        |   |   |   |   |          |   |   |   |   |
     ------------------------------------------------------->
        i_0 ...         i_0+4      j_0 ...         j_0+4

We can create expression based on index rather than address:  (i_0 + 4 < j_0 || j_0 + 4 < i_0).

Also take example from spec2k/200.sixtrack/DACOP, gcc needs to generate alias check for three pairs:
//pair 1
dr_a:
(Data Ref:
  bb: 8
  stmt: _92 = da.cc[_27];
  ref: da.cc[_27];
)
dr_b:
(Data Ref:
  bb: 8
  stmt: da.cc[_93] = _92;
  ref: da.cc[_93];
)
//pair 2
dr_a:
(Data Ref:
  bb: 8
  stmt: pretmp_29 = da.i2[_27];
  ref: da.i2[_27];
)
dr_b:
(Data Ref:
  bb: 8
  stmt: da.i2[_93] = pretmp_29;
  ref: da.i2[_93];
)
//pair 3
dr_a:
(Data Ref:
  bb: 8
  stmt: pretmp_28 = da.i1[_27];
  ref: da.i1[_27];
)
dr_b:
(Data Ref:
  bb: 8
  stmt: da.i1[_93] = pretmp_28;
  ref: da.i1[_93];
)

With this patch, code can be improved to:

  <bb 7>:
  _37 = (unsigned int) _6;
  _106 = (unsigned int) _52;
  _107 = _37 - _106;
  _108 = _107 > 7;
  _109 = (integer(kind=8)) _2;
  _110 = _109 + 3;
  _111 = (integer(kind=8)) _52;
  _112 = _111 + -1;
  _113 = _110 < _112;
  _114 = (integer(kind=8)) _52;
  _115 = _114 + 2;
  _116 = (integer(kind=8)) _2;
  _117 = _115 < _116;
  _118 = _113 | _117;
  _119 = _108 & _118;
  _120 = (integer(kind=8)) _2;
  _121 = _120 + 3;
  _122 = (integer(kind=8)) _52;
  _123 = _122 + -1;
  _124 = _121 < _123;
  _125 = (integer(kind=8)) _52;
  _126 = _125 + 2;
  _127 = (integer(kind=8)) _2;
  _128 = _126 < _127;
  _129 = _124 | _128;
  _130 = _119 & _129;
  _131 = (integer(kind=8)) _2;
  _132 = _131 + 3;
  _133 = (integer(kind=8)) _52;
  _134 = _133 + -1;
  _135 = _132 < _134;
  _136 = (integer(kind=8)) _52;
  _137 = _136 + 2;
  _138 = (integer(kind=8)) _2;
  _139 = _137 < _138;
  _140 = _135 | _139;
  _141 = _130 & _140;
  if (_141 != 0)
    goto <bb 8>;
  else
    goto <bb 20>;

Note this patch doesn't do local CSE, and common sub-expressions will be optimized by later passes.  I think this is OK because the redundancy is far away from loops thus won't have bad effect (there is an opposite example/PR).
Bootstrap and test on x86_64 and AArch64, is it OK?

Thanks,
bin

2016-09-19  Bin Cheng  <bin.cheng@arm.com>

	* tree-vect-loop-manip.c (create_intersect_range_checks_index): New.
	(create_intersect_range_checks): New.
	(vect_create_cond_for_alias_checks): Call above function.

[-- Attachment #2: check-alias-on-index-20160918.txt --]
[-- Type: text/plain, Size: 10343 bytes --]

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 01d6bb1..c7d130e 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -2263,6 +2263,179 @@ vect_create_cond_for_align_checks (loop_vec_info loop_vinfo,
     *cond_expr = part_cond_expr;
 }
 
+/* Given two data references and segment lengths described by DR_A and DR_B,
+   create expression checking if the two addresses ranges intersect with
+   each other based on index of the two addresses.  This can only be done
+   if DR_A and DR_B referring to the same (array) object and the index is
+   the only difference.  For example:
+
+                       DR_A                           DR_B
+      data-ref         arr[i]                         arr[j]
+      base_object      arr                            arr
+      index            {i_0, +, 1}_loop               {j_0, +, 1}_loop
+
+   The addresses and their index are like:
+
+        |<- ADDR_A    ->|          |<- ADDR_B    ->|
+     ------------------------------------------------------->
+        |   |   |   |   |          |   |   |   |   |
+     ------------------------------------------------------->
+        i_0 ...         i_0+4      j_0 ...         j_0+4
+
+   We can create expression based on index rather than address:
+     (i_0 + 4 < j_0 || j_0 + 4 < i_0).  */
+
+static bool
+create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
+				     const dr_with_seg_len& dr_a,
+				     const dr_with_seg_len& dr_b)
+{
+  if (integer_zerop (DR_STEP (dr_a.dr))
+      || integer_zerop (DR_STEP (dr_b.dr))
+      || DR_NUM_DIMENSIONS (dr_a.dr) != DR_NUM_DIMENSIONS (dr_b.dr))
+    return false;
+
+  if (!tree_fits_uhwi_p (dr_a.seg_len) || !tree_fits_uhwi_p (dr_b.seg_len))
+    return false;
+
+  if (!operand_equal_p (DR_BASE_OBJECT (dr_a.dr), DR_BASE_OBJECT (dr_b.dr), 0))
+    return false;
+
+  if (!operand_equal_p (DR_STEP (dr_a.dr), DR_STEP (dr_b.dr), 0))
+    return false;
+
+  gcc_assert (TREE_CODE (DR_STEP (dr_a.dr)) == INTEGER_CST);
+
+  bool neg_step = tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0;
+  unsigned HOST_WIDE_INT abs_step = tree_to_uhwi (DR_STEP (dr_a.dr));
+  if (neg_step)
+    abs_step = -abs_step;
+
+  unsigned HOST_WIDE_INT seg_len1 = tree_to_uhwi (dr_a.seg_len);
+  unsigned HOST_WIDE_INT seg_len2 = tree_to_uhwi (dr_b.seg_len);
+  /* Infer index length from segment length.  */
+  unsigned HOST_WIDE_INT idx_len1 = (seg_len1 + abs_step - 1) / abs_step;
+  unsigned HOST_WIDE_INT idx_len2 = (seg_len2 + abs_step - 1) / abs_step;
+
+  unsigned int i;
+  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
+  for (i = 0; i < DR_NUM_DIMENSIONS (dr_a.dr); i++)
+    {
+      tree access1 = DR_ACCESS_FN (dr_a.dr, i);
+      tree access2 = DR_ACCESS_FN (dr_b.dr, i);
+      /* Two index must be the same if they are not scev.  */
+      if (TREE_CODE (access1) != POLYNOMIAL_CHREC
+	  || TREE_CODE (access2) != POLYNOMIAL_CHREC)
+	{
+	  if (operand_equal_p (access1, access2, 0))
+	    continue;
+
+	  return false;
+	}
+      /* Two index must be the same if they are not scev wrto current loop.  */
+      if (CHREC_VARIABLE (access1) != (unsigned)loop->num
+	  || CHREC_VARIABLE (access2) != (unsigned)loop->num)
+	{
+	  if (operand_equal_p (access1, access2, 0))
+	    continue;
+
+	  return false;
+	}
+      /* Two index must have the same step.  */
+      if (!operand_equal_p (CHREC_RIGHT (access1), CHREC_RIGHT (access2), 0))
+	return false;
+
+      tree min1 = CHREC_LEFT (access1);
+      tree min2 = CHREC_LEFT (access2);
+      if (!types_compatible_p (TREE_TYPE (min1), TREE_TYPE (min2)))
+	return false;
+
+      tree max1 = fold_build2 (PLUS_EXPR, TREE_TYPE (min1), min1,
+			       build_int_cst (TREE_TYPE (min1), idx_len1));
+      tree max2 = fold_build2 (PLUS_EXPR, TREE_TYPE (min2), min2,
+			       build_int_cst (TREE_TYPE (min2), idx_len2));
+      /* Adjust ranges for negative step.  */
+      if (neg_step)
+	{
+	  min1 = fold_build2 (MINUS_EXPR, TREE_TYPE (min1), max1,
+			      build_int_cst (TREE_TYPE (min1), 1));
+	  max1 = fold_build2 (PLUS_EXPR, TREE_TYPE (min1),
+			      CHREC_LEFT (access1),
+			      build_int_cst (TREE_TYPE (min1), 1));
+	  min2 = fold_build2 (MINUS_EXPR, TREE_TYPE (min2), max2,
+			      build_int_cst (TREE_TYPE (min2), 1));
+	  max2 = fold_build2 (PLUS_EXPR, TREE_TYPE (min2),
+			      CHREC_LEFT (access2),
+			      build_int_cst (TREE_TYPE (min2), 1));
+	}
+      tree part_cond_expr
+	= fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
+	    fold_build2 (LE_EXPR, boolean_type_node, max1, min2),
+	    fold_build2 (LE_EXPR, boolean_type_node, max2, min1));
+      if (*cond_expr)
+	*cond_expr = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
+				  *cond_expr, part_cond_expr);
+      else
+	*cond_expr = part_cond_expr;
+    }
+  return true;
+}
+
+/* Given two data references and segment lengths described by DR_A and DR_B,
+   create expression checking if the two addresses ranges intersect with
+   each other:
+
+     ((DR_A_addr_0 + DR_A_segment_length_0) <= DR_B_addr_0)
+     || (DR_B_addr_0 + DER_B_segment_length_0) <= DR_A_addr_0))  */
+
+static void
+create_intersect_range_checks (loop_vec_info loop_vinfo, tree *cond_expr,
+			       const dr_with_seg_len& dr_a,
+			       const dr_with_seg_len& dr_b)
+{
+  *cond_expr = NULL_TREE;
+  if (create_intersect_range_checks_index (loop_vinfo, cond_expr, dr_a, dr_b))
+    return;
+
+  tree segment_length_a = dr_a.seg_len;
+  tree segment_length_b = dr_b.seg_len;
+  tree addr_base_a = DR_BASE_ADDRESS (dr_a.dr);
+  tree addr_base_b = DR_BASE_ADDRESS (dr_b.dr);
+  tree offset_a = DR_OFFSET (dr_a.dr), offset_b = DR_OFFSET (dr_b.dr);
+
+  offset_a = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_a),
+			  offset_a, DR_INIT (dr_a.dr));
+  offset_b = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_b),
+			  offset_b, DR_INIT (dr_b.dr));
+  addr_base_a = fold_build_pointer_plus (addr_base_a, offset_a);
+  addr_base_b = fold_build_pointer_plus (addr_base_b, offset_b);
+
+  tree seg_a_min = addr_base_a;
+  tree seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
+  /* For negative step, we need to adjust address range by TYPE_SIZE_UNIT
+     bytes, e.g., int a[3] -> a[1] range is [a+4, a+16) instead of
+     [a, a+12) */
+  if (tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0)
+    {
+      tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr_a.dr)));
+      seg_a_min = fold_build_pointer_plus (seg_a_max, unit_size);
+      seg_a_max = fold_build_pointer_plus (addr_base_a, unit_size);
+    }
+
+  tree seg_b_min = addr_base_b;
+  tree seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
+  if (tree_int_cst_compare (DR_STEP (dr_b.dr), size_zero_node) < 0)
+    {
+      tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr_b.dr)));
+      seg_b_min = fold_build_pointer_plus (seg_b_max, unit_size);
+      seg_b_max = fold_build_pointer_plus (addr_base_b, unit_size);
+    }
+  *cond_expr
+    = fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
+	fold_build2 (LE_EXPR, boolean_type_node, seg_a_max, seg_b_min),
+	fold_build2 (LE_EXPR, boolean_type_node, seg_b_max, seg_a_min));
+}
+
 /* Function vect_create_cond_for_alias_checks.
 
    Create a conditional expression that represents the run-time checks for
@@ -2290,15 +2463,6 @@ vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
     LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo);
   tree part_cond_expr;
 
-  /* Create expression
-     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
-     || (load_ptr_0 + load_segment_length_0) <= store_ptr_0))
-     &&
-     ...
-     &&
-     ((store_ptr_n + store_segment_length_n) <= load_ptr_n)
-     || (load_ptr_n + load_segment_length_n) <= store_ptr_n))  */
-
   if (comp_alias_ddrs.is_empty ())
     return;
 
@@ -2306,18 +2470,6 @@ vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
     {
       const dr_with_seg_len& dr_a = comp_alias_ddrs[i].first;
       const dr_with_seg_len& dr_b = comp_alias_ddrs[i].second;
-      tree segment_length_a = dr_a.seg_len;
-      tree segment_length_b = dr_b.seg_len;
-      tree addr_base_a = DR_BASE_ADDRESS (dr_a.dr);
-      tree addr_base_b = DR_BASE_ADDRESS (dr_b.dr);
-      tree offset_a = DR_OFFSET (dr_a.dr), offset_b = DR_OFFSET (dr_b.dr);
-
-      offset_a = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_a),
-			      offset_a, DR_INIT (dr_a.dr));
-      offset_b = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_b),
-			      offset_b, DR_INIT (dr_b.dr));
-      addr_base_a = fold_build_pointer_plus (addr_base_a, offset_a);
-      addr_base_b = fold_build_pointer_plus (addr_base_b, offset_b);
 
       if (dump_enabled_p ())
 	{
@@ -2329,32 +2481,8 @@ vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
 	  dump_printf (MSG_NOTE, "\n");
 	}
 
-      tree seg_a_min = addr_base_a;
-      tree seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
-      /* For negative step, we need to adjust address range by TYPE_SIZE_UNIT
-	 bytes, e.g., int a[3] -> a[1] range is [a+4, a+16) instead of
-	 [a, a+12) */
-      if (tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0)
-	{
-	  tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr_a.dr)));
-	  seg_a_min = fold_build_pointer_plus (seg_a_max, unit_size);
-	  seg_a_max = fold_build_pointer_plus (addr_base_a, unit_size);
-	}
-
-      tree seg_b_min = addr_base_b;
-      tree seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
-      if (tree_int_cst_compare (DR_STEP (dr_b.dr), size_zero_node) < 0)
-	{
-	  tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr_b.dr)));
-	  seg_b_min = fold_build_pointer_plus (seg_b_max, unit_size);
-	  seg_b_max = fold_build_pointer_plus (addr_base_b, unit_size);
-	}
-
-      part_cond_expr =
-      	fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
-	  fold_build2 (LE_EXPR, boolean_type_node, seg_a_max, seg_b_min),
-	  fold_build2 (LE_EXPR, boolean_type_node, seg_b_max, seg_a_min));
-
+      /* Create condition expression for each pair data references.  */
+      create_intersect_range_checks (loop_vinfo, &part_cond_expr, dr_a, dr_b);
       if (*cond_expr)
 	*cond_expr = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
 				  *cond_expr, part_cond_expr);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-20 15:30 [PATCH GCC][v2]Simplify alias check code generation in vectorizer Bin Cheng
@ 2016-09-21 14:21 ` Richard Biener
  2016-09-22 10:06   ` Bin.Cheng
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-09-21 14:21 UTC (permalink / raw)
  To: Bin Cheng; +Cc: gcc-patches, nd

On Tue, Sep 20, 2016 at 5:25 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> I originally posted a patch improving code generation for alias check in vectorizer at https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00929.html.  Here it's the 2nd version (better) patch.  It detects data reference pair in which the two references are only different to each other wrto index.  In this case the patch generates intersect range checks based on index of address of reference, rather than the address of reference.  Take example from patch's comment, given below two data references:
>
>                        DR_A                           DR_B
>       data-ref         arr[i]                         arr[j]
>       base_object      arr                            arr
>       index            {i_0, +, 1}_loop               {j_0, +, 1}_loop
>
> The addresses and their index can be depicted as:
>
>         |<- ADDR_A    ->|          |<- ADDR_B    ->|
>      ------------------------------------------------------->
>         |   |   |   |   |          |   |   |   |   |
>      ------------------------------------------------------->
>         i_0 ...         i_0+4      j_0 ...         j_0+4
>
> We can create expression based on index rather than address:  (i_0 + 4 < j_0 || j_0 + 4 < i_0).
>
> Also take example from spec2k/200.sixtrack/DACOP, gcc needs to generate alias check for three pairs:
> //pair 1
> dr_a:
> (Data Ref:
>   bb: 8
>   stmt: _92 = da.cc[_27];
>   ref: da.cc[_27];
> )
> dr_b:
> (Data Ref:
>   bb: 8
>   stmt: da.cc[_93] = _92;
>   ref: da.cc[_93];
> )
> //pair 2
> dr_a:
> (Data Ref:
>   bb: 8
>   stmt: pretmp_29 = da.i2[_27];
>   ref: da.i2[_27];
> )
> dr_b:
> (Data Ref:
>   bb: 8
>   stmt: da.i2[_93] = pretmp_29;
>   ref: da.i2[_93];
> )
> //pair 3
> dr_a:
> (Data Ref:
>   bb: 8
>   stmt: pretmp_28 = da.i1[_27];
>   ref: da.i1[_27];
> )
> dr_b:
> (Data Ref:
>   bb: 8
>   stmt: da.i1[_93] = pretmp_28;
>   ref: da.i1[_93];
> )
>
> With this patch, code can be improved to:
>
>   <bb 7>:
>   _37 = (unsigned int) _6;
>   _106 = (unsigned int) _52;
>   _107 = _37 - _106;
>   _108 = _107 > 7;
>   _109 = (integer(kind=8)) _2;
>   _110 = _109 + 3;
>   _111 = (integer(kind=8)) _52;
>   _112 = _111 + -1;
>   _113 = _110 < _112;
>   _114 = (integer(kind=8)) _52;
>   _115 = _114 + 2;
>   _116 = (integer(kind=8)) _2;
>   _117 = _115 < _116;
>   _118 = _113 | _117;
>   _119 = _108 & _118;
>   _120 = (integer(kind=8)) _2;
>   _121 = _120 + 3;
>   _122 = (integer(kind=8)) _52;
>   _123 = _122 + -1;
>   _124 = _121 < _123;
>   _125 = (integer(kind=8)) _52;
>   _126 = _125 + 2;
>   _127 = (integer(kind=8)) _2;
>   _128 = _126 < _127;
>   _129 = _124 | _128;
>   _130 = _119 & _129;
>   _131 = (integer(kind=8)) _2;
>   _132 = _131 + 3;
>   _133 = (integer(kind=8)) _52;
>   _134 = _133 + -1;
>   _135 = _132 < _134;
>   _136 = (integer(kind=8)) _52;
>   _137 = _136 + 2;
>   _138 = (integer(kind=8)) _2;
>   _139 = _137 < _138;
>   _140 = _135 | _139;
>   _141 = _130 & _140;
>   if (_141 != 0)
>     goto <bb 8>;
>   else
>     goto <bb 20>;
>
> Note this patch doesn't do local CSE, and common sub-expressions will be optimized by later passes.  I think this is OK because the redundancy is far away from loops thus won't have bad effect (there is an opposite example/PR).
> Bootstrap and test on x86_64 and AArch64, is it OK?

Seeing

+  /* Infer index length from segment length.  */
+  unsigned HOST_WIDE_INT idx_len1 = (seg_len1 + abs_step - 1) / abs_step;
+  unsigned HOST_WIDE_INT idx_len2 = (seg_len2 + abs_step - 1) / abs_step;

Does this really work for DR_STEP that is bigger than 1 in index space?  The
segment length is generally vectorization-factor * DR_STEP but index space
is -- if you are talking about array-ref indexes -- the segment length divided
by the array element size.  Note that what the index space refers to for
a given DR_ACCESS_FN depends on whether this is from an ARRAY_REF
(element size) or the base pointer evolution (bytes).  I suppose looking at
the access functions type might distinguish the pointer vs. array index case.

What I'm looking for is a big comment on why the above "translation" is
correct.

Thanks,
Richard.

> Thanks,
> bin
>
> 2016-09-19  Bin Cheng  <bin.cheng@arm.com>
>
>         * tree-vect-loop-manip.c (create_intersect_range_checks_index): New.
>         (create_intersect_range_checks): New.
>         (vect_create_cond_for_alias_checks): Call above function.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-21 14:21 ` Richard Biener
@ 2016-09-22 10:06   ` Bin.Cheng
  2016-09-22 10:28     ` Richard Biener
                       ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Bin.Cheng @ 2016-09-22 10:06 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 5552 bytes --]

On Wed, Sep 21, 2016 at 3:01 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Sep 20, 2016 at 5:25 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>> Hi,
>> I originally posted a patch improving code generation for alias check in vectorizer at https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00929.html.  Here it's the 2nd version (better) patch.  It detects data reference pair in which the two references are only different to each other wrto index.  In this case the patch generates intersect range checks based on index of address of reference, rather than the address of reference.  Take example from patch's comment, given below two data references:
>>
>>                        DR_A                           DR_B
>>       data-ref         arr[i]                         arr[j]
>>       base_object      arr                            arr
>>       index            {i_0, +, 1}_loop               {j_0, +, 1}_loop
>>
>> The addresses and their index can be depicted as:
>>
>>         |<- ADDR_A    ->|          |<- ADDR_B    ->|
>>      ------------------------------------------------------->
>>         |   |   |   |   |          |   |   |   |   |
>>      ------------------------------------------------------->
>>         i_0 ...         i_0+4      j_0 ...         j_0+4
>>
>> We can create expression based on index rather than address:  (i_0 + 4 < j_0 || j_0 + 4 < i_0).
>>
>> Also take example from spec2k/200.sixtrack/DACOP, gcc needs to generate alias check for three pairs:
>> //pair 1
>> dr_a:
>> (Data Ref:
>>   bb: 8
>>   stmt: _92 = da.cc[_27];
>>   ref: da.cc[_27];
>> )
>> dr_b:
>> (Data Ref:
>>   bb: 8
>>   stmt: da.cc[_93] = _92;
>>   ref: da.cc[_93];
>> )
>> //pair 2
>> dr_a:
>> (Data Ref:
>>   bb: 8
>>   stmt: pretmp_29 = da.i2[_27];
>>   ref: da.i2[_27];
>> )
>> dr_b:
>> (Data Ref:
>>   bb: 8
>>   stmt: da.i2[_93] = pretmp_29;
>>   ref: da.i2[_93];
>> )
>> //pair 3
>> dr_a:
>> (Data Ref:
>>   bb: 8
>>   stmt: pretmp_28 = da.i1[_27];
>>   ref: da.i1[_27];
>> )
>> dr_b:
>> (Data Ref:
>>   bb: 8
>>   stmt: da.i1[_93] = pretmp_28;
>>   ref: da.i1[_93];
>> )
>>
>> With this patch, code can be improved to:
>>
>>   <bb 7>:
>>   _37 = (unsigned int) _6;
>>   _106 = (unsigned int) _52;
>>   _107 = _37 - _106;
>>   _108 = _107 > 7;
>>   _109 = (integer(kind=8)) _2;
>>   _110 = _109 + 3;
>>   _111 = (integer(kind=8)) _52;
>>   _112 = _111 + -1;
>>   _113 = _110 < _112;
>>   _114 = (integer(kind=8)) _52;
>>   _115 = _114 + 2;
>>   _116 = (integer(kind=8)) _2;
>>   _117 = _115 < _116;
>>   _118 = _113 | _117;
>>   _119 = _108 & _118;
>>   _120 = (integer(kind=8)) _2;
>>   _121 = _120 + 3;
>>   _122 = (integer(kind=8)) _52;
>>   _123 = _122 + -1;
>>   _124 = _121 < _123;
>>   _125 = (integer(kind=8)) _52;
>>   _126 = _125 + 2;
>>   _127 = (integer(kind=8)) _2;
>>   _128 = _126 < _127;
>>   _129 = _124 | _128;
>>   _130 = _119 & _129;
>>   _131 = (integer(kind=8)) _2;
>>   _132 = _131 + 3;
>>   _133 = (integer(kind=8)) _52;
>>   _134 = _133 + -1;
>>   _135 = _132 < _134;
>>   _136 = (integer(kind=8)) _52;
>>   _137 = _136 + 2;
>>   _138 = (integer(kind=8)) _2;
>>   _139 = _137 < _138;
>>   _140 = _135 | _139;
>>   _141 = _130 & _140;
>>   if (_141 != 0)
>>     goto <bb 8>;
>>   else
>>     goto <bb 20>;
>>
>> Note this patch doesn't do local CSE, and common sub-expressions will be optimized by later passes.  I think this is OK because the redundancy is far away from loops thus won't have bad effect (there is an opposite example/PR).
>> Bootstrap and test on x86_64 and AArch64, is it OK?
>
> Seeing
>
> +  /* Infer index length from segment length.  */
> +  unsigned HOST_WIDE_INT idx_len1 = (seg_len1 + abs_step - 1) / abs_step;
> +  unsigned HOST_WIDE_INT idx_len2 = (seg_len2 + abs_step - 1) / abs_step;
>
> Does this really work for DR_STEP that is bigger than 1 in index space?  The
I think it works.  Since both segment length of memory accessed and
index length evaluated are induction variable wrto loop iterations,
they are linear functions of the niters, with DR_STEP and IDX_STEP as
their coefficient, like
  segment_length = niters * DR_STEP
  index_length = niters * IDX_STEP
The affine relation between segment_length and index_length is
guaranteed by SCEV, otherwise, the loop won't be eligible for
vectorization.  So I think it's equal between check segment and index
here.
Follow your reviews, I did find a problem in the original patch
because I forgot to multiply niters with IDX_STEP in computing
index_length.  This is fixed now.  I also added explaining comment.

> segment length is generally vectorization-factor * DR_STEP but index space
> is -- if you are talking about array-ref indexes -- the segment length divided
> by the array element size.  Note that what the index space refers to for
> a given DR_ACCESS_FN depends on whether this is from an ARRAY_REF
> (element size) or the base pointer evolution (bytes).  I suppose looking at
> the access functions type might distinguish the pointer vs. array index case.
? Looking at access functions is the reason we can't distinguish
between pointer and array reference because their access function are
totally different.  On the contrary, we need to look into (innermost)
evolution behavior in this case I think?

>
> What I'm looking for is a big comment on why the above "translation" is
> correct.
Comments added.  Bootstrap and test, is it reasonable?

Thanks,
bin

[-- Attachment #2: check-alias-on-index-20160921.txt --]
[-- Type: text/plain, Size: 11397 bytes --]

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 01d6bb1..8203040 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -2263,6 +2263,194 @@ vect_create_cond_for_align_checks (loop_vec_info loop_vinfo,
     *cond_expr = part_cond_expr;
 }
 
+/* Given two data references and segment lengths described by DR_A and DR_B,
+   create expression checking if the two addresses ranges intersect with
+   each other based on index of the two addresses.  This can only be done
+   if DR_A and DR_B referring to the same (array) object and the index is
+   the only difference.  For example:
+
+                       DR_A                           DR_B
+      data-ref         arr[i]                         arr[j]
+      base_object      arr                            arr
+      index            {i_0, +, 1}_loop               {j_0, +, 1}_loop
+
+   The addresses and their index are like:
+
+        |<- ADDR_A    ->|          |<- ADDR_B    ->|
+     ------------------------------------------------------->
+        |   |   |   |   |          |   |   |   |   |
+     ------------------------------------------------------->
+        i_0 ...         i_0+4      j_0 ...         j_0+4
+
+   We can create expression based on index rather than address:
+
+     (i_0 + 4 < j_0 || j_0 + 4 < i_0)
+
+   Note evolution step of index needs to be considered in comparison.  */
+
+static bool
+create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
+				     const dr_with_seg_len& dr_a,
+				     const dr_with_seg_len& dr_b)
+{
+  if (integer_zerop (DR_STEP (dr_a.dr))
+      || integer_zerop (DR_STEP (dr_b.dr))
+      || DR_NUM_DIMENSIONS (dr_a.dr) != DR_NUM_DIMENSIONS (dr_b.dr))
+    return false;
+
+  if (!tree_fits_uhwi_p (dr_a.seg_len) || !tree_fits_uhwi_p (dr_b.seg_len))
+    return false;
+
+  if (!operand_equal_p (DR_BASE_OBJECT (dr_a.dr), DR_BASE_OBJECT (dr_b.dr), 0))
+    return false;
+
+  if (!operand_equal_p (DR_STEP (dr_a.dr), DR_STEP (dr_b.dr), 0))
+    return false;
+
+  gcc_assert (TREE_CODE (DR_STEP (dr_a.dr)) == INTEGER_CST);
+
+  bool neg_step = tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0;
+  unsigned HOST_WIDE_INT abs_step = tree_to_uhwi (DR_STEP (dr_a.dr));
+  if (neg_step)
+    abs_step = -abs_step;
+
+  unsigned HOST_WIDE_INT seg_len1 = tree_to_uhwi (dr_a.seg_len);
+  unsigned HOST_WIDE_INT seg_len2 = tree_to_uhwi (dr_b.seg_len);
+  /* Infer the number of iterations with which the memory segment is accessed
+     by DR.  In other words, alias is checked if memory segment accessed by
+     DR_A in some iterations intersect with memory segment accessed by DR_B
+     in the same amount iterations.
+     Note segnment length is a linear function of number of iterations with
+     DR_STEP as the coefficient.  */
+  unsigned HOST_WIDE_INT niter_len1 = (seg_len1 + abs_step - 1) / abs_step;
+  unsigned HOST_WIDE_INT niter_len2 = (seg_len2 + abs_step - 1) / abs_step;
+
+  unsigned int i;
+  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
+  for (i = 0; i < DR_NUM_DIMENSIONS (dr_a.dr); i++)
+    {
+      tree access1 = DR_ACCESS_FN (dr_a.dr, i);
+      tree access2 = DR_ACCESS_FN (dr_b.dr, i);
+      /* Two index must be the same if they are not scev, or not scev wrto
+	 current loop being vecorized.  */
+      if (TREE_CODE (access1) != POLYNOMIAL_CHREC
+	  || TREE_CODE (access2) != POLYNOMIAL_CHREC
+	  || CHREC_VARIABLE (access1) != (unsigned)loop->num
+	  || CHREC_VARIABLE (access2) != (unsigned)loop->num)
+	{
+	  if (operand_equal_p (access1, access2, 0))
+	    continue;
+
+	  return false;
+	}
+      /* Two index must have the same step.  */
+      if (!operand_equal_p (CHREC_RIGHT (access1), CHREC_RIGHT (access2), 0))
+	return false;
+
+      tree idx_step = CHREC_RIGHT (access1);
+      /* Index must have const step, otherwise DR_STEP won't be constant.  */
+      gcc_assert (TREE_CODE (idx_step) == INTEGER_CST);
+      /* Index must evaluate in the same direction as DR.  */
+      gcc_assert (!neg_step
+		  || tree_int_cst_compare (idx_step, size_zero_node) < 0);
+
+      tree min1 = CHREC_LEFT (access1);
+      tree min2 = CHREC_LEFT (access2);
+      if (!types_compatible_p (TREE_TYPE (min1), TREE_TYPE (min2)))
+	return false;
+
+      /* Ideally, alias can be checked against loop's control IV, but we
+	 need to prove linear mapping between control IV and reference
+	 index.  Although that should be true, we check against (array)
+	 index of data reference.  Like segment length, index length is
+	 linear function of the number of iterations with index_step as
+	 the coefficient, i.e, niter_len * idx_step.  */
+      tree idx_len1 = fold_build2 (MULT_EXPR, TREE_TYPE (min1), idx_step,
+				   build_int_cst (TREE_TYPE (min1),
+						  niter_len1));
+      tree idx_len2 = fold_build2 (MULT_EXPR, TREE_TYPE (min2), idx_step,
+				   build_int_cst (TREE_TYPE (min2),
+						  niter_len2));
+      tree max1 = fold_build2 (PLUS_EXPR, TREE_TYPE (min1), min1, idx_len1);
+      tree max2 = fold_build2 (PLUS_EXPR, TREE_TYPE (min2), min2, idx_len2);
+      /* Adjust ranges for negative step.  */
+      if (neg_step)
+	{
+	  min1 = fold_build2 (MINUS_EXPR, TREE_TYPE (min1), max1, idx_step);
+	  max1 = fold_build2 (MINUS_EXPR, TREE_TYPE (min1),
+			      CHREC_LEFT (access1), idx_step);
+	  min2 = fold_build2 (MINUS_EXPR, TREE_TYPE (min2), max2, idx_step);
+	  max2 = fold_build2 (MINUS_EXPR, TREE_TYPE (min2),
+			      CHREC_LEFT (access2), idx_step);
+	}
+      tree part_cond_expr
+	= fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
+	    fold_build2 (LE_EXPR, boolean_type_node, max1, min2),
+	    fold_build2 (LE_EXPR, boolean_type_node, max2, min1));
+      if (*cond_expr)
+	*cond_expr = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
+				  *cond_expr, part_cond_expr);
+      else
+	*cond_expr = part_cond_expr;
+    }
+  return true;
+}
+
+/* Given two data references and segment lengths described by DR_A and DR_B,
+   create expression checking if the two addresses ranges intersect with
+   each other:
+
+     ((DR_A_addr_0 + DR_A_segment_length_0) <= DR_B_addr_0)
+     || (DR_B_addr_0 + DER_B_segment_length_0) <= DR_A_addr_0))  */
+
+static void
+create_intersect_range_checks (loop_vec_info loop_vinfo, tree *cond_expr,
+			       const dr_with_seg_len& dr_a,
+			       const dr_with_seg_len& dr_b)
+{
+  *cond_expr = NULL_TREE;
+  if (create_intersect_range_checks_index (loop_vinfo, cond_expr, dr_a, dr_b))
+    return;
+
+  tree segment_length_a = dr_a.seg_len;
+  tree segment_length_b = dr_b.seg_len;
+  tree addr_base_a = DR_BASE_ADDRESS (dr_a.dr);
+  tree addr_base_b = DR_BASE_ADDRESS (dr_b.dr);
+  tree offset_a = DR_OFFSET (dr_a.dr), offset_b = DR_OFFSET (dr_b.dr);
+
+  offset_a = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_a),
+			  offset_a, DR_INIT (dr_a.dr));
+  offset_b = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_b),
+			  offset_b, DR_INIT (dr_b.dr));
+  addr_base_a = fold_build_pointer_plus (addr_base_a, offset_a);
+  addr_base_b = fold_build_pointer_plus (addr_base_b, offset_b);
+
+  tree seg_a_min = addr_base_a;
+  tree seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
+  /* For negative step, we need to adjust address range by TYPE_SIZE_UNIT
+     bytes, e.g., int a[3] -> a[1] range is [a+4, a+16) instead of
+     [a, a+12) */
+  if (tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0)
+    {
+      tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr_a.dr)));
+      seg_a_min = fold_build_pointer_plus (seg_a_max, unit_size);
+      seg_a_max = fold_build_pointer_plus (addr_base_a, unit_size);
+    }
+
+  tree seg_b_min = addr_base_b;
+  tree seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
+  if (tree_int_cst_compare (DR_STEP (dr_b.dr), size_zero_node) < 0)
+    {
+      tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr_b.dr)));
+      seg_b_min = fold_build_pointer_plus (seg_b_max, unit_size);
+      seg_b_max = fold_build_pointer_plus (addr_base_b, unit_size);
+    }
+  *cond_expr
+    = fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
+	fold_build2 (LE_EXPR, boolean_type_node, seg_a_max, seg_b_min),
+	fold_build2 (LE_EXPR, boolean_type_node, seg_b_max, seg_a_min));
+}
+
 /* Function vect_create_cond_for_alias_checks.
 
    Create a conditional expression that represents the run-time checks for
@@ -2290,15 +2478,6 @@ vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
     LOOP_VINFO_COMP_ALIAS_DDRS (loop_vinfo);
   tree part_cond_expr;
 
-  /* Create expression
-     ((store_ptr_0 + store_segment_length_0) <= load_ptr_0)
-     || (load_ptr_0 + load_segment_length_0) <= store_ptr_0))
-     &&
-     ...
-     &&
-     ((store_ptr_n + store_segment_length_n) <= load_ptr_n)
-     || (load_ptr_n + load_segment_length_n) <= store_ptr_n))  */
-
   if (comp_alias_ddrs.is_empty ())
     return;
 
@@ -2306,18 +2485,6 @@ vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
     {
       const dr_with_seg_len& dr_a = comp_alias_ddrs[i].first;
       const dr_with_seg_len& dr_b = comp_alias_ddrs[i].second;
-      tree segment_length_a = dr_a.seg_len;
-      tree segment_length_b = dr_b.seg_len;
-      tree addr_base_a = DR_BASE_ADDRESS (dr_a.dr);
-      tree addr_base_b = DR_BASE_ADDRESS (dr_b.dr);
-      tree offset_a = DR_OFFSET (dr_a.dr), offset_b = DR_OFFSET (dr_b.dr);
-
-      offset_a = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_a),
-			      offset_a, DR_INIT (dr_a.dr));
-      offset_b = fold_build2 (PLUS_EXPR, TREE_TYPE (offset_b),
-			      offset_b, DR_INIT (dr_b.dr));
-      addr_base_a = fold_build_pointer_plus (addr_base_a, offset_a);
-      addr_base_b = fold_build_pointer_plus (addr_base_b, offset_b);
 
       if (dump_enabled_p ())
 	{
@@ -2329,32 +2496,8 @@ vect_create_cond_for_alias_checks (loop_vec_info loop_vinfo, tree * cond_expr)
 	  dump_printf (MSG_NOTE, "\n");
 	}
 
-      tree seg_a_min = addr_base_a;
-      tree seg_a_max = fold_build_pointer_plus (addr_base_a, segment_length_a);
-      /* For negative step, we need to adjust address range by TYPE_SIZE_UNIT
-	 bytes, e.g., int a[3] -> a[1] range is [a+4, a+16) instead of
-	 [a, a+12) */
-      if (tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0)
-	{
-	  tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr_a.dr)));
-	  seg_a_min = fold_build_pointer_plus (seg_a_max, unit_size);
-	  seg_a_max = fold_build_pointer_plus (addr_base_a, unit_size);
-	}
-
-      tree seg_b_min = addr_base_b;
-      tree seg_b_max = fold_build_pointer_plus (addr_base_b, segment_length_b);
-      if (tree_int_cst_compare (DR_STEP (dr_b.dr), size_zero_node) < 0)
-	{
-	  tree unit_size = TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (dr_b.dr)));
-	  seg_b_min = fold_build_pointer_plus (seg_b_max, unit_size);
-	  seg_b_max = fold_build_pointer_plus (addr_base_b, unit_size);
-	}
-
-      part_cond_expr =
-      	fold_build2 (TRUTH_OR_EXPR, boolean_type_node,
-	  fold_build2 (LE_EXPR, boolean_type_node, seg_a_max, seg_b_min),
-	  fold_build2 (LE_EXPR, boolean_type_node, seg_b_max, seg_a_min));
-
+      /* Create condition expression for each pair data references.  */
+      create_intersect_range_checks (loop_vinfo, &part_cond_expr, dr_a, dr_b);
       if (*cond_expr)
 	*cond_expr = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
 				  *cond_expr, part_cond_expr);

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-22 10:06   ` Bin.Cheng
@ 2016-09-22 10:28     ` Richard Biener
  2016-09-23 22:12     ` Richard Sandiford
  2016-09-26  9:38     ` Robin Dapp
  2 siblings, 0 replies; 16+ messages in thread
From: Richard Biener @ 2016-09-22 10:28 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: gcc-patches

On Thu, Sep 22, 2016 at 11:43 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Wed, Sep 21, 2016 at 3:01 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, Sep 20, 2016 at 5:25 PM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>>> Hi,
>>> I originally posted a patch improving code generation for alias check in vectorizer at https://gcc.gnu.org/ml/gcc-patches/2016-06/msg00929.html.  Here it's the 2nd version (better) patch.  It detects data reference pair in which the two references are only different to each other wrto index.  In this case the patch generates intersect range checks based on index of address of reference, rather than the address of reference.  Take example from patch's comment, given below two data references:
>>>
>>>                        DR_A                           DR_B
>>>       data-ref         arr[i]                         arr[j]
>>>       base_object      arr                            arr
>>>       index            {i_0, +, 1}_loop               {j_0, +, 1}_loop
>>>
>>> The addresses and their index can be depicted as:
>>>
>>>         |<- ADDR_A    ->|          |<- ADDR_B    ->|
>>>      ------------------------------------------------------->
>>>         |   |   |   |   |          |   |   |   |   |
>>>      ------------------------------------------------------->
>>>         i_0 ...         i_0+4      j_0 ...         j_0+4
>>>
>>> We can create expression based on index rather than address:  (i_0 + 4 < j_0 || j_0 + 4 < i_0).
>>>
>>> Also take example from spec2k/200.sixtrack/DACOP, gcc needs to generate alias check for three pairs:
>>> //pair 1
>>> dr_a:
>>> (Data Ref:
>>>   bb: 8
>>>   stmt: _92 = da.cc[_27];
>>>   ref: da.cc[_27];
>>> )
>>> dr_b:
>>> (Data Ref:
>>>   bb: 8
>>>   stmt: da.cc[_93] = _92;
>>>   ref: da.cc[_93];
>>> )
>>> //pair 2
>>> dr_a:
>>> (Data Ref:
>>>   bb: 8
>>>   stmt: pretmp_29 = da.i2[_27];
>>>   ref: da.i2[_27];
>>> )
>>> dr_b:
>>> (Data Ref:
>>>   bb: 8
>>>   stmt: da.i2[_93] = pretmp_29;
>>>   ref: da.i2[_93];
>>> )
>>> //pair 3
>>> dr_a:
>>> (Data Ref:
>>>   bb: 8
>>>   stmt: pretmp_28 = da.i1[_27];
>>>   ref: da.i1[_27];
>>> )
>>> dr_b:
>>> (Data Ref:
>>>   bb: 8
>>>   stmt: da.i1[_93] = pretmp_28;
>>>   ref: da.i1[_93];
>>> )
>>>
>>> With this patch, code can be improved to:
>>>
>>>   <bb 7>:
>>>   _37 = (unsigned int) _6;
>>>   _106 = (unsigned int) _52;
>>>   _107 = _37 - _106;
>>>   _108 = _107 > 7;
>>>   _109 = (integer(kind=8)) _2;
>>>   _110 = _109 + 3;
>>>   _111 = (integer(kind=8)) _52;
>>>   _112 = _111 + -1;
>>>   _113 = _110 < _112;
>>>   _114 = (integer(kind=8)) _52;
>>>   _115 = _114 + 2;
>>>   _116 = (integer(kind=8)) _2;
>>>   _117 = _115 < _116;
>>>   _118 = _113 | _117;
>>>   _119 = _108 & _118;
>>>   _120 = (integer(kind=8)) _2;
>>>   _121 = _120 + 3;
>>>   _122 = (integer(kind=8)) _52;
>>>   _123 = _122 + -1;
>>>   _124 = _121 < _123;
>>>   _125 = (integer(kind=8)) _52;
>>>   _126 = _125 + 2;
>>>   _127 = (integer(kind=8)) _2;
>>>   _128 = _126 < _127;
>>>   _129 = _124 | _128;
>>>   _130 = _119 & _129;
>>>   _131 = (integer(kind=8)) _2;
>>>   _132 = _131 + 3;
>>>   _133 = (integer(kind=8)) _52;
>>>   _134 = _133 + -1;
>>>   _135 = _132 < _134;
>>>   _136 = (integer(kind=8)) _52;
>>>   _137 = _136 + 2;
>>>   _138 = (integer(kind=8)) _2;
>>>   _139 = _137 < _138;
>>>   _140 = _135 | _139;
>>>   _141 = _130 & _140;
>>>   if (_141 != 0)
>>>     goto <bb 8>;
>>>   else
>>>     goto <bb 20>;
>>>
>>> Note this patch doesn't do local CSE, and common sub-expressions will be optimized by later passes.  I think this is OK because the redundancy is far away from loops thus won't have bad effect (there is an opposite example/PR).
>>> Bootstrap and test on x86_64 and AArch64, is it OK?
>>
>> Seeing
>>
>> +  /* Infer index length from segment length.  */
>> +  unsigned HOST_WIDE_INT idx_len1 = (seg_len1 + abs_step - 1) / abs_step;
>> +  unsigned HOST_WIDE_INT idx_len2 = (seg_len2 + abs_step - 1) / abs_step;
>>
>> Does this really work for DR_STEP that is bigger than 1 in index space?  The
> I think it works.  Since both segment length of memory accessed and
> index length evaluated are induction variable wrto loop iterations,
> they are linear functions of the niters, with DR_STEP and IDX_STEP as
> their coefficient, like
>   segment_length = niters * DR_STEP
>   index_length = niters * IDX_STEP
> The affine relation between segment_length and index_length is
> guaranteed by SCEV, otherwise, the loop won't be eligible for
> vectorization.  So I think it's equal between check segment and index
> here.
> Follow your reviews, I did find a problem in the original patch
> because I forgot to multiply niters with IDX_STEP in computing
> index_length.  This is fixed now.  I also added explaining comment.

Ok, now it makes sense.

>> segment length is generally vectorization-factor * DR_STEP but index space
>> is -- if you are talking about array-ref indexes -- the segment length divided
>> by the array element size.  Note that what the index space refers to for
>> a given DR_ACCESS_FN depends on whether this is from an ARRAY_REF
>> (element size) or the base pointer evolution (bytes).  I suppose looking at
>> the access functions type might distinguish the pointer vs. array index case.
> ? Looking at access functions is the reason we can't distinguish
> between pointer and array reference because their access function are
> totally different.  On the contrary, we need to look into (innermost)
> evolution behavior in this case I think?

My comment is moot as you translate into niter space which you can of course
do for both pointer and index IVs.

Note that we do have access functions for pointer IVs.  Having the same
base makes sure access functions are "compatible" for two DRs.

>>
>> What I'm looking for is a big comment on why the above "translation" is
>> correct.
> Comments added.  Bootstrap and test, is it reasonable?

Yes.

Thanks,
Richard.

> Thanks,
> bin

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-22 10:06   ` Bin.Cheng
  2016-09-22 10:28     ` Richard Biener
@ 2016-09-23 22:12     ` Richard Sandiford
  2016-09-26  9:38     ` Robin Dapp
  2 siblings, 0 replies; 16+ messages in thread
From: Richard Sandiford @ 2016-09-23 22:12 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Richard Biener, gcc-patches

"Bin.Cheng" <amker.cheng@gmail.com> writes:
> +  gcc_assert (TREE_CODE (DR_STEP (dr_a.dr)) == INTEGER_CST);
> +
> +  bool neg_step = tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0;
> +  unsigned HOST_WIDE_INT abs_step = tree_to_uhwi (DR_STEP (dr_a.dr));
> +  if (neg_step)
> +    abs_step = -abs_step;

I might be wrong, but it looks like this would assert for negative steps,
since the tree passed to tree_to_uhwi would be out of range.

Thansk
Richard

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-22 10:06   ` Bin.Cheng
  2016-09-22 10:28     ` Richard Biener
  2016-09-23 22:12     ` Richard Sandiford
@ 2016-09-26  9:38     ` Robin Dapp
  2016-09-26 11:17       ` Richard Biener
  2016-09-26 12:52       ` Bin.Cheng
  2 siblings, 2 replies; 16+ messages in thread
From: Robin Dapp @ 2016-09-26  9:38 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 494 bytes --]

> Comments added.  Bootstrap and test, is it reasonable?

This introduces an ICE on s390x with a Fortran testcase, because
of an assertion failure in tree_to_uhwi (DR_STEP (dr_a.dr)) for
DR_STEP (dr_a.dr) = -8(OVF).

The attached patch fixes this by introducing an additional
tree_fits_uhwi_p().

ok to commit?

Regards
 Robin

--

gcc/ChangeLog:

2016-09-26  Robin Dapp  <rdapp@linux.vnet.ibm.com>

	* tree-vect-loop-manip.c (create_intersect_range_checks_index):
	Add tree_fits_uhwi_p check.

[-- Attachment #2: gcc-tree-vect-loop-manip-fix.diff --]
[-- Type: text/x-patch, Size: 898 bytes --]

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 8203040..8be0c17 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -2301,6 +2301,9 @@ create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
   if (!tree_fits_uhwi_p (dr_a.seg_len) || !tree_fits_uhwi_p (dr_b.seg_len))
     return false;
 
+  if (!tree_fits_uhwi_p (DR_STEP (dr_a.dr)))
+    return false;
+
   if (!operand_equal_p (DR_BASE_OBJECT (dr_a.dr), DR_BASE_OBJECT (dr_b.dr), 0))
     return false;
 
@@ -2310,6 +2313,7 @@ create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
   gcc_assert (TREE_CODE (DR_STEP (dr_a.dr)) == INTEGER_CST);
 
   bool neg_step = tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0;
+
   unsigned HOST_WIDE_INT abs_step = tree_to_uhwi (DR_STEP (dr_a.dr));
   if (neg_step)
     abs_step = -abs_step;

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-26  9:38     ` Robin Dapp
@ 2016-09-26 11:17       ` Richard Biener
  2016-09-26 13:03         ` Markus Trippelsdorf
  2016-09-26 12:52       ` Bin.Cheng
  1 sibling, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-09-26 11:17 UTC (permalink / raw)
  To: Robin Dapp; +Cc: Bin.Cheng, gcc-patches

On Mon, Sep 26, 2016 at 11:21 AM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote:
>> Comments added.  Bootstrap and test, is it reasonable?
>
> This introduces an ICE on s390x with a Fortran testcase, because
> of an assertion failure in tree_to_uhwi (DR_STEP (dr_a.dr)) for
> DR_STEP (dr_a.dr) = -8(OVF).
>
> The attached patch fixes this by introducing an additional
> tree_fits_uhwi_p().
>
> ok to commit?

no, this disables handling of negative step.  I think it should
check for tree_fits_shwi_p () instead and adjust the code to do

  unsigned HOST_WIDE_INT abs_step = absu_hwi (tree_to_shwi (DR_STEP (dr_a.dr)));

Richard.

> Regards
>  Robin
>
> --
>
> gcc/ChangeLog:
>
> 2016-09-26  Robin Dapp  <rdapp@linux.vnet.ibm.com>
>
>         * tree-vect-loop-manip.c (create_intersect_range_checks_index):
>         Add tree_fits_uhwi_p check.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-26  9:38     ` Robin Dapp
  2016-09-26 11:17       ` Richard Biener
@ 2016-09-26 12:52       ` Bin.Cheng
  1 sibling, 0 replies; 16+ messages in thread
From: Bin.Cheng @ 2016-09-26 12:52 UTC (permalink / raw)
  To: Robin Dapp; +Cc: gcc-patches

On Mon, Sep 26, 2016 at 10:21 AM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote:
>> Comments added.  Bootstrap and test, is it reasonable?
>
> This introduces an ICE on s390x with a Fortran testcase, because
> of an assertion failure in tree_to_uhwi (DR_STEP (dr_a.dr)) for
> DR_STEP (dr_a.dr) = -8(OVF).
>
> The attached patch fixes this by introducing an additional
> tree_fits_uhwi_p().
>
> ok to commit?
Hi Robin,
Unfortunately I won't have code access now, could you please work out
a patch as suggested by Richard?  Thanks very much for fixing this.

Thanks,
bin
>
> Regards
>  Robin
>
> --
>
> gcc/ChangeLog:
>
> 2016-09-26  Robin Dapp  <rdapp@linux.vnet.ibm.com>
>
>         * tree-vect-loop-manip.c (create_intersect_range_checks_index):
>         Add tree_fits_uhwi_p check.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-26 11:17       ` Richard Biener
@ 2016-09-26 13:03         ` Markus Trippelsdorf
  2016-09-26 15:43           ` Robin Dapp
  0 siblings, 1 reply; 16+ messages in thread
From: Markus Trippelsdorf @ 2016-09-26 13:03 UTC (permalink / raw)
  To: Richard Biener; +Cc: Robin Dapp, Bin.Cheng, gcc-patches

On 2016.09.26 at 13:14 +0200, Richard Biener wrote:
> On Mon, Sep 26, 2016 at 11:21 AM, Robin Dapp <rdapp@linux.vnet.ibm.com> wrote:
> >> Comments added.  Bootstrap and test, is it reasonable?
> >
> > This introduces an ICE on s390x with a Fortran testcase, because
> > of an assertion failure in tree_to_uhwi (DR_STEP (dr_a.dr)) for
> > DR_STEP (dr_a.dr) = -8(OVF).
> >
> > The attached patch fixes this by introducing an additional
> > tree_fits_uhwi_p().
> >
> > ok to commit?
> 
> no, this disables handling of negative step.  I think it should
> check for tree_fits_shwi_p () instead and adjust the code to do
> 
>   unsigned HOST_WIDE_INT abs_step = absu_hwi (tree_to_shwi (DR_STEP (dr_a.dr)));
> 

And also please mention https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77724 and perhaps
add its testcase, too.

-- 
Markus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-26 13:03         ` Markus Trippelsdorf
@ 2016-09-26 15:43           ` Robin Dapp
  0 siblings, 0 replies; 16+ messages in thread
From: Robin Dapp @ 2016-09-26 15:43 UTC (permalink / raw)
  To: Markus Trippelsdorf, Richard Biener; +Cc: Bin.Cheng, gcc-patches

[-- Attachment #1: Type: text/plain, Size: 695 bytes --]

> And also please mention https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77724 and perhaps
> add its testcase, too.
> 

Patch according to Richard's suggestion, included test case. The test
case works unpatched on x86-64 but fails on s390. Is gcc.dg/vect the
proper place for it? (I didn't manage to run it independently in this
directory via RUNTESTFLAGS=vect.exp=... or otherwise)

Bootstrapped on x86 and s390.

--

gcc/ChangeLog:

2016-09-26  Robin Dapp  <rdapp@linux.vnet.ibm.com>

	* tree-vect-loop-manip.c (create_intersect_range_checks_index):
	Add tree_fits_shwi_p check.


gcc/testsuite/ChangeLog:

2016-09-26  Robin Dapp  <rdapp@linux.vnet.ibm.com>

	* gcc.dg/vect/pr77724.c: New test.

[-- Attachment #2: gcc-tree-vect-loop-manip-fix.diff --]
[-- Type: text/x-patch, Size: 1449 bytes --]

diff --git a/gcc/testsuite/gcc.dg/vect/pr77724.c b/gcc/testsuite/gcc.dg/vect/pr77724.c
new file mode 100644
index 0000000..3039b5a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr77724.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" { target { { i?86-*-* x86_64-*-* } && ia32 } } } */
+/* { dg-options "-O3 -march=z13" { target s390*-*-* } } */
+
+int a[81];
+int b, c;
+
+void
+fn1()
+{
+  int d = b;
+  for (; c; --c)
+    a[c + d] = a[c];
+}
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 8203040..147a90f 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -2301,6 +2301,9 @@ create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
   if (!tree_fits_uhwi_p (dr_a.seg_len) || !tree_fits_uhwi_p (dr_b.seg_len))
     return false;
 
+  if (!tree_fits_shwi_p (DR_STEP (dr_a.dr)))
+    return false;
+
   if (!operand_equal_p (DR_BASE_OBJECT (dr_a.dr), DR_BASE_OBJECT (dr_b.dr), 0))
     return false;
 
@@ -2310,7 +2313,8 @@ create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
   gcc_assert (TREE_CODE (DR_STEP (dr_a.dr)) == INTEGER_CST);
 
   bool neg_step = tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0;
-  unsigned HOST_WIDE_INT abs_step = tree_to_uhwi (DR_STEP (dr_a.dr));
+  unsigned HOST_WIDE_INT abs_step =
+    absu_hwi (tree_to_shwi (DR_STEP (dr_a.dr)));
   if (neg_step)
     abs_step = -abs_step;
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-27 13:24       ` Richard Biener
@ 2016-09-28  7:32         ` Markus Trippelsdorf
  0 siblings, 0 replies; 16+ messages in thread
From: Markus Trippelsdorf @ 2016-09-28  7:32 UTC (permalink / raw)
  To: Richard Biener
  Cc: Robin Dapp, Richard Biener, Bernd Edlinger, GCC Patches, Bin.Cheng

On 2016.09.27 at 15:22 +0200, Richard Biener wrote:
> On Tue, 27 Sep 2016, Robin Dapp wrote:
> 
> > > Also the '=' in the split line goes to the next line according to
> > > coding conventions.
> > 
> > fixed, I had only looked at an instance one function above which had it
> > wrong as well. Also changed comment grammar slightly.
> 
> Ok.

Since you are not listed in MAINTAINERS I've committed your patch.

-- 
Markus

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-27 13:22     ` Robin Dapp
@ 2016-09-27 13:24       ` Richard Biener
  2016-09-28  7:32         ` Markus Trippelsdorf
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-09-27 13:24 UTC (permalink / raw)
  To: Robin Dapp
  Cc: Richard Biener, Bernd Edlinger, Markus Trippelsdorf, GCC Patches,
	Bin.Cheng

On Tue, 27 Sep 2016, Robin Dapp wrote:

> > Also the '=' in the split line goes to the next line according to
> > coding conventions.
> 
> fixed, I had only looked at an instance one function above which had it
> wrong as well. Also changed comment grammar slightly.

Ok.

Thanks,
Richard.

> 
> Regards
>  Robin
> 
> --
> 
> gcc/ChangeLog:
> 
> 2016-09-27  Robin Dapp  <rdapp@linux.vnet.ibm.com>
> 
> 	* tree-vect-loop-manip.c (create_intersect_range_checks_index):
> 	Add tree_fits_shwi_p check.
> 
> 
> gcc/testsuite/ChangeLog:
> 
> 2016-09-27  Robin Dapp  <rdapp@linux.vnet.ibm.com>
> 
> 	* gcc.dg/vect/pr77724.c: New test.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-27  7:55   ` Richard Biener
@ 2016-09-27 13:22     ` Robin Dapp
  2016-09-27 13:24       ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Robin Dapp @ 2016-09-27 13:22 UTC (permalink / raw)
  To: Richard Biener, Bernd Edlinger
  Cc: Markus Trippelsdorf, Richard Biener, GCC Patches, Bin.Cheng

[-- Attachment #1: Type: text/plain, Size: 517 bytes --]

> Also the '=' in the split line goes to the next line according to
> coding conventions.

fixed, I had only looked at an instance one function above which had it
wrong as well. Also changed comment grammar slightly.


Regards
 Robin

--

gcc/ChangeLog:

2016-09-27  Robin Dapp  <rdapp@linux.vnet.ibm.com>

	* tree-vect-loop-manip.c (create_intersect_range_checks_index):
	Add tree_fits_shwi_p check.


gcc/testsuite/ChangeLog:

2016-09-27  Robin Dapp  <rdapp@linux.vnet.ibm.com>

	* gcc.dg/vect/pr77724.c: New test.

[-- Attachment #2: gcc-tree-vect-loop-manip-fix.diff --]
[-- Type: text/x-patch, Size: 2269 bytes --]

diff --git a/gcc/testsuite/gcc.dg/vect/pr77724.c b/gcc/testsuite/gcc.dg/vect/pr77724.c
new file mode 100644
index 0000000..b3411d6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr77724.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+
+int a[81];
+int b, c;
+
+void
+fn1()
+{
+  int d = b;
+  for (; c; --c)
+    a[c + d] = a[c];
+}
diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 8203040..a715fd9 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -2301,6 +2301,9 @@ create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
   if (!tree_fits_uhwi_p (dr_a.seg_len) || !tree_fits_uhwi_p (dr_b.seg_len))
     return false;
 
+  if (!tree_fits_shwi_p (DR_STEP (dr_a.dr)))
+    return false;
+
   if (!operand_equal_p (DR_BASE_OBJECT (dr_a.dr), DR_BASE_OBJECT (dr_b.dr), 0))
     return false;
 
@@ -2310,9 +2313,8 @@ create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
   gcc_assert (TREE_CODE (DR_STEP (dr_a.dr)) == INTEGER_CST);
 
   bool neg_step = tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0;
-  unsigned HOST_WIDE_INT abs_step = tree_to_uhwi (DR_STEP (dr_a.dr));
-  if (neg_step)
-    abs_step = -abs_step;
+  unsigned HOST_WIDE_INT abs_step
+    = absu_hwi (tree_to_shwi (DR_STEP (dr_a.dr)));
 
   unsigned HOST_WIDE_INT seg_len1 = tree_to_uhwi (dr_a.seg_len);
   unsigned HOST_WIDE_INT seg_len2 = tree_to_uhwi (dr_b.seg_len);
@@ -2331,7 +2333,7 @@ create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
     {
       tree access1 = DR_ACCESS_FN (dr_a.dr, i);
       tree access2 = DR_ACCESS_FN (dr_b.dr, i);
-      /* Two index must be the same if they are not scev, or not scev wrto
+      /* Two indices must be the same if they are not scev, or not scev wrto
 	 current loop being vecorized.  */
       if (TREE_CODE (access1) != POLYNOMIAL_CHREC
 	  || TREE_CODE (access2) != POLYNOMIAL_CHREC
@@ -2343,7 +2345,7 @@ create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
 
 	  return false;
 	}
-      /* Two index must have the same step.  */
+      /* The two indices must have the same step.  */
       if (!operand_equal_p (CHREC_RIGHT (access1), CHREC_RIGHT (access2), 0))
 	return false;
 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-26 18:26 ` Richard Biener
@ 2016-09-27  7:55   ` Richard Biener
  2016-09-27 13:22     ` Robin Dapp
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-09-27  7:55 UTC (permalink / raw)
  To: Bernd Edlinger, Robin Dapp
  Cc: Markus Trippelsdorf, Richard Biener, GCC Patches, Bin.Cheng

On Mon, Sep 26, 2016 at 8:15 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On September 26, 2016 5:46:28 PM GMT+02:00, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:
>>Hi,
>>
>>>@@ -2310,7 +2313,8 @@ create_intersect_range_checks_index
>>(loop_vec_info loop_vinfo, tree *cond_expr,
>>>   gcc_assert (TREE_CODE (DR_STEP (dr_a.dr)) == INTEGER_CST);
>>>
>>>   bool neg_step = tree_int_cst_compare (DR_STEP (dr_a.dr),
>>size_zero_node) < 0;
>>>-  unsigned HOST_WIDE_INT abs_step = tree_to_uhwi (DR_STEP (dr_a.dr));
>>>+  unsigned HOST_WIDE_INT abs_step =
>>>+    absu_hwi (tree_to_shwi (DR_STEP (dr_a.dr)));
>>>   if (neg_step)
>>>     abs_step = -abs_step;
>>
>>Hmmm...
>>
>>but you must remove the neg_step if you do this.
>
> The negation, yes.  Neg_step is used later as well.

Also the '=' in the split line goes to the next line according to
coding conventions.

Richard.

> Richard.
>
>>Right?
>>
>>
>>Bernd.
>
>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
  2016-09-26 15:48 Bernd Edlinger
@ 2016-09-26 18:26 ` Richard Biener
  2016-09-27  7:55   ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2016-09-26 18:26 UTC (permalink / raw)
  To: Bernd Edlinger, Robin Dapp
  Cc: Markus Trippelsdorf, Richard Biener, GCC Patches, Bin.Cheng

On September 26, 2016 5:46:28 PM GMT+02:00, Bernd Edlinger <bernd.edlinger@hotmail.de> wrote:
>Hi,
>
>>@@ -2310,7 +2313,8 @@ create_intersect_range_checks_index
>(loop_vec_info loop_vinfo, tree *cond_expr,
>>   gcc_assert (TREE_CODE (DR_STEP (dr_a.dr)) == INTEGER_CST);
>> 
>>   bool neg_step = tree_int_cst_compare (DR_STEP (dr_a.dr),
>size_zero_node) < 0;
>>-  unsigned HOST_WIDE_INT abs_step = tree_to_uhwi (DR_STEP (dr_a.dr));
>>+  unsigned HOST_WIDE_INT abs_step =
>>+    absu_hwi (tree_to_shwi (DR_STEP (dr_a.dr)));
>>   if (neg_step)
>>     abs_step = -abs_step;
>
>Hmmm...
>
>but you must remove the neg_step if you do this.

The negation, yes.  Neg_step is used later as well.

Richard.

>Right?
>
>
>Bernd.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH GCC][v2]Simplify alias check code generation in vectorizer
@ 2016-09-26 15:48 Bernd Edlinger
  2016-09-26 18:26 ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Bernd Edlinger @ 2016-09-26 15:48 UTC (permalink / raw)
  To: Robin Dapp; +Cc: Markus Trippelsdorf, Richard Biener, GCC Patches, Bin.Cheng

Hi,

>@@ -2310,7 +2313,8 @@ create_intersect_range_checks_index (loop_vec_info loop_vinfo, tree *cond_expr,
>   gcc_assert (TREE_CODE (DR_STEP (dr_a.dr)) == INTEGER_CST);
> 
>   bool neg_step = tree_int_cst_compare (DR_STEP (dr_a.dr), size_zero_node) < 0;
>-  unsigned HOST_WIDE_INT abs_step = tree_to_uhwi (DR_STEP (dr_a.dr));
>+  unsigned HOST_WIDE_INT abs_step =
>+    absu_hwi (tree_to_shwi (DR_STEP (dr_a.dr)));
>   if (neg_step)
>     abs_step = -abs_step;

Hmmm...

but you must remove the neg_step if you do this.

Right?


Bernd.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2016-09-28  5:16 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-20 15:30 [PATCH GCC][v2]Simplify alias check code generation in vectorizer Bin Cheng
2016-09-21 14:21 ` Richard Biener
2016-09-22 10:06   ` Bin.Cheng
2016-09-22 10:28     ` Richard Biener
2016-09-23 22:12     ` Richard Sandiford
2016-09-26  9:38     ` Robin Dapp
2016-09-26 11:17       ` Richard Biener
2016-09-26 13:03         ` Markus Trippelsdorf
2016-09-26 15:43           ` Robin Dapp
2016-09-26 12:52       ` Bin.Cheng
2016-09-26 15:48 Bernd Edlinger
2016-09-26 18:26 ` Richard Biener
2016-09-27  7:55   ` Richard Biener
2016-09-27 13:22     ` Robin Dapp
2016-09-27 13:24       ` Richard Biener
2016-09-28  7:32         ` Markus Trippelsdorf

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).