public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
@ 2016-03-16 10:00 Bin Cheng
  2016-03-16 12:20 ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Bin Cheng @ 2016-03-16 10:00 UTC (permalink / raw)
  To: gcc-patches; +Cc: nd

[-- Attachment #1: Type: text/plain, Size: 2683 bytes --]

Hi,
One issue revealed in tree ifcvt is the pass stores/tracks DRs against its memory references in IR.  This causes difficulty in identifying same memory references appearing in different forms.  Given below example:

void foo (int a[], int b[])
{
  int i;
  for (i = 0; i < 100; i++)
    {
      if (a[i] ==0)
	a[i] = b[i]*4;
      else
	a[i] = b[i]*3;
    }
}

The gimple dump before tree ifcvt is as:

  <bb 2>:

  <bb 3>:
  # i_24 = PHI <i_19(7), 0(2)>
  # ivtmp_28 = PHI <ivtmp_23(7), 100(2)>
  _5 = (long unsigned int) i_24;
  _6 = _5 * 4;
  _8 = a_7(D) + _6;
  _9 = *_8;
  if (_9 == 0)
    goto <bb 4>;
  else
    goto <bb 5>;

  <bb 4>:
  _11 = b_10(D) + _6;
  _12 = *_11;
  _13 = _12 * 4;
  goto <bb 6>;

  <bb 5>:
  _15 = b_10(D) + _6;
  _16 = *_15;
  _17 = _16 * 3;

  <bb 6>:
  # cstore_1 = PHI <_13(4), _17(5)>
  *_8 = cstore_1;
  i_19 = i_24 + 1;
  ivtmp_23 = ivtmp_28 - 1;
  if (ivtmp_23 != 0)
    goto <bb 7>;
  else
    goto <bb 8>;

  <bb 7>:
  goto <bb 3>;

  <bb 8>:
  return;

The two memory references "*_11" and "*_15" are actually the same thing, but ifcvt failed to discover this because they are recorded in different forms.  This patch fixes the issue by recording/tracking memory reference against its innermost_loop_behavior if: the memory reference has innermost_loop_behavior and it is not a compound reference.
BTW, PR56625 reported that this case couldn't be vectorized even tree if-conv can handle it.  Interesting thing is at current time, it's tree if-conv that could not handle the case.  Once it's if-converted with this patch, vectorizer are able to handle it too.

Bootstrap and test on x86_64 and AArch64.  Is it OK, not sure if it's GCC 7?

Thanks,
bin


2016-03-11  Bin Cheng  <bin.cheng@arm.com>

	PR tree-optimization/56625
	PR tree-optimization/69489
	* tree-data-ref.h (innermost_loop_behavior_hash): New class for
	hashing struct innermost_loop_behavior.
	(DR_INNERMOST): New macro.
	* tree-if-conv.c (innermost_DR_map): New map.
	(ref_DR_map, baseref_DR_map): Revise the comment.
	(hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR
	to innermost_DR_map if it has innermost loop behavior and is not
	a compound reference.
	(ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map if it has
	innermost loop behavior and is not a compound reference.
	(if_convertible_loop_p_1): Initialize innermost_DR_map.
	(if_convertible_loop_p): Release innermost_DR_map.

gcc/testsuite/ChangeLog
2016-03-11  Bin Cheng  <bin.cheng@arm.com>

	PR tree-optimization/56625
	PR tree-optimization/69489
	* gcc.dg/tree-ssa/ifc-pr69489-1.c: New test.


[-- Attachment #2: pr69489-part1-20160315.txt --]
[-- Type: text/plain, Size: 7233 bytes --]

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
new file mode 100644
index 0000000..05db62b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-S -O2 -ftree-vectorize -fdump-tree-ifcvt-stats" { target *-*-* } } */
+
+void foo (int a[], int b[])
+{
+  int i;
+  for (i = 0; i < 100; i++)
+    {
+      if (a[i] ==0)
+	a[i] = b[i]*4;
+      else
+	a[i] = b[i]*3;
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */
diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h
index eb24d16..310514a 100644
--- a/gcc/tree-data-ref.h
+++ b/gcc/tree-data-ref.h
@@ -57,6 +57,61 @@ struct innermost_loop_behavior
   tree aligned_to;
 };
 
+/* Hash for struct innermost_loop_behavior.  It depends on the user to
+   free the memory.  */
+
+struct innermost_loop_behavior_hash : nofree_ptr_hash <innermost_loop_behavior>
+{
+  static inline hashval_t hash (const value_type &);
+  static inline bool equal (const value_type &,
+			    const compare_type &);
+};
+
+inline hashval_t
+innermost_loop_behavior_hash::hash (const value_type &e)
+{
+  hashval_t hash;
+
+  hash = iterative_hash_expr (e->base_address, 0);
+  hash = iterative_hash_expr (e->offset, hash);
+  hash = iterative_hash_expr (e->init, hash);
+  hash = iterative_hash_expr (e->step, hash);
+  return iterative_hash_expr (e->aligned_to, hash);
+}
+
+inline bool
+innermost_loop_behavior_hash::equal (const value_type &e1,
+				     const compare_type &e2)
+{
+  bool equal_p;
+
+  if ((e1->base_address && !e2->base_address)
+      || (!e1->base_address && e2->base_address)
+      || (!e1->offset && e2->offset)
+      || (e1->offset && !e2->offset)
+      || (!e1->init && e2->init)
+      || (e1->init && !e2->init)
+      || (!e1->step && e2->step)
+      || (e1->step && !e2->step)
+      || (!e1->aligned_to && e2->aligned_to)
+      || (e1->aligned_to && !e2->aligned_to))
+    return false;
+
+  equal_p = true;
+  if (e1->base_address && e2->base_address)
+    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
+  if (e1->offset && e2->offset)
+    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
+  if (e1->init && e2->init)
+    equal_p &= operand_equal_p (e1->init, e2->init, 0);
+  if (e1->step && e2->step)
+    equal_p &= operand_equal_p (e1->step, e2->step, 0);
+  if (e1->aligned_to && e2->aligned_to)
+    equal_p &= operand_equal_p (e1->aligned_to, e2->aligned_to, 0);
+
+  return equal_p;
+}
+
 /* Describes the evolutions of indices of the memory reference.  The indices
    are indices of the ARRAY_REFs, indexes in artificial dimensions
    added for member selection of records and the operands of MEM_REFs.
@@ -144,6 +199,7 @@ struct data_reference
 #define DR_STEP(DR)                (DR)->innermost.step
 #define DR_PTR_INFO(DR)            (DR)->alias.ptr_info
 #define DR_ALIGNED_TO(DR)          (DR)->innermost.aligned_to
+#define DR_INNERMOST(DR)           (DR)->innermost
 
 typedef struct data_reference *data_reference_p;
 
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 9e305c7..1315370 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -120,10 +120,14 @@ static basic_block *ifc_bbs;
 /* Apply more aggressive (extended) if-conversion if true.  */
 static bool aggressive_if_conv;
 
-/* Hash table to store references, DR pairs.  */
+/* Hash table to store <references, DR> pairs.  */
 static hash_map<tree_operand_hash, data_reference_p> *ref_DR_map;
 
-/* Hash table to store base reference, DR pairs.  */
+/* Hash table to store <DR's innermost loop behavior, DR> pairs.  */
+static hash_map<innermost_loop_behavior_hash,
+		data_reference_p> *innermost_DR_map;
+
+/* Hash table to store <base reference, DR> pairs.  */
 static hash_map<tree_operand_hash, data_reference_p> *baseref_DR_map;
 
 /* Structure used to predicate basic blocks.  This is attached to the
@@ -615,15 +619,29 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a)
   data_reference_p *master_dr, *base_master_dr;
   tree ref = DR_REF (a);
   tree base_ref = DR_BASE_OBJECT (a);
+  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
   tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
   bool exist1, exist2;
 
-  while (TREE_CODE (ref) == COMPONENT_REF
-	 || TREE_CODE (ref) == IMAGPART_EXPR
-	 || TREE_CODE (ref) == REALPART_EXPR)
-    ref = TREE_OPERAND (ref, 0);
+  /* If reference in DR has innermost loop behavior and it is not
+     a compound memory reference, we store it to innermost_DR_map,
+     otherwise to ref_DR_map.  */
+  if (TREE_CODE (ref) == COMPONENT_REF
+      || TREE_CODE (ref) == IMAGPART_EXPR
+      || TREE_CODE (ref) == REALPART_EXPR
+      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
+	   || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
+    {
+      while (TREE_CODE (ref) == COMPONENT_REF
+	     || TREE_CODE (ref) == IMAGPART_EXPR
+	     || TREE_CODE (ref) == REALPART_EXPR)
+	ref = TREE_OPERAND (ref, 0);
+
+      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
+    }
+  else
+    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
 
-  master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
   if (!exist1)
     *master_dr = a;
 
@@ -687,15 +705,29 @@ ifcvt_memrefs_wont_trap (gimple *stmt, vec<data_reference_p> drs)
 
   tree ref_base_a = DR_REF (a);
   tree base = DR_BASE_OBJECT (a);
+  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
 
   gcc_assert (DR_STMT (a) == stmt);
 
-  while (TREE_CODE (ref_base_a) == COMPONENT_REF
-	 || TREE_CODE (ref_base_a) == IMAGPART_EXPR
-	 || TREE_CODE (ref_base_a) == REALPART_EXPR)
-    ref_base_a = TREE_OPERAND (ref_base_a, 0);
+  /* If reference in DR has innermost loop behavior and it is not
+     a compound memory reference, we get it from innermost_DR_map,
+     otherwise from ref_DR_map.  */
+  if (TREE_CODE (ref_base_a) == COMPONENT_REF
+      || TREE_CODE (ref_base_a) == IMAGPART_EXPR
+      || TREE_CODE (ref_base_a) == REALPART_EXPR
+      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
+	   || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
+    {
+      while (TREE_CODE (ref_base_a) == COMPONENT_REF
+	     || TREE_CODE (ref_base_a) == IMAGPART_EXPR
+	     || TREE_CODE (ref_base_a) == REALPART_EXPR)
+	ref_base_a = TREE_OPERAND (ref_base_a, 0);
+
+      master_dr = ref_DR_map->get (ref_base_a);
+    }
+  else
+    master_dr = innermost_DR_map->get (innermost);
 
-  master_dr = ref_DR_map->get (ref_base_a);
   base_master_dr = baseref_DR_map->get (base);
 
   gcc_assert (master_dr != NULL);
@@ -1229,6 +1261,8 @@ if_convertible_loop_p_1 (struct loop *loop,
   data_reference_p dr;
 
   ref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
+  innermost_DR_map
+	  = new hash_map<innermost_loop_behavior_hash, data_reference_p>;
   baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
 
   predicate_bbs (loop);
@@ -1341,6 +1375,9 @@ if_convertible_loop_p (struct loop *loop, bool *any_mask_load_store)
   delete ref_DR_map;
   ref_DR_map = NULL;
 
+  delete innermost_DR_map;
+  innermost_DR_map = NULL;
+
   delete baseref_DR_map;
   baseref_DR_map = NULL;
 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
  2016-03-16 10:00 [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible Bin Cheng
@ 2016-03-16 12:20 ` Richard Biener
  2016-03-16 16:17   ` Bin.Cheng
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2016-03-16 12:20 UTC (permalink / raw)
  To: Bin Cheng; +Cc: gcc-patches, nd

On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> One issue revealed in tree ifcvt is the pass stores/tracks DRs against its memory references in IR.  This causes difficulty in identifying same memory references appearing in different forms.  Given below example:
>
> void foo (int a[], int b[])
> {
>   int i;
>   for (i = 0; i < 100; i++)
>     {
>       if (a[i] ==0)
>         a[i] = b[i]*4;
>       else
>         a[i] = b[i]*3;
>     }
> }
>
> The gimple dump before tree ifcvt is as:
>
>   <bb 2>:
>
>   <bb 3>:
>   # i_24 = PHI <i_19(7), 0(2)>
>   # ivtmp_28 = PHI <ivtmp_23(7), 100(2)>
>   _5 = (long unsigned int) i_24;
>   _6 = _5 * 4;
>   _8 = a_7(D) + _6;
>   _9 = *_8;
>   if (_9 == 0)
>     goto <bb 4>;
>   else
>     goto <bb 5>;
>
>   <bb 4>:
>   _11 = b_10(D) + _6;
>   _12 = *_11;
>   _13 = _12 * 4;
>   goto <bb 6>;
>
>   <bb 5>:
>   _15 = b_10(D) + _6;
>   _16 = *_15;
>   _17 = _16 * 3;
>
>   <bb 6>:
>   # cstore_1 = PHI <_13(4), _17(5)>
>   *_8 = cstore_1;
>   i_19 = i_24 + 1;
>   ivtmp_23 = ivtmp_28 - 1;
>   if (ivtmp_23 != 0)
>     goto <bb 7>;
>   else
>     goto <bb 8>;
>
>   <bb 7>:
>   goto <bb 3>;
>
>   <bb 8>:
>   return;
>
> The two memory references "*_11" and "*_15" are actually the same thing, but ifcvt failed to discover this because they are recorded in different forms.  This patch fixes the issue by recording/tracking memory reference against its innermost_loop_behavior if: the memory reference has innermost_loop_behavior and it is not a compound reference.
> BTW, PR56625 reported that this case couldn't be vectorized even tree if-conv can handle it.  Interesting thing is at current time, it's tree if-conv that could not handle the case.  Once it's if-converted with this patch, vectorizer are able to handle it too.
>
> Bootstrap and test on x86_64 and AArch64.  Is it OK, not sure if it's GCC 7?

Hmm.

+  equal_p = true;
+  if (e1->base_address && e2->base_address)
+    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
+  if (e1->offset && e2->offset)
+    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);

surely better to return false early.

I think we don't want this in tree-data-refs.h also because of ...

@@ -615,15 +619,29 @@
hash_memrefs_baserefs_and_store_DRs_read_written_info
(data_reference_p a)
   data_reference_p *master_dr, *base_master_dr;
   tree ref = DR_REF (a);
   tree base_ref = DR_BASE_OBJECT (a);
+  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
   tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
   bool exist1, exist2;

-  while (TREE_CODE (ref) == COMPONENT_REF
-        || TREE_CODE (ref) == IMAGPART_EXPR
-        || TREE_CODE (ref) == REALPART_EXPR)
-    ref = TREE_OPERAND (ref, 0);
+  /* If reference in DR has innermost loop behavior and it is not
+     a compound memory reference, we store it to innermost_DR_map,
+     otherwise to ref_DR_map.  */
+  if (TREE_CODE (ref) == COMPONENT_REF
+      || TREE_CODE (ref) == IMAGPART_EXPR
+      || TREE_CODE (ref) == REALPART_EXPR
+      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
+          || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
+    {
+      while (TREE_CODE (ref) == COMPONENT_REF
+            || TREE_CODE (ref) == IMAGPART_EXPR
+            || TREE_CODE (ref) == REALPART_EXPR)
+       ref = TREE_OPERAND (ref, 0);
+
+      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
+    }
+  else
+    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);

we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd need to
strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
and REALPART) before creating the DR (or adjust the equality function
and hashing
to disregard them which means subtracting their offset from DR_INIT.

To adjust the references we collect you'd maybe could use a callback
to get_references_in_stmt
to adjust them.

OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as

Index: tree-if-conv.c
===================================================================
--- tree-if-conv.c      (revision 234215)
+++ tree-if-conv.c      (working copy)
@@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo

   for (i = 0; refs->iterate (i, &dr); i++)
     {
+      tree *refp = &DR_REF (dr);
+      while ((TREE_CODE (*refp) == COMPONENT_REF
+             && TREE_OPERAND (*refp, 2) == NULL_TREE)
+            || TREE_CODE (*refp) == IMAGPART_EXPR
+            || TREE_CODE (*refp) == REALPART_EXPR)
+       refp = &TREE_OPERAND (*refp, 0);
+      if (refp != &DR_REF (dr))
+       {
+         tree saved_base = *refp;
+         *refp = integer_zero_node;
+
+         if (DR_INIT (dr))
+           {
+             tree poffset;
+             int punsignedp, preversep, pvolatilep;
+             machine_mode pmode;
+             HOST_WIDE_INT pbitsize, pbitpos;
+             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, &poffset,
+                                  &pmode, &punsignedp, &preversep, &pvolatilep,
+                                  false);
+             gcc_assert (poffset == NULL_TREE);
+
+             DR_INIT (dr)
+               = wide_int_to_tree (ssizetype,
+                                   wi::sub (DR_INIT (dr),
+                                            pbitpos / BITS_PER_UNIT));
+           }
+
+         *refp = saved_base;
+         DR_REF (dr) = *refp;
+       }
+
       dr->aux = XNEW (struct ifc_dr);
       DR_BASE_W_UNCONDITIONALLY (dr) = false;
       DR_RW_UNCONDITIONALLY (dr) = false;


Can you add a testcase for the vectorization?

Richard.


> Thanks,
> bin
>
>
> 2016-03-11  Bin Cheng  <bin.cheng@arm.com>
>
>         PR tree-optimization/56625
>         PR tree-optimization/69489
>         * tree-data-ref.h (innermost_loop_behavior_hash): New class for
>         hashing struct innermost_loop_behavior.
>         (DR_INNERMOST): New macro.
>         * tree-if-conv.c (innermost_DR_map): New map.
>         (ref_DR_map, baseref_DR_map): Revise the comment.
>         (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR
>         to innermost_DR_map if it has innermost loop behavior and is not
>         a compound reference.
>         (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map if it has
>         innermost loop behavior and is not a compound reference.
>         (if_convertible_loop_p_1): Initialize innermost_DR_map.
>         (if_convertible_loop_p): Release innermost_DR_map.
>
> gcc/testsuite/ChangeLog
> 2016-03-11  Bin Cheng  <bin.cheng@arm.com>
>
>         PR tree-optimization/56625
>         PR tree-optimization/69489
>         * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test.
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
  2016-03-16 12:20 ` Richard Biener
@ 2016-03-16 16:17   ` Bin.Cheng
  2016-03-17  9:53     ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Bin.Cheng @ 2016-03-16 16:17 UTC (permalink / raw)
  To: Richard Biener; +Cc: Bin Cheng, gcc-patches

On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> > Hi,
> > ......
> > Bootstrap and test on x86_64 and AArch64.  Is it OK, not sure if it's GCC 7?
>
> Hmm.
Hi,
Thanks for reviewing.
>
> +  equal_p = true;
> +  if (e1->base_address && e2->base_address)
> +    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
> +  if (e1->offset && e2->offset)
> +    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>
> surely better to return false early.
>
> I think we don't want this in tree-data-refs.h also because of ...
>
> @@ -615,15 +619,29 @@
> hash_memrefs_baserefs_and_store_DRs_read_written_info
> (data_reference_p a)
>    data_reference_p *master_dr, *base_master_dr;and REALPART) before creating the DR (or adjust the equality function
and hashing
>    tree ref = DR_REF (a);
>    tree base_ref = DR_BASE_OBJECT (a);
> +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>    tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>    bool exist1, exist2;
>
> -  while (TREE_CODE (ref) == COMPONENT_REF
> -        || TREE_CODE (ref) == IMAGPART_EXPR
> -        || TREE_CODE (ref) == REALPART_EXPR)
> -    ref = TREE_OPERAND (ref, 0);
> +  /* If reference in DR has innermost loop behavior and it is not
> +     a compound memory reference, we store it to innermost_DR_map,
> +     otherwise to ref_DR_map.  */
> +  if (TREE_CODE (ref) == COMPONENT_REF
> +      || TREE_CODE (ref) == IMAGPART_EXPR
> +      || TREE_CODE (ref) == REALPART_EXPR
> +      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
> +          || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
> +    {
> +      while (TREE_CODE (ref) == COMPONENT_REF
> +            || TREE_CODE (ref) == IMAGPART_EXPR
> +            || TREE_CODE (ref) == REALPART_EXPR)
> +       ref = TREE_OPERAND (ref, 0);
> +
> +      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
> +    }
> +  else
> +    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>
> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd need to
> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
> and REALPART) before creating the DR (or adjust the equality function
> and hashing
> to disregard them which means subtracting their offset from DR_INIT.
I am not sure if I understand correctly.  But for component reference,
it is the base object that we want to record/track.  For example,

  for (i = 0; i < N; i++) {
    m = *data++;

    m1 = p1->x - m;
    m2 = p2->x + m;

    p3->y = (m1 >= m2) ? p1->y : p2->y;

    p1++;
    p2++;
    p3++;
  }
We want to infer that reads of p1/p2 in condition statement won't trap
because there are unconditional reads of the structures, though the
unconditional reads are actual of other sub-objects.  Here it is the
invariant part of address that we want to track.
Also illustrated by this example, we can't rely on data-ref analyzer
here.  Because in gathering/scattering cases, the address could be not
affine at all.
>
> To adjust the references we collect you'd maybe could use a callback
> to get_references_in_stmt
> to adjust them.
>
> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as
Is this a part of the method you suggested above, or is it an
alternative one?  If it's the latter, then I have below questions
embedded.
>
> Index: tree-if-conv.c
> ===================================================================
> --- tree-if-conv.c      (revision 234215)
> +++ tree-if-conv.c      (working copy)
> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>
>    for (i = 0; refs->iterate (i, &dr); i++)
>      {
> +      tree *refp = &DR_REF (dr);
> +      while ((TREE_CODE (*refp) == COMPONENT_REF
> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
> +            || TREE_CODE (*refp) == IMAGPART_EXPR
> +            || TREE_CODE (*refp) == REALPART_EXPR)
> +       refp = &TREE_OPERAND (*refp, 0);
> +      if (refp != &DR_REF (dr))
> +       {
> +         tree saved_base = *refp;
> +         *refp = integer_zero_node;
> +
> +         if (DR_INIT (dr))
> +           {
> +             tree poffset;
> +             int punsignedp, preversep, pvolatilep;
> +             machine_mode pmode;
> +             HOST_WIDE_INT pbitsize, pbitpos;
> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, &poffset,
> +                                  &pmode, &punsignedp, &preversep, &pvolatilep,
> +                                  false);
> +             gcc_assert (poffset == NULL_TREE);
> +
> +             DR_INIT (dr)
> +               = wide_int_to_tree (ssizetype,
> +                                   wi::sub (DR_INIT (dr),
> +                                            pbitpos / BITS_PER_UNIT));
> +           }
> +
> +         *refp = saved_base;
> +         DR_REF (dr) = *refp;
> +       }
Looks to me the code is trying to resolve difference between two (or
more) component references, which is DR_INIT in the code.  But DR_INIT
is not the only thing needs to be handled.  For a structure containing
two sub-arrays, DR_OFFSET may be different too.
Actually I did try to replace ref_DR_map.  For memory reference
doesn't have innermost affine behavior, we need to modify DR fields by
populating it with artificial data.  This results in some kind of
over-designed code.  IMHO, modifying some DR structures outside of
data-ref analyzer isn't that good.  After comparing the two methods, I
think may be this one is better, of course with the cost of two
different maps.

I might be misunderstanding your idea, so please correct if I was wrong.

Thanks,
bin
> +
>        dr->aux = XNEW (struct ifc_dr);
>        DR_BASE_W_UNCONDITIONALLY (dr) = false;
>        DR_RW_UNCONDITIONALLY (dr) = false;
>
>
> Can you add a testcase for the vectorization?
>
> Richard.
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
  2016-03-16 16:17   ` Bin.Cheng
@ 2016-03-17  9:53     ` Richard Biener
       [not found]       ` <CAHFci29CHpOBxah4QxrL_JOL6p_NC=r3-e3eoNjHp_4z749PnQ@mail.gmail.com>
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2016-03-17  9:53 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Bin Cheng, gcc-patches

On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>>
>> On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
>> > Hi,
>> > ......
>> > Bootstrap and test on x86_64 and AArch64.  Is it OK, not sure if it's GCC 7?
>>
>> Hmm.
> Hi,
> Thanks for reviewing.
>>
>> +  equal_p = true;
>> +  if (e1->base_address && e2->base_address)
>> +    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>> +  if (e1->offset && e2->offset)
>> +    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>
>> surely better to return false early.
>>
>> I think we don't want this in tree-data-refs.h also because of ...
>>
>> @@ -615,15 +619,29 @@
>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>> (data_reference_p a)
>>    data_reference_p *master_dr, *base_master_dr;and REALPART) before creating the DR (or adjust the equality function
> and hashing
>>    tree ref = DR_REF (a);
>>    tree base_ref = DR_BASE_OBJECT (a);
>> +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>>    tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>    bool exist1, exist2;
>>
>> -  while (TREE_CODE (ref) == COMPONENT_REF
>> -        || TREE_CODE (ref) == IMAGPART_EXPR
>> -        || TREE_CODE (ref) == REALPART_EXPR)
>> -    ref = TREE_OPERAND (ref, 0);
>> +  /* If reference in DR has innermost loop behavior and it is not
>> +     a compound memory reference, we store it to innermost_DR_map,
>> +     otherwise to ref_DR_map.  */
>> +  if (TREE_CODE (ref) == COMPONENT_REF
>> +      || TREE_CODE (ref) == IMAGPART_EXPR
>> +      || TREE_CODE (ref) == REALPART_EXPR
>> +      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>> +          || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>> +    {
>> +      while (TREE_CODE (ref) == COMPONENT_REF
>> +            || TREE_CODE (ref) == IMAGPART_EXPR
>> +            || TREE_CODE (ref) == REALPART_EXPR)
>> +       ref = TREE_OPERAND (ref, 0);
>> +
>> +      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
>> +    }
>> +  else
>> +    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>>
>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd need to
>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>> and REALPART) before creating the DR (or adjust the equality function
>> and hashing
>> to disregard them which means subtracting their offset from DR_INIT.
> I am not sure if I understand correctly.  But for component reference,
> it is the base object that we want to record/track.  For example,
>
>   for (i = 0; i < N; i++) {
>     m = *data++;
>
>     m1 = p1->x - m;
>     m2 = p2->x + m;
>
>     p3->y = (m1 >= m2) ? p1->y : p2->y;
>
>     p1++;
>     p2++;
>     p3++;
>   }
> We want to infer that reads of p1/p2 in condition statement won't trap
> because there are unconditional reads of the structures, though the
> unconditional reads are actual of other sub-objects.  Here it is the
> invariant part of address that we want to track.

Well, the variant parts - we want to strip invariant parts as far as we can
(offsetof (x) and offsetof (y))

> Also illustrated by this example, we can't rely on data-ref analyzer
> here.  Because in gathering/scattering cases, the address could be not
> affine at all.

Sure, but that's a different issue.

>>
>> To adjust the references we collect you'd maybe could use a callback
>> to get_references_in_stmt
>> to adjust them.
>>
>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as
> Is this a part of the method you suggested above, or is it an
> alternative one?  If it's the latter, then I have below questions
> embedded.

It is an alternative to adding a hook to get_references_in_stmt and
probably "easier".

>>
>> Index: tree-if-conv.c
>> ===================================================================
>> --- tree-if-conv.c      (revision 234215)
>> +++ tree-if-conv.c      (working copy)
>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>
>>    for (i = 0; refs->iterate (i, &dr); i++)
>>      {
>> +      tree *refp = &DR_REF (dr);
>> +      while ((TREE_CODE (*refp) == COMPONENT_REF
>> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
>> +            || TREE_CODE (*refp) == IMAGPART_EXPR
>> +            || TREE_CODE (*refp) == REALPART_EXPR)
>> +       refp = &TREE_OPERAND (*refp, 0);
>> +      if (refp != &DR_REF (dr))
>> +       {
>> +         tree saved_base = *refp;
>> +         *refp = integer_zero_node;
>> +
>> +         if (DR_INIT (dr))
>> +           {
>> +             tree poffset;
>> +             int punsignedp, preversep, pvolatilep;
>> +             machine_mode pmode;
>> +             HOST_WIDE_INT pbitsize, pbitpos;
>> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, &poffset,
>> +                                  &pmode, &punsignedp, &preversep, &pvolatilep,
>> +                                  false);
>> +             gcc_assert (poffset == NULL_TREE);
>> +
>> +             DR_INIT (dr)
>> +               = wide_int_to_tree (ssizetype,
>> +                                   wi::sub (DR_INIT (dr),
>> +                                            pbitpos / BITS_PER_UNIT));
>> +           }
>> +
>> +         *refp = saved_base;
>> +         DR_REF (dr) = *refp;
>> +       }
> Looks to me the code is trying to resolve difference between two (or
> more) component references, which is DR_INIT in the code.  But DR_INIT
> is not the only thing needs to be handled.  For a structure containing
> two sub-arrays, DR_OFFSET may be different too.

Yes, but we can't say that if

  a->a[i]

doesn't trap that then

  a->b[i]

doesn't trap either.  We can only "strip" outermost
non-variable-offset components.

But maybe I'm missing what example you are thinking of.

> Actually I did try to replace ref_DR_map.  For memory reference
> doesn't have innermost affine behavior, we need to modify DR fields by
> populating it with artificial data.  This results in some kind of
> over-designed code.  IMHO, modifying some DR structures outside of
> data-ref analyzer isn't that good.  After comparing the two methods, I
> think may be this one is better, of course with the cost of two
> different maps.

IMHO the code is already somewhat awful in using data-ref analysis at all
(for what it does it certainly doesn't need it).  Note that in the case
data-ref analysis cannot analyze innermost behavior we're good as well
with my proposed code - we simply leave DR_INIT NULL (but still strip
outermost components).

Richard.

> I might be misunderstanding your idea, so please correct if I was wrong.
>
> Thanks,
> bin
>> +
>>        dr->aux = XNEW (struct ifc_dr);
>>        DR_BASE_W_UNCONDITIONALLY (dr) = false;
>>        DR_RW_UNCONDITIONALLY (dr) = false;
>>
>>
>> Can you add a testcase for the vectorization?
>>
>> Richard.
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Fwd: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
       [not found]       ` <CAHFci29CHpOBxah4QxrL_JOL6p_NC=r3-e3eoNjHp_4z749PnQ@mail.gmail.com>
@ 2016-03-28 22:04         ` Bin.Cheng
  2016-03-29  8:44           ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Bin.Cheng @ 2016-03-28 22:04 UTC (permalink / raw)
  To: gcc-patches

Sorry, Should have replied to gcc-patches list.

Thanks,
bin

---------- Forwarded message ----------
From: "Bin.Cheng" <amker.cheng@gmail.com>
Date: Tue, 29 Mar 2016 03:55:04 +0800
Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
DR against its innermost loop bahavior if possible
To: Richard Biener <richard.guenther@gmail.com>

On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote:
> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>>
>>> Hmm.
>> Hi,
>> Thanks for reviewing.
>>>
>>> +  equal_p = true;
>>> +  if (e1->base_address && e2->base_address)
>>> +    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>>> +  if (e1->offset && e2->offset)
>>> +    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>>
>>> surely better to return false early.
>>>
>>> I think we don't want this in tree-data-refs.h also because of ...
>>>
>>> @@ -615,15 +619,29 @@
>>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>>> (data_reference_p a)
>>>    data_reference_p *master_dr, *base_master_dr;and REALPART) before
>>> creating the DR (or adjust the equality function
>> and hashing
>>>    tree ref = DR_REF (a);
>>>    tree base_ref = DR_BASE_OBJECT (a);
>>> +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>>>    tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>>    bool exist1, exist2;
>>>
>>> -  while (TREE_CODE (ref) == COMPONENT_REF
>>> -        || TREE_CODE (ref) == IMAGPART_EXPR
>>> -        || TREE_CODE (ref) == REALPART_EXPR)
>>> -    ref = TREE_OPERAND (ref, 0);
>>> +  /* If reference in DR has innermost loop behavior and it is not
>>> +     a compound memory reference, we store it to innermost_DR_map,
>>> +     otherwise to ref_DR_map.  */
>>> +  if (TREE_CODE (ref) == COMPONENT_REF
>>> +      || TREE_CODE (ref) == IMAGPART_EXPR
>>> +      || TREE_CODE (ref) == REALPART_EXPR
>>> +      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>>> +          || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>>> +    {
>>> +      while (TREE_CODE (ref) == COMPONENT_REF
>>> +            || TREE_CODE (ref) == IMAGPART_EXPR
>>> +            || TREE_CODE (ref) == REALPART_EXPR)
>>> +       ref = TREE_OPERAND (ref, 0);
>>> +
>>> +      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
>>> +    }
>>> +  else
>>> +    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>>>
>>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd
>>> need to
>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>>> and REALPART) before creating the DR (or adjust the equality function
>>> and hashing
>>> to disregard them which means subtracting their offset from DR_INIT.
>> I am not sure if I understand correctly.  But for component reference,
>> it is the base object that we want to record/track.  For example,
>>
>>   for (i = 0; i < N; i++) {
>>     m = *data++;
>>
>>     m1 = p1->x - m;
>>     m2 = p2->x + m;
>>
>>     p3->y = (m1 >= m2) ? p1->y : p2->y;
>>
>>     p1++;
>>     p2++;
>>     p3++;
>>   }
>> We want to infer that reads of p1/p2 in condition statement won't trap
>> because there are unconditional reads of the structures, though the
>> unconditional reads are actual of other sub-objects.  Here it is the
>> invariant part of address that we want to track.
>
> Well, the variant parts - we want to strip invariant parts as far as we can
> (offsetof (x) and offsetof (y))
>
>> Also illustrated by this example, we can't rely on data-ref analyzer
>> here.  Because in gathering/scattering cases, the address could be not
>> affine at all.
>
> Sure, but that's a different issue.
>
>>>
>>> To adjust the references we collect you'd maybe could use a callback
>>> to get_references_in_stmt
>>> to adjust them.
>>>
>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
>>> as
>> Is this a part of the method you suggested above, or is it an
>> alternative one?  If it's the latter, then I have below questions
>> embedded.
>
> It is an alternative to adding a hook to get_references_in_stmt and
> probably "easier".
>
>>>
>>> Index: tree-if-conv.c
>>> ===================================================================
>>> --- tree-if-conv.c      (revision 234215)
>>> +++ tree-if-conv.c      (working copy)
>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>
>>>    for (i = 0; refs->iterate (i, &dr); i++)
>>>      {
>>> +      tree *refp = &DR_REF (dr);
>>> +      while ((TREE_CODE (*refp) == COMPONENT_REF
>>> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>> +            || TREE_CODE (*refp) == IMAGPART_EXPR
>>> +            || TREE_CODE (*refp) == REALPART_EXPR)
>>> +       refp = &TREE_OPERAND (*refp, 0);
>>> +      if (refp != &DR_REF (dr))
>>> +       {
>>> +         tree saved_base = *refp;
>>> +         *refp = integer_zero_node;
>>> +
>>> +         if (DR_INIT (dr))
>>> +           {
>>> +             tree poffset;
>>> +             int punsignedp, preversep, pvolatilep;
>>> +             machine_mode pmode;
>>> +             HOST_WIDE_INT pbitsize, pbitpos;
>>> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos,
>>> &poffset,
>>> +                                  &pmode, &punsignedp, &preversep,
>>> &pvolatilep,
>>> +                                  false);
>>> +             gcc_assert (poffset == NULL_TREE);
>>> +
>>> +             DR_INIT (dr)
>>> +               = wide_int_to_tree (ssizetype,
>>> +                                   wi::sub (DR_INIT (dr),
>>> +                                            pbitpos / BITS_PER_UNIT));
>>> +           }
>>> +
>>> +         *refp = saved_base;
>>> +         DR_REF (dr) = *refp;
>>> +       }
>> Looks to me the code is trying to resolve difference between two (or
>> more) component references, which is DR_INIT in the code.  But DR_INIT
>> is not the only thing needs to be handled.  For a structure containing
>> two sub-arrays, DR_OFFSET may be different too.
>
> Yes, but we can't say that if
>
>   a->a[i]
>
> doesn't trap that then
>
>   a->b[i]
>
> doesn't trap either.  We can only "strip" outermost
> non-variable-offset components.
>
> But maybe I'm missing what example you are thinking of.
Hmm, this was the case I meant.  What I don't understand is current
code logic does infer trap information for a.b[i] from a.a[i].  Given
below example:
struct str
{
  int a[10];
  int b[20];
  char c;
};

void bar (struct str *);
int foo (int x, int n)
{
  int i;
  struct str s;
  bar (&s);
  for (i = 0; i < n; i++)
    {
      s.a[i] = s.b[i];
      if (x > i)
        s.b[i] = 0;
    }
  bar (&s);
  return 0;
}
The loop is convertible because of below code in function
ifcvt_memrefs_wont_trap:

  /* If a is unconditionally accessed then ... */
  if (DR_RW_UNCONDITIONALLY (*master_dr))
    {
      /* an unconditional read won't trap.  */
      if (DR_IS_READ (a))
	return true;

      /* an unconditionaly write won't trap if the base is written
         to unconditionally.  */
      if (base_master_dr
	  && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
	return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
      else
	{
	  /* or the base is know to be not readonly.  */
	  tree base_tree = get_base_address (DR_REF (a));
	  if (DECL_P (base_tree)
	      && decl_binds_to_current_def_p (base_tree)
	      && ! TREE_READONLY (base_tree))
	    return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
	}
    }
It is the main object '&s' that is recorded in base_master_dr, so
s.b[i] is considered not trap.  Even it's not, I suppose
get_base_address will give same result?

Thanks,
bin



-- 
Best Regards.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
  2016-03-28 22:04         ` Fwd: " Bin.Cheng
@ 2016-03-29  8:44           ` Richard Biener
  2016-03-31 17:08             ` Bin.Cheng
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2016-03-29  8:44 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: GCC Patches

On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> Sorry, Should have replied to gcc-patches list.
>
> Thanks,
> bin
>
> ---------- Forwarded message ----------
> From: "Bin.Cheng" <amker.cheng@gmail.com>
> Date: Tue, 29 Mar 2016 03:55:04 +0800
> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
> DR against its innermost loop bahavior if possible
> To: Richard Biener <richard.guenther@gmail.com>
>
> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote:
>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>>
>>>> Hmm.
>>> Hi,
>>> Thanks for reviewing.
>>>>
>>>> +  equal_p = true;
>>>> +  if (e1->base_address && e2->base_address)
>>>> +    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>>>> +  if (e1->offset && e2->offset)
>>>> +    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>>>
>>>> surely better to return false early.
>>>>
>>>> I think we don't want this in tree-data-refs.h also because of ...
>>>>
>>>> @@ -615,15 +619,29 @@
>>>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>>>> (data_reference_p a)
>>>>    data_reference_p *master_dr, *base_master_dr;and REALPART) before
>>>> creating the DR (or adjust the equality function
>>> and hashing
>>>>    tree ref = DR_REF (a);
>>>>    tree base_ref = DR_BASE_OBJECT (a);
>>>> +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>>>>    tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>>>    bool exist1, exist2;
>>>>
>>>> -  while (TREE_CODE (ref) == COMPONENT_REF
>>>> -        || TREE_CODE (ref) == IMAGPART_EXPR
>>>> -        || TREE_CODE (ref) == REALPART_EXPR)
>>>> -    ref = TREE_OPERAND (ref, 0);
>>>> +  /* If reference in DR has innermost loop behavior and it is not
>>>> +     a compound memory reference, we store it to innermost_DR_map,
>>>> +     otherwise to ref_DR_map.  */
>>>> +  if (TREE_CODE (ref) == COMPONENT_REF
>>>> +      || TREE_CODE (ref) == IMAGPART_EXPR
>>>> +      || TREE_CODE (ref) == REALPART_EXPR
>>>> +      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>>>> +          || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>>>> +    {
>>>> +      while (TREE_CODE (ref) == COMPONENT_REF
>>>> +            || TREE_CODE (ref) == IMAGPART_EXPR
>>>> +            || TREE_CODE (ref) == REALPART_EXPR)
>>>> +       ref = TREE_OPERAND (ref, 0);
>>>> +
>>>> +      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
>>>> +    }
>>>> +  else
>>>> +    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>>>>
>>>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd
>>>> need to
>>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>>>> and REALPART) before creating the DR (or adjust the equality function
>>>> and hashing
>>>> to disregard them which means subtracting their offset from DR_INIT.
>>> I am not sure if I understand correctly.  But for component reference,
>>> it is the base object that we want to record/track.  For example,
>>>
>>>   for (i = 0; i < N; i++) {
>>>     m = *data++;
>>>
>>>     m1 = p1->x - m;
>>>     m2 = p2->x + m;
>>>
>>>     p3->y = (m1 >= m2) ? p1->y : p2->y;
>>>
>>>     p1++;
>>>     p2++;
>>>     p3++;
>>>   }
>>> We want to infer that reads of p1/p2 in condition statement won't trap
>>> because there are unconditional reads of the structures, though the
>>> unconditional reads are actual of other sub-objects.  Here it is the
>>> invariant part of address that we want to track.
>>
>> Well, the variant parts - we want to strip invariant parts as far as we can
>> (offsetof (x) and offsetof (y))
>>
>>> Also illustrated by this example, we can't rely on data-ref analyzer
>>> here.  Because in gathering/scattering cases, the address could be not
>>> affine at all.
>>
>> Sure, but that's a different issue.
>>
>>>>
>>>> To adjust the references we collect you'd maybe could use a callback
>>>> to get_references_in_stmt
>>>> to adjust them.
>>>>
>>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
>>>> as
>>> Is this a part of the method you suggested above, or is it an
>>> alternative one?  If it's the latter, then I have below questions
>>> embedded.
>>
>> It is an alternative to adding a hook to get_references_in_stmt and
>> probably "easier".
>>
>>>>
>>>> Index: tree-if-conv.c
>>>> ===================================================================
>>>> --- tree-if-conv.c      (revision 234215)
>>>> +++ tree-if-conv.c      (working copy)
>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>>
>>>>    for (i = 0; refs->iterate (i, &dr); i++)
>>>>      {
>>>> +      tree *refp = &DR_REF (dr);
>>>> +      while ((TREE_CODE (*refp) == COMPONENT_REF
>>>> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>>> +            || TREE_CODE (*refp) == IMAGPART_EXPR
>>>> +            || TREE_CODE (*refp) == REALPART_EXPR)
>>>> +       refp = &TREE_OPERAND (*refp, 0);
>>>> +      if (refp != &DR_REF (dr))
>>>> +       {
>>>> +         tree saved_base = *refp;
>>>> +         *refp = integer_zero_node;
>>>> +
>>>> +         if (DR_INIT (dr))
>>>> +           {
>>>> +             tree poffset;
>>>> +             int punsignedp, preversep, pvolatilep;
>>>> +             machine_mode pmode;
>>>> +             HOST_WIDE_INT pbitsize, pbitpos;
>>>> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos,
>>>> &poffset,
>>>> +                                  &pmode, &punsignedp, &preversep,
>>>> &pvolatilep,
>>>> +                                  false);
>>>> +             gcc_assert (poffset == NULL_TREE);
>>>> +
>>>> +             DR_INIT (dr)
>>>> +               = wide_int_to_tree (ssizetype,
>>>> +                                   wi::sub (DR_INIT (dr),
>>>> +                                            pbitpos / BITS_PER_UNIT));
>>>> +           }
>>>> +
>>>> +         *refp = saved_base;
>>>> +         DR_REF (dr) = *refp;
>>>> +       }
>>> Looks to me the code is trying to resolve difference between two (or
>>> more) component references, which is DR_INIT in the code.  But DR_INIT
>>> is not the only thing needs to be handled.  For a structure containing
>>> two sub-arrays, DR_OFFSET may be different too.
>>
>> Yes, but we can't say that if
>>
>>   a->a[i]
>>
>> doesn't trap that then
>>
>>   a->b[i]
>>
>> doesn't trap either.  We can only "strip" outermost
>> non-variable-offset components.
>>
>> But maybe I'm missing what example you are thinking of.
> Hmm, this was the case I meant.  What I don't understand is current
> code logic does infer trap information for a.b[i] from a.a[i].  Given
> below example:
> struct str
> {
>   int a[10];
>   int b[20];
>   char c;
> };
>
> void bar (struct str *);
> int foo (int x, int n)
> {
>   int i;
>   struct str s;
>   bar (&s);
>   for (i = 0; i < n; i++)
>     {
>       s.a[i] = s.b[i];
>       if (x > i)
>         s.b[i] = 0;
>     }
>   bar (&s);
>   return 0;
> }
> The loop is convertible because of below code in function
> ifcvt_memrefs_wont_trap:
>
>   /* If a is unconditionally accessed then ... */
>   if (DR_RW_UNCONDITIONALLY (*master_dr))
>     {
>       /* an unconditional read won't trap.  */
>       if (DR_IS_READ (a))
>         return true;
>
>       /* an unconditionaly write won't trap if the base is written
>          to unconditionally.  */
>       if (base_master_dr
>           && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
>         return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>       else
>         {
>           /* or the base is know to be not readonly.  */
>           tree base_tree = get_base_address (DR_REF (a));
>           if (DECL_P (base_tree)
>               && decl_binds_to_current_def_p (base_tree)
>               && ! TREE_READONLY (base_tree))
>             return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>         }
>     }
> It is the main object '&s' that is recorded in base_master_dr, so
> s.b[i] is considered not trap.  Even it's not, I suppose
> get_base_address will give same result?

Well, for this case it sees that s.b[i] is read from so it can't be an
out-of-bound
access.  And s.a[i] is written to unconditionally so 's' cannot be a
readonly object.
With both pieces of information we can conclude that s.b[i] = 0 doesn't trap.

Richard.

>
> Thanks,
> bin
>
>
>
> --
> Best Regards.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
  2016-03-29  8:44           ` Richard Biener
@ 2016-03-31 17:08             ` Bin.Cheng
  2016-04-04 13:07               ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Bin.Cheng @ 2016-03-31 17:08 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 10834 bytes --]

On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> Sorry, Should have replied to gcc-patches list.
>>
>> Thanks,
>> bin
>>
>> ---------- Forwarded message ----------
>> From: "Bin.Cheng" <amker.cheng@gmail.com>
>> Date: Tue, 29 Mar 2016 03:55:04 +0800
>> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
>> DR against its innermost loop bahavior if possible
>> To: Richard Biener <richard.guenther@gmail.com>
>>
>> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote:
>>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>>> <richard.guenther@gmail.com> wrote:
>>>>>
>>>>> Hmm.
>>>> Hi,
>>>> Thanks for reviewing.
>>>>>
>>>>> +  equal_p = true;
>>>>> +  if (e1->base_address && e2->base_address)
>>>>> +    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>>>>> +  if (e1->offset && e2->offset)
>>>>> +    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>>>>
>>>>> surely better to return false early.
>>>>>
>>>>> I think we don't want this in tree-data-refs.h also because of ...
>>>>>
>>>>> @@ -615,15 +619,29 @@
>>>>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>>>>> (data_reference_p a)
>>>>>    data_reference_p *master_dr, *base_master_dr;and REALPART) before
>>>>> creating the DR (or adjust the equality function
>>>> and hashing
>>>>>    tree ref = DR_REF (a);
>>>>>    tree base_ref = DR_BASE_OBJECT (a);
>>>>> +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>>>>>    tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>>>>    bool exist1, exist2;
>>>>>
>>>>> -  while (TREE_CODE (ref) == COMPONENT_REF
>>>>> -        || TREE_CODE (ref) == IMAGPART_EXPR
>>>>> -        || TREE_CODE (ref) == REALPART_EXPR)
>>>>> -    ref = TREE_OPERAND (ref, 0);
>>>>> +  /* If reference in DR has innermost loop behavior and it is not
>>>>> +     a compound memory reference, we store it to innermost_DR_map,
>>>>> +     otherwise to ref_DR_map.  */
>>>>> +  if (TREE_CODE (ref) == COMPONENT_REF
>>>>> +      || TREE_CODE (ref) == IMAGPART_EXPR
>>>>> +      || TREE_CODE (ref) == REALPART_EXPR
>>>>> +      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>>>>> +          || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>>>>> +    {
>>>>> +      while (TREE_CODE (ref) == COMPONENT_REF
>>>>> +            || TREE_CODE (ref) == IMAGPART_EXPR
>>>>> +            || TREE_CODE (ref) == REALPART_EXPR)
>>>>> +       ref = TREE_OPERAND (ref, 0);
>>>>> +
>>>>> +      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
>>>>> +    }
>>>>> +  else
>>>>> +    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>>>>>
>>>>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd
>>>>> need to
>>>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>>>>> and REALPART) before creating the DR (or adjust the equality function
>>>>> and hashing
>>>>> to disregard them which means subtracting their offset from DR_INIT.
>>>> I am not sure if I understand correctly.  But for component reference,
>>>> it is the base object that we want to record/track.  For example,
>>>>
>>>>   for (i = 0; i < N; i++) {
>>>>     m = *data++;
>>>>
>>>>     m1 = p1->x - m;
>>>>     m2 = p2->x + m;
>>>>
>>>>     p3->y = (m1 >= m2) ? p1->y : p2->y;
>>>>
>>>>     p1++;
>>>>     p2++;
>>>>     p3++;
>>>>   }
>>>> We want to infer that reads of p1/p2 in condition statement won't trap
>>>> because there are unconditional reads of the structures, though the
>>>> unconditional reads are actual of other sub-objects.  Here it is the
>>>> invariant part of address that we want to track.
>>>
>>> Well, the variant parts - we want to strip invariant parts as far as we can
>>> (offsetof (x) and offsetof (y))
>>>
>>>> Also illustrated by this example, we can't rely on data-ref analyzer
>>>> here.  Because in gathering/scattering cases, the address could be not
>>>> affine at all.
>>>
>>> Sure, but that's a different issue.
>>>
>>>>>
>>>>> To adjust the references we collect you'd maybe could use a callback
>>>>> to get_references_in_stmt
>>>>> to adjust them.
>>>>>
>>>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
>>>>> as
>>>> Is this a part of the method you suggested above, or is it an
>>>> alternative one?  If it's the latter, then I have below questions
>>>> embedded.
>>>
>>> It is an alternative to adding a hook to get_references_in_stmt and
>>> probably "easier".
>>>
>>>>>
>>>>> Index: tree-if-conv.c
>>>>> ===================================================================
>>>>> --- tree-if-conv.c      (revision 234215)
>>>>> +++ tree-if-conv.c      (working copy)
>>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>>>
>>>>>    for (i = 0; refs->iterate (i, &dr); i++)
>>>>>      {
>>>>> +      tree *refp = &DR_REF (dr);
>>>>> +      while ((TREE_CODE (*refp) == COMPONENT_REF
>>>>> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>>>> +            || TREE_CODE (*refp) == IMAGPART_EXPR
>>>>> +            || TREE_CODE (*refp) == REALPART_EXPR)
>>>>> +       refp = &TREE_OPERAND (*refp, 0);
>>>>> +      if (refp != &DR_REF (dr))
>>>>> +       {
>>>>> +         tree saved_base = *refp;
>>>>> +         *refp = integer_zero_node;
>>>>> +
>>>>> +         if (DR_INIT (dr))
>>>>> +           {
>>>>> +             tree poffset;
>>>>> +             int punsignedp, preversep, pvolatilep;
>>>>> +             machine_mode pmode;
>>>>> +             HOST_WIDE_INT pbitsize, pbitpos;
>>>>> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos,
>>>>> &poffset,
>>>>> +                                  &pmode, &punsignedp, &preversep,
>>>>> &pvolatilep,
>>>>> +                                  false);
>>>>> +             gcc_assert (poffset == NULL_TREE);
>>>>> +
>>>>> +             DR_INIT (dr)
>>>>> +               = wide_int_to_tree (ssizetype,
>>>>> +                                   wi::sub (DR_INIT (dr),
>>>>> +                                            pbitpos / BITS_PER_UNIT));
>>>>> +           }
>>>>> +
>>>>> +         *refp = saved_base;
>>>>> +         DR_REF (dr) = *refp;
>>>>> +       }
>>>> Looks to me the code is trying to resolve difference between two (or
>>>> more) component references, which is DR_INIT in the code.  But DR_INIT
>>>> is not the only thing needs to be handled.  For a structure containing
>>>> two sub-arrays, DR_OFFSET may be different too.
>>>
>>> Yes, but we can't say that if
>>>
>>>   a->a[i]
>>>
>>> doesn't trap that then
>>>
>>>   a->b[i]
>>>
>>> doesn't trap either.  We can only "strip" outermost
>>> non-variable-offset components.
>>>
>>> But maybe I'm missing what example you are thinking of.
>> Hmm, this was the case I meant.  What I don't understand is current
>> code logic does infer trap information for a.b[i] from a.a[i].  Given
>> below example:
>> struct str
>> {
>>   int a[10];
>>   int b[20];
>>   char c;
>> };
>>
>> void bar (struct str *);
>> int foo (int x, int n)
>> {
>>   int i;
>>   struct str s;
>>   bar (&s);
>>   for (i = 0; i < n; i++)
>>     {
>>       s.a[i] = s.b[i];
>>       if (x > i)
>>         s.b[i] = 0;
>>     }
>>   bar (&s);
>>   return 0;
>> }
>> The loop is convertible because of below code in function
>> ifcvt_memrefs_wont_trap:
>>
>>   /* If a is unconditionally accessed then ... */
>>   if (DR_RW_UNCONDITIONALLY (*master_dr))
>>     {
>>       /* an unconditional read won't trap.  */
>>       if (DR_IS_READ (a))
>>         return true;
>>
>>       /* an unconditionaly write won't trap if the base is written
>>          to unconditionally.  */
>>       if (base_master_dr
>>           && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
>>         return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>       else
>>         {
>>           /* or the base is know to be not readonly.  */
>>           tree base_tree = get_base_address (DR_REF (a));
>>           if (DECL_P (base_tree)
>>               && decl_binds_to_current_def_p (base_tree)
>>               && ! TREE_READONLY (base_tree))
>>             return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>         }
>>     }
>> It is the main object '&s' that is recorded in base_master_dr, so
>> s.b[i] is considered not trap.  Even it's not, I suppose
>> get_base_address will give same result?
>
> Well, for this case it sees that s.b[i] is read from so it can't be an
> out-of-bound
> access.  And s.a[i] is written to unconditionally so 's' cannot be a
> readonly object.
> With both pieces of information we can conclude that s.b[i] = 0 doesn't trap.

Hi Richard,
Attachment is the updated patch.  I made below changes wrto your
review comments:
1) Moved hash class for innermost loop behavior from tree-data-ref.h
to tree-if-conv.c.
    I also removed check on innermost_loop_behavior.aligned_to which
seems redundant to me because it's computed from offset and we have
already checked equality for offset.
2) Replaced ref_DR_map with new map innermost_DR_map.
3) Post-processed DR in if_convertible_loop_p_1 for compound reference
or references don't have innermost behavior.  This cleans up following
code using DR information.
4) Added a vectorization test for PR56625.

I didn't incorporate your proposed code because I think it handles
COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y.
Looks to me it is another ifcvt opportunity different from PR69489.
Anyway, fix is easy, I can send another patch in GCC7.

Bootstrap and test on x86_64/AArch64, so any comments on this version?

Thanks,
bin

2016-03-30  Bin Cheng  <bin.cheng@arm.com>

    PR tree-optimization/56625
    PR tree-optimization/69489
    * tree-data-ref.h (DR_INNERMOST): New macro.
    * tree-if-conv.c (innermost_loop_behavior_hash): New class for
    hashing struct innermost_loop_behavior.
    (ref_DR_map): Remove.
    (innermost_DR_map): New map.
    (baseref_DR_map): Revise comment.
    (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR
    to innermost_DR_map accroding to its innermost loop behavior.
    (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map according
    to its innermost loop behavior.
    (if_convertible_loop_p_1): Remove intialization for ref_DR_map.
    Add initialization for innermost_DR_map.  Record memory reference
    in DR_BASE_ADDRESS if the reference is compound one or it doesn't
    have innermost loop behavior.
    (if_convertible_loop_p): Remove release for ref_DR_map.  Release
    innermost_DR_map.

gcc/testsuite/ChangeLog
2016-03-30  Bin Cheng  <bin.cheng@arm.com>

    PR tree-optimization/56625
    PR tree-optimization/69489
    * gcc.dg/vect/pr56625.c: New test.
    * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test.

[-- Attachment #2: pr69489-part1-20160330.txt --]
[-- Type: text/plain, Size: 7637 bytes --]

diff --git a/gcc/tree-data-ref.h b/gcc/tree-data-ref.h
index eb24d16..856dd58 100644
--- a/gcc/tree-data-ref.h
+++ b/gcc/tree-data-ref.h
@@ -144,6 +144,7 @@ struct data_reference
 #define DR_STEP(DR)                (DR)->innermost.step
 #define DR_PTR_INFO(DR)            (DR)->alias.ptr_info
 #define DR_ALIGNED_TO(DR)          (DR)->innermost.aligned_to
+#define DR_INNERMOST(DR)           (DR)->innermost
 
 typedef struct data_reference *data_reference_p;
 
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 9e305c7..884006c 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -114,16 +114,68 @@ along with GCC; see the file COPYING3.  If not see
 #include "builtins.h"
 #include "params.h"
 
+/* Hash for struct innermost_loop_behavior.  It depends on the user to
+   free the memory.  */
+
+struct innermost_loop_behavior_hash : nofree_ptr_hash <innermost_loop_behavior>
+{
+  static inline hashval_t hash (const value_type &);
+  static inline bool equal (const value_type &,
+			    const compare_type &);
+};
+
+inline hashval_t
+innermost_loop_behavior_hash::hash (const value_type &e)
+{
+  hashval_t hash;
+
+  hash = iterative_hash_expr (e->base_address, 0);
+  hash = iterative_hash_expr (e->offset, hash);
+  hash = iterative_hash_expr (e->init, hash);
+  return iterative_hash_expr (e->step, hash);
+}
+
+inline bool
+innermost_loop_behavior_hash::equal (const value_type &e1,
+				     const compare_type &e2)
+{
+  if ((e1->base_address && !e2->base_address)
+      || (!e1->base_address && e2->base_address)
+      || (!e1->offset && e2->offset)
+      || (e1->offset && !e2->offset)
+      || (!e1->init && e2->init)
+      || (e1->init && !e2->init)
+      || (!e1->step && e2->step)
+      || (e1->step && !e2->step))
+    return false;
+
+  if (e1->base_address && e2->base_address
+      && !operand_equal_p (e1->base_address, e2->base_address, 0))
+    return false;
+  if (e1->offset && e2->offset
+      && !operand_equal_p (e1->offset, e2->offset, 0))
+    return false;
+  if (e1->init && e2->init
+      && !operand_equal_p (e1->init, e2->init, 0))
+    return false;
+  if (e1->step && e2->step
+      && !operand_equal_p (e1->step, e2->step, 0))
+    return false;
+
+  return true;
+}
+
 /* List of basic blocks in if-conversion-suitable order.  */
 static basic_block *ifc_bbs;
 
 /* Apply more aggressive (extended) if-conversion if true.  */
 static bool aggressive_if_conv;
 
-/* Hash table to store references, DR pairs.  */
-static hash_map<tree_operand_hash, data_reference_p> *ref_DR_map;
+/* Hash table to store <DR's innermost loop behavior, DR> pairs.  */
+static hash_map<innermost_loop_behavior_hash,
+		data_reference_p> *innermost_DR_map;
 
-/* Hash table to store base reference, DR pairs.  */
+/* Hash table to store <base reference, DR> pairs.  */
 static hash_map<tree_operand_hash, data_reference_p> *baseref_DR_map;
 
 /* Structure used to predicate basic blocks.  This is attached to the
@@ -613,17 +665,12 @@ hash_memrefs_baserefs_and_store_DRs_read_written_info (data_reference_p a)
 {
 
   data_reference_p *master_dr, *base_master_dr;
-  tree ref = DR_REF (a);
   tree base_ref = DR_BASE_OBJECT (a);
+  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
   tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
   bool exist1, exist2;
 
-  while (TREE_CODE (ref) == COMPONENT_REF
-	 || TREE_CODE (ref) == IMAGPART_EXPR
-	 || TREE_CODE (ref) == REALPART_EXPR)
-    ref = TREE_OPERAND (ref, 0);
-
-  master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
+  master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
   if (!exist1)
     *master_dr = a;
 
@@ -685,21 +732,18 @@ ifcvt_memrefs_wont_trap (gimple *stmt, vec<data_reference_p> drs)
   data_reference_p *master_dr, *base_master_dr;
   data_reference_p a = drs[gimple_uid (stmt) - 1];
 
-  tree ref_base_a = DR_REF (a);
   tree base = DR_BASE_OBJECT (a);
+  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
 
   gcc_assert (DR_STMT (a) == stmt);
+  gcc_assert (DR_BASE_ADDRESS (a) || DR_OFFSET (a)
+              || DR_INIT (a) || DR_STEP (a));
 
-  while (TREE_CODE (ref_base_a) == COMPONENT_REF
-	 || TREE_CODE (ref_base_a) == IMAGPART_EXPR
-	 || TREE_CODE (ref_base_a) == REALPART_EXPR)
-    ref_base_a = TREE_OPERAND (ref_base_a, 0);
+  master_dr = innermost_DR_map->get (innermost);
+  gcc_assert (master_dr != NULL);
 
-  master_dr = ref_DR_map->get (ref_base_a);
   base_master_dr = baseref_DR_map->get (base);
 
-  gcc_assert (master_dr != NULL);
-
   /* If a is unconditionally written to it doesn't trap.  */
   if (DR_W_UNCONDITIONALLY (*master_dr))
     return true;
@@ -1228,13 +1272,16 @@ if_convertible_loop_p_1 (struct loop *loop,
 
   data_reference_p dr;
 
-  ref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
+  innermost_DR_map
+	  = new hash_map<innermost_loop_behavior_hash, data_reference_p>;
   baseref_DR_map = new hash_map<tree_operand_hash, data_reference_p>;
 
   predicate_bbs (loop);
 
   for (i = 0; refs->iterate (i, &dr); i++)
     {
+      tree ref = DR_REF (dr);
+
       dr->aux = XNEW (struct ifc_dr);
       DR_BASE_W_UNCONDITIONALLY (dr) = false;
       DR_RW_UNCONDITIONALLY (dr) = false;
@@ -1244,6 +1291,27 @@ if_convertible_loop_p_1 (struct loop *loop,
       IFC_DR (dr)->base_w_predicate = boolean_false_node;
       if (gimple_uid (DR_STMT (dr)) == 0)
 	gimple_set_uid (DR_STMT (dr), i + 1);
+
+      /* If DR doesn't have innermost loop behavior or it's a compound
+         memory reference, we synthesize its innermost loop behavior
+         for hashing.  */
+      if (TREE_CODE (ref) == COMPONENT_REF
+          || TREE_CODE (ref) == IMAGPART_EXPR
+          || TREE_CODE (ref) == REALPART_EXPR
+          || !(DR_BASE_ADDRESS (dr) || DR_OFFSET (dr)
+	       || DR_INIT (dr) || DR_STEP (dr)))
+        {
+          while (TREE_CODE (ref) == COMPONENT_REF
+	         || TREE_CODE (ref) == IMAGPART_EXPR
+	         || TREE_CODE (ref) == REALPART_EXPR)
+	    ref = TREE_OPERAND (ref, 0);
+
+          DR_BASE_ADDRESS (dr) = ref;
+          DR_OFFSET (dr) = NULL;
+          DR_INIT (dr) = NULL;
+          DR_STEP (dr) = NULL;
+          DR_ALIGNED_TO (dr) = NULL;
+        }
       hash_memrefs_baserefs_and_store_DRs_read_written_info (dr);
     }
 
@@ -1338,8 +1406,8 @@ if_convertible_loop_p (struct loop *loop, bool *any_mask_load_store)
 
   free_data_refs (refs);
 
-  delete ref_DR_map;
-  ref_DR_map = NULL;
+  delete innermost_DR_map;
+  innermost_DR_map = NULL;
 
   delete baseref_DR_map;
   baseref_DR_map = NULL;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
new file mode 100644
index 0000000..02a6731
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ifc-pr69489-1.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-ifcvt-stats" { target *-*-* } } */
+
+void foo (int a[], int b[])
+{
+  int i;
+  for (i = 0; i < 100; i++)
+    {
+      if (a[i] == 0)
+	a[i] = b[i]*4;
+      else
+	a[i] = b[i]*3;
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "Applying if-conversion" 1 "ifcvt" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/pr56625.c b/gcc/testsuite/gcc.dg/vect/pr56625.c
new file mode 100644
index 0000000..b903be3
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr56625.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+
+void foo (int a[], int b[])
+{
+  int i;
+  for (i = 0; i < 100; i++)
+    {
+      if (a[i] == 0)
+	a[i] = b[i]*4;
+      else
+	a[i] = b[i]*3;
+    }
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
  2016-03-31 17:08             ` Bin.Cheng
@ 2016-04-04 13:07               ` Richard Biener
  2016-04-04 14:14                 ` Bin.Cheng
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2016-04-04 13:07 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: GCC Patches

On Thu, Mar 31, 2016 at 6:43 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> Sorry, Should have replied to gcc-patches list.
>>>
>>> Thanks,
>>> bin
>>>
>>> ---------- Forwarded message ----------
>>> From: "Bin.Cheng" <amker.cheng@gmail.com>
>>> Date: Tue, 29 Mar 2016 03:55:04 +0800
>>> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
>>> DR against its innermost loop bahavior if possible
>>> To: Richard Biener <richard.guenther@gmail.com>
>>>
>>> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote:
>>>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>
>>>>>> Hmm.
>>>>> Hi,
>>>>> Thanks for reviewing.
>>>>>>
>>>>>> +  equal_p = true;
>>>>>> +  if (e1->base_address && e2->base_address)
>>>>>> +    equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>>>>>> +  if (e1->offset && e2->offset)
>>>>>> +    equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>>>>>
>>>>>> surely better to return false early.
>>>>>>
>>>>>> I think we don't want this in tree-data-refs.h also because of ...
>>>>>>
>>>>>> @@ -615,15 +619,29 @@
>>>>>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>>>>>> (data_reference_p a)
>>>>>>    data_reference_p *master_dr, *base_master_dr;and REALPART) before
>>>>>> creating the DR (or adjust the equality function
>>>>> and hashing
>>>>>>    tree ref = DR_REF (a);
>>>>>>    tree base_ref = DR_BASE_OBJECT (a);
>>>>>> +  innermost_loop_behavior *innermost = &DR_INNERMOST (a);
>>>>>>    tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>>>>>    bool exist1, exist2;
>>>>>>
>>>>>> -  while (TREE_CODE (ref) == COMPONENT_REF
>>>>>> -        || TREE_CODE (ref) == IMAGPART_EXPR
>>>>>> -        || TREE_CODE (ref) == REALPART_EXPR)
>>>>>> -    ref = TREE_OPERAND (ref, 0);
>>>>>> +  /* If reference in DR has innermost loop behavior and it is not
>>>>>> +     a compound memory reference, we store it to innermost_DR_map,
>>>>>> +     otherwise to ref_DR_map.  */
>>>>>> +  if (TREE_CODE (ref) == COMPONENT_REF
>>>>>> +      || TREE_CODE (ref) == IMAGPART_EXPR
>>>>>> +      || TREE_CODE (ref) == REALPART_EXPR
>>>>>> +      || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>>>>>> +          || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>>>>>> +    {
>>>>>> +      while (TREE_CODE (ref) == COMPONENT_REF
>>>>>> +            || TREE_CODE (ref) == IMAGPART_EXPR
>>>>>> +            || TREE_CODE (ref) == REALPART_EXPR)
>>>>>> +       ref = TREE_OPERAND (ref, 0);
>>>>>> +
>>>>>> +      master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
>>>>>> +    }
>>>>>> +  else
>>>>>> +    master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
>>>>>>
>>>>>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd
>>>>>> need to
>>>>>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>>>>>> and REALPART) before creating the DR (or adjust the equality function
>>>>>> and hashing
>>>>>> to disregard them which means subtracting their offset from DR_INIT.
>>>>> I am not sure if I understand correctly.  But for component reference,
>>>>> it is the base object that we want to record/track.  For example,
>>>>>
>>>>>   for (i = 0; i < N; i++) {
>>>>>     m = *data++;
>>>>>
>>>>>     m1 = p1->x - m;
>>>>>     m2 = p2->x + m;
>>>>>
>>>>>     p3->y = (m1 >= m2) ? p1->y : p2->y;
>>>>>
>>>>>     p1++;
>>>>>     p2++;
>>>>>     p3++;
>>>>>   }
>>>>> We want to infer that reads of p1/p2 in condition statement won't trap
>>>>> because there are unconditional reads of the structures, though the
>>>>> unconditional reads are actual of other sub-objects.  Here it is the
>>>>> invariant part of address that we want to track.
>>>>
>>>> Well, the variant parts - we want to strip invariant parts as far as we can
>>>> (offsetof (x) and offsetof (y))
>>>>
>>>>> Also illustrated by this example, we can't rely on data-ref analyzer
>>>>> here.  Because in gathering/scattering cases, the address could be not
>>>>> affine at all.
>>>>
>>>> Sure, but that's a different issue.
>>>>
>>>>>>
>>>>>> To adjust the references we collect you'd maybe could use a callback
>>>>>> to get_references_in_stmt
>>>>>> to adjust them.
>>>>>>
>>>>>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple
>>>>>> as
>>>>> Is this a part of the method you suggested above, or is it an
>>>>> alternative one?  If it's the latter, then I have below questions
>>>>> embedded.
>>>>
>>>> It is an alternative to adding a hook to get_references_in_stmt and
>>>> probably "easier".
>>>>
>>>>>>
>>>>>> Index: tree-if-conv.c
>>>>>> ===================================================================
>>>>>> --- tree-if-conv.c      (revision 234215)
>>>>>> +++ tree-if-conv.c      (working copy)
>>>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>>>>
>>>>>>    for (i = 0; refs->iterate (i, &dr); i++)
>>>>>>      {
>>>>>> +      tree *refp = &DR_REF (dr);
>>>>>> +      while ((TREE_CODE (*refp) == COMPONENT_REF
>>>>>> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>>>>> +            || TREE_CODE (*refp) == IMAGPART_EXPR
>>>>>> +            || TREE_CODE (*refp) == REALPART_EXPR)
>>>>>> +       refp = &TREE_OPERAND (*refp, 0);
>>>>>> +      if (refp != &DR_REF (dr))
>>>>>> +       {
>>>>>> +         tree saved_base = *refp;
>>>>>> +         *refp = integer_zero_node;
>>>>>> +
>>>>>> +         if (DR_INIT (dr))
>>>>>> +           {
>>>>>> +             tree poffset;
>>>>>> +             int punsignedp, preversep, pvolatilep;
>>>>>> +             machine_mode pmode;
>>>>>> +             HOST_WIDE_INT pbitsize, pbitpos;
>>>>>> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos,
>>>>>> &poffset,
>>>>>> +                                  &pmode, &punsignedp, &preversep,
>>>>>> &pvolatilep,
>>>>>> +                                  false);
>>>>>> +             gcc_assert (poffset == NULL_TREE);
>>>>>> +
>>>>>> +             DR_INIT (dr)
>>>>>> +               = wide_int_to_tree (ssizetype,
>>>>>> +                                   wi::sub (DR_INIT (dr),
>>>>>> +                                            pbitpos / BITS_PER_UNIT));
>>>>>> +           }
>>>>>> +
>>>>>> +         *refp = saved_base;
>>>>>> +         DR_REF (dr) = *refp;
>>>>>> +       }
>>>>> Looks to me the code is trying to resolve difference between two (or
>>>>> more) component references, which is DR_INIT in the code.  But DR_INIT
>>>>> is not the only thing needs to be handled.  For a structure containing
>>>>> two sub-arrays, DR_OFFSET may be different too.
>>>>
>>>> Yes, but we can't say that if
>>>>
>>>>   a->a[i]
>>>>
>>>> doesn't trap that then
>>>>
>>>>   a->b[i]
>>>>
>>>> doesn't trap either.  We can only "strip" outermost
>>>> non-variable-offset components.
>>>>
>>>> But maybe I'm missing what example you are thinking of.
>>> Hmm, this was the case I meant.  What I don't understand is current
>>> code logic does infer trap information for a.b[i] from a.a[i].  Given
>>> below example:
>>> struct str
>>> {
>>>   int a[10];
>>>   int b[20];
>>>   char c;
>>> };
>>>
>>> void bar (struct str *);
>>> int foo (int x, int n)
>>> {
>>>   int i;
>>>   struct str s;
>>>   bar (&s);
>>>   for (i = 0; i < n; i++)
>>>     {
>>>       s.a[i] = s.b[i];
>>>       if (x > i)
>>>         s.b[i] = 0;
>>>     }
>>>   bar (&s);
>>>   return 0;
>>> }
>>> The loop is convertible because of below code in function
>>> ifcvt_memrefs_wont_trap:
>>>
>>>   /* If a is unconditionally accessed then ... */
>>>   if (DR_RW_UNCONDITIONALLY (*master_dr))
>>>     {
>>>       /* an unconditional read won't trap.  */
>>>       if (DR_IS_READ (a))
>>>         return true;
>>>
>>>       /* an unconditionaly write won't trap if the base is written
>>>          to unconditionally.  */
>>>       if (base_master_dr
>>>           && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
>>>         return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>>       else
>>>         {
>>>           /* or the base is know to be not readonly.  */
>>>           tree base_tree = get_base_address (DR_REF (a));
>>>           if (DECL_P (base_tree)
>>>               && decl_binds_to_current_def_p (base_tree)
>>>               && ! TREE_READONLY (base_tree))
>>>             return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>>         }
>>>     }
>>> It is the main object '&s' that is recorded in base_master_dr, so
>>> s.b[i] is considered not trap.  Even it's not, I suppose
>>> get_base_address will give same result?
>>
>> Well, for this case it sees that s.b[i] is read from so it can't be an
>> out-of-bound
>> access.  And s.a[i] is written to unconditionally so 's' cannot be a
>> readonly object.
>> With both pieces of information we can conclude that s.b[i] = 0 doesn't trap.
>
> Hi Richard,
> Attachment is the updated patch.  I made below changes wrto your
> review comments:
> 1) Moved hash class for innermost loop behavior from tree-data-ref.h
> to tree-if-conv.c.
>     I also removed check on innermost_loop_behavior.aligned_to which
> seems redundant to me because it's computed from offset and we have
> already checked equality for offset.
> 2) Replaced ref_DR_map with new map innermost_DR_map.
> 3) Post-processed DR in if_convertible_loop_p_1 for compound reference
> or references don't have innermost behavior.  This cleans up following
> code using DR information.
> 4) Added a vectorization test for PR56625.
>
> I didn't incorporate your proposed code because I think it handles
> COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y.

But the previous code already handled it, so not handling it would be
a regression.
Note that I think your patch will handle it as well given you hash innermost
behavior.

> Looks to me it is another ifcvt opportunity different from PR69489.
> Anyway, fix is easy, I can send another patch in GCC7.
>
> Bootstrap and test on x86_64/AArch64, so any comments on this version?

Looks good to me.

Richard.

> Thanks,
> bin
>
> 2016-03-30  Bin Cheng  <bin.cheng@arm.com>
>
>     PR tree-optimization/56625
>     PR tree-optimization/69489
>     * tree-data-ref.h (DR_INNERMOST): New macro.
>     * tree-if-conv.c (innermost_loop_behavior_hash): New class for
>     hashing struct innermost_loop_behavior.
>     (ref_DR_map): Remove.
>     (innermost_DR_map): New map.
>     (baseref_DR_map): Revise comment.
>     (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR
>     to innermost_DR_map accroding to its innermost loop behavior.
>     (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map according
>     to its innermost loop behavior.
>     (if_convertible_loop_p_1): Remove intialization for ref_DR_map.
>     Add initialization for innermost_DR_map.  Record memory reference
>     in DR_BASE_ADDRESS if the reference is compound one or it doesn't
>     have innermost loop behavior.
>     (if_convertible_loop_p): Remove release for ref_DR_map.  Release
>     innermost_DR_map.
>
> gcc/testsuite/ChangeLog
> 2016-03-30  Bin Cheng  <bin.cheng@arm.com>
>
>     PR tree-optimization/56625
>     PR tree-optimization/69489
>     * gcc.dg/vect/pr56625.c: New test.
>     * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
  2016-04-04 13:07               ` Richard Biener
@ 2016-04-04 14:14                 ` Bin.Cheng
  2016-04-05  7:23                   ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Bin.Cheng @ 2016-04-04 14:14 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Mon, Apr 4, 2016 at 2:07 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Thu, Mar 31, 2016 at 6:43 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>> Sorry, Should have replied to gcc-patches list.
>>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> ---------- Forwarded message ----------
>>>> From: "Bin.Cheng" <amker.cheng@gmail.com>
>>>> Date: Tue, 29 Mar 2016 03:55:04 +0800
>>>> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
>>>> DR against its innermost loop bahavior if possible
>>>> To: Richard Biener <richard.guenther@gmail.com>
>>>>
>>>> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>
>>>>> It is an alternative to adding a hook to get_references_in_stmt and
>>>>> probably "easier".
>>>>>
>>>>>>>
>>>>>>> Index: tree-if-conv.c
>>>>>>> ===================================================================
>>>>>>> --- tree-if-conv.c      (revision 234215)
>>>>>>> +++ tree-if-conv.c      (working copy)
>>>>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>>>>>
>>>>>>>    for (i = 0; refs->iterate (i, &dr); i++)
>>>>>>>      {
>>>>>>> +      tree *refp = &DR_REF (dr);
>>>>>>> +      while ((TREE_CODE (*refp) == COMPONENT_REF
>>>>>>> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>>>>>> +            || TREE_CODE (*refp) == IMAGPART_EXPR
>>>>>>> +            || TREE_CODE (*refp) == REALPART_EXPR)
>>>>>>> +       refp = &TREE_OPERAND (*refp, 0);
>>>>>>> +      if (refp != &DR_REF (dr))
>>>>>>> +       {
>>>>>>> +         tree saved_base = *refp;
>>>>>>> +         *refp = integer_zero_node;
>>>>>>> +
>>>>>>> +         if (DR_INIT (dr))
>>>>>>> +           {
>>>>>>> +             tree poffset;
>>>>>>> +             int punsignedp, preversep, pvolatilep;
>>>>>>> +             machine_mode pmode;
>>>>>>> +             HOST_WIDE_INT pbitsize, pbitpos;
>>>>>>> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos,
>>>>>>> &poffset,
>>>>>>> +                                  &pmode, &punsignedp, &preversep,
>>>>>>> &pvolatilep,
>>>>>>> +                                  false);
>>>>>>> +             gcc_assert (poffset == NULL_TREE);
>>>>>>> +
>>>>>>> +             DR_INIT (dr)
>>>>>>> +               = wide_int_to_tree (ssizetype,
>>>>>>> +                                   wi::sub (DR_INIT (dr),
>>>>>>> +                                            pbitpos / BITS_PER_UNIT));
>>>>>>> +           }
>>>>>>> +
>>>>>>> +         *refp = saved_base;
>>>>>>> +         DR_REF (dr) = *refp;
>>>>>>> +       }
>>>>>> Looks to me the code is trying to resolve difference between two (or
>>>>>> more) component references, which is DR_INIT in the code.  But DR_INIT
>>>>>> is not the only thing needs to be handled.  For a structure containing
>>>>>> two sub-arrays, DR_OFFSET may be different too.
>>>>>
>>>>> Yes, but we can't say that if
>>>>>
>>>>>   a->a[i]
>>>>>
>>>>> doesn't trap that then
>>>>>
>>>>>   a->b[i]
>>>>>
>>>>> doesn't trap either.  We can only "strip" outermost
>>>>> non-variable-offset components.
>>>>>
>>>>> But maybe I'm missing what example you are thinking of.
>>>> Hmm, this was the case I meant.  What I don't understand is current
>>>> code logic does infer trap information for a.b[i] from a.a[i].  Given
>>>> below example:
>>>> struct str
>>>> {
>>>>   int a[10];
>>>>   int b[20];
>>>>   char c;
>>>> };
>>>>
>>>> void bar (struct str *);
>>>> int foo (int x, int n)
>>>> {
>>>>   int i;
>>>>   struct str s;
>>>>   bar (&s);
>>>>   for (i = 0; i < n; i++)
>>>>     {
>>>>       s.a[i] = s.b[i];
>>>>       if (x > i)
>>>>         s.b[i] = 0;
>>>>     }
>>>>   bar (&s);
>>>>   return 0;
>>>> }
>>>> The loop is convertible because of below code in function
>>>> ifcvt_memrefs_wont_trap:
>>>>
>>>>   /* If a is unconditionally accessed then ... */
>>>>   if (DR_RW_UNCONDITIONALLY (*master_dr))
>>>>     {
>>>>       /* an unconditional read won't trap.  */
>>>>       if (DR_IS_READ (a))
>>>>         return true;
>>>>
>>>>       /* an unconditionaly write won't trap if the base is written
>>>>          to unconditionally.  */
>>>>       if (base_master_dr
>>>>           && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
>>>>         return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>>>       else
>>>>         {
>>>>           /* or the base is know to be not readonly.  */
>>>>           tree base_tree = get_base_address (DR_REF (a));
>>>>           if (DECL_P (base_tree)
>>>>               && decl_binds_to_current_def_p (base_tree)
>>>>               && ! TREE_READONLY (base_tree))
>>>>             return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>>>         }
>>>>     }
>>>> It is the main object '&s' that is recorded in base_master_dr, so
>>>> s.b[i] is considered not trap.  Even it's not, I suppose
>>>> get_base_address will give same result?
>>>
>>> Well, for this case it sees that s.b[i] is read from so it can't be an
>>> out-of-bound
>>> access.  And s.a[i] is written to unconditionally so 's' cannot be a
>>> readonly object.
>>> With both pieces of information we can conclude that s.b[i] = 0 doesn't trap.
>>
>> Hi Richard,
>> Attachment is the updated patch.  I made below changes wrto your
>> review comments:
>> 1) Moved hash class for innermost loop behavior from tree-data-ref.h
>> to tree-if-conv.c.
>>     I also removed check on innermost_loop_behavior.aligned_to which
>> seems redundant to me because it's computed from offset and we have
>> already checked equality for offset.
>> 2) Replaced ref_DR_map with new map innermost_DR_map.
>> 3) Post-processed DR in if_convertible_loop_p_1 for compound reference
>> or references don't have innermost behavior.  This cleans up following
>> code using DR information.
>> 4) Added a vectorization test for PR56625.
>>
>> I didn't incorporate your proposed code because I think it handles
>> COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y.
>
> But the previous code already handled it, so not handling it would be
> a regression.
> Note that I think your patch will handle it as well given you hash innermost
> behavior.
If I understand correctly, your code is more precise handling
component reference by stripping const offset from innermost loop
behavior.  Currently tree if-conv just strips component ref for
structure field away.  Yes my patch can handle the case, but that's
done by falling back to the reference itself (the existing code
logic), rather than tunning innermost loop behavior as you suggested:

+          while (TREE_CODE (ref) == COMPONENT_REF
+             || TREE_CODE (ref) == IMAGPART_EXPR
+             || TREE_CODE (ref) == REALPART_EXPR)
+        ref = TREE_OPERAND (ref, 0);
+
+          DR_BASE_ADDRESS (dr) = ref;
+          DR_OFFSET (dr) = NULL;
+          DR_INIT (dr) = NULL;
+          DR_STEP (dr) = NULL;
+          DR_ALIGNED_TO (dr) = NULL;

I think innermost loop behavior is unnecessary here for component
refs, so is there an example showing possible exception?

I will re-base/apply the patch after entering Stage1.

Thanks,
bin
>
>> Looks to me it is another ifcvt opportunity different from PR69489.
>> Anyway, fix is easy, I can send another patch in GCC7.
>>
>> Bootstrap and test on x86_64/AArch64, so any comments on this version?
>
> Looks good to me.
>
> Richard.
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
  2016-04-04 14:14                 ` Bin.Cheng
@ 2016-04-05  7:23                   ` Richard Biener
  0 siblings, 0 replies; 10+ messages in thread
From: Richard Biener @ 2016-04-05  7:23 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: GCC Patches

On Mon, Apr 4, 2016 at 4:14 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Mon, Apr 4, 2016 at 2:07 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Thu, Mar 31, 2016 at 6:43 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Tue, Mar 29, 2016 at 9:37 AM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Mon, Mar 28, 2016 at 9:57 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>>> Sorry, Should have replied to gcc-patches list.
>>>>>
>>>>> Thanks,
>>>>> bin
>>>>>
>>>>> ---------- Forwarded message ----------
>>>>> From: "Bin.Cheng" <amker.cheng@gmail.com>
>>>>> Date: Tue, 29 Mar 2016 03:55:04 +0800
>>>>> Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking
>>>>> DR against its innermost loop bahavior if possible
>>>>> To: Richard Biener <richard.guenther@gmail.com>
>>>>>
>>>>> On 3/17/16, Richard Biener <richard.guenther@gmail.com> wrote:
>>>>>> On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>>>>> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>>>>>>> <richard.guenther@gmail.com> wrote:
>>>>>>
>>>>>> It is an alternative to adding a hook to get_references_in_stmt and
>>>>>> probably "easier".
>>>>>>
>>>>>>>>
>>>>>>>> Index: tree-if-conv.c
>>>>>>>> ===================================================================
>>>>>>>> --- tree-if-conv.c      (revision 234215)
>>>>>>>> +++ tree-if-conv.c      (working copy)
>>>>>>>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>>>>>>>
>>>>>>>>    for (i = 0; refs->iterate (i, &dr); i++)
>>>>>>>>      {
>>>>>>>> +      tree *refp = &DR_REF (dr);
>>>>>>>> +      while ((TREE_CODE (*refp) == COMPONENT_REF
>>>>>>>> +             && TREE_OPERAND (*refp, 2) == NULL_TREE)
>>>>>>>> +            || TREE_CODE (*refp) == IMAGPART_EXPR
>>>>>>>> +            || TREE_CODE (*refp) == REALPART_EXPR)
>>>>>>>> +       refp = &TREE_OPERAND (*refp, 0);
>>>>>>>> +      if (refp != &DR_REF (dr))
>>>>>>>> +       {
>>>>>>>> +         tree saved_base = *refp;
>>>>>>>> +         *refp = integer_zero_node;
>>>>>>>> +
>>>>>>>> +         if (DR_INIT (dr))
>>>>>>>> +           {
>>>>>>>> +             tree poffset;
>>>>>>>> +             int punsignedp, preversep, pvolatilep;
>>>>>>>> +             machine_mode pmode;
>>>>>>>> +             HOST_WIDE_INT pbitsize, pbitpos;
>>>>>>>> +             get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos,
>>>>>>>> &poffset,
>>>>>>>> +                                  &pmode, &punsignedp, &preversep,
>>>>>>>> &pvolatilep,
>>>>>>>> +                                  false);
>>>>>>>> +             gcc_assert (poffset == NULL_TREE);
>>>>>>>> +
>>>>>>>> +             DR_INIT (dr)
>>>>>>>> +               = wide_int_to_tree (ssizetype,
>>>>>>>> +                                   wi::sub (DR_INIT (dr),
>>>>>>>> +                                            pbitpos / BITS_PER_UNIT));
>>>>>>>> +           }
>>>>>>>> +
>>>>>>>> +         *refp = saved_base;
>>>>>>>> +         DR_REF (dr) = *refp;
>>>>>>>> +       }
>>>>>>> Looks to me the code is trying to resolve difference between two (or
>>>>>>> more) component references, which is DR_INIT in the code.  But DR_INIT
>>>>>>> is not the only thing needs to be handled.  For a structure containing
>>>>>>> two sub-arrays, DR_OFFSET may be different too.
>>>>>>
>>>>>> Yes, but we can't say that if
>>>>>>
>>>>>>   a->a[i]
>>>>>>
>>>>>> doesn't trap that then
>>>>>>
>>>>>>   a->b[i]
>>>>>>
>>>>>> doesn't trap either.  We can only "strip" outermost
>>>>>> non-variable-offset components.
>>>>>>
>>>>>> But maybe I'm missing what example you are thinking of.
>>>>> Hmm, this was the case I meant.  What I don't understand is current
>>>>> code logic does infer trap information for a.b[i] from a.a[i].  Given
>>>>> below example:
>>>>> struct str
>>>>> {
>>>>>   int a[10];
>>>>>   int b[20];
>>>>>   char c;
>>>>> };
>>>>>
>>>>> void bar (struct str *);
>>>>> int foo (int x, int n)
>>>>> {
>>>>>   int i;
>>>>>   struct str s;
>>>>>   bar (&s);
>>>>>   for (i = 0; i < n; i++)
>>>>>     {
>>>>>       s.a[i] = s.b[i];
>>>>>       if (x > i)
>>>>>         s.b[i] = 0;
>>>>>     }
>>>>>   bar (&s);
>>>>>   return 0;
>>>>> }
>>>>> The loop is convertible because of below code in function
>>>>> ifcvt_memrefs_wont_trap:
>>>>>
>>>>>   /* If a is unconditionally accessed then ... */
>>>>>   if (DR_RW_UNCONDITIONALLY (*master_dr))
>>>>>     {
>>>>>       /* an unconditional read won't trap.  */
>>>>>       if (DR_IS_READ (a))
>>>>>         return true;
>>>>>
>>>>>       /* an unconditionaly write won't trap if the base is written
>>>>>          to unconditionally.  */
>>>>>       if (base_master_dr
>>>>>           && DR_BASE_W_UNCONDITIONALLY (*base_master_dr))
>>>>>         return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>>>>       else
>>>>>         {
>>>>>           /* or the base is know to be not readonly.  */
>>>>>           tree base_tree = get_base_address (DR_REF (a));
>>>>>           if (DECL_P (base_tree)
>>>>>               && decl_binds_to_current_def_p (base_tree)
>>>>>               && ! TREE_READONLY (base_tree))
>>>>>             return PARAM_VALUE (PARAM_ALLOW_STORE_DATA_RACES);
>>>>>         }
>>>>>     }
>>>>> It is the main object '&s' that is recorded in base_master_dr, so
>>>>> s.b[i] is considered not trap.  Even it's not, I suppose
>>>>> get_base_address will give same result?
>>>>
>>>> Well, for this case it sees that s.b[i] is read from so it can't be an
>>>> out-of-bound
>>>> access.  And s.a[i] is written to unconditionally so 's' cannot be a
>>>> readonly object.
>>>> With both pieces of information we can conclude that s.b[i] = 0 doesn't trap.
>>>
>>> Hi Richard,
>>> Attachment is the updated patch.  I made below changes wrto your
>>> review comments:
>>> 1) Moved hash class for innermost loop behavior from tree-data-ref.h
>>> to tree-if-conv.c.
>>>     I also removed check on innermost_loop_behavior.aligned_to which
>>> seems redundant to me because it's computed from offset and we have
>>> already checked equality for offset.
>>> 2) Replaced ref_DR_map with new map innermost_DR_map.
>>> 3) Post-processed DR in if_convertible_loop_p_1 for compound reference
>>> or references don't have innermost behavior.  This cleans up following
>>> code using DR information.
>>> 4) Added a vectorization test for PR56625.
>>>
>>> I didn't incorporate your proposed code because I think it handles
>>> COMPONENT_REF of structure array like struct_arr[i].x/struct_arr[i].y.
>>
>> But the previous code already handled it, so not handling it would be
>> a regression.
>> Note that I think your patch will handle it as well given you hash innermost
>> behavior.
> If I understand correctly, your code is more precise handling
> component reference by stripping const offset from innermost loop
> behavior.  Currently tree if-conv just strips component ref for
> structure field away.  Yes my patch can handle the case, but that's
> done by falling back to the reference itself (the existing code
> logic), rather than tunning innermost loop behavior as you suggested:

Ah, indeed.

> +          while (TREE_CODE (ref) == COMPONENT_REF
> +             || TREE_CODE (ref) == IMAGPART_EXPR
> +             || TREE_CODE (ref) == REALPART_EXPR)
> +        ref = TREE_OPERAND (ref, 0);
> +
> +          DR_BASE_ADDRESS (dr) = ref;
> +          DR_OFFSET (dr) = NULL;
> +          DR_INIT (dr) = NULL;
> +          DR_STEP (dr) = NULL;
> +          DR_ALIGNED_TO (dr) = NULL;
>
> I think innermost loop behavior is unnecessary here for component
> refs, so is there an example showing possible exception?

Well, we don't need the outer component refs but DR analyze included
them in its analysis (and thus DR_INIT will differ).  For cases DR analyze
failed on that doesn't matter of course.

> I will re-base/apply the patch after entering Stage1.

Thanks,
Richard.

> Thanks,
> bin
>>
>>> Looks to me it is another ifcvt opportunity different from PR69489.
>>> Anyway, fix is easy, I can send another patch in GCC7.
>>>
>>> Bootstrap and test on x86_64/AArch64, so any comments on this version?
>>
>> Looks good to me.
>>
>> Richard.
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2016-04-05  7:23 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-16 10:00 [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible Bin Cheng
2016-03-16 12:20 ` Richard Biener
2016-03-16 16:17   ` Bin.Cheng
2016-03-17  9:53     ` Richard Biener
     [not found]       ` <CAHFci29CHpOBxah4QxrL_JOL6p_NC=r3-e3eoNjHp_4z749PnQ@mail.gmail.com>
2016-03-28 22:04         ` Fwd: " Bin.Cheng
2016-03-29  8:44           ` Richard Biener
2016-03-31 17:08             ` Bin.Cheng
2016-04-04 13:07               ` Richard Biener
2016-04-04 14:14                 ` Bin.Cheng
2016-04-05  7:23                   ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).