public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
@ 2013-11-25 10:43 bin.cheng
  2013-11-25 21:21 ` Jeff Law
  0 siblings, 1 reply; 12+ messages in thread
From: bin.cheng @ 2013-11-25 10:43 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2224 bytes --]

Hi,
I previously committed two patches lowering complex address expression for
IVOPT at http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00546.html and
http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01103.html
When I bootstrapping GCC I found there were some peculiar cases like
&MEM[ptr+CST] + xxxx, which should be handled too.  This patch consists
below two changes:

1) change in alloc_iv:
Original code lowers top level complex address expressions like
&MEM[ptr+off].  The patch relaxes check condition in order to lower
expressions like &MEM[ptr+off] + xxx, just as the BASE from below dump:
use 2
  generic
  in statement _595 = &MEM[(void *)&this_prg + 36B] + _594;

  at position 
  type struct gcov_bucket_type *
  base (struct gcov_bucket_type *) &MEM[(void *)&this_prg + 36B] +
(sizetype) ((unsigned int) (src_i_683 + -1) * 20)
  step 4294967276
  base object (void *) &this_prg
  related candidates  

2) change in tree_to_aff_combination:
The function get_inner_reference returns "&MEM[ptr+off]" as the core for
input like the memory ADDRESS in below dump:
use 2
  address
  in statement _59 = MEM[(const struct gcov_ctr_summary *)summary_22(D) +
4B].histogram[h_ix_111].min_value;

  at position MEM[(const struct gcov_ctr_summary *)summary_22(D) +
4B].histogram[h_ix_111].min_value
  type const gcov_type *
  base (const gcov_type *) &MEM[(const struct gcov_ctr_summary
*)summary_22(D) + 4B] + 36
  step 20
  base object (void *) summary_22(D)
  related candidates 

Which can be further reduced into something like "summary_22(D) + 40B".
This change is necessary for the first one, because I am using
tree_to_aff_combination rather than get_inner_reference_aff now.

Bootstrap and test on x86/x86_64/arm.  Is it OK?

Thanks.
bin

2013-11-25  Bin Cheng  <bin.cheng@arm.com>

	* tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
	(alloc_iv): Lower more cases by calling	contain_complex_addr_expr
	and tree_to_aff_combination.
	* tree-affine.c (tree_to_aff_combination): Handle &MEM[ptr+CST]
	in core part of complex reference.

gcc/testsuite/ChangeLog
2013-11-25  Bin Cheng  <bin.cheng@arm.com>

	* gcc.dg/tree-ssa/ivopts-lower_base.c: New test.

[-- Attachment #2: ivopt-lower-more-addr_expr-20131121.txt --]
[-- Type: text/plain, Size: 4126 bytes --]

Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lower_base.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopts-lower_base.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lower_base.c	(revision 0)
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+
+struct tag1
+{
+    int x[100];
+    int y;
+};
+
+struct tag2
+{
+    int a;
+    struct tag1 t1[100];
+    int b;
+};
+
+int
+foo (struct tag2 *t2, int len)
+{
+  int i = 0;
+  for (i = 0; i < len; i++)
+    {
+      (*(struct tag2*)((char *)t2+4)).t1[i].x[len] = len;
+    }
+
+  return 0;
+}
+/* { dg-final { scan-tree-dump-not "base .*&MEM\\\[" "ivopts" } }  */
+/* { dg-final { cleanup-tree-dump "ivopts" } }  */
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c	(revision 205087)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -924,13 +924,40 @@ determine_base_object (tree expr)
     }
 }
 
+/* Return true if complex address expression appears in EXPR.  */
+
+static bool
+contain_complex_addr_expr (tree expr)
+{
+  bool res = false;
+
+  STRIP_NOPS (expr);
+  switch (TREE_CODE (expr))
+    {
+    case POINTER_PLUS_EXPR:
+    case PLUS_EXPR:
+    case MINUS_EXPR:
+      res |= contain_complex_addr_expr (TREE_OPERAND (expr, 0));
+      res |= contain_complex_addr_expr (TREE_OPERAND (expr, 1));
+      break;
+
+    case ADDR_EXPR:
+      return (!DECL_P (TREE_OPERAND (expr, 0)));
+
+    default:
+      return false;
+    }
+
+  return res;
+}
+
 /* Allocates an induction variable with given initial value BASE and step STEP
    for loop LOOP.  */
 
 static struct iv *
 alloc_iv (tree base, tree step)
 {
-  tree base_object = base;
+  tree expr = base;
   struct iv *iv = XCNEW (struct iv);
   gcc_assert (step != NULL_TREE);
 
@@ -939,21 +966,17 @@ alloc_iv (tree base, tree step)
        1) More accurate cost can be computed for address expressions;
        2) Duplicate candidates won't be created for bases in different
           forms, like &a[0] and &a.  */
-  STRIP_NOPS (base_object);
-  if (TREE_CODE (base_object) == ADDR_EXPR
-      && !DECL_P (TREE_OPERAND (base_object, 0)))
+  STRIP_NOPS (expr);
+  if ((TREE_CODE (expr) == ADDR_EXPR && !DECL_P (TREE_OPERAND (expr, 0)))
+      || contain_complex_addr_expr (expr))
     {
       aff_tree comb;
-      double_int size;
-      base_object = get_inner_reference_aff (TREE_OPERAND (base_object, 0),
-					     &comb, &size);
-      gcc_assert (base_object != NULL_TREE);
-      base_object = build_fold_addr_expr (base_object);
+      tree_to_aff_combination (expr, TREE_TYPE (base), &comb);
       base = fold_convert (TREE_TYPE (base), aff_combination_to_tree (&comb));
     }
 
   iv->base = base;
-  iv->base_object = determine_base_object (base_object);
+  iv->base_object = determine_base_object (base);
   iv->step = step;
   iv->biv_p = false;
   iv->have_use_for = false;
Index: gcc/tree-affine.c
===================================================================
--- gcc/tree-affine.c	(revision 205087)
+++ gcc/tree-affine.c	(working copy)
@@ -328,7 +328,19 @@ tree_to_aff_combination (tree expr, tree type, aff
 			     double_int::from_uhwi (bitpos / BITS_PER_UNIT));
       core = build_fold_addr_expr (core);
       if (TREE_CODE (core) == ADDR_EXPR)
-	aff_combination_add_elt (comb, core, double_int_one);
+	{
+	  /* Handle &MEM[ptr + CST] in core part of complex reference.  */
+	  if (TREE_CODE (TREE_OPERAND (core, 0)) == MEM_REF)
+	    {
+	      core = TREE_OPERAND (core, 0);
+	      tree_to_aff_combination (TREE_OPERAND (core, 0), type, &tmp);
+	      aff_combination_add (comb, &tmp);
+	      tree_to_aff_combination (TREE_OPERAND (core, 1), sizetype, &tmp);
+	      aff_combination_add (comb, &tmp);
+	    }
+	  else
+	    aff_combination_add_elt (comb, core, double_int_one);
+	}
       else
 	{
 	  tree_to_aff_combination (core, type, &tmp);

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2013-11-25 10:43 [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c bin.cheng
@ 2013-11-25 21:21 ` Jeff Law
  2013-12-06 10:19   ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Jeff Law @ 2013-11-25 21:21 UTC (permalink / raw)
  To: bin.cheng, gcc-patches

On 11/25/13 02:22, bin.cheng wrote:
> Hi,
> I previously committed two patches lowering complex address expression for
> IVOPT at http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00546.html and
> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01103.html
> When I bootstrapping GCC I found there were some peculiar cases like
> &MEM[ptr+CST] + xxxx, which should be handled too.  This patch consists
> below two changes:
>
> 1) change in alloc_iv:
> Original code lowers top level complex address expressions like
> &MEM[ptr+off].  The patch relaxes check condition in order to lower
> expressions like &MEM[ptr+off] + xxx, just as the BASE from below dump:
> use 2
>    generic
>    in statement _595 = &MEM[(void *)&this_prg + 36B] + _594;
>
>    at position
>    type struct gcov_bucket_type *
>    base (struct gcov_bucket_type *) &MEM[(void *)&this_prg + 36B] +
> (sizetype) ((unsigned int) (src_i_683 + -1) * 20)
>    step 4294967276
>    base object (void *) &this_prg
>    related candidates
>
> 2) change in tree_to_aff_combination:
> The function get_inner_reference returns "&MEM[ptr+off]" as the core for
> input like the memory ADDRESS in below dump:
> use 2
>    address
>    in statement _59 = MEM[(const struct gcov_ctr_summary *)summary_22(D) +
> 4B].histogram[h_ix_111].min_value;
>
>    at position MEM[(const struct gcov_ctr_summary *)summary_22(D) +
> 4B].histogram[h_ix_111].min_value
>    type const gcov_type *
>    base (const gcov_type *) &MEM[(const struct gcov_ctr_summary
> *)summary_22(D) + 4B] + 36
>    step 20
>    base object (void *) summary_22(D)
>    related candidates
>
> Which can be further reduced into something like "summary_22(D) + 40B".
> This change is necessary for the first one, because I am using
> tree_to_aff_combination rather than get_inner_reference_aff now.
>
> Bootstrap and test on x86/x86_64/arm.  Is it OK?
>
> Thanks.
> bin
>
> 2013-11-25  Bin Cheng  <bin.cheng@arm.com>
>
> 	* tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
> 	(alloc_iv): Lower more cases by calling	contain_complex_addr_expr
> 	and tree_to_aff_combination.
> 	* tree-affine.c (tree_to_aff_combination): Handle &MEM[ptr+CST]
> 	in core part of complex reference.
>
> gcc/testsuite/ChangeLog
> 2013-11-25  Bin Cheng  <bin.cheng@arm.com>
>
> 	* gcc.dg/tree-ssa/ivopts-lower_base.c: New test.
Unless there's a PR for this problem, I think this needs to wait.

jeff


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2013-11-25 21:21 ` Jeff Law
@ 2013-12-06 10:19   ` Richard Biener
  2013-12-06 10:40     ` Bin.Cheng
  2014-05-06  8:39     ` Bin.Cheng
  0 siblings, 2 replies; 12+ messages in thread
From: Richard Biener @ 2013-12-06 10:19 UTC (permalink / raw)
  To: Jeff Law; +Cc: bin.cheng, GCC Patches

On Mon, Nov 25, 2013 at 7:41 PM, Jeff Law <law@redhat.com> wrote:
> On 11/25/13 02:22, bin.cheng wrote:
>>
>> Hi,
>> I previously committed two patches lowering complex address expression for
>> IVOPT at http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00546.html and
>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01103.html
>> When I bootstrapping GCC I found there were some peculiar cases like
>> &MEM[ptr+CST] + xxxx, which should be handled too.  This patch consists
>> below two changes:
>>
>> 1) change in alloc_iv:
>> Original code lowers top level complex address expressions like
>> &MEM[ptr+off].  The patch relaxes check condition in order to lower
>> expressions like &MEM[ptr+off] + xxx, just as the BASE from below dump:
>> use 2
>>    generic
>>    in statement _595 = &MEM[(void *)&this_prg + 36B] + _594;
>>
>>    at position
>>    type struct gcov_bucket_type *
>>    base (struct gcov_bucket_type *) &MEM[(void *)&this_prg + 36B] +
>> (sizetype) ((unsigned int) (src_i_683 + -1) * 20)
>>    step 4294967276
>>    base object (void *) &this_prg
>>    related candidates
>>
>> 2) change in tree_to_aff_combination:
>> The function get_inner_reference returns "&MEM[ptr+off]" as the core for
>> input like the memory ADDRESS in below dump:
>> use 2
>>    address
>>    in statement _59 = MEM[(const struct gcov_ctr_summary *)summary_22(D) +
>> 4B].histogram[h_ix_111].min_value;
>>
>>    at position MEM[(const struct gcov_ctr_summary *)summary_22(D) +
>> 4B].histogram[h_ix_111].min_value
>>    type const gcov_type *
>>    base (const gcov_type *) &MEM[(const struct gcov_ctr_summary
>> *)summary_22(D) + 4B] + 36
>>    step 20
>>    base object (void *) summary_22(D)
>>    related candidates
>>
>> Which can be further reduced into something like "summary_22(D) + 40B".
>> This change is necessary for the first one, because I am using
>> tree_to_aff_combination rather than get_inner_reference_aff now.
>>
>> Bootstrap and test on x86/x86_64/arm.  Is it OK?
>>
>> Thanks.
>> bin
>>
>> 2013-11-25  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
>>         (alloc_iv): Lower more cases by calling contain_complex_addr_expr
>>         and tree_to_aff_combination.
>>         * tree-affine.c (tree_to_aff_combination): Handle &MEM[ptr+CST]
>>         in core part of complex reference.
>>
>> gcc/testsuite/ChangeLog
>> 2013-11-25  Bin Cheng  <bin.cheng@arm.com>
>>
>>         * gcc.dg/tree-ssa/ivopts-lower_base.c: New test.
>
> Unless there's a PR for this problem, I think this needs to wait.

I agree.  Btw, please split the patch.

Index: gcc/tree-affine.c
===================================================================
--- gcc/tree-affine.c    (revision 205087)
+++ gcc/tree-affine.c    (working copy)
@@ -328,7 +328,19 @@ tree_to_aff_combination (tree expr, tree type, aff
                  double_int::from_uhwi (bitpos / BITS_PER_UNIT));
       core = build_fold_addr_expr (core);
       if (TREE_CODE (core) == ADDR_EXPR)
-    aff_combination_add_elt (comb, core, double_int_one);
+    {
+      /* Handle &MEM[ptr + CST] in core part of complex reference.  */
+      if (TREE_CODE (TREE_OPERAND (core, 0)) == MEM_REF)
+        {
+          core = TREE_OPERAND (core, 0);
+          tree_to_aff_combination (TREE_OPERAND (core, 0), type, &tmp);
+          aff_combination_add (comb, &tmp);
+          tree_to_aff_combination (TREE_OPERAND (core, 1), sizetype, &tmp);
+          aff_combination_add (comb, &tmp);
+        }
+      else
+        aff_combination_add_elt (comb, core, double_int_one);
+    }
       else
     {
       tree_to_aff_combination (core, type, &tmp)

please handle the offset before taking the address, thus

  if (TREE_CODE (core) == MEM_REF)
    {
       add constant offset;
       core = TREE_OPERAND (core, 0);
    }
  else
    core = build_fold_addr_expr (core);

that simplifies things and avoids the address building.

Richard.

> jeff
>
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2013-12-06 10:19   ` Richard Biener
@ 2013-12-06 10:40     ` Bin.Cheng
  2013-12-06 11:20       ` Richard Biener
  2014-05-06  8:39     ` Bin.Cheng
  1 sibling, 1 reply; 12+ messages in thread
From: Bin.Cheng @ 2013-12-06 10:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, bin.cheng, GCC Patches

On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Mon, Nov 25, 2013 at 7:41 PM, Jeff Law <law@redhat.com> wrote:
>> On 11/25/13 02:22, bin.cheng wrote:
>>
>> Unless there's a PR for this problem, I think this needs to wait.
>
> I agree.  Btw, please split the patch.
Yes, I will get back to this after entering stage 1 again :)

Hi Richard,
I talked with you about clean strip_offset_1 up after this series of
base simplification patches, but I realized it's not safe because
affine facility has it's limit, like can only support 8 elements.
Though the cleanup passes bootstrap and test on x86/x86_64 and most of
codes in strip_offset_1 won't be executed usually, I guess we'd better
to live with it, so what do you think?

Thanks,
bin

>
> Index: gcc/tree-affine.c
> ===================================================================
> --- gcc/tree-affine.c    (revision 205087)
> +++ gcc/tree-affine.c    (working copy)
> @@ -328,7 +328,19 @@ tree_to_aff_combination (tree expr, tree type, aff
>                   double_int::from_uhwi (bitpos / BITS_PER_UNIT));
>        core = build_fold_addr_expr (core);
>        if (TREE_CODE (core) == ADDR_EXPR)
> -    aff_combination_add_elt (comb, core, double_int_one);
> +    {
> +      /* Handle &MEM[ptr + CST] in core part of complex reference.  */
> +      if (TREE_CODE (TREE_OPERAND (core, 0)) == MEM_REF)
> +        {
> +          core = TREE_OPERAND (core, 0);
> +          tree_to_aff_combination (TREE_OPERAND (core, 0), type, &tmp);
> +          aff_combination_add (comb, &tmp);
> +          tree_to_aff_combination (TREE_OPERAND (core, 1), sizetype, &tmp);
> +          aff_combination_add (comb, &tmp);
> +        }
> +      else
> +        aff_combination_add_elt (comb, core, double_int_one);
> +    }
>        else
>      {
>        tree_to_aff_combination (core, type, &tmp)
>
> please handle the offset before taking the address, thus
>
>   if (TREE_CODE (core) == MEM_REF)
>     {
>        add constant offset;
>        core = TREE_OPERAND (core, 0);
>     }
>   else
>     core = build_fold_addr_expr (core);
>
> that simplifies things and avoids the address building.
>
> Richard.
>
>> jeff
>>
>>



-- 
Best Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2013-12-06 10:40     ` Bin.Cheng
@ 2013-12-06 11:20       ` Richard Biener
  2013-12-06 12:04         ` Bin.Cheng
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2013-12-06 11:20 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Jeff Law, bin.cheng, GCC Patches

On Fri, Dec 6, 2013 at 11:40 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Mon, Nov 25, 2013 at 7:41 PM, Jeff Law <law@redhat.com> wrote:
>>> On 11/25/13 02:22, bin.cheng wrote:
>>>
>>> Unless there's a PR for this problem, I think this needs to wait.
>>
>> I agree.  Btw, please split the patch.
> Yes, I will get back to this after entering stage 1 again :)
>
> Hi Richard,
> I talked with you about clean strip_offset_1 up after this series of
> base simplification patches, but I realized it's not safe because
> affine facility has it's limit, like can only support 8 elements.
> Though the cleanup passes bootstrap and test on x86/x86_64 and most of
> codes in strip_offset_1 won't be executed usually, I guess we'd better
> to live with it, so what do you think?

Not sure - I'm lacking some context here ;)  If you have a cleanup patch
fine - WRT the affine limit of 8 elements, further elements will just
add to the rest tree.  This is to limit compile-time.

Richard.

> Thanks,
> bin
>
>>
>> Index: gcc/tree-affine.c
>> ===================================================================
>> --- gcc/tree-affine.c    (revision 205087)
>> +++ gcc/tree-affine.c    (working copy)
>> @@ -328,7 +328,19 @@ tree_to_aff_combination (tree expr, tree type, aff
>>                   double_int::from_uhwi (bitpos / BITS_PER_UNIT));
>>        core = build_fold_addr_expr (core);
>>        if (TREE_CODE (core) == ADDR_EXPR)
>> -    aff_combination_add_elt (comb, core, double_int_one);
>> +    {
>> +      /* Handle &MEM[ptr + CST] in core part of complex reference.  */
>> +      if (TREE_CODE (TREE_OPERAND (core, 0)) == MEM_REF)
>> +        {
>> +          core = TREE_OPERAND (core, 0);
>> +          tree_to_aff_combination (TREE_OPERAND (core, 0), type, &tmp);
>> +          aff_combination_add (comb, &tmp);
>> +          tree_to_aff_combination (TREE_OPERAND (core, 1), sizetype, &tmp);
>> +          aff_combination_add (comb, &tmp);
>> +        }
>> +      else
>> +        aff_combination_add_elt (comb, core, double_int_one);
>> +    }
>>        else
>>      {
>>        tree_to_aff_combination (core, type, &tmp)
>>
>> please handle the offset before taking the address, thus
>>
>>   if (TREE_CODE (core) == MEM_REF)
>>     {
>>        add constant offset;
>>        core = TREE_OPERAND (core, 0);
>>     }
>>   else
>>     core = build_fold_addr_expr (core);
>>
>> that simplifies things and avoids the address building.
>>
>> Richard.
>>
>>> jeff
>>>
>>>
>
>
>
> --
> Best Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2013-12-06 11:20       ` Richard Biener
@ 2013-12-06 12:04         ` Bin.Cheng
  0 siblings, 0 replies; 12+ messages in thread
From: Bin.Cheng @ 2013-12-06 12:04 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, bin.cheng, GCC Patches

On Fri, Dec 6, 2013 at 7:20 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Fri, Dec 6, 2013 at 11:40 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Mon, Nov 25, 2013 at 7:41 PM, Jeff Law <law@redhat.com> wrote:
>>>> On 11/25/13 02:22, bin.cheng wrote:
>>>>
>>>> Unless there's a PR for this problem, I think this needs to wait.
>>>
>>> I agree.  Btw, please split the patch.
>> Yes, I will get back to this after entering stage 1 again :)
>>
>> Hi Richard,
>> I talked with you about clean strip_offset_1 up after this series of
>> base simplification patches, but I realized it's not safe because
>> affine facility has it's limit, like can only support 8 elements.
>> Though the cleanup passes bootstrap and test on x86/x86_64 and most of
>> codes in strip_offset_1 won't be executed usually, I guess we'd better
>> to live with it, so what do you think?
>
> Not sure - I'm lacking some context here ;)  If you have a cleanup patch
> fine - WRT the affine limit of 8 elements, further elements will just
> add to the rest tree.  This is to limit compile-time.
Yes, so it's possible to have COMPONENT_REF stored in rest tree if we
reach the 8 elements limit.  In this case the COMPONENT_REF will be
visited by the code in strip_offset_1, although it sounds unlikely to
happen.

Thanks,
bin
>
> Richard.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2013-12-06 10:19   ` Richard Biener
  2013-12-06 10:40     ` Bin.Cheng
@ 2014-05-06  8:39     ` Bin.Cheng
  2014-05-06 10:44       ` Richard Biener
  1 sibling, 1 reply; 12+ messages in thread
From: Bin.Cheng @ 2014-05-06  8:39 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, bin.cheng, GCC Patches

[-- Attachment #1: Type: text/plain, Size: 4815 bytes --]

On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Mon, Nov 25, 2013 at 7:41 PM, Jeff Law <law@redhat.com> wrote:
>> On 11/25/13 02:22, bin.cheng wrote:
>>>
>>> Hi,
>>> I previously committed two patches lowering complex address expression for
>>> IVOPT at http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00546.html and
>>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01103.html
>>> When I bootstrapping GCC I found there were some peculiar cases like
>>> &MEM[ptr+CST] + xxxx, which should be handled too.  This patch consists
>>> below two changes:
>>>
>>> 1) change in alloc_iv:
>>> Original code lowers top level complex address expressions like
>>> &MEM[ptr+off].  The patch relaxes check condition in order to lower
>>> expressions like &MEM[ptr+off] + xxx, just as the BASE from below dump:
>>> use 2
>>>    generic
>>>    in statement _595 = &MEM[(void *)&this_prg + 36B] + _594;
>>>
>>>    at position
>>>    type struct gcov_bucket_type *
>>>    base (struct gcov_bucket_type *) &MEM[(void *)&this_prg + 36B] +
>>> (sizetype) ((unsigned int) (src_i_683 + -1) * 20)
>>>    step 4294967276
>>>    base object (void *) &this_prg
>>>    related candidates
>>>
>>> 2) change in tree_to_aff_combination:
>>> The function get_inner_reference returns "&MEM[ptr+off]" as the core for
>>> input like the memory ADDRESS in below dump:
>>> use 2
>>>    address
>>>    in statement _59 = MEM[(const struct gcov_ctr_summary *)summary_22(D) +
>>> 4B].histogram[h_ix_111].min_value;
>>>
>>>    at position MEM[(const struct gcov_ctr_summary *)summary_22(D) +
>>> 4B].histogram[h_ix_111].min_value
>>>    type const gcov_type *
>>>    base (const gcov_type *) &MEM[(const struct gcov_ctr_summary
>>> *)summary_22(D) + 4B] + 36
>>>    step 20
>>>    base object (void *) summary_22(D)
>>>    related candidates
>>>
>>> Which can be further reduced into something like "summary_22(D) + 40B".
>>> This change is necessary for the first one, because I am using
>>> tree_to_aff_combination rather than get_inner_reference_aff now.
>>>
>>> Bootstrap and test on x86/x86_64/arm.  Is it OK?
>>>
>>> Thanks.
>>> bin
>>>
>>> 2013-11-25  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         * tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
>>>         (alloc_iv): Lower more cases by calling contain_complex_addr_expr
>>>         and tree_to_aff_combination.
>>>         * tree-affine.c (tree_to_aff_combination): Handle &MEM[ptr+CST]
>>>         in core part of complex reference.
>>>
>>> gcc/testsuite/ChangeLog
>>> 2013-11-25  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>         * gcc.dg/tree-ssa/ivopts-lower_base.c: New test.
>>
>> Unless there's a PR for this problem, I think this needs to wait.
>
> I agree.  Btw, please split the patch.
>
> Index: gcc/tree-affine.c
> ===================================================================
> --- gcc/tree-affine.c    (revision 205087)
> +++ gcc/tree-affine.c    (working copy)
> @@ -328,7 +328,19 @@ tree_to_aff_combination (tree expr, tree type, aff
>                   double_int::from_uhwi (bitpos / BITS_PER_UNIT));
>        core = build_fold_addr_expr (core);
>        if (TREE_CODE (core) == ADDR_EXPR)
> -    aff_combination_add_elt (comb, core, double_int_one);
> +    {
> +      /* Handle &MEM[ptr + CST] in core part of complex reference.  */
> +      if (TREE_CODE (TREE_OPERAND (core, 0)) == MEM_REF)
> +        {
> +          core = TREE_OPERAND (core, 0);
> +          tree_to_aff_combination (TREE_OPERAND (core, 0), type, &tmp);
> +          aff_combination_add (comb, &tmp);
> +          tree_to_aff_combination (TREE_OPERAND (core, 1), sizetype, &tmp);
> +          aff_combination_add (comb, &tmp);
> +        }
> +      else
> +        aff_combination_add_elt (comb, core, double_int_one);
> +    }
>        else
>      {
>        tree_to_aff_combination (core, type, &tmp)
>
> please handle the offset before taking the address, thus
>
>   if (TREE_CODE (core) == MEM_REF)
>     {
>        add constant offset;
>        core = TREE_OPERAND (core, 0);
>     }
>   else
>     core = build_fold_addr_expr (core);
>
> that simplifies things and avoids the address building.
>
Hi,
I split the patch into two and updated the test case.
The patches pass bootstrap/tests on x86/x86_64, also pass test on arm cortex-m3.
Is it OK?

Thanks,
bin

PATCH 1:

2014-05-06  Bin Cheng  <bin.cheng@arm.com>

    * gcc/tree-affine.c (tree_to_aff_combination): Handle MEM_REF for
    core part of address expressions.

PATCH 2:

2014-05-06  Bin Cheng  <bin.cheng@arm.com>

    * gcc/tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
    (alloc_iv): Lower base expressions containing ADDR_EXPR.

gcc/testsuite/ChangeLog
2014-05-06  Bin Cheng  <bin.cheng@arm.com>

    * gcc.dg/tree-ssa/ivopts-lower_base.c: New test.



-- 
Best Regards.

[-- Attachment #2: ivopt-lower-more-addr_expr-1.txt --]
[-- Type: text/plain, Size: 747 bytes --]

Index: gcc/tree-affine.c
===================================================================
--- gcc/tree-affine.c	(revision 210047)
+++ gcc/tree-affine.c	(working copy)
@@ -331,7 +331,15 @@ tree_to_aff_combination (tree expr, tree type, aff
 	break;
       aff_combination_const (comb, type,
 			     double_int::from_uhwi (bitpos / BITS_PER_UNIT));
-      core = build_fold_addr_expr (core);
+      if (TREE_CODE (core) == MEM_REF)
+	{
+	  tree_to_aff_combination (TREE_OPERAND (core, 1), sizetype, &tmp);
+	  aff_combination_add (comb, &tmp);
+	  core = TREE_OPERAND (core, 0);
+	}
+      else
+	core = build_fold_addr_expr (core);
+
       if (TREE_CODE (core) == ADDR_EXPR)
 	aff_combination_add_elt (comb, core, double_int_one);
       else

[-- Attachment #3: ivopt-lower-more-addr_expr-2.txt --]
[-- Type: text/plain, Size: 4170 bytes --]

Index: gcc/testsuite/gcc.dg/tree-ssa/ivopts-lower_base.c
===================================================================
--- gcc/testsuite/gcc.dg/tree-ssa/ivopts-lower_base.c	(revision 0)
+++ gcc/testsuite/gcc.dg/tree-ssa/ivopts-lower_base.c	(revision 0)
@@ -0,0 +1,61 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ivopts-details" } */
+#include <string.h>
+#include <stdlib.h>
+
+#define MAX_NUM	(256)
+
+void
+sort_pointers (size_t n, void **pointers, void **work)
+{
+  typedef unsigned char digit_t;
+  unsigned int count[MAX_NUM];
+  int big_endian_p;
+  size_t i;
+  size_t j;
+
+  if ((sizeof (void *) / sizeof (digit_t)) % 2 != 0)
+    abort ();
+
+  for (i = 0, j = 0; i < sizeof (size_t); ++i)
+    {
+      j *= MAX_NUM;
+      j += i;
+    }
+
+  big_endian_p = (((char *)&j)[0] == 0);
+  for (i = 0; i < sizeof (void *) / sizeof (digit_t); ++i)
+    {
+      digit_t *digit;
+      digit_t *bias;
+      digit_t *top;
+      unsigned int *countp;
+      void **pointerp;
+
+      if (big_endian_p)
+	j = sizeof (void *) / sizeof (digit_t) - i;
+      else
+	j = i;
+
+      memset (count, 0, MAX_NUM * sizeof (unsigned int));
+      bias = ((digit_t *) pointers) + j;
+      top = ((digit_t *) (pointers + n)) + j;
+      for (digit = bias;
+	   digit < top;
+	   digit += sizeof (void *) / sizeof (digit_t))
+	++count[*digit];
+
+      for (countp = count + 1; countp < count + MAX_NUM; ++countp)
+	*countp += countp[-1];
+
+      for (pointerp = pointers + n - 1; pointerp >= pointers; --pointerp)
+	work[--count[((digit_t *) pointerp)[j]]] = *pointerp;
+
+      pointerp = pointers;
+      pointers = work;
+      work = pointerp;
+    }
+}
+
+/* { dg-final { scan-tree-dump-not "base \[^\\n\]*&MEM\\\[" "ivopts" } }  */
+/* { dg-final { cleanup-tree-dump "ivopts" } }  */
Index: gcc/tree-ssa-loop-ivopts.c
===================================================================
--- gcc/tree-ssa-loop-ivopts.c	(revision 210047)
+++ gcc/tree-ssa-loop-ivopts.c	(working copy)
@@ -928,36 +928,60 @@ determine_base_object (tree expr)
     }
 }
 
+/* Return true if address expression with non-DECL_P operand appears
+   in EXPR.  */
+
+static bool
+contain_complex_addr_expr (tree expr)
+{
+  bool res = false;
+
+  STRIP_NOPS (expr);
+  switch (TREE_CODE (expr))
+    {
+    case POINTER_PLUS_EXPR:
+    case PLUS_EXPR:
+    case MINUS_EXPR:
+      res |= contain_complex_addr_expr (TREE_OPERAND (expr, 0));
+      res |= contain_complex_addr_expr (TREE_OPERAND (expr, 1));
+      break;
+
+    case ADDR_EXPR:
+      return (!DECL_P (TREE_OPERAND (expr, 0)));
+
+    default:
+      return false;
+    }
+
+  return res;
+}
+
 /* Allocates an induction variable with given initial value BASE and step STEP
    for loop LOOP.  */
 
 static struct iv *
 alloc_iv (tree base, tree step)
 {
-  tree base_object = base;
+  tree expr = base;
   struct iv *iv = XCNEW (struct iv);
   gcc_assert (step != NULL_TREE);
 
-  /* Lower all address expressions except ones with DECL_P as operand.
+  /* Lower address expression in base except ones with DECL_P as operand.
      By doing this:
        1) More accurate cost can be computed for address expressions;
        2) Duplicate candidates won't be created for bases in different
           forms, like &a[0] and &a.  */
-  STRIP_NOPS (base_object);
-  if (TREE_CODE (base_object) == ADDR_EXPR
-      && !DECL_P (TREE_OPERAND (base_object, 0)))
+  STRIP_NOPS (expr);
+  if ((TREE_CODE (expr) == ADDR_EXPR && !DECL_P (TREE_OPERAND (expr, 0)))
+      || contain_complex_addr_expr (expr))
     {
       aff_tree comb;
-      double_int size;
-      base_object = get_inner_reference_aff (TREE_OPERAND (base_object, 0),
-					     &comb, &size);
-      gcc_assert (base_object != NULL_TREE);
-      base_object = build_fold_addr_expr (base_object);
+      tree_to_aff_combination (expr, TREE_TYPE (base), &comb);
       base = fold_convert (TREE_TYPE (base), aff_combination_to_tree (&comb));
     }
 
   iv->base = base;
-  iv->base_object = determine_base_object (base_object);
+  iv->base_object = determine_base_object (base);
   iv->step = step;
   iv->biv_p = false;
   iv->have_use_for = false;

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2014-05-06  8:39     ` Bin.Cheng
@ 2014-05-06 10:44       ` Richard Biener
  2014-05-08  9:08         ` Bin.Cheng
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2014-05-06 10:44 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Jeff Law, bin.cheng, GCC Patches

On Tue, May 6, 2014 at 10:39 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Mon, Nov 25, 2013 at 7:41 PM, Jeff Law <law@redhat.com> wrote:
>>> On 11/25/13 02:22, bin.cheng wrote:
>>>>
>>>> Hi,
>>>> I previously committed two patches lowering complex address expression for
>>>> IVOPT at http://gcc.gnu.org/ml/gcc-patches/2013-11/msg00546.html and
>>>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg01103.html
>>>> When I bootstrapping GCC I found there were some peculiar cases like
>>>> &MEM[ptr+CST] + xxxx, which should be handled too.  This patch consists
>>>> below two changes:
>>>>
>>>> 1) change in alloc_iv:
>>>> Original code lowers top level complex address expressions like
>>>> &MEM[ptr+off].  The patch relaxes check condition in order to lower
>>>> expressions like &MEM[ptr+off] + xxx, just as the BASE from below dump:
>>>> use 2
>>>>    generic
>>>>    in statement _595 = &MEM[(void *)&this_prg + 36B] + _594;
>>>>
>>>>    at position
>>>>    type struct gcov_bucket_type *
>>>>    base (struct gcov_bucket_type *) &MEM[(void *)&this_prg + 36B] +
>>>> (sizetype) ((unsigned int) (src_i_683 + -1) * 20)
>>>>    step 4294967276
>>>>    base object (void *) &this_prg
>>>>    related candidates
>>>>
>>>> 2) change in tree_to_aff_combination:
>>>> The function get_inner_reference returns "&MEM[ptr+off]" as the core for
>>>> input like the memory ADDRESS in below dump:
>>>> use 2
>>>>    address
>>>>    in statement _59 = MEM[(const struct gcov_ctr_summary *)summary_22(D) +
>>>> 4B].histogram[h_ix_111].min_value;
>>>>
>>>>    at position MEM[(const struct gcov_ctr_summary *)summary_22(D) +
>>>> 4B].histogram[h_ix_111].min_value
>>>>    type const gcov_type *
>>>>    base (const gcov_type *) &MEM[(const struct gcov_ctr_summary
>>>> *)summary_22(D) + 4B] + 36
>>>>    step 20
>>>>    base object (void *) summary_22(D)
>>>>    related candidates
>>>>
>>>> Which can be further reduced into something like "summary_22(D) + 40B".
>>>> This change is necessary for the first one, because I am using
>>>> tree_to_aff_combination rather than get_inner_reference_aff now.
>>>>
>>>> Bootstrap and test on x86/x86_64/arm.  Is it OK?
>>>>
>>>> Thanks.
>>>> bin
>>>>
>>>> 2013-11-25  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         * tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
>>>>         (alloc_iv): Lower more cases by calling contain_complex_addr_expr
>>>>         and tree_to_aff_combination.
>>>>         * tree-affine.c (tree_to_aff_combination): Handle &MEM[ptr+CST]
>>>>         in core part of complex reference.
>>>>
>>>> gcc/testsuite/ChangeLog
>>>> 2013-11-25  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>         * gcc.dg/tree-ssa/ivopts-lower_base.c: New test.
>>>
>>> Unless there's a PR for this problem, I think this needs to wait.
>>
>> I agree.  Btw, please split the patch.
>>
>> Index: gcc/tree-affine.c
>> ===================================================================
>> --- gcc/tree-affine.c    (revision 205087)
>> +++ gcc/tree-affine.c    (working copy)
>> @@ -328,7 +328,19 @@ tree_to_aff_combination (tree expr, tree type, aff
>>                   double_int::from_uhwi (bitpos / BITS_PER_UNIT));
>>        core = build_fold_addr_expr (core);
>>        if (TREE_CODE (core) == ADDR_EXPR)
>> -    aff_combination_add_elt (comb, core, double_int_one);
>> +    {
>> +      /* Handle &MEM[ptr + CST] in core part of complex reference.  */
>> +      if (TREE_CODE (TREE_OPERAND (core, 0)) == MEM_REF)
>> +        {
>> +          core = TREE_OPERAND (core, 0);
>> +          tree_to_aff_combination (TREE_OPERAND (core, 0), type, &tmp);
>> +          aff_combination_add (comb, &tmp);
>> +          tree_to_aff_combination (TREE_OPERAND (core, 1), sizetype, &tmp);
>> +          aff_combination_add (comb, &tmp);
>> +        }
>> +      else
>> +        aff_combination_add_elt (comb, core, double_int_one);
>> +    }
>>        else
>>      {
>>        tree_to_aff_combination (core, type, &tmp)
>>
>> please handle the offset before taking the address, thus
>>
>>   if (TREE_CODE (core) == MEM_REF)
>>     {
>>        add constant offset;
>>        core = TREE_OPERAND (core, 0);
>>     }
>>   else
>>     core = build_fold_addr_expr (core);
>>
>> that simplifies things and avoids the address building.
>>
> Hi,
> I split the patch into two and updated the test case.
> The patches pass bootstrap/tests on x86/x86_64, also pass test on arm cortex-m3.
> Is it OK?
>
> Thanks,
> bin
>
> PATCH 1:
>
> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>
>     * gcc/tree-affine.c (tree_to_aff_combination): Handle MEM_REF for
>     core part of address expressions.

No gcc/ in the changelog

Simplify that by using aff_combination_add_cst:

+      if (TREE_CODE (core) == MEM_REF)
+       {
+         aff_combination_add_cst (comb, mem_ref_offset (core));
+         core = TREE_OPERAND (core, 0);

patch 1 is ok with that change.

> PATCH 2:
>
> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>
>     * gcc/tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
>     (alloc_iv): Lower base expressions containing ADDR_EXPR.

So this "lowers" addresses(?) which are based on &<not-decl>,
like &a[0] + 4 but not &a + 4.  I question this odd choice.  ISTR
when originally introducing address lowering (in rev. 204497) I was
concerned about the cost?

Now I wonder why we bother to convert the lowered form back
to trees to store it in ->base and not simply keep (and always
compute) the affine expansion only.  Thus, change struct iv
to have aff_tree base instead of tree base?

Can you see what it takes to do such change?  At the point
we replace uses we go into affine representation again anyway.

Thanks,
Richard.

> gcc/testsuite/ChangeLog
> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>
>     * gcc.dg/tree-ssa/ivopts-lower_base.c: New test.
>
>
>
> --
> Best Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2014-05-06 10:44       ` Richard Biener
@ 2014-05-08  9:08         ` Bin.Cheng
  2014-05-11 12:49           ` Bin.Cheng
  0 siblings, 1 reply; 12+ messages in thread
From: Bin.Cheng @ 2014-05-08  9:08 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, bin.cheng, GCC Patches

On Tue, May 6, 2014 at 6:44 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, May 6, 2014 at 10:39 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:

>> Hi,
>> I split the patch into two and updated the test case.
>> The patches pass bootstrap/tests on x86/x86_64, also pass test on arm cortex-m3.
>> Is it OK?
>>
>> Thanks,
>> bin
>>
>> PATCH 1:
>>
>> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>>
>>     * gcc/tree-affine.c (tree_to_aff_combination): Handle MEM_REF for
>>     core part of address expressions.
>
> No gcc/ in the changelog
>
> Simplify that by using aff_combination_add_cst:
>
> +      if (TREE_CODE (core) == MEM_REF)
> +       {
> +         aff_combination_add_cst (comb, mem_ref_offset (core));
> +         core = TREE_OPERAND (core, 0);
>
> patch 1 is ok with that change.

Installed with below change because of wide-int merge:
-      core = build_fold_addr_expr (core);
+      if (TREE_CODE (core) == MEM_REF)
+    {
+      aff_combination_add_cst (comb, wi::to_widest (TREE_OPERAND (core, 1)));
+      core = TREE_OPERAND (core, 0);

>
>> PATCH 2:
>>
>> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>>
>>     * gcc/tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
>>     (alloc_iv): Lower base expressions containing ADDR_EXPR.
>
> So this "lowers" addresses(?) which are based on &<not-decl>,
> like &a[0] + 4 but not &a + 4.  I question this odd choice.  ISTR
&a+4 is already in its lowered form, what we want to handle is &EXPR +
4, in which EXPR is MEM_REF/ARRAY_REF, etc..

> when originally introducing address lowering (in rev. 204497) I was
> concerned about the cost?
Yes, you did.  I still think the iv number is relative small for each
loop, thus the change won't cause compilation time issue considering
the existing tree<->affine transformation in ivopt.
I would like to collect more accurate time information for ivopt in
gcc bootstrap.  Should I use "-ftime-report" then add all time slices
in TV_TREE_LOOP_IVOPTS category?  Is there any better solutions?
Thanks.

>
> Now I wonder why we bother to convert the lowered form back
> to trees to store it in ->base and not simply keep (and always
> compute) the affine expansion only.  Thus, change struct iv
> to have aff_tree base instead of tree base?
>
> Can you see what it takes to do such change?  At the point
> we replace uses we go into affine representation again anyway.
Good idea, I may have a look.

Thanks,
bin

-- 
Best Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2014-05-08  9:08         ` Bin.Cheng
@ 2014-05-11 12:49           ` Bin.Cheng
  2014-05-13  8:59             ` Richard Biener
  0 siblings, 1 reply; 12+ messages in thread
From: Bin.Cheng @ 2014-05-11 12:49 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, bin.cheng, GCC Patches

On Thu, May 8, 2014 at 5:08 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Tue, May 6, 2014 at 6:44 PM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Tue, May 6, 2014 at 10:39 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>
>>> Hi,
>>> I split the patch into two and updated the test case.
>>> The patches pass bootstrap/tests on x86/x86_64, also pass test on arm cortex-m3.
>>> Is it OK?
>>>
>>> Thanks,
>>> bin
>>>
>>> PATCH 1:
>>>
>>> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>     * gcc/tree-affine.c (tree_to_aff_combination): Handle MEM_REF for
>>>     core part of address expressions.
>>
>> No gcc/ in the changelog
>>
>> Simplify that by using aff_combination_add_cst:
>>
>> +      if (TREE_CODE (core) == MEM_REF)
>> +       {
>> +         aff_combination_add_cst (comb, mem_ref_offset (core));
>> +         core = TREE_OPERAND (core, 0);
>>
>> patch 1 is ok with that change.
>
> Installed with below change because of wide-int merge:
> -      core = build_fold_addr_expr (core);
> +      if (TREE_CODE (core) == MEM_REF)
> +    {
> +      aff_combination_add_cst (comb, wi::to_widest (TREE_OPERAND (core, 1)));
> +      core = TREE_OPERAND (core, 0);
>
>>
>>> PATCH 2:
>>>
>>> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>>>
>>>     * gcc/tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
>>>     (alloc_iv): Lower base expressions containing ADDR_EXPR.
>>
>> So this "lowers" addresses(?) which are based on &<not-decl>,
>> like &a[0] + 4 but not &a + 4.  I question this odd choice.  ISTR
> &a+4 is already in its lowered form, what we want to handle is &EXPR +
> 4, in which EXPR is MEM_REF/ARRAY_REF, etc..
>
>> when originally introducing address lowering (in rev. 204497) I was
>> concerned about the cost?
> Yes, you did.  I still think the iv number is relative small for each
> loop, thus the change won't cause compilation time issue considering
> the existing tree<->affine transformation in ivopt.
> I would like to collect more accurate time information for ivopt in
> gcc bootstrap.  Should I use "-ftime-report" then add all time slices
> in TV_TREE_LOOP_IVOPTS category?  Is there any better solutions?
> Thanks.
>
>>
>> Now I wonder why we bother to convert the lowered form back
>> to trees to store it in ->base and not simply keep (and always
>> compute) the affine expansion only.  Thus, change struct iv
>> to have aff_tree base instead of tree base?
>>
>> Can you see what it takes to do such change?  At the point
>> we replace uses we go into affine representation again anyway.
> Good idea, I may have a look.
After going through work flow of ivopt, I think it's practical to make
such change and can help compilation time.  Ivopt calls
tree_to_aff_combination to convert iv->base into affine_tree whenever
it tries to represent an iv_use by an iv_cand,  i.e., N*M times for
computing cost of each iv_use represented by each iv_cand, and M times
for rewriting each iv_use.  Here, N and M stands for number of iv_use
and iv_cand.  Also strip_offset (which is a recursive function) is
called for iv->base also at complexity of O(NM), so it may be improved
too.
To make a start, it's possible to store both tree and affine_tree of
base in struct iv, and use either of them whenever needed.

It seems to me there are two potential issues of this change.  First,
though affine_tree of base is stored, we can't operate on it directly,
which means we have to copy it before using it.  Second, ivopt
sometimes computes affine_tree of base in different type other than
TREE_TYPE(base), we may need to do additional calls to
aff_combination_convert to get wanted type.  All these will introduce
additional computation and compromise benefit of the change.

At last, I did some experiments on how many additional calls to
tree_to_aff_combination are introduced by this patch.  The data of
bootstrap shows it's less than 3% comparing to calls made by current
implementation of ivopt, which confirms that this patch shouldn't have
issue of compilation time.

Any comments?

Thanks,
bin


-- 
Best Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2014-05-11 12:49           ` Bin.Cheng
@ 2014-05-13  8:59             ` Richard Biener
  2014-05-13 10:18               ` Bin.Cheng
  0 siblings, 1 reply; 12+ messages in thread
From: Richard Biener @ 2014-05-13  8:59 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: Jeff Law, bin.cheng, GCC Patches

On Sun, May 11, 2014 at 2:49 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Thu, May 8, 2014 at 5:08 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Tue, May 6, 2014 at 6:44 PM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Tue, May 6, 2014 at 10:39 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>> On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
>>>> <richard.guenther@gmail.com> wrote:
>>
>>>> Hi,
>>>> I split the patch into two and updated the test case.
>>>> The patches pass bootstrap/tests on x86/x86_64, also pass test on arm cortex-m3.
>>>> Is it OK?
>>>>
>>>> Thanks,
>>>> bin
>>>>
>>>> PATCH 1:
>>>>
>>>> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>     * gcc/tree-affine.c (tree_to_aff_combination): Handle MEM_REF for
>>>>     core part of address expressions.
>>>
>>> No gcc/ in the changelog
>>>
>>> Simplify that by using aff_combination_add_cst:
>>>
>>> +      if (TREE_CODE (core) == MEM_REF)
>>> +       {
>>> +         aff_combination_add_cst (comb, mem_ref_offset (core));
>>> +         core = TREE_OPERAND (core, 0);
>>>
>>> patch 1 is ok with that change.
>>
>> Installed with below change because of wide-int merge:
>> -      core = build_fold_addr_expr (core);
>> +      if (TREE_CODE (core) == MEM_REF)
>> +    {
>> +      aff_combination_add_cst (comb, wi::to_widest (TREE_OPERAND (core, 1)));
>> +      core = TREE_OPERAND (core, 0);
>>
>>>
>>>> PATCH 2:
>>>>
>>>> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>>>>
>>>>     * gcc/tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
>>>>     (alloc_iv): Lower base expressions containing ADDR_EXPR.
>>>
>>> So this "lowers" addresses(?) which are based on &<not-decl>,
>>> like &a[0] + 4 but not &a + 4.  I question this odd choice.  ISTR
>> &a+4 is already in its lowered form, what we want to handle is &EXPR +
>> 4, in which EXPR is MEM_REF/ARRAY_REF, etc..
>>
>>> when originally introducing address lowering (in rev. 204497) I was
>>> concerned about the cost?
>> Yes, you did.  I still think the iv number is relative small for each
>> loop, thus the change won't cause compilation time issue considering
>> the existing tree<->affine transformation in ivopt.
>> I would like to collect more accurate time information for ivopt in
>> gcc bootstrap.  Should I use "-ftime-report" then add all time slices
>> in TV_TREE_LOOP_IVOPTS category?  Is there any better solutions?
>> Thanks.
>>
>>>
>>> Now I wonder why we bother to convert the lowered form back
>>> to trees to store it in ->base and not simply keep (and always
>>> compute) the affine expansion only.  Thus, change struct iv
>>> to have aff_tree base instead of tree base?
>>>
>>> Can you see what it takes to do such change?  At the point
>>> we replace uses we go into affine representation again anyway.
>> Good idea, I may have a look.
> After going through work flow of ivopt, I think it's practical to make
> such change and can help compilation time.  Ivopt calls
> tree_to_aff_combination to convert iv->base into affine_tree whenever
> it tries to represent an iv_use by an iv_cand,  i.e., N*M times for
> computing cost of each iv_use represented by each iv_cand, and M times
> for rewriting each iv_use.  Here, N and M stands for number of iv_use
> and iv_cand.  Also strip_offset (which is a recursive function) is
> called for iv->base also at complexity of O(NM), so it may be improved
> too.
> To make a start, it's possible to store both tree and affine_tree of
> base in struct iv, and use either of them whenever needed.
>
> It seems to me there are two potential issues of this change.  First,
> though affine_tree of base is stored, we can't operate on it directly,
> which means we have to copy it before using it.

You'd use it as unchanged operand to some aff_* function to avoid
actual copying of course.

>  Second, ivopt
> sometimes computes affine_tree of base in different type other than
> TREE_TYPE(base), we may need to do additional calls to
> aff_combination_convert to get wanted type.  All these will introduce
> additional computation and compromise benefit of the change.

Sure.

> At last, I did some experiments on how many additional calls to
> tree_to_aff_combination are introduced by this patch.  The data of
> bootstrap shows it's less than 3% comparing to calls made by current
> implementation of ivopt, which confirms that this patch shouldn't have
> issue of compilation time.
>
> Any comments?

I'd see the use of affines as making the algorithmic structure of
IVOPTs clearer and more consistent (also with possibly using
the _expand variants later to get even more simplifications).

I don't want to force you to do this change but I thought it may
help further changes (as you seem to be doing a lot of IVOPTs
changes lately).

So - the patch is ok as-is then, but please consider refactoring
IVOPTs code when you do changes.  It's current structure
is somewhat awkward.

Thanks,
Richard.

> Thanks,
> bin
>
>
> --
> Best Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c
  2014-05-13  8:59             ` Richard Biener
@ 2014-05-13 10:18               ` Bin.Cheng
  0 siblings, 0 replies; 12+ messages in thread
From: Bin.Cheng @ 2014-05-13 10:18 UTC (permalink / raw)
  To: Richard Biener; +Cc: Jeff Law, bin.cheng, GCC Patches

On Tue, May 13, 2014 at 4:59 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Sun, May 11, 2014 at 2:49 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>> On Thu, May 8, 2014 at 5:08 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>> On Tue, May 6, 2014 at 6:44 PM, Richard Biener
>>> <richard.guenther@gmail.com> wrote:
>>>> On Tue, May 6, 2014 at 10:39 AM, Bin.Cheng <amker.cheng@gmail.com> wrote:
>>>>> On Fri, Dec 6, 2013 at 6:19 PM, Richard Biener
>>>>> <richard.guenther@gmail.com> wrote:
>>>
>>>>> Hi,
>>>>> I split the patch into two and updated the test case.
>>>>> The patches pass bootstrap/tests on x86/x86_64, also pass test on arm cortex-m3.
>>>>> Is it OK?
>>>>>
>>>>> Thanks,
>>>>> bin
>>>>>
>>>>> PATCH 1:
>>>>>
>>>>> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>>>>>
>>>>>     * gcc/tree-affine.c (tree_to_aff_combination): Handle MEM_REF for
>>>>>     core part of address expressions.
>>>>
>>>> No gcc/ in the changelog
>>>>
>>>> Simplify that by using aff_combination_add_cst:
>>>>
>>>> +      if (TREE_CODE (core) == MEM_REF)
>>>> +       {
>>>> +         aff_combination_add_cst (comb, mem_ref_offset (core));
>>>> +         core = TREE_OPERAND (core, 0);
>>>>
>>>> patch 1 is ok with that change.
>>>
>>> Installed with below change because of wide-int merge:
>>> -      core = build_fold_addr_expr (core);
>>> +      if (TREE_CODE (core) == MEM_REF)
>>> +    {
>>> +      aff_combination_add_cst (comb, wi::to_widest (TREE_OPERAND (core, 1)));
>>> +      core = TREE_OPERAND (core, 0);
>>>
>>>>
>>>>> PATCH 2:
>>>>>
>>>>> 2014-05-06  Bin Cheng  <bin.cheng@arm.com>
>>>>>
>>>>>     * gcc/tree-ssa-loop-ivopts.c (contain_complex_addr_expr): New.
>>>>>     (alloc_iv): Lower base expressions containing ADDR_EXPR.
>>>>
>>>> So this "lowers" addresses(?) which are based on &<not-decl>,
>>>> like &a[0] + 4 but not &a + 4.  I question this odd choice.  ISTR
>>> &a+4 is already in its lowered form, what we want to handle is &EXPR +
>>> 4, in which EXPR is MEM_REF/ARRAY_REF, etc..
>>>
>>>> when originally introducing address lowering (in rev. 204497) I was
>>>> concerned about the cost?
>>> Yes, you did.  I still think the iv number is relative small for each
>>> loop, thus the change won't cause compilation time issue considering
>>> the existing tree<->affine transformation in ivopt.
>>> I would like to collect more accurate time information for ivopt in
>>> gcc bootstrap.  Should I use "-ftime-report" then add all time slices
>>> in TV_TREE_LOOP_IVOPTS category?  Is there any better solutions?
>>> Thanks.
>>>
>>>>
>>>> Now I wonder why we bother to convert the lowered form back
>>>> to trees to store it in ->base and not simply keep (and always
>>>> compute) the affine expansion only.  Thus, change struct iv
>>>> to have aff_tree base instead of tree base?
>>>>
>>>> Can you see what it takes to do such change?  At the point
>>>> we replace uses we go into affine representation again anyway.
>>> Good idea, I may have a look.
>> After going through work flow of ivopt, I think it's practical to make
>> such change and can help compilation time.  Ivopt calls
>> tree_to_aff_combination to convert iv->base into affine_tree whenever
>> it tries to represent an iv_use by an iv_cand,  i.e., N*M times for
>> computing cost of each iv_use represented by each iv_cand, and M times
>> for rewriting each iv_use.  Here, N and M stands for number of iv_use
>> and iv_cand.  Also strip_offset (which is a recursive function) is
>> called for iv->base also at complexity of O(NM), so it may be improved
>> too.
>> To make a start, it's possible to store both tree and affine_tree of
>> base in struct iv, and use either of them whenever needed.
>>
>> It seems to me there are two potential issues of this change.  First,
>> though affine_tree of base is stored, we can't operate on it directly,
>> which means we have to copy it before using it.
>
> You'd use it as unchanged operand to some aff_* function to avoid
> actual copying of course.
>
>>  Second, ivopt
>> sometimes computes affine_tree of base in different type other than
>> TREE_TYPE(base), we may need to do additional calls to
>> aff_combination_convert to get wanted type.  All these will introduce
>> additional computation and compromise benefit of the change.
>
> Sure.
>
>> At last, I did some experiments on how many additional calls to
>> tree_to_aff_combination are introduced by this patch.  The data of
>> bootstrap shows it's less than 3% comparing to calls made by current
>> implementation of ivopt, which confirms that this patch shouldn't have
>> issue of compilation time.
>>
>> Any comments?
>
> I'd see the use of affines as making the algorithmic structure of
> IVOPTs clearer and more consistent (also with possibly using
> the _expand variants later to get even more simplifications).
Yes, one example, it's possible to rewrite iv uses in a loop-invariant
sensitive manner if we can use affine in the whole process.  Currently
fold_tree routines tend to decompose invariant into different
expressions.

>
> I don't want to force you to do this change but I thought it may
> help further changes (as you seem to be doing a lot of IVOPTs
> changes lately).
>
> So - the patch is ok as-is then, but please consider refactoring
Patch installed.

> IVOPTs code when you do changes.  It's current structure
> is somewhat awkward.
Understood.  I will continue to look at it when have proper time.

Thanks,
bin



-- 
Best Regards.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2014-05-13 10:18 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-25 10:43 [PATCH GCC]Pick up more address lowering cases for ivopt and tree-affine.c bin.cheng
2013-11-25 21:21 ` Jeff Law
2013-12-06 10:19   ` Richard Biener
2013-12-06 10:40     ` Bin.Cheng
2013-12-06 11:20       ` Richard Biener
2013-12-06 12:04         ` Bin.Cheng
2014-05-06  8:39     ` Bin.Cheng
2014-05-06 10:44       ` Richard Biener
2014-05-08  9:08         ` Bin.Cheng
2014-05-11 12:49           ` Bin.Cheng
2014-05-13  8:59             ` Richard Biener
2014-05-13 10:18               ` Bin.Cheng

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).