public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
@ 2016-01-14  7:38 Jeff Law
  2016-01-14  7:46 ` Jakub Jelinek
  2016-01-14  9:47 ` Richard Biener
  0 siblings, 2 replies; 14+ messages in thread
From: Jeff Law @ 2016-01-14  7:38 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 839 bytes --]


As noted in the BZ, DOM does not exploit VRP information to create 
additional equivalences in the arms of conditionals.

This can cause DOM to miss relatively simple optimizations that show up 
in the adpcm benchmark as well as within GCC itself.

The fix is quite simple -- we just need to extend the code which used 
the implicit range of booleans to propagate on both sides of equality 
conditionals to use VRP information as well.

Once done we also need to make sure to convert the true/false value 
we're propagating to the right type.  Again, trivial.

The testcase is derived from the adpcm benchmark.  It checks that we 
optimize away the bit-xor on both arms of the conditional and that in 
turn allows other values to be discovered as constants.

Bootstrapped and regression tested on x86_64.  Installed on the trunk.



Jeff

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 6290 bytes --]

commit e9d42e91d1e88ece5be38dbde81843e516b327e0
Author: Jeff Law <law@redhat.com>
Date:   Thu Jan 14 00:37:37 2016 -0700

    [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
    
    	PR tree-optimization/69270
    	* tree-ssa-dom.c (ssa_name_has_boolean_range): New function.
    	(record_edge_info): Use it.  Convert boolean_{true,false}_node
    	to the type of op0.
    
    	PR tree-optimization/69270
    	* gcc.dg/tree-ssa/pr69270.c: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index dc13621..40e3dfb 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2016-01-14  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/69270
+	* tree-ssa-dom.c (ssa_name_has_boolean_range): New function.
+	(record_edge_info): Use it.  Convert boolean_{true,false}_node
+	to the type of op0.
+
 2016-01-13  Jan Hubicka  <hubicka@ucw.cz>
 
 	PR ipa/66487
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index f393e25..63976ea 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2016-01-14  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/69270
+	* gcc.dg/tree-ssa/pr69270.c: New test.
+
 2016-01-13  Bernd Schmidt  <bschmidt@redhat.com>
 
 	PR c/66208
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69270.c b/gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
new file mode 100644
index 0000000..529b521
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
@@ -0,0 +1,42 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
+
+/* There should be two references to bufferstep that turn into
+   constants.  */
+/* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .0." 1 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with constant .1." 1 "dom3"} } */
+
+/* And some assignments ought to fold down to constants.  */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"} } */
+/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"} } */
+
+/* The XOR operations should have been optimized to constants.  */
+/* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
+
+
+extern int *stepsizeTable;
+
+void
+adpcm_coder (signed char *outdata, int len)
+{
+  signed char *outp;
+  int delta;
+  int outputbuffer;
+  int bufferstep = 0;
+  outp = (signed char *) outdata;
+  int step = 0;
+  int index = 0;
+  int diff = 0;
+  for (; len > 0; len--)
+    {
+      delta = 0;
+      if (diff >= step)
+	delta = 4;
+      step = stepsizeTable[index];
+      if (bufferstep)
+	outputbuffer = (delta << 4) & 0xf0;
+      else
+	*outp++ = (delta & 0x0f) | outputbuffer;
+      bufferstep = !bufferstep;
+    }
+}
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index 9d2e189..a9abed9 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -316,6 +316,38 @@ record_conditions (struct edge_info *edge_info, tree cond, tree inverted)
   edge_info->cond_equivalences.safe_push (c);
 }
 
+/* Return TRUE is OP, an SSA_NAME has a range of values [0..1], false
+   otherwise.
+
+   This can be because it is a boolean type, any type with
+   a single bit of precision, or has known range of values
+   it might old of [0..1] via VRP analysis.  */
+
+static bool
+ssa_name_has_boolean_range (tree op)
+{
+  /* Boolean types always have a range [0..1].  */
+  if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE)
+    return true;
+
+  /* An integral type with a single bit of precision.  */
+  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
+      && TYPE_PRECISION (TREE_TYPE (op)) == 1)
+    return true;
+
+  /* An integral type with more precision, but the object
+     only takes on values [0..1] as determined by VRP
+     analysis.  */
+  wide_int min, max;
+  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
+      && get_range_info (op, &min, &max) == VR_RANGE
+      && wi::eq_p (min, 0)
+      && wi::eq_p (max, 1))
+    return true;
+
+  return false;
+}
+
 /* We have finished optimizing BB, record any information implied by
    taking a specific outgoing edge from BB.  */
 
@@ -390,36 +422,32 @@ record_edge_info (basic_block bb)
              can record an equivalence for OP0 rather than COND.  */
           if ((code == EQ_EXPR || code == NE_EXPR)
               && TREE_CODE (op0) == SSA_NAME
-              && TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE
+	      && ssa_name_has_boolean_range (op0)
               && is_gimple_min_invariant (op1))
             {
+	      tree true_val = fold_convert (TREE_TYPE (op0),
+					    boolean_true_node);
+	      tree false_val = fold_convert (TREE_TYPE (op0),
+					     boolean_false_node);
               if (code == EQ_EXPR)
                 {
                   edge_info = allocate_edge_info (true_edge);
                   edge_info->lhs = op0;
-                  edge_info->rhs = (integer_zerop (op1)
-                                    ? boolean_false_node
-                                    : boolean_true_node);
+                  edge_info->rhs = (integer_zerop (op1) ? false_val : true_val);
 
                   edge_info = allocate_edge_info (false_edge);
                   edge_info->lhs = op0;
-                  edge_info->rhs = (integer_zerop (op1)
-                                    ? boolean_true_node
-                                    : boolean_false_node);
+                  edge_info->rhs = (integer_zerop (op1) ? true_val : false_val);
                 }
               else
                 {
                   edge_info = allocate_edge_info (true_edge);
                   edge_info->lhs = op0;
-                  edge_info->rhs = (integer_zerop (op1)
-                                    ? boolean_true_node
-                                    : boolean_false_node);
+                  edge_info->rhs = (integer_zerop (op1) ? true_val : false_val);
 
                   edge_info = allocate_edge_info (false_edge);
                   edge_info->lhs = op0;
-                  edge_info->rhs = (integer_zerop (op1)
-                                    ? boolean_false_node
-                                    : boolean_true_node);
+                  edge_info->rhs = (integer_zerop (op1) ? false_val : true_val);
                 }
             }
           else if (is_gimple_min_invariant (op0)

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-14  7:38 [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM Jeff Law
@ 2016-01-14  7:46 ` Jakub Jelinek
  2016-01-14  7:49   ` Jakub Jelinek
  2016-01-14  9:47 ` Richard Biener
  1 sibling, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2016-01-14  7:46 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Thu, Jan 14, 2016 at 12:38:52AM -0700, Jeff Law wrote:
> +  /* An integral type with more precision, but the object
> +     only takes on values [0..1] as determined by VRP
> +     analysis.  */
> +  wide_int min, max;
> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
> +      && get_range_info (op, &min, &max) == VR_RANGE
> +      && wi::eq_p (min, 0)
> +      && wi::eq_p (max, 1))
> +    return true;

You could use and/or:
  if (INTEGRAL_TYPE_P (TREE_TYPE (op)) && wi::eq_p (get_nonzero_bits (op), 1))
set_range_info for VR_RANGE should usually update also the non-zero bits, but
set_nonzero_bits does not update the recorded range.

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-14  7:46 ` Jakub Jelinek
@ 2016-01-14  7:49   ` Jakub Jelinek
  2016-01-14 18:15     ` Jeff Law
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2016-01-14  7:49 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Thu, Jan 14, 2016 at 08:46:43AM +0100, Jakub Jelinek wrote:
> On Thu, Jan 14, 2016 at 12:38:52AM -0700, Jeff Law wrote:
> > +  /* An integral type with more precision, but the object
> > +     only takes on values [0..1] as determined by VRP
> > +     analysis.  */
> > +  wide_int min, max;
> > +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
> > +      && get_range_info (op, &min, &max) == VR_RANGE
> > +      && wi::eq_p (min, 0)
> > +      && wi::eq_p (max, 1))
> > +    return true;
> 
> You could use and/or:
>   if (INTEGRAL_TYPE_P (TREE_TYPE (op)) && wi::eq_p (get_nonzero_bits (op), 1))
> set_range_info for VR_RANGE should usually update also the non-zero bits, but
> set_nonzero_bits does not update the recorded range.

Though, that would need to be limited to TYPE_PRECISION (TREE_TYPE (op)) > 1
or TYPE_UNSIGNED.

BTW,

+  /* An integral type with a single bit of precision.  */                                                                                         
+  if (INTEGRAL_TYPE_P (TREE_TYPE (op))                                                                                                            
+      && TYPE_PRECISION (TREE_TYPE (op)) == 1)                                                                                                    
+    return true;                                                                                                                                  

does not guarantee values 0, 1, it can mean either 0, 1 or -1, 0.  So, if
-1, 0 is unacceptable, you need to use TYPE_UNSIGNED (TREE_TYPE (op)) too.

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-14  7:38 [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM Jeff Law
  2016-01-14  7:46 ` Jakub Jelinek
@ 2016-01-14  9:47 ` Richard Biener
  2016-01-14 18:27   ` Jeff Law
  1 sibling, 1 reply; 14+ messages in thread
From: Richard Biener @ 2016-01-14  9:47 UTC (permalink / raw)
  To: Jeff Law; +Cc: GCC Patches

On Thu, Jan 14, 2016 at 8:38 AM, Jeff Law <law@redhat.com> wrote:
>
> As noted in the BZ, DOM does not exploit VRP information to create
> additional equivalences in the arms of conditionals.
>
> This can cause DOM to miss relatively simple optimizations that show up in
> the adpcm benchmark as well as within GCC itself.
>
> The fix is quite simple -- we just need to extend the code which used the
> implicit range of booleans to propagate on both sides of equality
> conditionals to use VRP information as well.
>
> Once done we also need to make sure to convert the true/false value we're
> propagating to the right type.  Again, trivial.
>
> The testcase is derived from the adpcm benchmark.  It checks that we
> optimize away the bit-xor on both arms of the conditional and that in turn
> allows other values to be discovered as constants.
>
> Bootstrapped and regression tested on x86_64.  Installed on the trunk.
>
>
>
> Jeff
>
> commit e9d42e91d1e88ece5be38dbde81843e516b327e0
> Author: Jeff Law <law@redhat.com>
> Date:   Thu Jan 14 00:37:37 2016 -0700
>
>     [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
>
>         PR tree-optimization/69270
>         * tree-ssa-dom.c (ssa_name_has_boolean_range): New function.
>         (record_edge_info): Use it.  Convert boolean_{true,false}_node
>         to the type of op0.
>
>         PR tree-optimization/69270
>         * gcc.dg/tree-ssa/pr69270.c: New test.
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index dc13621..40e3dfb 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,10 @@
> +2016-01-14  Jeff Law  <law@redhat.com>
> +
> +       PR tree-optimization/69270
> +       * tree-ssa-dom.c (ssa_name_has_boolean_range): New function.
> +       (record_edge_info): Use it.  Convert boolean_{true,false}_node
> +       to the type of op0.
> +
>  2016-01-13  Jan Hubicka  <hubicka@ucw.cz>
>
>         PR ipa/66487
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index f393e25..63976ea 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2016-01-14  Jeff Law  <law@redhat.com>
> +
> +       PR tree-optimization/69270
> +       * gcc.dg/tree-ssa/pr69270.c: New test.
> +
>  2016-01-13  Bernd Schmidt  <bschmidt@redhat.com>
>
>         PR c/66208
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
> b/gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
> new file mode 100644
> index 0000000..529b521
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
> @@ -0,0 +1,42 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
> +
> +/* There should be two references to bufferstep that turn into
> +   constants.  */
> +/* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
> constant .0." 1 "dom3"} } */
> +/* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
> constant .1." 1 "dom3"} } */
> +
> +/* And some assignments ought to fold down to constants.  */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"}
> } */
> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"}
> } */
> +
> +/* The XOR operations should have been optimized to constants.  */
> +/* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
> +
> +
> +extern int *stepsizeTable;
> +
> +void
> +adpcm_coder (signed char *outdata, int len)
> +{
> +  signed char *outp;
> +  int delta;
> +  int outputbuffer;
> +  int bufferstep = 0;
> +  outp = (signed char *) outdata;
> +  int step = 0;
> +  int index = 0;
> +  int diff = 0;
> +  for (; len > 0; len--)
> +    {
> +      delta = 0;
> +      if (diff >= step)
> +       delta = 4;
> +      step = stepsizeTable[index];
> +      if (bufferstep)
> +       outputbuffer = (delta << 4) & 0xf0;
> +      else
> +       *outp++ = (delta & 0x0f) | outputbuffer;
> +      bufferstep = !bufferstep;
> +    }
> +}
> diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
> index 9d2e189..a9abed9 100644
> --- a/gcc/tree-ssa-dom.c
> +++ b/gcc/tree-ssa-dom.c
> @@ -316,6 +316,38 @@ record_conditions (struct edge_info *edge_info, tree
> cond, tree inverted)
>    edge_info->cond_equivalences.safe_push (c);
>  }
>
> +/* Return TRUE is OP, an SSA_NAME has a range of values [0..1], false
> +   otherwise.
> +
> +   This can be because it is a boolean type, any type with
> +   a single bit of precision, or has known range of values
> +   it might old of [0..1] via VRP analysis.  */
> +
> +static bool
> +ssa_name_has_boolean_range (tree op)
> +{
> +  /* Boolean types always have a range [0..1].  */
> +  if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE)
> +    return true;
> +
> +  /* An integral type with a single bit of precision.  */
> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
> +      && TYPE_PRECISION (TREE_TYPE (op)) == 1)
> +    return true;
> +
> +  /* An integral type with more precision, but the object
> +     only takes on values [0..1] as determined by VRP
> +     analysis.  */
> +  wide_int min, max;
> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
> +      && get_range_info (op, &min, &max) == VR_RANGE
> +      && wi::eq_p (min, 0)
> +      && wi::eq_p (max, 1))
> +    return true;
> +
> +  return false;
> +}
> +
>  /* We have finished optimizing BB, record any information implied by
>     taking a specific outgoing edge from BB.  */
>
> @@ -390,36 +422,32 @@ record_edge_info (basic_block bb)
>               can record an equivalence for OP0 rather than COND.  */
>            if ((code == EQ_EXPR || code == NE_EXPR)
>                && TREE_CODE (op0) == SSA_NAME
> -              && TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE
> +             && ssa_name_has_boolean_range (op0)
>                && is_gimple_min_invariant (op1))
>              {
> +             tree true_val = fold_convert (TREE_TYPE (op0),
> +                                           boolean_true_node);
> +             tree false_val = fold_convert (TREE_TYPE (op0),
> +                                            boolean_false_node);

Apart from what Jakub said we have constant_boolean_node for this,

  true_val = constant_boolean_node (true, TREE_TYPE (op0));
...

>                if (code == EQ_EXPR)
>                  {
>                    edge_info = allocate_edge_info (true_edge);
>                    edge_info->lhs = op0;
> -                  edge_info->rhs = (integer_zerop (op1)
> -                                    ? boolean_false_node
> -                                    : boolean_true_node);
> +                  edge_info->rhs = (integer_zerop (op1) ? false_val :
> true_val);
>
>                    edge_info = allocate_edge_info (false_edge);
>                    edge_info->lhs = op0;
> -                  edge_info->rhs = (integer_zerop (op1)
> -                                    ? boolean_true_node
> -                                    : boolean_false_node);
> +                  edge_info->rhs = (integer_zerop (op1) ? true_val :
> false_val);
>                  }
>                else
>                  {
>                    edge_info = allocate_edge_info (true_edge);
>                    edge_info->lhs = op0;
> -                  edge_info->rhs = (integer_zerop (op1)
> -                                    ? boolean_true_node
> -                                    : boolean_false_node);
> +                  edge_info->rhs = (integer_zerop (op1) ? true_val :
> false_val);
>
>                    edge_info = allocate_edge_info (false_edge);
>                    edge_info->lhs = op0;
> -                  edge_info->rhs = (integer_zerop (op1)
> -                                    ? boolean_false_node
> -                                    : boolean_true_node);
> +                  edge_info->rhs = (integer_zerop (op1) ? false_val :
> true_val);
>                  }
>              }
>            else if (is_gimple_min_invariant (op0)
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-14  7:49   ` Jakub Jelinek
@ 2016-01-14 18:15     ` Jeff Law
  2016-01-15 22:32       ` Jeff Law
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Law @ 2016-01-14 18:15 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On 01/14/2016 12:49 AM, Jakub Jelinek wrote:
> On Thu, Jan 14, 2016 at 08:46:43AM +0100, Jakub Jelinek wrote:
>> On Thu, Jan 14, 2016 at 12:38:52AM -0700, Jeff Law wrote:
>>> +  /* An integral type with more precision, but the object
>>> +     only takes on values [0..1] as determined by VRP
>>> +     analysis.  */
>>> +  wide_int min, max;
>>> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
>>> +      && get_range_info (op, &min, &max) == VR_RANGE
>>> +      && wi::eq_p (min, 0)
>>> +      && wi::eq_p (max, 1))
>>> +    return true;
>>
>> You could use and/or:
>>    if (INTEGRAL_TYPE_P (TREE_TYPE (op)) && wi::eq_p (get_nonzero_bits (op), 1))
>> set_range_info for VR_RANGE should usually update also the non-zero bits, but
>> set_nonzero_bits does not update the recorded range.
>
> Though, that would need to be limited to TYPE_PRECISION (TREE_TYPE (op)) > 1
> or TYPE_UNSIGNED.
Quite surprisingly, this does seem to do better fairly often.  Usually 
it's just getting more constants into the PHI nodes without further 
improvements.  However occasionally I see a PHI that turns into a 
constant, which is then propagated to a use where we're able to simplify 
some arithmetic/logical.

Unfortunately it doesn't make a bit of difference in the final output, 
so something post DOM was picking up these anyway (most likely VRP2). 
But I'm a fan of getting stuff optimized earlier in the pipeline when 
it's reasonable to do so, and this seems reasonable.

Limiting to TYPE_PRECISION > 1 or TYPE_UNSIGNED ought to be trivial.

>
> BTW,
>
> +  /* An integral type with a single bit of precision.  */
> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
> +      && TYPE_PRECISION (TREE_TYPE (op)) == 1)
> +    return true;
>
> does not guarantee values 0, 1, it can mean either 0, 1 or -1, 0.  So, if
> -1, 0 is unacceptable, you need to use TYPE_UNSIGNED (TREE_TYPE (op)) too.
Hmm, then I think we've got other places that likely need updating.  I 
remember adding similar code elsewhere in the past couple years.  I'll 
have to do a little auditing.

jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-14  9:47 ` Richard Biener
@ 2016-01-14 18:27   ` Jeff Law
  2016-01-15  2:49     ` Jeff Law
  0 siblings, 1 reply; 14+ messages in thread
From: Jeff Law @ 2016-01-14 18:27 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On 01/14/2016 02:47 AM, Richard Biener wrote:
> On Thu, Jan 14, 2016 at 8:38 AM, Jeff Law <law@redhat.com> wrote:
>>
>> As noted in the BZ, DOM does not exploit VRP information to create
>> additional equivalences in the arms of conditionals.
>>
>> This can cause DOM to miss relatively simple optimizations that show up in
>> the adpcm benchmark as well as within GCC itself.
>>
>> The fix is quite simple -- we just need to extend the code which used the
>> implicit range of booleans to propagate on both sides of equality
>> conditionals to use VRP information as well.
>>
>> Once done we also need to make sure to convert the true/false value we're
>> propagating to the right type.  Again, trivial.
>>
>> The testcase is derived from the adpcm benchmark.  It checks that we
>> optimize away the bit-xor on both arms of the conditional and that in turn
>> allows other values to be discovered as constants.
>>
>> Bootstrapped and regression tested on x86_64.  Installed on the trunk.
>>
>>
>>
>> Jeff
>>
>> commit e9d42e91d1e88ece5be38dbde81843e516b327e0
>> Author: Jeff Law <law@redhat.com>
>> Date:   Thu Jan 14 00:37:37 2016 -0700
>>
>>      [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
>>
>>          PR tree-optimization/69270
>>          * tree-ssa-dom.c (ssa_name_has_boolean_range): New function.
>>          (record_edge_info): Use it.  Convert boolean_{true,false}_node
>>          to the type of op0.
>>
>>          PR tree-optimization/69270
>>          * gcc.dg/tree-ssa/pr69270.c: New test.
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index dc13621..40e3dfb 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,10 @@
>> +2016-01-14  Jeff Law  <law@redhat.com>
>> +
>> +       PR tree-optimization/69270
>> +       * tree-ssa-dom.c (ssa_name_has_boolean_range): New function.
>> +       (record_edge_info): Use it.  Convert boolean_{true,false}_node
>> +       to the type of op0.
>> +
>>   2016-01-13  Jan Hubicka  <hubicka@ucw.cz>
>>
>>          PR ipa/66487
>> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
>> index f393e25..63976ea 100644
>> --- a/gcc/testsuite/ChangeLog
>> +++ b/gcc/testsuite/ChangeLog
>> @@ -1,3 +1,8 @@
>> +2016-01-14  Jeff Law  <law@redhat.com>
>> +
>> +       PR tree-optimization/69270
>> +       * gcc.dg/tree-ssa/pr69270.c: New test.
>> +
>>   2016-01-13  Bernd Schmidt  <bschmidt@redhat.com>
>>
>>          PR c/66208
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
>> b/gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
>> new file mode 100644
>> index 0000000..529b521
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69270.c
>> @@ -0,0 +1,42 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fsplit-paths -fdump-tree-dom3-details" } */
>> +
>> +/* There should be two references to bufferstep that turn into
>> +   constants.  */
>> +/* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
>> constant .0." 1 "dom3"} } */
>> +/* { dg-final { scan-tree-dump-times "Replaced .bufferstep_\[0-9\]+. with
>> constant .1." 1 "dom3"} } */
>> +
>> +/* And some assignments ought to fold down to constants.  */
>> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 1;" 2 "dom3"}
>> } */
>> +/* { dg-final { scan-tree-dump-times "Folded to: _\[0-9\]+ = 0;" 2 "dom3"}
>> } */
>> +
>> +/* The XOR operations should have been optimized to constants.  */
>> +/* { dg-final { scan-tree-dump-not "bit_xor" "dom3"} } */
>> +
>> +
>> +extern int *stepsizeTable;
>> +
>> +void
>> +adpcm_coder (signed char *outdata, int len)
>> +{
>> +  signed char *outp;
>> +  int delta;
>> +  int outputbuffer;
>> +  int bufferstep = 0;
>> +  outp = (signed char *) outdata;
>> +  int step = 0;
>> +  int index = 0;
>> +  int diff = 0;
>> +  for (; len > 0; len--)
>> +    {
>> +      delta = 0;
>> +      if (diff >= step)
>> +       delta = 4;
>> +      step = stepsizeTable[index];
>> +      if (bufferstep)
>> +       outputbuffer = (delta << 4) & 0xf0;
>> +      else
>> +       *outp++ = (delta & 0x0f) | outputbuffer;
>> +      bufferstep = !bufferstep;
>> +    }
>> +}
>> diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
>> index 9d2e189..a9abed9 100644
>> --- a/gcc/tree-ssa-dom.c
>> +++ b/gcc/tree-ssa-dom.c
>> @@ -316,6 +316,38 @@ record_conditions (struct edge_info *edge_info, tree
>> cond, tree inverted)
>>     edge_info->cond_equivalences.safe_push (c);
>>   }
>>
>> +/* Return TRUE is OP, an SSA_NAME has a range of values [0..1], false
>> +   otherwise.
>> +
>> +   This can be because it is a boolean type, any type with
>> +   a single bit of precision, or has known range of values
>> +   it might old of [0..1] via VRP analysis.  */
>> +
>> +static bool
>> +ssa_name_has_boolean_range (tree op)
>> +{
>> +  /* Boolean types always have a range [0..1].  */
>> +  if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE)
>> +    return true;
>> +
>> +  /* An integral type with a single bit of precision.  */
>> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
>> +      && TYPE_PRECISION (TREE_TYPE (op)) == 1)
>> +    return true;
>> +
>> +  /* An integral type with more precision, but the object
>> +     only takes on values [0..1] as determined by VRP
>> +     analysis.  */
>> +  wide_int min, max;
>> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
>> +      && get_range_info (op, &min, &max) == VR_RANGE
>> +      && wi::eq_p (min, 0)
>> +      && wi::eq_p (max, 1))
>> +    return true;
>> +
>> +  return false;
>> +}
>> +
>>   /* We have finished optimizing BB, record any information implied by
>>      taking a specific outgoing edge from BB.  */
>>
>> @@ -390,36 +422,32 @@ record_edge_info (basic_block bb)
>>                can record an equivalence for OP0 rather than COND.  */
>>             if ((code == EQ_EXPR || code == NE_EXPR)
>>                 && TREE_CODE (op0) == SSA_NAME
>> -              && TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE
>> +             && ssa_name_has_boolean_range (op0)
>>                 && is_gimple_min_invariant (op1))
>>               {
>> +             tree true_val = fold_convert (TREE_TYPE (op0),
>> +                                           boolean_true_node);
>> +             tree false_val = fold_convert (TREE_TYPE (op0),
>> +                                            boolean_false_node);
>
> Apart from what Jakub said we have constant_boolean_node for this,
>
>    true_val = constant_boolean_node (true, TREE_TYPE (op0));
Will update.  Thanks.

jeff

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-14 18:27   ` Jeff Law
@ 2016-01-15  2:49     ` Jeff Law
  0 siblings, 0 replies; 14+ messages in thread
From: Jeff Law @ 2016-01-15  2:49 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

[-- Attachment #1: Type: text/plain, Size: 685 bytes --]

On 01/14/2016 11:27 AM, Jeff Law wrote:
>
>> Apart from what Jakub said we have constant_boolean_node for this,
>>
>>    true_val = constant_boolean_node (true, TREE_TYPE (op0));
> Will update.  Thanks.
Here's the patch which uses constant_boolean_node and verifies the type 
is unsigned when it has a single bit of precision case.

The other part of Jakub's suggestion regresses code generation in the 
final assembly -- we're getting some constant initializations when 
they're not needed.  I'm highly confident we just need to mirror a few 
bits into the uncprop pass, which I'll be looking at directly.

Bootstrapped and regression tested on x86_64.  Installed on the trunk.

Jeff

[-- Attachment #2: patch --]
[-- Type: text/plain, Size: 2585 bytes --]

commit d1bdc38e96ec5ae607a472968134e5d8b0ac4456
Author: law <law@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Fri Jan 15 02:45:44 2016 +0000

           PR tree-optimization/69270
            * tree-ssa-dom.c (ssa_name_has_boolean_range): If the type has a
            single bit of precision, verify it's also unsigned.
            (record_edge_info): Use constant_boolean_node rather than fold_convert
            to convert boolean_true/boolean_false to the right type.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@232399 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 790662511..6f9c6a5 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@
+2016-01-14  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/69270
+	* tree-ssa-dom.c (ssa_name_has_boolean_range): If the type has a
+	single bit of precision, verify it's also unsigned.
+	(record_edge_info): Use constant_boolean_node rather than fold_convert
+	to convert boolean_true/boolean_false to the right type.
+
 2016-01-14  Richard Henderson  <rth@redhat.com>
 
 	PR rtl-opt/69014
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index da4faca..f2257b3 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -319,8 +319,8 @@ record_conditions (struct edge_info *edge_info, tree cond, tree inverted)
 /* Return TRUE is OP, an SSA_NAME has a range of values [0..1], false
    otherwise.
 
-   This can be because it is a boolean type, any type with
-   a single bit of precision, or has known range of [0..1]
+   This can be because it is a boolean type, any unsigned integral
+   type with a single bit of precision, or has known range of [0..1]
    via VRP analysis.  */
 
 static bool
@@ -332,6 +332,7 @@ ssa_name_has_boolean_range (tree op)
 
   /* An integral type with a single bit of precision.  */
   if (INTEGRAL_TYPE_P (TREE_TYPE (op))
+      && TYPE_UNSIGNED (TREE_TYPE (op))
       && TYPE_PRECISION (TREE_TYPE (op)) == 1)
     return true;
 
@@ -425,10 +426,9 @@ record_edge_info (basic_block bb)
 	      && ssa_name_has_boolean_range (op0)
               && is_gimple_min_invariant (op1))
             {
-	      tree true_val = fold_convert (TREE_TYPE (op0),
-					    boolean_true_node);
-	      tree false_val = fold_convert (TREE_TYPE (op0),
-					     boolean_false_node);
+	      tree true_val = constant_boolean_node (true, TREE_TYPE (op0));
+	      tree false_val = constant_boolean_node (false, TREE_TYPE (op0));
+
               if (code == EQ_EXPR)
                 {
                   edge_info = allocate_edge_info (true_edge);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-14 18:15     ` Jeff Law
@ 2016-01-15 22:32       ` Jeff Law
  2016-01-16  0:31         ` Jakub Jelinek
                           ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Jeff Law @ 2016-01-15 22:32 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 2832 bytes --]

On 01/14/2016 11:14 AM, Jeff Law wrote:
> On 01/14/2016 12:49 AM, Jakub Jelinek wrote:
>> On Thu, Jan 14, 2016 at 08:46:43AM +0100, Jakub Jelinek wrote:
>>> On Thu, Jan 14, 2016 at 12:38:52AM -0700, Jeff Law wrote:
>>>> +  /* An integral type with more precision, but the object
>>>> +     only takes on values [0..1] as determined by VRP
>>>> +     analysis.  */
>>>> +  wide_int min, max;
>>>> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
>>>> +      && get_range_info (op, &min, &max) == VR_RANGE
>>>> +      && wi::eq_p (min, 0)
>>>> +      && wi::eq_p (max, 1))
>>>> +    return true;
>>>
>>> You could use and/or:
>>>    if (INTEGRAL_TYPE_P (TREE_TYPE (op)) && wi::eq_p (get_nonzero_bits
>>> (op), 1))
>>> set_range_info for VR_RANGE should usually update also the non-zero
>>> bits, but
>>> set_nonzero_bits does not update the recorded range.
>>
>> Though, that would need to be limited to TYPE_PRECISION (TREE_TYPE
>> (op)) > 1
>> or TYPE_UNSIGNED.
> Quite surprisingly, this does seem to do better fairly often.  Usually
> it's just getting more constants into the PHI nodes without further
> improvements.  However occasionally I see a PHI that turns into a
> constant, which is then propagated to a use where we're able to simplify
> some arithmetic/logical.
>
> Unfortunately it doesn't make a bit of difference in the final output,
> so something post DOM was picking up these anyway (most likely VRP2).
> But I'm a fan of getting stuff optimized earlier in the pipeline when
> it's reasonable to do so, and this seems reasonable.
>
> Limiting to TYPE_PRECISION > 1 or TYPE_UNSIGNED ought to be trivial.
So further testing did show some code regressions from this improvement. 
  Specifically we were clearly better at propagating boolean values 
derived from test conditions into PHIs (and ultimately into real 
statements as well).  That was the purpose of the patch.

Where we took a small step backwards was the out-of-ssa translation and 
RTL expansion.  A constant argument in a PHI generates a constant load 
at RTL time.  We have uncprop to detect cases where there are already 
objects holding the value we want and just before out-of-ssa we 
un-propagate the constant.  When the object holding the value we want 
coalesces with the LHS of the PHI (which is most of the time) we win.

uncprop wasn't catching these new cases where we'd propagated constants 
more aggressively into PHI nodes.   This patch fixes that problem.

In all, this is a very small improvement in the generated code.  It may 
ultimately prove more useful in the future to drive path partitioning.

There's two small tests.  One verifies we're able to propagate more 
constants per the original intent of the patch.  The other verifies we 
un-propagate as well.

Bootstrapped and regression tested on x86_64.  Installed on the trunk.

jeff



[-- Attachment #2: P --]
[-- Type: text/plain, Size: 9513 bytes --]

commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0
Author: Jeff Law <law@tor.usersys.redhat.com>
Date:   Fri Jan 15 17:15:24 2016 -0500

    	PR tree-optimization/69270
    	* tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
    	tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
    	* tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
    	* tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
    	* tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
    	ssa_name_has_boolean_range and constant_boolean_node.
    
    	PR tree-optimization/69270
    	* gcc.dg/tree-ssa/pr69270-2.c: New test.
    	* gcc.dg/tree-ssa/pr69270-3.c: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e3dc328..409e981 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,13 @@
+2016-01-15  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/69270
+	* tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
+	tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
+	* tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
+	* tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
+	* tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
+	ssa_name_has_boolean_range and constant_boolean_node.
+
 2016-01-15  Vladimir Makarov  <vmakarov@redhat.com>
 
 	PR rtl-optimization/69030
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 29291a2..d9a9246 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,9 @@
+2016-01-15  Jeff Law  <law@redhat.com>
+
+	PR tree-optimization/69270
+	* gcc.dg/tree-ssa/pr69270-2.c: New test.
+	* gcc.dg/tree-ssa/pr69270-3.c: New test.
+
 2016-01-15  Paul Thomas  <pault@gcc.gnu.org>
 
 	PR fortran/49630
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69270-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-2.c
new file mode 100644
index 0000000..15c7bdd
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-2.c
@@ -0,0 +1,52 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dom3-details -w" } */
+
+/* There should be a reference to usecount that turn into
+   constants.  */
+/* { dg-final { scan-tree-dump-times "Replaced .usecount_\[0-9\]+. with constant .1." 1 "dom3"} } */
+
+/* And an assignment using usecount ought to fold down to constants.  */
+/* { dg-final { scan-tree-dump-times "Folded to: usecount_\[0-9\]+ = 2;" 1 "dom3"} } */
+
+/* The arithmetic using usecount should be gone, except for the one in the
+   details debugging.  */
+/* { dg-final { scan-tree-dump-times "usecount_\[0-9\]+ = usecount_\[0-9\]+ . 1;" 1 "dom3"} } */
+
+typedef union tree_node *tree;
+typedef union gimple_statement_d *gimple;
+extern const int tree_code_type[];
+union tree_node
+{
+  int code:16;
+};
+typedef struct immediate_use_iterator_d
+{
+}
+imm_use_iterator;
+void
+insert_debug_temp_for_var_def (gimple stmt)
+{
+  gimple def_stmt = ((void *) 0);
+  int usecount = 0;
+  tree value = ((void *) 0);
+  for (; arf ();)
+    {
+      if (!gimple_debug_bind_p (stmt))
+        continue;
+      if (usecount++)
+        break;
+      unsigned char no_value = 0;
+      if (!gimple_bb (def_stmt))
+        no_value = 1;
+      if (!no_value)
+        value = gimple_assign_rhs_to_tree ();
+    }
+  if (value)
+    {
+      if ((tree_code_type[(int) (((value)->code))] == 42)
+          || (usecount == 1 && (is_gimple_min_invariant (value))))
+        value = unshare_expr (value);
+    }
+}
+
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c
new file mode 100644
index 0000000..89735f6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr69270-3.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-uncprop-details -w" } */
+
+/* We're looking for a constant argument a PHI node.  There
+   should only be one if we unpropagate correctly.  */
+/* { dg-final { scan-tree-dump-times ", 1" 1 "uncprop1"} } */
+
+typedef long unsigned int size_t;
+typedef union gimple_statement_d *gimple;
+unsigned char
+propagate_with_phi ()
+{
+  gimple use_stmt;
+  unsigned char phi_inserted;
+  phi_inserted = 0;
+  for (; !end_imm_use_stmt_p (); next_imm_use_stmt ())
+    {
+      if (!(arf () == 10 && boo () == 20))
+        continue;
+      if (!phi_inserted)
+        phi_inserted = 1;
+      else
+        update_stmt ();
+    }
+}
+
diff --git a/gcc/tree-ssa-dom.c b/gcc/tree-ssa-dom.c
index f2257b3..8298637 100644
--- a/gcc/tree-ssa-dom.c
+++ b/gcc/tree-ssa-dom.c
@@ -316,39 +316,6 @@ record_conditions (struct edge_info *edge_info, tree cond, tree inverted)
   edge_info->cond_equivalences.safe_push (c);
 }
 
-/* Return TRUE is OP, an SSA_NAME has a range of values [0..1], false
-   otherwise.
-
-   This can be because it is a boolean type, any unsigned integral
-   type with a single bit of precision, or has known range of [0..1]
-   via VRP analysis.  */
-
-static bool
-ssa_name_has_boolean_range (tree op)
-{
-  /* Boolean types always have a range [0..1].  */
-  if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE)
-    return true;
-
-  /* An integral type with a single bit of precision.  */
-  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
-      && TYPE_UNSIGNED (TREE_TYPE (op))
-      && TYPE_PRECISION (TREE_TYPE (op)) == 1)
-    return true;
-
-  /* An integral type with more precision, but the object
-     only takes on values [0..1] as determined by VRP
-     analysis.  */
-  wide_int min, max;
-  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
-      && get_range_info (op, &min, &max) == VR_RANGE
-      && wi::eq_p (min, 0)
-      && wi::eq_p (max, 1))
-    return true;
-
-  return false;
-}
-
 /* We have finished optimizing BB, record any information implied by
    taking a specific outgoing edge from BB.  */
 
diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c
index 4b57578..307bb1f 100644
--- a/gcc/tree-ssa-uncprop.c
+++ b/gcc/tree-ssa-uncprop.c
@@ -94,23 +94,26 @@ associate_equivalences_with_edges (void)
 		 can record an equivalence for OP0 rather than COND.  */
 	      if (TREE_CODE (op0) == SSA_NAME
 		  && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op0)
-		  && TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE
+		  && ssa_name_has_boolean_range (op0)
 		  && is_gimple_min_invariant (op1))
 		{
+		  tree true_val = constant_boolean_node (true, TREE_TYPE (op0));
+		  tree false_val = constant_boolean_node (false,
+							  TREE_TYPE (op0));
 		  if (code == EQ_EXPR)
 		    {
 		      equivalency = XNEW (struct edge_equivalency);
 		      equivalency->lhs = op0;
 		      equivalency->rhs = (integer_zerop (op1)
-					  ? boolean_false_node
-					  : boolean_true_node);
+					  ? false_val
+					  : true_val);
 		      true_edge->aux = equivalency;
 
 		      equivalency = XNEW (struct edge_equivalency);
 		      equivalency->lhs = op0;
 		      equivalency->rhs = (integer_zerop (op1)
-					  ? boolean_true_node
-					  : boolean_false_node);
+					  ? true_val
+					  : false_val);
 		      false_edge->aux = equivalency;
 		    }
 		  else
@@ -118,15 +121,15 @@ associate_equivalences_with_edges (void)
 		      equivalency = XNEW (struct edge_equivalency);
 		      equivalency->lhs = op0;
 		      equivalency->rhs = (integer_zerop (op1)
-					  ? boolean_true_node
-					  : boolean_false_node);
+					  ? true_val
+					  : false_val);
 		      true_edge->aux = equivalency;
 
 		      equivalency = XNEW (struct edge_equivalency);
 		      equivalency->lhs = op0;
 		      equivalency->rhs = (integer_zerop (op1)
-					  ? boolean_false_node
-					  : boolean_true_node);
+					  ? false_val
+					  : true_val);
 		      false_edge->aux = equivalency;
 		    }
 		}
diff --git a/gcc/tree-ssanames.c b/gcc/tree-ssanames.c
index 82866b2..b6f72e2 100644
--- a/gcc/tree-ssanames.c
+++ b/gcc/tree-ssanames.c
@@ -411,6 +411,40 @@ get_nonzero_bits (const_tree name)
   return ri->get_nonzero_bits ();
 }
 
+/* Return TRUE is OP, an SSA_NAME has a range of values [0..1], false
+   otherwise.
+
+   This can be because it is a boolean type, any unsigned integral
+   type with a single bit of precision, or has known range of [0..1]
+   via VRP analysis.  */
+
+bool
+ssa_name_has_boolean_range (tree op)
+{
+  gcc_assert (TREE_CODE (op) == SSA_NAME);
+
+  /* Boolean types always have a range [0..1].  */
+  if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE)
+    return true;
+
+  /* An integral type with a single bit of precision.  */
+  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
+      && TYPE_UNSIGNED (TREE_TYPE (op))
+      && TYPE_PRECISION (TREE_TYPE (op)) == 1)
+    return true;
+
+  /* An integral type with more precision, but the object
+     only takes on values [0..1] as determined by VRP
+     analysis.  */
+  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
+      && (TYPE_PRECISION (TREE_TYPE (op)) > 1
+	  || TYPE_UNSIGNED (TREE_TYPE (op)))
+      && wi::eq_p (get_nonzero_bits (op), 1))
+    return true;
+
+  return false;
+}
+
 /* We no longer need the SSA_NAME expression VAR, release it so that
    it may be reused.
 
diff --git a/gcc/tree-ssanames.h b/gcc/tree-ssanames.h
index d8ce684..c81b1a1 100644
--- a/gcc/tree-ssanames.h
+++ b/gcc/tree-ssanames.h
@@ -75,6 +75,7 @@ extern enum value_range_type get_range_info (const_tree, wide_int *,
 					     wide_int *);
 extern void set_nonzero_bits (tree, const wide_int_ref &);
 extern wide_int get_nonzero_bits (const_tree);
+extern bool ssa_name_has_boolean_range (tree);
 extern void init_ssanames (struct function *, int);
 extern void fini_ssanames (struct function *);
 extern void ssanames_print_statistics (void);

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-15 22:32       ` Jeff Law
@ 2016-01-16  0:31         ` Jakub Jelinek
  2016-01-18 11:31         ` Andreas Schwab
  2016-02-28 20:33         ` H.J. Lu
  2 siblings, 0 replies; 14+ messages in thread
From: Jakub Jelinek @ 2016-01-16  0:31 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Fri, Jan 15, 2016 at 03:32:33PM -0700, Jeff Law wrote:
> +bool
> +ssa_name_has_boolean_range (tree op)
> +{
> +  gcc_assert (TREE_CODE (op) == SSA_NAME);
> +
> +  /* Boolean types always have a range [0..1].  */
> +  if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE)
> +    return true;
> +
> +  /* An integral type with a single bit of precision.  */
> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
> +      && TYPE_UNSIGNED (TREE_TYPE (op))
> +      && TYPE_PRECISION (TREE_TYPE (op)) == 1)
> +    return true;
> +
> +  /* An integral type with more precision, but the object
> +     only takes on values [0..1] as determined by VRP
> +     analysis.  */
> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
> +      && (TYPE_PRECISION (TREE_TYPE (op)) > 1
> +	  || TYPE_UNSIGNED (TREE_TYPE (op)))

I think this || TYPE_UNSIGNED (TREE_TYPE (op)) is useless.
Because, if TYPE_PRECISION (TREE_TYPE (op)) > 1, then both signed and
unsigned is fine, and if precision is 1, then already the earlier if
handled it, and precision 0 is hopefully invalid.

> +      && wi::eq_p (get_nonzero_bits (op), 1))
> +    return true;
> +
> +  return false;
> +}

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-15 22:32       ` Jeff Law
  2016-01-16  0:31         ` Jakub Jelinek
@ 2016-01-18 11:31         ` Andreas Schwab
  2016-01-18 11:38           ` Kyrill Tkachov
  2016-02-28 20:33         ` H.J. Lu
  2 siblings, 1 reply; 14+ messages in thread
From: Andreas Schwab @ 2016-01-18 11:31 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jakub Jelinek, gcc-patches

Jeff Law <law@redhat.com> writes:

> commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0
> Author: Jeff Law <law@tor.usersys.redhat.com>
> Date:   Fri Jan 15 17:15:24 2016 -0500
>
>     	PR tree-optimization/69270
>     	* tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
>     	tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
>     	* tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
>     	* tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
>     	* tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
>     	ssa_name_has_boolean_range and constant_boolean_node.
>     
>     	PR tree-optimization/69270
>     	* gcc.dg/tree-ssa/pr69270-2.c: New test.
>     	* gcc.dg/tree-ssa/pr69270-3.c: New test.

This breaks gcc.target/aarch64/tst_3.c.

 	//.tune generic
 	.type	f1, %function
 f1:
-	tst	x0, 1
-	csinc	w0, w0, wzr, eq
+	ands	w1, w0, 1
+	csel	w0, w1, w0, ne
 	ret
 	.size	f1, .-f1

Andreas.

-- 
Andreas Schwab, SUSE Labs, schwab@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-18 11:31         ` Andreas Schwab
@ 2016-01-18 11:38           ` Kyrill Tkachov
  2016-01-18 11:50             ` Jakub Jelinek
  0 siblings, 1 reply; 14+ messages in thread
From: Kyrill Tkachov @ 2016-01-18 11:38 UTC (permalink / raw)
  To: Andreas Schwab, Jeff Law; +Cc: Jakub Jelinek, gcc-patches


On 18/01/16 11:31, Andreas Schwab wrote:
> Jeff Law <law@redhat.com> writes:
>
>> commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0
>> Author: Jeff Law <law@tor.usersys.redhat.com>
>> Date:   Fri Jan 15 17:15:24 2016 -0500
>>
>>      	PR tree-optimization/69270
>>      	* tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
>>      	tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
>>      	* tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
>>      	* tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
>>      	* tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
>>      	ssa_name_has_boolean_range and constant_boolean_node.
>>      
>>      	PR tree-optimization/69270
>>      	* gcc.dg/tree-ssa/pr69270-2.c: New test.
>>      	* gcc.dg/tree-ssa/pr69270-3.c: New test.
> This breaks gcc.target/aarch64/tst_3.c.
>
>   	//.tune generic
>   	.type	f1, %function
>   f1:
> -	tst	x0, 1
> -	csinc	w0, w0, wzr, eq
> +	ands	w1, w0, 1
> +	csel	w0, w1, w0, ne
>   	ret
>   	.size	f1, .-f1

The two sequences look equally valid to me.
Instead of doing an and-compare followed by a conditional increment
we do an and-compare followed by a conditional select (without discarding
the result of the and).
So the testcase should be adjusted.
I'll do it.

Thanks,
Kyrill

> Andreas.
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-18 11:38           ` Kyrill Tkachov
@ 2016-01-18 11:50             ` Jakub Jelinek
  2016-01-18 11:53               ` Kyrill Tkachov
  0 siblings, 1 reply; 14+ messages in thread
From: Jakub Jelinek @ 2016-01-18 11:50 UTC (permalink / raw)
  To: Kyrill Tkachov; +Cc: Andreas Schwab, Jeff Law, gcc-patches

On Mon, Jan 18, 2016 at 11:38:37AM +0000, Kyrill Tkachov wrote:
> On 18/01/16 11:31, Andreas Schwab wrote:
> >Jeff Law <law@redhat.com> writes:
> >
> >>commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0
> >>Author: Jeff Law <law@tor.usersys.redhat.com>
> >>Date:   Fri Jan 15 17:15:24 2016 -0500
> >>
> >>     	PR tree-optimization/69270
> >>     	* tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
> >>     	tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
> >>     	* tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
> >>     	* tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
> >>     	* tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
> >>     	ssa_name_has_boolean_range and constant_boolean_node.
> >>     	PR tree-optimization/69270
> >>     	* gcc.dg/tree-ssa/pr69270-2.c: New test.
> >>     	* gcc.dg/tree-ssa/pr69270-3.c: New test.
> >This breaks gcc.target/aarch64/tst_3.c.
> >
> >  	//.tune generic
> >  	.type	f1, %function
> >  f1:
> >-	tst	x0, 1
> >-	csinc	w0, w0, wzr, eq
> >+	ands	w1, w0, 1
> >+	csel	w0, w1, w0, ne
> >  	ret
> >  	.size	f1, .-f1
> 
> The two sequences look equally valid to me.
> Instead of doing an and-compare followed by a conditional increment
> we do an and-compare followed by a conditional select (without discarding
> the result of the and).
> So the testcase should be adjusted.
> I'll do it.

IMHO please wait for the resolution of PR69320 here.

	Jakub

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-18 11:50             ` Jakub Jelinek
@ 2016-01-18 11:53               ` Kyrill Tkachov
  0 siblings, 0 replies; 14+ messages in thread
From: Kyrill Tkachov @ 2016-01-18 11:53 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Andreas Schwab, Jeff Law, gcc-patches


On 18/01/16 11:49, Jakub Jelinek wrote:
> On Mon, Jan 18, 2016 at 11:38:37AM +0000, Kyrill Tkachov wrote:
>> On 18/01/16 11:31, Andreas Schwab wrote:
>>> Jeff Law <law@redhat.com> writes:
>>>
>>>> commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0
>>>> Author: Jeff Law <law@tor.usersys.redhat.com>
>>>> Date:   Fri Jan 15 17:15:24 2016 -0500
>>>>
>>>>      	PR tree-optimization/69270
>>>>      	* tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
>>>>      	tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
>>>>      	* tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
>>>>      	* tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
>>>>      	* tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
>>>>      	ssa_name_has_boolean_range and constant_boolean_node.
>>>>      	PR tree-optimization/69270
>>>>      	* gcc.dg/tree-ssa/pr69270-2.c: New test.
>>>>      	* gcc.dg/tree-ssa/pr69270-3.c: New test.
>>> This breaks gcc.target/aarch64/tst_3.c.
>>>
>>>   	//.tune generic
>>>   	.type	f1, %function
>>>   f1:
>>> -	tst	x0, 1
>>> -	csinc	w0, w0, wzr, eq
>>> +	ands	w1, w0, 1
>>> +	csel	w0, w1, w0, ne
>>>   	ret
>>>   	.size	f1, .-f1
>> The two sequences look equally valid to me.
>> Instead of doing an and-compare followed by a conditional increment
>> we do an and-compare followed by a conditional select (without discarding
>> the result of the and).
>> So the testcase should be adjusted.
>> I'll do it.
> IMHO please wait for the resolution of PR69320 here.

Ok, but both sequences should be ok for aarch64 here.
The function in the test is:
int
f1 (int x)
{
   if (x & 1)
     return 1;
   return x;
}

and the "return 1" should just be changed to "return 2" to avoid whatever
optimiser decided to perform that transformation.

Kyrill

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM
  2016-01-15 22:32       ` Jeff Law
  2016-01-16  0:31         ` Jakub Jelinek
  2016-01-18 11:31         ` Andreas Schwab
@ 2016-02-28 20:33         ` H.J. Lu
  2 siblings, 0 replies; 14+ messages in thread
From: H.J. Lu @ 2016-02-28 20:33 UTC (permalink / raw)
  To: Jeff Law; +Cc: Jakub Jelinek, GCC Patches

On Fri, Jan 15, 2016 at 2:32 PM, Jeff Law <law@redhat.com> wrote:
> On 01/14/2016 11:14 AM, Jeff Law wrote:
>>
>> On 01/14/2016 12:49 AM, Jakub Jelinek wrote:
>>>
>>> On Thu, Jan 14, 2016 at 08:46:43AM +0100, Jakub Jelinek wrote:
>>>>
>>>> On Thu, Jan 14, 2016 at 12:38:52AM -0700, Jeff Law wrote:
>>>>>
>>>>> +  /* An integral type with more precision, but the object
>>>>> +     only takes on values [0..1] as determined by VRP
>>>>> +     analysis.  */
>>>>> +  wide_int min, max;
>>>>> +  if (INTEGRAL_TYPE_P (TREE_TYPE (op))
>>>>> +      && get_range_info (op, &min, &max) == VR_RANGE
>>>>> +      && wi::eq_p (min, 0)
>>>>> +      && wi::eq_p (max, 1))
>>>>> +    return true;
>>>>
>>>>
>>>> You could use and/or:
>>>>    if (INTEGRAL_TYPE_P (TREE_TYPE (op)) && wi::eq_p (get_nonzero_bits
>>>> (op), 1))
>>>> set_range_info for VR_RANGE should usually update also the non-zero
>>>> bits, but
>>>> set_nonzero_bits does not update the recorded range.
>>>
>>>
>>> Though, that would need to be limited to TYPE_PRECISION (TREE_TYPE
>>> (op)) > 1
>>> or TYPE_UNSIGNED.
>>
>> Quite surprisingly, this does seem to do better fairly often.  Usually
>> it's just getting more constants into the PHI nodes without further
>> improvements.  However occasionally I see a PHI that turns into a
>> constant, which is then propagated to a use where we're able to simplify
>> some arithmetic/logical.
>>
>> Unfortunately it doesn't make a bit of difference in the final output,
>> so something post DOM was picking up these anyway (most likely VRP2).
>> But I'm a fan of getting stuff optimized earlier in the pipeline when
>> it's reasonable to do so, and this seems reasonable.
>>
>> Limiting to TYPE_PRECISION > 1 or TYPE_UNSIGNED ought to be trivial.
>
> So further testing did show some code regressions from this improvement.
> Specifically we were clearly better at propagating boolean values derived
> from test conditions into PHIs (and ultimately into real statements as
> well).  That was the purpose of the patch.
>
> Where we took a small step backwards was the out-of-ssa translation and RTL
> expansion.  A constant argument in a PHI generates a constant load at RTL
> time.  We have uncprop to detect cases where there are already objects
> holding the value we want and just before out-of-ssa we un-propagate the
> constant.  When the object holding the value we want coalesces with the LHS
> of the PHI (which is most of the time) we win.
>
> uncprop wasn't catching these new cases where we'd propagated constants more
> aggressively into PHI nodes.   This patch fixes that problem.
>
> In all, this is a very small improvement in the generated code.  It may
> ultimately prove more useful in the future to drive path partitioning.
>
> There's two small tests.  One verifies we're able to propagate more
> constants per the original intent of the patch.  The other verifies we
> un-propagate as well.
>
>
> Bootstrapped and regression tested on x86_64.  Installed on the trunk.
>
> jeff
>
>
>
> commit 1384b36abcd52a7ac72ca6538afa2aed2e04f8e0
> Author: Jeff Law <law@tor.usersys.redhat.com>
> Date:   Fri Jan 15 17:15:24 2016 -0500
>
>         PR tree-optimization/69270
>         * tree-ssanames.c (ssa_name_has_boolean_range): Moved here from
>         tree-ssa-dom.c.  Improve test for [0..1] ranve from VRP.
>         * tree-ssa-dom.c (ssa_name_has_boolean_range): Remove.
>         * tree-ssanames.h (ssa_name_has_boolean_range): Prototype.
>         * tree-ssa-uncprop.c (associate_equivalences_with_edges): Use
>         ssa_name_has_boolean_range and constant_boolean_node.
>
>         PR tree-optimization/69270
>         * gcc.dg/tree-ssa/pr69270-2.c: New test.
>         * gcc.dg/tree-ssa/pr69270-3.c: New test.
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70005

H.J.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-02-28 20:33 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-14  7:38 [PATCH][PR tree-optimization/69270] Exploit VRP information in DOM Jeff Law
2016-01-14  7:46 ` Jakub Jelinek
2016-01-14  7:49   ` Jakub Jelinek
2016-01-14 18:15     ` Jeff Law
2016-01-15 22:32       ` Jeff Law
2016-01-16  0:31         ` Jakub Jelinek
2016-01-18 11:31         ` Andreas Schwab
2016-01-18 11:38           ` Kyrill Tkachov
2016-01-18 11:50             ` Jakub Jelinek
2016-01-18 11:53               ` Kyrill Tkachov
2016-02-28 20:33         ` H.J. Lu
2016-01-14  9:47 ` Richard Biener
2016-01-14 18:27   ` Jeff Law
2016-01-15  2:49     ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).