public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
@ 2015-02-10 20:55 Jeff Law
  2015-02-11 10:56 ` Richard Biener
  2015-02-12  7:02 ` Bin.Cheng
  0 siblings, 2 replies; 10+ messages in thread
From: Jeff Law @ 2015-02-10 20:55 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1378 bytes --]


This PR was originally minor issue where we regressed on this kind of 
sequence:

typedef struct toto_s *toto_t;
toto_t add (toto_t a, toto_t b) {
   int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
   return (toto_t)(intptr_t) tmp;
}


There was talk of trying to peephole this in the x86 backend.  But later 
Jakub speculated that if we had good type narrowing this could be done 
in the tree optimizers...

Soooo, here we go.  I didn't do anything with logicals are those are 
already handled elsewhere in match.pd.  I didn't try to handle MULT as 
in the early experiments I did, it was a lose because of the existing 
mechanisms for widening multiplications.

Interestingly enough, this patch seems to help out libjava more than 
anything else in a GCC build and it really only helps a few routines. 
There weren't any routines I could see where the code regressed after 
this patch.  This is probably an indicator that these things aren't 
*that* common, or the existing shortening code better than we thought, 
or some important shortening case is missing.


I think we should pull the other tests from 47477 which are not 
regressions out into their own bug for future work.  Or alternately, 
when this fix is checked in remove the regression marker in 47477.


Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for 
the trunk?








[-- Attachment #2: P --]
[-- Type: text/plain, Size: 2959 bytes --]

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7f3816c..7a95029 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2015-02-10  Jeff Law  <law@redhat.com>
+
+	* match.pd (convert (plus/minus (convert @0) (convert @1): New
+	simplifier to narrow arithmetic.
+
 2015-02-10  Richard Biener  <rguenther@suse.de>
 
 	PR tree-optimization/64909
diff --git a/gcc/match.pd b/gcc/match.pd
index 81c4ee6..abc703e 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3.  If not see
    (logs (pows @0 @1))
    (mult @1 (logs @0)))))
 
+/* If we have a narrowing conversion of an arithmetic operation where
+   both operands are widening conversions from the same type as the outer
+   narrowing conversion.  Then convert the innermost operands to a suitable
+   unsigned type (to avoid introducing undefined behaviour), perform the
+   operation and convert the result to the desired type. 
+
+   This narrows the arithmetic operation.  */
+(for op (plus minus)
+  (simplify
+    (convert (op (convert@2 @0) (convert @1)))
+    (if (TREE_TYPE (@0) == TREE_TYPE (@1)
+         && TREE_TYPE (@0) == type
+         && INTEGRAL_TYPE_P (type)
+         && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE (@0))
+	 /* This prevents infinite recursion.  */
+	 && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))
+      (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
+        (convert (op (convert:utype @0) (convert:utype @1)))))))
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 15d5e2d..76e5254 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2015-02-10  Jeff Law  <law@redhat.com>
+
+	PR rtl-optimization/47477
+	* gcc.dg/tree-ssa/narrow-arith-1.c: New test.
+
 2015-02-10  Richard Biener  <rguenther@suse.de>
 
 	PR tree-optimization/64909
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
new file mode 100644
index 0000000..104cb6f5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
@@ -0,0 +1,22 @@
+/* PR tree-optimization/47477 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -w" } */
+/* { dg-require-effective-target ilp32 } */
+
+typedef int int64_t __attribute__ ((__mode__ (__DI__)));
+typedef int * intptr_t;
+
+typedef struct toto_s *toto_t;
+toto_t add (toto_t a, toto_t b) {
+  int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
+  return (toto_t)(intptr_t) tmp;
+}
+
+/* For an ILP32 target there'll be 6 casts when we start, but just 4
+   if the match.pd pattern is successfully matched.  */
+/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
+
+

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
  2015-02-10 20:55 [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd Jeff Law
@ 2015-02-11 10:56 ` Richard Biener
  2015-02-11 17:05   ` Jeff Law
  2015-02-12  7:02 ` Bin.Cheng
  1 sibling, 1 reply; 10+ messages in thread
From: Richard Biener @ 2015-02-11 10:56 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Tue, Feb 10, 2015 at 9:55 PM, Jeff Law <law@redhat.com> wrote:
>
> This PR was originally minor issue where we regressed on this kind of
> sequence:
>
> typedef struct toto_s *toto_t;
> toto_t add (toto_t a, toto_t b) {
>   int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
>   return (toto_t)(intptr_t) tmp;
> }
>
>
> There was talk of trying to peephole this in the x86 backend.  But later
> Jakub speculated that if we had good type narrowing this could be done in
> the tree optimizers...
>
> Soooo, here we go.  I didn't do anything with logicals are those are already
> handled elsewhere in match.pd.  I didn't try to handle MULT as in the early
> experiments I did, it was a lose because of the existing mechanisms for
> widening multiplications.
>
> Interestingly enough, this patch seems to help out libjava more than
> anything else in a GCC build and it really only helps a few routines. There
> weren't any routines I could see where the code regressed after this patch.
> This is probably an indicator that these things aren't *that* common, or the
> existing shortening code better than we thought, or some important
> shortening case is missing.
>
>
> I think we should pull the other tests from 47477 which are not regressions
> out into their own bug for future work.  Or alternately, when this fix is
> checked in remove the regression marker in 47477.
>
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for the
> trunk?
>
>
>
>
>
>
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 7f3816c..7a95029 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-10  Jeff Law  <law@redhat.com>
> +
> +       * match.pd (convert (plus/minus (convert @0) (convert @1): New
> +       simplifier to narrow arithmetic.
> +

Please reference PR47477 from here

>  2015-02-10  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/64909
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 81c4ee6..abc703e 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3.  If not see
>     (logs (pows @0 @1))
>     (mult @1 (logs @0)))))
>
> +/* If we have a narrowing conversion of an arithmetic operation where
> +   both operands are widening conversions from the same type as the outer
> +   narrowing conversion.  Then convert the innermost operands to a suitable
> +   unsigned type (to avoid introducing undefined behaviour), perform the
> +   operation and convert the result to the desired type.
> +
> +   This narrows the arithmetic operation.  */

This is also a part of what shorten_binary_op does, so please refer to
that in the comment so we can eventually complete this from there
and remove shorten_binary_op.

> +(for op (plus minus)
> +  (simplify
> +    (convert (op (convert@2 @0) (convert @1)))
> +    (if (TREE_TYPE (@0) == TREE_TYPE (@1)
> +         && TREE_TYPE (@0) == type

Otherwhere in match.pd we use

(GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
 || (GENERIC && TREE_TYPE (@0) == TREE_TYPE (@1))

for type equality checks.  And I think even for GENERIC you want
to ignore qualifiers and thus use TYPE_MAIN_VARIANT (TREE_TYPE (@0))
== TYPE_MAIN_VARAINT (TREE_TYPE (@1)).

> +         && INTEGRAL_TYPE_P (type)

So instead of testing TREE_TYPE (@0) == type we probably want
to just assert that TYPE_PRECISON (TREE_TYPE (@0)) == TYPE_PRECISION
(type) to allow sign-changes.  Say for

short i, j;
(unsigned short) (i + j)

we still want to narrow the widened i and j.

> +         && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
> (@0))
> +        /* This prevents infinite recursion.  */
> +        && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))

How can it recurse with the type precision constraint right above?
If you think it's necessary then nesting that condition below as

> +      (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
             (if (utype != TREE_TYPE (@2))

avoids evaluating unsigned_type_for twice.

Note that technically we don't need to perform the operation in an unsigned
type iff TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)).  Thus

           (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
             (convert (op @0 @1)))
           (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
             (convert (op (convert:utype @0) (convert:utype @1))))))

You can see that GCC exploits that with -fwrapv.  Maybe this
simplification should be split out into a separate pattern though,
tacking sign-changes for binary ops only.

Thanks,
Richard.

> +        (convert (op (convert:utype @0) (convert:utype @1)))))))
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 15d5e2d..76e5254 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-10  Jeff Law  <law@redhat.com>
> +
> +       PR rtl-optimization/47477
> +       * gcc.dg/tree-ssa/narrow-arith-1.c: New test.
> +
>  2015-02-10  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/64909
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> new file mode 100644
> index 0000000..104cb6f5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/47477 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -w" } */
> +/* { dg-require-effective-target ilp32 } */
> +
> +typedef int int64_t __attribute__ ((__mode__ (__DI__)));
> +typedef int * intptr_t;
> +
> +typedef struct toto_s *toto_t;
> +toto_t add (toto_t a, toto_t b) {
> +  int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
> +  return (toto_t)(intptr_t) tmp;
> +}
> +
> +/* For an ILP32 target there'll be 6 casts when we start, but just 4
> +   if the match.pd pattern is successfully matched.  */
> +/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" }
> } */
> +/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1
> "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> +
> +
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
  2015-02-11 10:56 ` Richard Biener
@ 2015-02-11 17:05   ` Jeff Law
  2015-02-12 13:02     ` Richard Biener
  0 siblings, 1 reply; 10+ messages in thread
From: Jeff Law @ 2015-02-11 17:05 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On 02/11/15 03:56, Richard Biener wrote:
>>
>>
>>
>>
>>
>>
>>
>>
>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>> index 7f3816c..7a95029 100644
>> --- a/gcc/ChangeLog
>> +++ b/gcc/ChangeLog
>> @@ -1,3 +1,8 @@
>> +2015-02-10  Jeff Law  <law@redhat.com>
>> +
>> +       * match.pd (convert (plus/minus (convert @0) (convert @1): New
>> +       simplifier to narrow arithmetic.
>> +
>
> Please reference PR47477 from here
Doh.  Fixed.


>
>>   2015-02-10  Richard Biener  <rguenther@suse.de>
>>
>>          PR tree-optimization/64909
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 81c4ee6..abc703e 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3.  If not see
>>      (logs (pows @0 @1))
>>      (mult @1 (logs @0)))))
>>
>> +/* If we have a narrowing conversion of an arithmetic operation where
>> +   both operands are widening conversions from the same type as the outer
>> +   narrowing conversion.  Then convert the innermost operands to a suitable
>> +   unsigned type (to avoid introducing undefined behaviour), perform the
>> +   operation and convert the result to the desired type.
>> +
>> +   This narrows the arithmetic operation.  */
>
> This is also a part of what shorten_binary_op does, so please refer to
> that in the comment so we can eventually complete this from there
> and remove shorten_binary_op.
Done.  I've created a block comment meant to cover this pattern and any 
additions we need along the way to moving shortening/narrowing out of 
the front-ends.


>
>> +(for op (plus minus)
>> +  (simplify
>> +    (convert (op (convert@2 @0) (convert @1)))
>> +    (if (TREE_TYPE (@0) == TREE_TYPE (@1)
>> +         && TREE_TYPE (@0) == type
>
> Otherwhere in match.pd we use
>
> (GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
>   || (GENERIC && TREE_TYPE (@0) == TREE_TYPE (@1))
>
> for type equality checks.  And I think even for GENERIC you want
> to ignore qualifiers and thus use TYPE_MAIN_VARIANT (TREE_TYPE (@0))
> == TYPE_MAIN_VARAINT (TREE_TYPE (@1)).
Seems reasonable.    Makes for some ugly formatting though.  It might 
make sense to factor that test so we don't end up repeating it senselessly.

>
>> +         && INTEGRAL_TYPE_P (type)
>
> So instead of testing TREE_TYPE (@0) == type we probably want
> to just assert that TYPE_PRECISON (TREE_TYPE (@0)) == TYPE_PRECISION
> (type) to allow sign-changes.  Say for
Yea, makes sense.

>
> short i, j;
> (unsigned short) (i + j)
>
> we still want to narrow the widened i and j.
Right.
>
>> +         && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
>> (@0))
>> +        /* This prevents infinite recursion.  */
>> +        && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))
>
> How can it recurse with the type precision constraint right above?
It may no longer be needed.  The precision check used to be >=, but that 
regressed some java codes.  So it got tightened in the last iteration 
before posting.

>
> Note that technically we don't need to perform the operation in an unsigned
> type iff TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)).  Thus
>
>             (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
>               (convert (op @0 @1)))
>             (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
>               (convert (op (convert:utype @0) (convert:utype @1))))))
>
> You can see that GCC exploits that with -fwrapv.  Maybe this
> simplification should be split out into a separate pattern though,
> tacking sign-changes for binary ops only.
No strong opinion on separate pattern or integrating into existing pattern.

jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
  2015-02-10 20:55 [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd Jeff Law
  2015-02-11 10:56 ` Richard Biener
@ 2015-02-12  7:02 ` Bin.Cheng
  2015-02-12 16:43   ` Jeff Law
  1 sibling, 1 reply; 10+ messages in thread
From: Bin.Cheng @ 2015-02-12  7:02 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Wed, Feb 11, 2015 at 4:55 AM, Jeff Law <law@redhat.com> wrote:
>
> This PR was originally minor issue where we regressed on this kind of
> sequence:
>
> typedef struct toto_s *toto_t;
> toto_t add (toto_t a, toto_t b) {
>   int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
>   return (toto_t)(intptr_t) tmp;
> }
>
>
> There was talk of trying to peephole this in the x86 backend.  But later
> Jakub speculated that if we had good type narrowing this could be done in
> the tree optimizers...
>
> Soooo, here we go.  I didn't do anything with logicals are those are already
> handled elsewhere in match.pd.  I didn't try to handle MULT as in the early
> experiments I did, it was a lose because of the existing mechanisms for
> widening multiplications.
>
> Interestingly enough, this patch seems to help out libjava more than
> anything else in a GCC build and it really only helps a few routines. There
> weren't any routines I could see where the code regressed after this patch.
> This is probably an indicator that these things aren't *that* common, or the
> existing shortening code better than we thought, or some important
> shortening case is missing.

Cool that we are trying to simplify type conversion using generic
match facility.  I have thought about type promotion in match.pd too.
For example, (unsigned long long)(unsigned long)(int_expr), if we can
prove int_expr is always positive (in my case, this is from vrp
information), then the first conversion can be saved.  This is another
way for (and related? I didn't look at the code) the sign/zero
extension elimination work using VRP I suppose?

Thanks,
bin
>
>
> I think we should pull the other tests from 47477 which are not regressions
> out into their own bug for future work.  Or alternately, when this fix is
> checked in remove the regression marker in 47477.
>
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu.  OK for the
> trunk?
>
>
>
>
>
>
>
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 7f3816c..7a95029 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-10  Jeff Law  <law@redhat.com>
> +
> +       * match.pd (convert (plus/minus (convert @0) (convert @1): New
> +       simplifier to narrow arithmetic.
> +
>  2015-02-10  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/64909
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 81c4ee6..abc703e 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3.  If not see
>     (logs (pows @0 @1))
>     (mult @1 (logs @0)))))
>
> +/* If we have a narrowing conversion of an arithmetic operation where
> +   both operands are widening conversions from the same type as the outer
> +   narrowing conversion.  Then convert the innermost operands to a suitable
> +   unsigned type (to avoid introducing undefined behaviour), perform the
> +   operation and convert the result to the desired type.
> +
> +   This narrows the arithmetic operation.  */
> +(for op (plus minus)
> +  (simplify
> +    (convert (op (convert@2 @0) (convert @1)))
> +    (if (TREE_TYPE (@0) == TREE_TYPE (@1)
> +         && TREE_TYPE (@0) == type
> +         && INTEGRAL_TYPE_P (type)
> +         && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
> (@0))
> +        /* This prevents infinite recursion.  */
> +        && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))
> +      (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
> +        (convert (op (convert:utype @0) (convert:utype @1)))))))
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 15d5e2d..76e5254 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-10  Jeff Law  <law@redhat.com>
> +
> +       PR rtl-optimization/47477
> +       * gcc.dg/tree-ssa/narrow-arith-1.c: New test.
> +
>  2015-02-10  Richard Biener  <rguenther@suse.de>
>
>         PR tree-optimization/64909
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> new file mode 100644
> index 0000000..104cb6f5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/narrow-arith-1.c
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/47477 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -w" } */
> +/* { dg-require-effective-target ilp32 } */
> +
> +typedef int int64_t __attribute__ ((__mode__ (__DI__)));
> +typedef int * intptr_t;
> +
> +typedef struct toto_s *toto_t;
> +toto_t add (toto_t a, toto_t b) {
> +  int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
> +  return (toto_t)(intptr_t) tmp;
> +}
> +
> +/* For an ILP32 target there'll be 6 casts when we start, but just 4
> +   if the match.pd pattern is successfully matched.  */
> +/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" }
> } */
> +/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1
> "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> +
> +
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
  2015-02-11 17:05   ` Jeff Law
@ 2015-02-12 13:02     ` Richard Biener
  2015-02-12 16:55       ` Jeff Law
  2015-02-12 23:23       ` Jeff Law
  0 siblings, 2 replies; 10+ messages in thread
From: Richard Biener @ 2015-02-12 13:02 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Wed, Feb 11, 2015 at 6:05 PM, Jeff Law <law@redhat.com> wrote:
> On 02/11/15 03:56, Richard Biener wrote:
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
>>> index 7f3816c..7a95029 100644
>>> --- a/gcc/ChangeLog
>>> +++ b/gcc/ChangeLog
>>> @@ -1,3 +1,8 @@
>>> +2015-02-10  Jeff Law  <law@redhat.com>
>>> +
>>> +       * match.pd (convert (plus/minus (convert @0) (convert @1): New
>>> +       simplifier to narrow arithmetic.
>>> +
>>
>>
>> Please reference PR47477 from here
>
> Doh.  Fixed.
>
>
>>
>>>   2015-02-10  Richard Biener  <rguenther@suse.de>
>>>
>>>          PR tree-optimization/64909
>>> diff --git a/gcc/match.pd b/gcc/match.pd
>>> index 81c4ee6..abc703e 100644
>>> --- a/gcc/match.pd
>>> +++ b/gcc/match.pd
>>> @@ -1018,3 +1018,21 @@ along with GCC; see the file COPYING3.  If not see
>>>      (logs (pows @0 @1))
>>>      (mult @1 (logs @0)))))
>>>
>>> +/* If we have a narrowing conversion of an arithmetic operation where
>>> +   both operands are widening conversions from the same type as the
>>> outer
>>> +   narrowing conversion.  Then convert the innermost operands to a
>>> suitable
>>> +   unsigned type (to avoid introducing undefined behaviour), perform the
>>> +   operation and convert the result to the desired type.
>>> +
>>> +   This narrows the arithmetic operation.  */
>>
>>
>> This is also a part of what shorten_binary_op does, so please refer to
>> that in the comment so we can eventually complete this from there
>> and remove shorten_binary_op.
>
> Done.  I've created a block comment meant to cover this pattern and any
> additions we need along the way to moving shortening/narrowing out of the
> front-ends.
>
>
>>
>>> +(for op (plus minus)
>>> +  (simplify
>>> +    (convert (op (convert@2 @0) (convert @1)))
>>> +    (if (TREE_TYPE (@0) == TREE_TYPE (@1)
>>> +         && TREE_TYPE (@0) == type
>>
>>
>> Otherwhere in match.pd we use
>>
>> (GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
>>   || (GENERIC && TREE_TYPE (@0) == TREE_TYPE (@1))
>>
>> for type equality checks.  And I think even for GENERIC you want
>> to ignore qualifiers and thus use TYPE_MAIN_VARIANT (TREE_TYPE (@0))
>> == TYPE_MAIN_VARAINT (TREE_TYPE (@1)).
>
> Seems reasonable.    Makes for some ugly formatting though.  It might make
> sense to factor that test so we don't end up repeating it senselessly.
>
>>
>>> +         && INTEGRAL_TYPE_P (type)
>>
>>
>> So instead of testing TREE_TYPE (@0) == type we probably want
>> to just assert that TYPE_PRECISON (TREE_TYPE (@0)) == TYPE_PRECISION
>> (type) to allow sign-changes.  Say for
>
> Yea, makes sense.
>
>>
>> short i, j;
>> (unsigned short) (i + j)
>>
>> we still want to narrow the widened i and j.
>
> Right.
>>
>>
>>> +         && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
>>> (@0))
>>> +        /* This prevents infinite recursion.  */
>>> +        && unsigned_type_for (TREE_TYPE (@0)) != TREE_TYPE (@2))
>>
>>
>> How can it recurse with the type precision constraint right above?
>
> It may no longer be needed.  The precision check used to be >=, but that
> regressed some java codes.  So it got tightened in the last iteration before
> posting.
>
>>
>> Note that technically we don't need to perform the operation in an
>> unsigned
>> type iff TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)).  Thus
>>
>>             (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0))
>>               (convert (op @0 @1)))
>>             (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
>>               (convert (op (convert:utype @0) (convert:utype @1))))))
>>
>> You can see that GCC exploits that with -fwrapv.  Maybe this
>> simplification should be split out into a separate pattern though,
>> tacking sign-changes for binary ops only.
>
> No strong opinion on separate pattern or integrating into existing pattern.

Yeah, I'd leave it as you did it for now.  Btw, -ENOPATCH.

Richard.

> jeff
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
  2015-02-12  7:02 ` Bin.Cheng
@ 2015-02-12 16:43   ` Jeff Law
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Law @ 2015-02-12 16:43 UTC (permalink / raw)
  To: Bin.Cheng; +Cc: gcc-patches

On 02/12/15 00:01, Bin.Cheng wrote:
>
> Cool that we are trying to simplify type conversion using generic
> match facility.  I have thought about type promotion in match.pd too.
match.pd really feels like a meta-language which describes how to 
combine gimple or generic statements.


> For example, (unsigned long long)(unsigned long)(int_expr), if we can
> prove int_expr is always positive (in my case, this is from vrp
> information), then the first conversion can be saved.  This is another
> way for (and related? I didn't look at the code) the sign/zero
> extension elimination work using VRP I suppose?
In theory, given VRP information attached to an SSA_NAME you could query 
that information in the conditional part of the match.pd pattern.

Another patch I'm working on is best described as detecting cases where 
an outer bit-and masks off all the bits outside the innermost operand's 
type.  Under the right conditions, this also allows for narrowing.

Jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
  2015-02-12 13:02     ` Richard Biener
@ 2015-02-12 16:55       ` Jeff Law
  2015-02-12 23:23       ` Jeff Law
  1 sibling, 0 replies; 10+ messages in thread
From: Jeff Law @ 2015-02-12 16:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On 02/12/15 06:02, Richard Biener wrote:
>
> Yeah, I'd leave it as you did it for now.  Btw, -ENOPATCH.
Because it needed to spin through another round of testing :-)   I'll be 
posting it after my morning meetings.

jeff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
  2015-02-12 13:02     ` Richard Biener
  2015-02-12 16:55       ` Jeff Law
@ 2015-02-12 23:23       ` Jeff Law
  2015-02-13  9:09         ` Richard Biener
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff Law @ 2015-02-12 23:23 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 955 bytes --]

On 02/12/15 06:02, Richard Biener wrote:

>>
>> No strong opinion on separate pattern or integrating into existing pattern.
>
> Yeah, I'd leave it as you did it for now.  Btw, -ENOPATCH.
This version should address all your comments.  A few notes.

When in GENERIC, the inner operands can be a fairly arbitrary 
expression, including things like COMPONENT_REFs for bitfields.   We 
filter these out by ensuring the type's precision for both operands is a 
power of two and at least byte sized.

It may not be strictly necessary, but the pattern now verifies that the 
final type, type of the inner conversion and type of the @0/@1 arguments 
are integral types.  Just seemed like the safe thing to do.

I added the TYPE_OVERFLOW_WRAPS support into the new pattern rather than 
create a totally new pattern.

Testcase name change to reflect it's actually from a PR.

Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for 
the trunk?

Jeff

[-- Attachment #2: P --]
[-- Type: text/plain, Size: 4227 bytes --]

commit a0317dfd4a3f7fc907a84e6956a8c116bb941f52
Author: Jeff Law <law@redhat.com>
Date:   Wed Feb 11 16:19:05 2015 -0700

    	PR rtl-optimization/47477
    	* match.pd (convert (plus/minus (convert @0) (convert @1): New
    	simplifier to narrow arithmetic.
    
    	PR rtl-optimization/47477
    	* gcc.dg/tree-ssa/pr47477.c: New test.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c9ac045..1258a0a 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,11 @@
 2015-02-11  Jeff Law  <law@redhat.com>
 
+	PR rtl-optimization/47477
+	* match.pd (convert (plus/minus (convert @0) (convert @1): New
+	simplifier to narrow arithmetic.
+
+2015-02-11  Jeff Law  <law@redhat.com>
+
 	PR target/63347
 	* haifa-sched.c (prune_ready_list): If we have a SCHED_GROUP_P insn
 	that needs to be queued, just queue it for a single cycle.
diff --git a/gcc/match.pd b/gcc/match.pd
index 81c4ee6..27e0dd2 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1018,3 +1018,36 @@ along with GCC; see the file COPYING3.  If not see
    (logs (pows @0 @1))
    (mult @1 (logs @0)))))
 
+/* Narrowing of arithmetic and logical operations. 
+
+   These are conceptually similar to the transformations performed for
+   the C/C++ front-ends by shorten_binary_op and shorten_compare.  Long
+   term we want to move all that code out of the front-ends into here.  */
+
+/* If we have a narrowing conversion of an arithmetic operation where
+   both operands are widening conversions from the same type as the outer
+   narrowing conversion.  Then convert the innermost operands to a suitable
+   unsigned type (to avoid introducing undefined behaviour), perform the
+   operation and convert the result to the desired type.  */
+(for op (plus minus)
+  (simplify
+    (convert (op (convert@2 @0) (convert@3 @1)))
+    (if (((GENERIC 
+	   && TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == TYPE_MAIN_VARIANT (type)
+	   /* @0 and @1 could be arbitrary expressions here, including 
+	      COMPONENT_REFs with inconvenient types.  For example, they
+	      might have a TYPE_PRECISION of 15 bits.  Ensure we are working
+	      with TYPE_PRECISION that are powers of two and at least byte
+	      sized.  */
+	   && exact_log2 (TYPE_PRECISION (TREE_TYPE (@0))) >= 3
+	   && exact_log2 (TYPE_PRECISION (TREE_TYPE (@1))) >= 3)
+	  || (GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))))
+	 && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type)
+	 && INTEGRAL_TYPE_P (type)
+	 && INTEGRAL_TYPE_P (TREE_TYPE (@0))
+	 && INTEGRAL_TYPE_P (TREE_TYPE (@2))
+	 && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE (@0)))
+      (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
+	(convert (op @0 @1)))
+      (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
+	(convert (op (convert:utype @0) (convert:utype @1)))))))
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 8d5d3f3..2455c68 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2015-02-11  Jeff Law  <law@redhat.com>
+
+	PR rtl-optimization/47477
+	* gcc.dg/tree-ssa/pr47477.c: New test.
+
 2015-02-11  Jerry DeLisle  <jvdelisle@gcc.gnu.org>
 
 	PR libgfortran/57822
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr47477.c b/gcc/testsuite/gcc.dg/tree-ssa/pr47477.c
new file mode 100644
index 0000000..104cb6f5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr47477.c
@@ -0,0 +1,22 @@
+/* PR tree-optimization/47477 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -w" } */
+/* { dg-require-effective-target ilp32 } */
+
+typedef int int64_t __attribute__ ((__mode__ (__DI__)));
+typedef int * intptr_t;
+
+typedef struct toto_s *toto_t;
+toto_t add (toto_t a, toto_t b) {
+  int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
+  return (toto_t)(intptr_t) tmp;
+}
+
+/* For an ILP32 target there'll be 6 casts when we start, but just 4
+   if the match.pd pattern is successfully matched.  */
+/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
+
+

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
  2015-02-12 23:23       ` Jeff Law
@ 2015-02-13  9:09         ` Richard Biener
  2015-02-13 21:02           ` Jeff Law
  0 siblings, 1 reply; 10+ messages in thread
From: Richard Biener @ 2015-02-13  9:09 UTC (permalink / raw)
  To: Jeff Law; +Cc: gcc-patches

On Fri, Feb 13, 2015 at 12:23 AM, Jeff Law <law@redhat.com> wrote:
> On 02/12/15 06:02, Richard Biener wrote:
>
>>>
>>> No strong opinion on separate pattern or integrating into existing
>>> pattern.
>>
>>
>> Yeah, I'd leave it as you did it for now.  Btw, -ENOPATCH.
>
> This version should address all your comments.  A few notes.
>
> When in GENERIC, the inner operands can be a fairly arbitrary expression,
> including things like COMPONENT_REFs for bitfields.   We filter these out by
> ensuring the type's precision for both operands is a power of two and at
> least byte sized.
>
> It may not be strictly necessary, but the pattern now verifies that the
> final type, type of the inner conversion and type of the @0/@1 arguments are
> integral types.  Just seemed like the safe thing to do.
>
> I added the TYPE_OVERFLOW_WRAPS support into the new pattern rather than
> create a totally new pattern.
>
> Testcase name change to reflect it's actually from a PR.
>
>
> Bootstrapped and regression tested on x86_64-unknown-linux-gnu. OK for the
> trunk?
>
> Jeff
>
> commit a0317dfd4a3f7fc907a84e6956a8c116bb941f52
> Author: Jeff Law <law@redhat.com>
> Date:   Wed Feb 11 16:19:05 2015 -0700
>
>         PR rtl-optimization/47477
>         * match.pd (convert (plus/minus (convert @0) (convert @1): New
>         simplifier to narrow arithmetic.
>
>         PR rtl-optimization/47477
>         * gcc.dg/tree-ssa/pr47477.c: New test.
>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index c9ac045..1258a0a 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,5 +1,11 @@
>  2015-02-11  Jeff Law  <law@redhat.com>
>
> +       PR rtl-optimization/47477
> +       * match.pd (convert (plus/minus (convert @0) (convert @1): New
> +       simplifier to narrow arithmetic.
> +
> +2015-02-11  Jeff Law  <law@redhat.com>
> +
>         PR target/63347
>         * haifa-sched.c (prune_ready_list): If we have a SCHED_GROUP_P insn
>         that needs to be queued, just queue it for a single cycle.
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 81c4ee6..27e0dd2 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1018,3 +1018,36 @@ along with GCC; see the file COPYING3.  If not see
>     (logs (pows @0 @1))
>     (mult @1 (logs @0)))))
>
> +/* Narrowing of arithmetic and logical operations.
> +
> +   These are conceptually similar to the transformations performed for
> +   the C/C++ front-ends by shorten_binary_op and shorten_compare.  Long
> +   term we want to move all that code out of the front-ends into here.  */
> +
> +/* If we have a narrowing conversion of an arithmetic operation where
> +   both operands are widening conversions from the same type as the outer
> +   narrowing conversion.  Then convert the innermost operands to a suitable
> +   unsigned type (to avoid introducing undefined behaviour), perform the
> +   operation and convert the result to the desired type.  */
> +(for op (plus minus)
> +  (simplify
> +    (convert (op (convert@2 @0) (convert@3 @1)))
> +    (if (((GENERIC
> +          && TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == TYPE_MAIN_VARIANT (type)
> +          /* @0 and @1 could be arbitrary expressions here, including
> +             COMPONENT_REFs with inconvenient types.  For example, they
> +             might have a TYPE_PRECISION of 15 bits.  Ensure we are working
> +             with TYPE_PRECISION that are powers of two and at least byte
> +             sized.  */
> +          && exact_log2 (TYPE_PRECISION (TREE_TYPE (@0))) >= 3
> +          && exact_log2 (TYPE_PRECISION (TREE_TYPE (@1))) >= 3)

Note that this can happen for GIMPLE as well for

struct X { long long a : 33; } x;
long long b;
b = x.a + x.a;

where you'll see 33bit temporaries from the loads of x.a (because C only
promotes bitfields up to size long).

We already have a related pattern that constrains it with similar ideas and
it IMHO does better by not only requiring a power-of-two precision but
a precision matching the machine mode (thus at RTL expansion we don't
need any extra zero/sign extensions).  It's

/* Try to fold (type) X op CST -> (type) (X op ((type-x) CST))
   when profitable.
   For bitwise binary operations apply operand conversions to the
   binary operation result instead of to the operands.  This allows
   to combine successive conversions and bitwise binary operations.
   We combine the above two cases by using a conditional convert.  */
(for bitop (bit_and bit_ior bit_xor)
 (simplify
  (bitop (convert @0) (convert? @1))
  (if (((TREE_CODE (@1) == INTEGER_CST
         && INTEGRAL_TYPE_P (TREE_TYPE (@0))
         && int_fits_type_p (@1, TREE_TYPE (@0)))
        || (GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1)))
        || (GENERIC && TREE_TYPE (@0) == TREE_TYPE (@1)))
       /* ???  This transform conflicts with fold-const.c doing
          Convert (T)(x & c) into (T)x & (T)c, if c is an integer
          constants (if x has signed type, the sign bit cannot be set
          in c).  This folds extension into the BIT_AND_EXPR.
          Restrict it to GIMPLE to avoid endless recursions.  */
       && (bitop != BIT_AND_EXPR || GIMPLE)
       && (/* That's a good idea if the conversion widens the operand, thus
              after hoisting the conversion the operation will be narrower.  */
           TYPE_PRECISION (TREE_TYPE (@0)) < TYPE_PRECISION (type)
           /* It's also a good idea if the conversion is to a non-integer
              mode.  */
           || GET_MODE_CLASS (TYPE_MODE (type)) != MODE_INT
           /* Or if the precision of TO is not the same as the precision
              of its mode.  */
           || TYPE_PRECISION (type) != GET_MODE_PRECISION (TYPE_MODE (type))))

so I'd say we simply want to generically constrain it as

  TYPE_PRECISION (TREE_TYPE (@0)) == GET_MODE_PRECISION (TYPE_MODE
(TREE_TYPE (@0)))

thus we want to perform the arithmetic in a type whose precision matches
its mode.

> +         || (GIMPLE && types_compatible_p (TREE_TYPE (@0), TREE_TYPE
> (@1))))

hmm, here you compare type of @0 and @1 while for GENERIC you checked
'type' and type of @0.  I believe we want to compare @0 and @1 in both cases.

> +        && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type)
> +        && INTEGRAL_TYPE_P (type)
> +        && INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +        && INTEGRAL_TYPE_P (TREE_TYPE (@2))

Better the INTEGRAL_TYPE_P checks as the very first thing.

Ok with these changes.

Thanks,
Richard.

> +        && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
> (@0)))
> +      (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
> +       (convert (op @0 @1)))
> +      (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
> +       (convert (op (convert:utype @0) (convert:utype @1)))))))
> diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
> index 8d5d3f3..2455c68 100644
> --- a/gcc/testsuite/ChangeLog
> +++ b/gcc/testsuite/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-02-11  Jeff Law  <law@redhat.com>
> +
> +       PR rtl-optimization/47477
> +       * gcc.dg/tree-ssa/pr47477.c: New test.
> +
>  2015-02-11  Jerry DeLisle  <jvdelisle@gcc.gnu.org>
>
>         PR libgfortran/57822
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr47477.c
> b/gcc/testsuite/gcc.dg/tree-ssa/pr47477.c
> new file mode 100644
> index 0000000..104cb6f5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr47477.c
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/47477 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized -w" } */
> +/* { dg-require-effective-target ilp32 } */
> +
> +typedef int int64_t __attribute__ ((__mode__ (__DI__)));
> +typedef int * intptr_t;
> +
> +typedef struct toto_s *toto_t;
> +toto_t add (toto_t a, toto_t b) {
> +  int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
> +  return (toto_t)(intptr_t) tmp;
> +}
> +
> +/* For an ILP32 target there'll be 6 casts when we start, but just 4
> +   if the match.pd pattern is successfully matched.  */
> +/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" }
> } */
> +/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1
> "optimized" } } */
> +/* { dg-final { cleanup-tree-dump "optimized" } } */
> +
> +
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd
  2015-02-13  9:09         ` Richard Biener
@ 2015-02-13 21:02           ` Jeff Law
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff Law @ 2015-02-13 21:02 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1196 bytes --]

On 02/13/15 02:09, Richard Biener wrote:
>
> Note that this can happen for GIMPLE as well for
>
> struct X { long long a : 33; } x;
> long long b;
> b = x.a + x.a;
>
> where you'll see 33bit temporaries from the loads of x.a (because C only
> promotes bitfields up to size long).
>
> We already have a related pattern that constrains it with similar ideas and
> it IMHO does better by not only requiring a power-of-two precision but
> a precision matching the machine mode (thus at RTL expansion we don't
> need any extra zero/sign extensions).  It's
Thanks.  Good to know it can happen for gimple as well.   I've pulled 
the updated tests out to the toplevel.

I'd been trying to avoid TYPE_MODE since it felt like a bit of letting 
RTL-isms sneak in here.  But that's probably inevitable.


>
>> +        && TYPE_PRECISION (TREE_TYPE (@0)) == TYPE_PRECISION (type)
>> +        && INTEGRAL_TYPE_P (type)
>> +        && INTEGRAL_TYPE_P (TREE_TYPE (@0))
>> +        && INTEGRAL_TYPE_P (TREE_TYPE (@2))
>
> Better the INTEGRAL_TYPE_P checks as the very first thing.
Works for me.

Updated, bootstrapped & regression tested on x86_64-unknown-linux-gnu. 
Installed on trunk.

Final patch attached.



[-- Attachment #2: P1 --]
[-- Type: text/plain, Size: 4501 bytes --]

commit fff56e577c989e828720cba4992b7379057bf86b
Author: law <law@138bc75d-0d04-0410-961f-82ee72b054a4>
Date:   Fri Feb 13 20:17:55 2015 +0000

    	PR rtl-optimization/47477
    	* match.pd (convert (plus/minus (convert @0) (convert @1): New
    	simplifier to narrow arithmetic.
    
    	PR rtl-optimization/47477
    	* gcc.dg/tree-ssa/pr47477.c: New test.
    
    git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@220695 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 290d3ac..f36e16c 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2015-02-13  Jeff Law  <law@redhat.com>
+
+	PR rtl-optimization/47477
+	* match.pd (convert (plus/minus (convert @0) (convert @1): New
+	simplifier to narrow arithmetic.
+
 2015-02-13  Jan Hubicka  <hubicka@ucw.cz>
 
 	PR ipa/65028
diff --git a/gcc/match.pd b/gcc/match.pd
index 81c4ee6..d438179 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1018,3 +1018,44 @@ along with GCC; see the file COPYING3.  If not see
    (logs (pows @0 @1))
    (mult @1 (logs @0)))))
 
+/* Narrowing of arithmetic and logical operations. 
+
+   These are conceptually similar to the transformations performed for
+   the C/C++ front-ends by shorten_binary_op and shorten_compare.  Long
+   term we want to move all that code out of the front-ends into here.  */
+
+/* If we have a narrowing conversion of an arithmetic operation where
+   both operands are widening conversions from the same type as the outer
+   narrowing conversion.  Then convert the innermost operands to a suitable
+   unsigned type (to avoid introducing undefined behaviour), perform the
+   operation and convert the result to the desired type.  */
+(for op (plus minus)
+  (simplify
+    (convert (op (convert@2 @0) (convert@3 @1)))
+    (if (INTEGRAL_TYPE_P (type)
+	 /* We check for type compatibility between @0 and @1 below,
+	    so there's no need to check that @1/@3 are integral types.  */
+	 && INTEGRAL_TYPE_P (TREE_TYPE (@0))
+	 && INTEGRAL_TYPE_P (TREE_TYPE (@2))
+	 /* The precision of the type of each operand must match the
+	    precision of the mode of each operand, similarly for the
+	    result.  */
+	 && (TYPE_PRECISION (TREE_TYPE (@0))
+	     == GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@0))))
+	 && (TYPE_PRECISION (TREE_TYPE (@1))
+	     == GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@1))))
+	 && TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
+	 /* The inner conversion must be a widening conversion.  */
+	 && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE (@0))
+	 && ((GENERIC 
+	      && (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
+		  == TYPE_MAIN_VARIANT (TREE_TYPE (@1)))
+	      && (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
+		  == TYPE_MAIN_VARIANT (type)))
+	     || (GIMPLE
+		 && types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))
+		 && types_compatible_p (TREE_TYPE (@0), type))))
+      (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
+	(convert (op @0 @1)))
+      (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
+	(convert (op (convert:utype @0) (convert:utype @1)))))))
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index b2d1c11..2fe9698 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2015-02-13  Jeff Law  <law@redhat.com>
+
+	PR rtl-optimization/47477
+	* gcc.dg/tree-ssa/pr47477.c: New test.
+
 2015-02-13  Paolo Carlini  <paolo.carlini@oracle.com>
 
 	PR c++/60211
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr47477.c b/gcc/testsuite/gcc.dg/tree-ssa/pr47477.c
new file mode 100644
index 0000000..104cb6f5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr47477.c
@@ -0,0 +1,22 @@
+/* PR tree-optimization/47477 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized -w" } */
+/* { dg-require-effective-target ilp32 } */
+
+typedef int int64_t __attribute__ ((__mode__ (__DI__)));
+typedef int * intptr_t;
+
+typedef struct toto_s *toto_t;
+toto_t add (toto_t a, toto_t b) {
+  int64_t tmp = (int64_t)(intptr_t)a + ((int64_t)(intptr_t)b&~1L);
+  return (toto_t)(intptr_t) tmp;
+}
+
+/* For an ILP32 target there'll be 6 casts when we start, but just 4
+   if the match.pd pattern is successfully matched.  */
+/* { dg-final { scan-tree-dump-times "= \\(int\\)" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\(unsigned int\\)" 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "= \\(struct toto_s \\*\\)" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
+
+

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2015-02-13 21:02 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-10 20:55 [RFA][PATCH][PR rtl-optimization/47477] Type narrowing in match.pd Jeff Law
2015-02-11 10:56 ` Richard Biener
2015-02-11 17:05   ` Jeff Law
2015-02-12 13:02     ` Richard Biener
2015-02-12 16:55       ` Jeff Law
2015-02-12 23:23       ` Jeff Law
2015-02-13  9:09         ` Richard Biener
2015-02-13 21:02           ` Jeff Law
2015-02-12  7:02 ` Bin.Cheng
2015-02-12 16:43   ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).