[PATCH] Optimize certain end of loop conditions into min/max operation

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* [PATCH] Optimize certain end of loop conditions into min/max operation
@ 2015-07-27  4:23 Michael Collison
  2015-07-27  7:36 ` Bin.Cheng
  2015-07-27  9:47 ` Richard Biener
  0 siblings, 2 replies; 29+ messages in thread
From: Michael Collison @ 2015-07-27  4:23 UTC (permalink / raw)
  To: gcc Patches

This patch is designed to optimize end of loop conditions involving of 
the form
  i < x && i < y into i < min (x, y). Loop condition involving '>' are 
handled similarly using max(x,y).
As an example:

#define N 1024

int  a[N], b[N], c[N];

void add (unsignedint  m, unsignedint  n)
{
   unsignedint  i, bound = (m < n) ? m : n;
   for  (i = 0; i < m && i < n; ++i)
     a[i] = b[i] + c[i];
}


Performed bootstrap and make check on: x86_64_unknown-linux-gnu, 
arm-linux-gnueabihf, and aarch64-linux-gnu.
Okay for trunk?

2015-07-24  Michael Collison  <michael.collison@linaro.org>
                     Andrew Pinski <andrew.pinski@caviumnetworks.com>

                 * match.pd ((x < y) && (x < z) -> x < min (y,z),
                 (x > y) and (x > z) -> x > max (y,z))

diff --git a/gcc/match.pd b/gcc/match.pd
index 5e8fd32..8691710 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not see
      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
                (convert:utype @4)))))))

+
+/* Transform (@0 < @1 and @0 < @2) to use min */
+(for op (lt le)
+(simplify
+(bit_and:c (op @0 @1) (op @0 @2))
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+(op @0 (min @1 @2)))))
+
+/* Transform (@0 > @1 and @0 > @2) to use max */
+(for op (gt ge)
+(simplify
+(bit_and:c (op @0 @1) (op @0 @2))
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+(op @0 (max @1 @2)))))
-- 

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-27  4:23 [PATCH] Optimize certain end of loop conditions into min/max operation Michael Collison
@ 2015-07-27  7:36 ` Bin.Cheng
  2015-07-27  8:11   ` Bin.Cheng
  2015-07-27  9:47 ` Richard Biener
  1 sibling, 1 reply; 29+ messages in thread
From: Bin.Cheng @ 2015-07-27  7:36 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc Patches

On Mon, Jul 27, 2015 at 11:41 AM, Michael Collison
<michael.collison@linaro.org> wrote:
> This patch is designed to optimize end of loop conditions involving of the
> form
>  i < x && i < y into i < min (x, y). Loop condition involving '>' are
> handled similarly using max(x,y).
> As an example:
>
> #define N 1024
>
> int  a[N], b[N], c[N];
>
> void add (unsignedint  m, unsignedint  n)
> {
>   unsignedint  i, bound = (m < n) ? m : n;
>   for  (i = 0; i < m && i < n; ++i)
>     a[i] = b[i] + c[i];
> }
>
>
> Performed bootstrap and make check on: x86_64_unknown-linux-gnu,
> arm-linux-gnueabihf, and aarch64-linux-gnu.
> Okay for trunk?
>
> 2015-07-24  Michael Collison  <michael.collison@linaro.org>
>                     Andrew Pinski <andrew.pinski@caviumnetworks.com>
>
>                 * match.pd ((x < y) && (x < z) -> x < min (y,z),
>                 (x > y) and (x > z) -> x > max (y,z))
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 5e8fd32..8691710 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not see
>      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>                (convert:utype @4)))))))
>
> +
> +/* Transform (@0 < @1 and @0 < @2) to use min */
> +(for op (lt le)
> +(simplify
> +(bit_and:c (op @0 @1) (op @0 @2))
> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
> +(op @0 (min @1 @2)))))
> +
> +/* Transform (@0 > @1 and @0 > @2) to use max */
> +(for op (gt ge)
> +(simplify
> +(bit_and:c (op @0 @1) (op @0 @2))
> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
> +(op @0 (max @1 @2)))))

Could you please give a test case for it?  Also IIUC, this is not only
simplification, but also loop invariant hoist, so how does it check
invariantness?

Thanks,
bin
> --
>
> --
> Michael Collison
> Linaro Toolchain Working Group
> michael.collison@linaro.org
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-27  7:36 ` Bin.Cheng
@ 2015-07-27  8:11   ` Bin.Cheng
  0 siblings, 0 replies; 29+ messages in thread
From: Bin.Cheng @ 2015-07-27  8:11 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc Patches

On Mon, Jul 27, 2015 at 12:23 PM, Bin.Cheng <amker.cheng@gmail.com> wrote:
> On Mon, Jul 27, 2015 at 11:41 AM, Michael Collison
> <michael.collison@linaro.org> wrote:
>> This patch is designed to optimize end of loop conditions involving of the
>> form
>>  i < x && i < y into i < min (x, y). Loop condition involving '>' are
>> handled similarly using max(x,y).
>> As an example:
>>
>> #define N 1024
>>
>> int  a[N], b[N], c[N];
>>
>> void add (unsignedint  m, unsignedint  n)
>> {
>>   unsignedint  i, bound = (m < n) ? m : n;
>>   for  (i = 0; i < m && i < n; ++i)
>>     a[i] = b[i] + c[i];
>> }
>>
>>
>> Performed bootstrap and make check on: x86_64_unknown-linux-gnu,
>> arm-linux-gnueabihf, and aarch64-linux-gnu.
>> Okay for trunk?
>>
>> 2015-07-24  Michael Collison  <michael.collison@linaro.org>
>>                     Andrew Pinski <andrew.pinski@caviumnetworks.com>
>>
>>                 * match.pd ((x < y) && (x < z) -> x < min (y,z),
>>                 (x > y) and (x > z) -> x > max (y,z))
>>
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 5e8fd32..8691710 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not see
>>      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>>                (convert:utype @4)))))))
>>
>> +
>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>> +(for op (lt le)
>> +(simplify
>> +(bit_and:c (op @0 @1) (op @0 @2))
>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>> +(op @0 (min @1 @2)))))
>> +
>> +/* Transform (@0 > @1 and @0 > @2) to use max */
>> +(for op (gt ge)
>> +(simplify
>> +(bit_and:c (op @0 @1) (op @0 @2))
>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>> +(op @0 (max @1 @2)))))
>
> Could you please give a test case for it?  Also IIUC, this is not only
> simplification, but also loop invariant hoist, so how does it check
> invariantness?

Sorry I realized this patch only does simplification and then let lim
pass decide if it can be moved?  In this way, there is no invariant
problem, please ignore previous message.

Thanks,
bin

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-27  4:23 [PATCH] Optimize certain end of loop conditions into min/max operation Michael Collison
  2015-07-27  7:36 ` Bin.Cheng
@ 2015-07-27  9:47 ` Richard Biener
  2015-07-27 16:25   ` Jeff Law
  2015-08-05 16:08   ` Alan Lawrence
  1 sibling, 2 replies; 29+ messages in thread
From: Richard Biener @ 2015-07-27  9:47 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc Patches

On Mon, Jul 27, 2015 at 5:41 AM, Michael Collison
<michael.collison@linaro.org> wrote:
> This patch is designed to optimize end of loop conditions involving of the
> form
>  i < x && i < y into i < min (x, y). Loop condition involving '>' are
> handled similarly using max(x,y).
> As an example:
>
> #define N 1024
>
> int  a[N], b[N], c[N];
>
> void add (unsignedint  m, unsignedint  n)
> {
>   unsignedint  i, bound = (m < n) ? m : n;
>   for  (i = 0; i < m && i < n; ++i)
>     a[i] = b[i] + c[i];
> }
>
>
> Performed bootstrap and make check on: x86_64_unknown-linux-gnu,
> arm-linux-gnueabihf, and aarch64-linux-gnu.
> Okay for trunk?

So this works only for && that has been lowered to non-CFG form
(I suppose phiopt would catch that?  If not, ifcombine would be the
place to implement it I guess).

Furthermore it doesn't work for three such ops which would require
an additional pattern like

 (simplfiy
  (bit_and:c (op @0 (min @1 @2)) (op @0 @3))
  (op @0 (min (min @1 @2) @3))))

if that's profitable?

Richard.

> 2015-07-24  Michael Collison  <michael.collison@linaro.org>
>                     Andrew Pinski <andrew.pinski@caviumnetworks.com>
>
>                 * match.pd ((x < y) && (x < z) -> x < min (y,z),
>                 (x > y) and (x > z) -> x > max (y,z))
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 5e8fd32..8691710 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not see
>      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>                (convert:utype @4)))))))
>
> +
> +/* Transform (@0 < @1 and @0 < @2) to use min */
> +(for op (lt le)
> +(simplify
> +(bit_and:c (op @0 @1) (op @0 @2))
> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
> +(op @0 (min @1 @2)))))
> +
> +/* Transform (@0 > @1 and @0 > @2) to use max */
> +(for op (gt ge)
> +(simplify
> +(bit_and:c (op @0 @1) (op @0 @2))
> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
> +(op @0 (max @1 @2)))))
> --
>
> --
> Michael Collison
> Linaro Toolchain Working Group
> michael.collison@linaro.org
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-27  9:47 ` Richard Biener
@ 2015-07-27 16:25   ` Jeff Law
  2015-07-28  7:59     ` Richard Biener
  2015-08-05 16:08   ` Alan Lawrence
  1 sibling, 1 reply; 29+ messages in thread
From: Jeff Law @ 2015-07-27 16:25 UTC (permalink / raw)
  To: Richard Biener, Michael Collison; +Cc: gcc Patches

On 07/27/2015 03:25 AM, Richard Biener wrote:
> On Mon, Jul 27, 2015 at 5:41 AM, Michael Collison
> <michael.collison@linaro.org> wrote:
>> This patch is designed to optimize end of loop conditions involving of the
>> form
>>   i < x && i < y into i < min (x, y). Loop condition involving '>' are
>> handled similarly using max(x,y).
>> As an example:
>>
>> #define N 1024
>>
>> int  a[N], b[N], c[N];
>>
>> void add (unsignedint  m, unsignedint  n)
>> {
>>    unsignedint  i, bound = (m < n) ? m : n;
>>    for  (i = 0; i < m && i < n; ++i)
>>      a[i] = b[i] + c[i];
>> }
>>
>>
>> Performed bootstrap and make check on: x86_64_unknown-linux-gnu,
>> arm-linux-gnueabihf, and aarch64-linux-gnu.
>> Okay for trunk?
>
> So this works only for && that has been lowered to non-CFG form
> (I suppose phiopt would catch that?  If not, ifcombine would be the
> place to implement it I guess).
phiopt is supposed to be generating MIN/MAX expressions for us.  If it 
isn't it'd be good to see any testcases where it isn't.

I think that raises a general question though.  Does it make more sense 
to capture MIN/MAX (and others) in phiopt or in the match.pd framework?

Jeff

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-27 16:25   ` Jeff Law
@ 2015-07-28  7:59     ` Richard Biener
  2015-07-29 22:38       ` Michael Collison
  2015-07-31 16:40       ` Jeff Law
  0 siblings, 2 replies; 29+ messages in thread
From: Richard Biener @ 2015-07-28  7:59 UTC (permalink / raw)
  To: Jeff Law; +Cc: Michael Collison, gcc Patches

On Mon, Jul 27, 2015 at 6:20 PM, Jeff Law <law@redhat.com> wrote:
> On 07/27/2015 03:25 AM, Richard Biener wrote:
>>
>> On Mon, Jul 27, 2015 at 5:41 AM, Michael Collison
>> <michael.collison@linaro.org> wrote:
>>>
>>> This patch is designed to optimize end of loop conditions involving of
>>> the
>>> form
>>>   i < x && i < y into i < min (x, y). Loop condition involving '>' are
>>> handled similarly using max(x,y).
>>> As an example:
>>>
>>> #define N 1024
>>>
>>> int  a[N], b[N], c[N];
>>>
>>> void add (unsignedint  m, unsignedint  n)
>>> {
>>>    unsignedint  i, bound = (m < n) ? m : n;
>>>    for  (i = 0; i < m && i < n; ++i)
>>>      a[i] = b[i] + c[i];
>>> }
>>>
>>>
>>> Performed bootstrap and make check on: x86_64_unknown-linux-gnu,
>>> arm-linux-gnueabihf, and aarch64-linux-gnu.
>>> Okay for trunk?
>>
>>
>> So this works only for && that has been lowered to non-CFG form
>> (I suppose phiopt would catch that?  If not, ifcombine would be the
>> place to implement it I guess).
>
> phiopt is supposed to be generating MIN/MAX expressions for us.  If it isn't
> it'd be good to see any testcases where it isn't.
>
> I think that raises a general question though.  Does it make more sense to
> capture MIN/MAX (and others) in phiopt or in the match.pd framework?

match.pd is good for pattern recognition - patterns of fixed size.  There are
cases that are done in fold-const.c for example that doesn't fit very well
and should be done as separate pass, like for example figuring out whether
an expression can be easily negated or whether there are sign-changes that
can be stripped.  Basically all cases where fold currently recurses (unbound).

The above case is a corner case I think - the number of && you can change
into (multiple) MIN/MAX is unbound but we might only care about the case
where there will be one MIN/MAX operation.

Generally phiopt and other patterns that match the CFG are not yet well
supported by match.pd (though I outlined how matching PHI nodes when
facing (simplify (cond ...) ...) would be possible).

So while putting something into match.pd is easy I'd like people to
think if doing the same thing elsewhere is better - that is, if this is really
a pattern transform operation or if you are just implementing a special-case
of a general transform as a pattern.

Richard.

> Jeff
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-28  7:59     ` Richard Biener
@ 2015-07-29 22:38       ` Michael Collison
  2015-07-31 16:40       ` Jeff Law
  1 sibling, 0 replies; 29+ messages in thread
From: Michael Collison @ 2015-07-29 22:38 UTC (permalink / raw)
  To: Richard Biener, Jeff Law; +Cc: gcc Patches

Richard and Jeff,

Any conclusion to this discussion? Is this okay in match.pd or would you 
like to see it implemented elsewhere?

On 7/28/2015 12:41 AM, Richard Biener wrote:
> On Mon, Jul 27, 2015 at 6:20 PM, Jeff Law <law@redhat.com> wrote:
>> On 07/27/2015 03:25 AM, Richard Biener wrote:
>>> On Mon, Jul 27, 2015 at 5:41 AM, Michael Collison
>>> <michael.collison@linaro.org> wrote:
>>>> This patch is designed to optimize end of loop conditions involving of
>>>> the
>>>> form
>>>>    i < x && i < y into i < min (x, y). Loop condition involving '>' are
>>>> handled similarly using max(x,y).
>>>> As an example:
>>>>
>>>> #define N 1024
>>>>
>>>> int  a[N], b[N], c[N];
>>>>
>>>> void add (unsignedint  m, unsignedint  n)
>>>> {
>>>>     unsignedint  i, bound = (m < n) ? m : n;
>>>>     for  (i = 0; i < m && i < n; ++i)
>>>>       a[i] = b[i] + c[i];
>>>> }
>>>>
>>>>
>>>> Performed bootstrap and make check on: x86_64_unknown-linux-gnu,
>>>> arm-linux-gnueabihf, and aarch64-linux-gnu.
>>>> Okay for trunk?
>>>
>>> So this works only for && that has been lowered to non-CFG form
>>> (I suppose phiopt would catch that?  If not, ifcombine would be the
>>> place to implement it I guess).
>> phiopt is supposed to be generating MIN/MAX expressions for us.  If it isn't
>> it'd be good to see any testcases where it isn't.
>>
>> I think that raises a general question though.  Does it make more sense to
>> capture MIN/MAX (and others) in phiopt or in the match.pd framework?
> match.pd is good for pattern recognition - patterns of fixed size.  There are
> cases that are done in fold-const.c for example that doesn't fit very well
> and should be done as separate pass, like for example figuring out whether
> an expression can be easily negated or whether there are sign-changes that
> can be stripped.  Basically all cases where fold currently recurses (unbound).
>
> The above case is a corner case I think - the number of && you can change
> into (multiple) MIN/MAX is unbound but we might only care about the case
> where there will be one MIN/MAX operation.
>
> Generally phiopt and other patterns that match the CFG are not yet well
> supported by match.pd (though I outlined how matching PHI nodes when
> facing (simplify (cond ...) ...) would be possible).
>
> So while putting something into match.pd is easy I'd like people to
> think if doing the same thing elsewhere is better - that is, if this is really
> a pattern transform operation or if you are just implementing a special-case
> of a general transform as a pattern.
>
> Richard.
>
>> Jeff
>>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-28  7:59     ` Richard Biener
  2015-07-29 22:38       ` Michael Collison
@ 2015-07-31 16:40       ` Jeff Law
  2015-07-31 18:48         ` Michael Collison
  1 sibling, 1 reply; 29+ messages in thread
From: Jeff Law @ 2015-07-31 16:40 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Collison, gcc Patches

On 07/28/2015 01:41 AM, Richard Biener wrote:
>
> The above case is a corner case I think - the number of && you can change
> into (multiple) MIN/MAX is unbound but we might only care about the case
> where there will be one MIN/MAX operation.
I suspect that's going to be the most important/common case.

>
> Generally phiopt and other patterns that match the CFG are not yet well
> supported by match.pd (though I outlined how matching PHI nodes when
> facing (simplify (cond ...) ...) would be possible).
Right.  Though I thought the conclusion after outlining we determined it 
wasn't really feasible, yet.


>
> So while putting something into match.pd is easy I'd like people to
> think if doing the same thing elsewhere is better - that is, if this is really
> a pattern transform operation or if you are just implementing a special-case
> of a general transform as a pattern.
So in this case we're taking something like:

  _6 = i_1 < m_4(D);
   _7 = i_1 < n_3(D);
   _8 = _6 & _7;
   if (_8 != 0)


And turning it into

_X = MIN (m_4, n_3)
if (i_1 < _X)

That seems to me like a good match for match.pd given its generality and 
the need to walk up the use-def chain.  It's certainly not a good fit 
for phi-opt since we're not looking at PHIs :-)



Michael -- can you take your sample code and turn it into a test for the 
testsuite.  I'd hazard a guess it'll need to be target specific because 
of its interactions with branch-costing.  Might as well make 4 variants 
(lt -> MIN, le -> MIN, ge->MAX, gt->MAX).

We're going to want that regardless of whether tackling this issue in 
match.pd (my current preference) or elsewhere.

jeff

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-31 16:40       ` Jeff Law
@ 2015-07-31 18:48         ` Michael Collison
  2015-07-31 19:10           ` Jeff Law
  0 siblings, 1 reply; 29+ messages in thread
From: Michael Collison @ 2015-07-31 18:48 UTC (permalink / raw)
  To: Jeff Law, Richard Biener; +Cc: gcc Patches

Hi Jeff,

Yes I will create a test case. I'm not quite sure what to check for even 
in the machine dependent test case. It's quite possible for the 
instructions that are generated to change over time.

On 7/31/2015 9:20 AM, Jeff Law wrote:
> On 07/28/2015 01:41 AM, Richard Biener wrote:
>>
>> The above case is a corner case I think - the number of && you can 
>> change
>> into (multiple) MIN/MAX is unbound but we might only care about the case
>> where there will be one MIN/MAX operation.
> I suspect that's going to be the most important/common case.
>
>>
>> Generally phiopt and other patterns that match the CFG are not yet well
>> supported by match.pd (though I outlined how matching PHI nodes when
>> facing (simplify (cond ...) ...) would be possible).
> Right.  Though I thought the conclusion after outlining we determined 
> it wasn't really feasible, yet.
>
>
>>
>> So while putting something into match.pd is easy I'd like people to
>> think if doing the same thing elsewhere is better - that is, if this 
>> is really
>> a pattern transform operation or if you are just implementing a 
>> special-case
>> of a general transform as a pattern.
> So in this case we're taking something like:
>
>  _6 = i_1 < m_4(D);
>   _7 = i_1 < n_3(D);
>   _8 = _6 & _7;
>   if (_8 != 0)
>
>
> And turning it into
>
> _X = MIN (m_4, n_3)
> if (i_1 < _X)
>
> That seems to me like a good match for match.pd given its generality 
> and the need to walk up the use-def chain.  It's certainly not a good 
> fit for phi-opt since we're not looking at PHIs :-)
>
>
>
> Michael -- can you take your sample code and turn it into a test for 
> the testsuite.  I'd hazard a guess it'll need to be target specific 
> because of its interactions with branch-costing.  Might as well make 4 
> variants (lt -> MIN, le -> MIN, ge->MAX, gt->MAX).
>
> We're going to want that regardless of whether tackling this issue in 
> match.pd (my current preference) or elsewhere.
>
> jeff

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-31 18:48         ` Michael Collison
@ 2015-07-31 19:10           ` Jeff Law
  2015-08-03  7:34             ` Richard Biener
  2015-09-18  7:00             ` Michael Collison
  0 siblings, 2 replies; 29+ messages in thread
From: Jeff Law @ 2015-07-31 19:10 UTC (permalink / raw)
  To: Michael Collison, Richard Biener; +Cc: gcc Patches

On 07/31/2015 12:18 PM, Michael Collison wrote:
> Hi Jeff,
>
> Yes I will create a test case. I'm not quite sure what to check for even
> in the machine dependent test case. It's quite possible for the
> instructions that are generated to change over time.
I think we're going to want to look at the gimple IR and search for the 
MIN/MAX expressions rather than the instructions.  Given we don't know 
where the transformation is going to land (yet), you can probably start 
with -fdump-tree-optimized and scanning the .optimized dump.

We can still do that and have the test be target specific.

jeff

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-31 19:10           ` Jeff Law
@ 2015-08-03  7:34             ` Richard Biener
  2015-09-18  7:00             ` Michael Collison
  1 sibling, 0 replies; 29+ messages in thread
From: Richard Biener @ 2015-08-03  7:34 UTC (permalink / raw)
  To: Jeff Law; +Cc: Michael Collison, gcc Patches

On Fri, Jul 31, 2015 at 8:27 PM, Jeff Law <law@redhat.com> wrote:
> On 07/31/2015 12:18 PM, Michael Collison wrote:
>>
>> Hi Jeff,
>>
>> Yes I will create a test case. I'm not quite sure what to check for even
>> in the machine dependent test case. It's quite possible for the
>> instructions that are generated to change over time.
>
> I think we're going to want to look at the gimple IR and search for the
> MIN/MAX expressions rather than the instructions.  Given we don't know where
> the transformation is going to land (yet), you can probably start with
> -fdump-tree-optimized and scanning the .optimized dump.

Yeah.  FWIW I'm ok with the specialized pattern in match.pd.  Indeed phiopt
isn't a good fit here (though on some targets you may end up seeing a PHI node,
dependent on LOGICAL_OP_NON_SHORT_CIRCUIT).  For longer chains
I'd say reassoc is the appropriate place to handle this as it has
already code to
combine &&s and ||s of conditions.  Maybe you can give it a short try to only
handle it there.

I'm also somewhat worried about code-generation on random targets.  So can
you have a quick look (just for your testcase) what the code-gen difference is
on primary platforms (just build some stage1 cc1s)?

Thanks,
Richard.

> We can still do that and have the test be target specific.
>
> jeff
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-27  9:47 ` Richard Biener
  2015-07-27 16:25   ` Jeff Law
@ 2015-08-05 16:08   ` Alan Lawrence
  2015-08-05 16:15     ` Jeff Law
  1 sibling, 1 reply; 29+ messages in thread
From: Alan Lawrence @ 2015-08-05 16:08 UTC (permalink / raw)
  To: Richard Biener; +Cc: Michael Collison, gcc Patches

Richard Biener wrote:

> Furthermore it doesn't work for three such ops which would require
> an additional pattern like
> 
>  (simplfiy
>   (bit_and:c (op @0 (min @1 @2)) (op @0 @3))
>   (op @0 (min (min @1 @2) @3))))
> 
> if that's profitable?

Shouldn't that be just a case of binding @1 in the original pattern:

>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>> +(for op (lt le)
>> +(simplify
>> +(bit_and:c (op @0 @1) (op @0 @2))
 >> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
 >> +(op @0 (min @1 @2)))))

to (min @1 @2) in your three-way pattern, @2 in the original to @3 in yours, and 
@0 to @0 ?

Or is @1 not allowed to bind to a 'min' ?


Secondly: should/can Michael's fix reference PR57600 ?

Cheers,
Alan

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-08-05 16:08   ` Alan Lawrence
@ 2015-08-05 16:15     ` Jeff Law
  0 siblings, 0 replies; 29+ messages in thread
From: Jeff Law @ 2015-08-05 16:15 UTC (permalink / raw)
  To: Alan Lawrence, Richard Biener; +Cc: Michael Collison, gcc Patches

On 08/05/2015 10:07 AM, Alan Lawrence wrote:
> Richard Biener wrote:
>
>> Furthermore it doesn't work for three such ops which would require
>> an additional pattern like
>>
>>  (simplfiy
>>   (bit_and:c (op @0 (min @1 @2)) (op @0 @3))
>>   (op @0 (min (min @1 @2) @3))))
>>
>> if that's profitable?
>
> Shouldn't that be just a case of binding @1 in the original pattern:
>
>>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>>> +(for op (lt le)
>>> +(simplify
>>> +(bit_and:c (op @0 @1) (op @0 @2))
>  >> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>  >> +(op @0 (min @1 @2)))))
>
> to (min @1 @2) in your three-way pattern, @2 in the original to @3 in
> yours, and @0 to @0 ?
>
> Or is @1 not allowed to bind to a 'min' ?
>
>
> Secondly: should/can Michael's fix reference PR57600 ?
It probably should.

Reading that BZ ISTM that we probably are missing some expression 
equivalences in DOM.  Given

x = min (a, b)

We should be recording
x <= a -> true
x <= b -> true

in our expression equivalency tables (plus all the related 
equivalences).  I'm not sure if that's the cause of the missed 
optimizations or not though.  Just something I noticed while reading the BZ.

jeff

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-07-31 19:10           ` Jeff Law
  2015-08-03  7:34             ` Richard Biener
@ 2015-09-18  7:00             ` Michael Collison
  2015-09-18  7:38               ` Marc Glisse
  1 sibling, 1 reply; 29+ messages in thread
From: Michael Collison @ 2015-09-18  7:00 UTC (permalink / raw)
  To: Jeff Law, Richard Biener; +Cc: gcc Patches

On 07/31/2015 11:27 AM, Jeff Law wrote:
> On 07/31/2015 12:18 PM, Michael Collison wrote:
>> Hi Jeff,
>>
>> Yes I will create a test case. I'm not quite sure what to check for even
>> in the machine dependent test case. It's quite possible for the
>> instructions that are generated to change over time.
> I think we're going to want to look at the gimple IR and search for 
> the MIN/MAX expressions rather than the instructions.  Given we don't 
> know where the transformation is going to land (yet), you can probably 
> start with -fdump-tree-optimized and scanning the .optimized dump.
>
> We can still do that and have the test be target specific.
>
> jeff
>
Jeff and Richard,

Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR 
expressions. I need some assistance; this test case will fail on targets 
that don't have support for MIN/MAX such as 68k. Is there any way to 
remedy this short of enumerating whether a target support MIN/MAX in 
testsuite/lib/target_support?

2015-07-24  Michael Collison <michael.collison@linaro.org>
                     Andrew Pinski <andrew.pinski@caviumnetworks.com>

                 * match.pd ((x < y) && (x < z) -> x < min (y,z),
                 (x > y) and (x > z) -> x > max (y,z))
                 * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.

diff --git a/gcc/match.pd b/gcc/match.pd
index 5e8fd32..8691710 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not see
      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
                (convert:utype @4)))))))

+
+/* Transform (@0 < @1 and @0 < @2) to use min */
+(for op (lt le)
+(simplify
+(bit_and:c (op @0 @1) (op @0 @2))
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+(op @0 (min @1 @2)))))
+
+/* Transform (@0 > @1 and @0 > @2) to use max */
+(for op (gt ge)
+(simplify
+(bit_and:c (op @0 @1) (op @0 @2))
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+(op @0 (max @1 @2)))))
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c 
b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
new file mode 100644
index 0000000..cc0189a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+#define N 1024
+
+int a[N], b[N], c[N];
+
+void add (unsigned int m, unsigned int n)
+{
+  unsigned int i;
+  for (i = 0; i < m && i < n; ++i)
+    a[i] = b[i] + c[i];
+}
+
+void add2 (unsigned int m, unsigned int n)
+{
+  unsigned int i;
+  for (i = N-1; i > m && i > n; --i)
+    a[i] = b[i] + c[i];
+}
+
+/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
-- 
1.9.1

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-18  7:00             ` Michael Collison
@ 2015-09-18  7:38               ` Marc Glisse
  2015-09-18  7:41                 ` Marc Glisse
  2015-09-18 22:01                 ` Michael Collison
  0 siblings, 2 replies; 29+ messages in thread
From: Marc Glisse @ 2015-09-18  7:38 UTC (permalink / raw)
  To: Michael Collison; +Cc: Jeff Law, Richard Biener, gcc Patches

On Thu, 17 Sep 2015, Michael Collison wrote:

> Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR 
> expressions. I need some assistance; this test case will fail on targets that 
> don't have support for MIN/MAX such as 68k. Is there any way to remedy this 
> short of enumerating whether a target support MIN/MAX in 
> testsuite/lib/target_support?
>
> 2015-07-24  Michael Collison <michael.collison@linaro.org>
>                    Andrew Pinski <andrew.pinski@caviumnetworks.com>
>
>                * match.pd ((x < y) && (x < z) -> x < min (y,z),
>                (x > y) and (x > z) -> x > max (y,z))
>                * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 5e8fd32..8691710 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not see
>     (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>               (convert:utype @4)))))))
>
> +
> +/* Transform (@0 < @1 and @0 < @2) to use min */
> +(for op (lt le)
> +(simplify

You seem to be missing all indentation.

> +(bit_and:c (op @0 @1) (op @0 @2))

:c seems useless here. On the other hand, it might make sense to use op:s 
since this is mostly useful if it removes the 2 original comparisons.

> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))

How did you chose this restriction? It seems safe enough, but the 
transformation could make sense in other cases as well. It can always be 
generalized later though.

> +(op @0 (min @1 @2)))))
> +
> +/* Transform (@0 > @1 and @0 > @2) to use max */
> +(for op (gt ge)

Note that you could unify the patterns with something like:
(for op (lt le gt ge)
      ext (min min max max)
  (simplify ...

> +(simplify
> +(bit_and:c (op @0 @1) (op @0 @2))
> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
> +(op @0 (max @1 @2)))))
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
> new file mode 100644
> index 0000000..cc0189a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +#define N 1024
> +
> +int a[N], b[N], c[N];
> +
> +void add (unsigned int m, unsigned int n)
> +{
> +  unsigned int i;
> +  for (i = 0; i < m && i < n; ++i)

Maybe writing '&' instead of '&&' would make it depend less on the target. 
Also, both tests seem to be for GENERIC (i.e. I expect that you are 
already seeing the optimized version with -fdump-tree-original or 
-fdump-tree-gimple). Maybe something as simple as:
int f(long a, long b, long c) {
   int cmp1 = a < b;
   int cmp2 = a < c;
   return cmp1 & cmp2;
}

> +    a[i] = b[i] + c[i];
> +}
> +
> +void add2 (unsigned int m, unsigned int n)
> +{
> +  unsigned int i;
> +  for (i = N-1; i > m && i > n; --i)
> +    a[i] = b[i] + c[i];
> +}
> +
> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-18  7:38               ` Marc Glisse
@ 2015-09-18  7:41                 ` Marc Glisse
  2015-09-18 10:06                   ` Richard Biener
  2015-09-18 22:01                 ` Michael Collison
  1 sibling, 1 reply; 29+ messages in thread
From: Marc Glisse @ 2015-09-18  7:41 UTC (permalink / raw)
  To: gcc Patches; +Cc: Michael Collison, Jeff Law, Richard Biener

Just a couple extra points. We can end up with a mix of < and >, which 
might prevent from matching:

   _3 = b_1(D) > a_2(D);
   _5 = a_2(D) < c_4(D);
   _8 = _3 & _5;

Just like with &, we could also transform:
x < y | x < z  --->  x < max(y, z)

(but maybe wait to make sure reviewers are ok with the first 
transformation before generalizing)

On Fri, 18 Sep 2015, Marc Glisse wrote:

> On Thu, 17 Sep 2015, Michael Collison wrote:
>
>> Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR 
>> expressions. I need some assistance; this test case will fail on targets 
>> that don't have support for MIN/MAX such as 68k. Is there any way to remedy 
>> this short of enumerating whether a target support MIN/MAX in 
>> testsuite/lib/target_support?
>> 
>> 2015-07-24  Michael Collison <michael.collison@linaro.org>
>>                    Andrew Pinski <andrew.pinski@caviumnetworks.com>
>>
>>                * match.pd ((x < y) && (x < z) -> x < min (y,z),
>>                (x > y) and (x > z) -> x > max (y,z))
>>                * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.
>> 
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 5e8fd32..8691710 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not see
>>     (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>>               (convert:utype @4)))))))
>> 
>> +
>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>> +(for op (lt le)
>> +(simplify
>
> You seem to be missing all indentation.
>
>> +(bit_and:c (op @0 @1) (op @0 @2))
>
> :c seems useless here. On the other hand, it might make sense to use op:s 
> since this is mostly useful if it removes the 2 original comparisons.
>
>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>
> How did you chose this restriction? It seems safe enough, but the 
> transformation could make sense in other cases as well. It can always be 
> generalized later though.
>
>> +(op @0 (min @1 @2)))))
>> +
>> +/* Transform (@0 > @1 and @0 > @2) to use max */
>> +(for op (gt ge)
>
> Note that you could unify the patterns with something like:
> (for op (lt le gt ge)
>     ext (min min max max)
> (simplify ...
>
>> +(simplify
>> +(bit_and:c (op @0 @1) (op @0 @2))
>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>> +(op @0 (max @1 @2)))))
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c 
>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>> new file mode 100644
>> index 0000000..cc0189a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>> @@ -0,0 +1,23 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>> +
>> +#define N 1024
>> +
>> +int a[N], b[N], c[N];
>> +
>> +void add (unsigned int m, unsigned int n)
>> +{
>> +  unsigned int i;
>> +  for (i = 0; i < m && i < n; ++i)
>
> Maybe writing '&' instead of '&&' would make it depend less on the target. 
> Also, both tests seem to be for GENERIC (i.e. I expect that you are already 
> seeing the optimized version with -fdump-tree-original or 
> -fdump-tree-gimple). Maybe something as simple as:
> int f(long a, long b, long c) {
>  int cmp1 = a < b;
>  int cmp2 = a < c;
>  return cmp1 & cmp2;
> }
>
>> +    a[i] = b[i] + c[i];
>> +}
>> +
>> +void add2 (unsigned int m, unsigned int n)
>> +{
>> +  unsigned int i;
>> +  for (i = N-1; i > m && i > n; --i)
>> +    a[i] = b[i] + c[i];
>> +}
>> +
>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
>
>

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-18  7:41                 ` Marc Glisse
@ 2015-09-18 10:06                   ` Richard Biener
  2015-09-30  8:08                     ` Michael Collison
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Biener @ 2015-09-18 10:06 UTC (permalink / raw)
  To: GCC Patches; +Cc: Michael Collison, Jeff Law

On Fri, Sep 18, 2015 at 9:38 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Just a couple extra points. We can end up with a mix of < and >, which might
> prevent from matching:
>
>   _3 = b_1(D) > a_2(D);
>   _5 = a_2(D) < c_4(D);
>   _8 = _3 & _5;
>
> Just like with &, we could also transform:
> x < y | x < z  --->  x < max(y, z)
>
> (but maybe wait to make sure reviewers are ok with the first transformation
> before generalizing)

Please merge the patterns as suggested and do the :c/:s changes as well.

The issue with getting mixed < and > is indeed there - I've wanted to
extend :c to handle tcc_comparison in some way at some point but
didn't get to how best to implement that yet...

So to fix that currently you have to replicate the merged pattern
with swapped comparison operands.

Otherwise I'm fine with the general approach.

Richard.

>
> On Fri, 18 Sep 2015, Marc Glisse wrote:
>
>> On Thu, 17 Sep 2015, Michael Collison wrote:
>>
>>> Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR
>>> expressions. I need some assistance; this test case will fail on targets
>>> that don't have support for MIN/MAX such as 68k. Is there any way to remedy
>>> this short of enumerating whether a target support MIN/MAX in
>>> testsuite/lib/target_support?
>>>
>>> 2015-07-24  Michael Collison <michael.collison@linaro.org>
>>>                    Andrew Pinski <andrew.pinski@caviumnetworks.com>
>>>
>>>                * match.pd ((x < y) && (x < z) -> x < min (y,z),
>>>                (x > y) and (x > z) -> x > max (y,z))
>>>                * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.
>>>
>>> diff --git a/gcc/match.pd b/gcc/match.pd
>>> index 5e8fd32..8691710 100644
>>> --- a/gcc/match.pd
>>> +++ b/gcc/match.pd
>>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not see
>>>     (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>>>               (convert:utype @4)))))))
>>>
>>> +
>>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>>> +(for op (lt le)
>>> +(simplify
>>
>>
>> You seem to be missing all indentation.
>>
>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>
>>
>> :c seems useless here. On the other hand, it might make sense to use op:s
>> since this is mostly useful if it removes the 2 original comparisons.
>>
>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>
>>
>> How did you chose this restriction? It seems safe enough, but the
>> transformation could make sense in other cases as well. It can always be
>> generalized later though.
>>
>>> +(op @0 (min @1 @2)))))
>>> +
>>> +/* Transform (@0 > @1 and @0 > @2) to use max */
>>> +(for op (gt ge)
>>
>>
>> Note that you could unify the patterns with something like:
>> (for op (lt le gt ge)
>>     ext (min min max max)
>> (simplify ...
>>
>>> +(simplify
>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>> +(op @0 (max @1 @2)))))
>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>> new file mode 100644
>>> index 0000000..cc0189a
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>> @@ -0,0 +1,23 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>>> +
>>> +#define N 1024
>>> +
>>> +int a[N], b[N], c[N];
>>> +
>>> +void add (unsigned int m, unsigned int n)
>>> +{
>>> +  unsigned int i;
>>> +  for (i = 0; i < m && i < n; ++i)
>>
>>
>> Maybe writing '&' instead of '&&' would make it depend less on the target.
>> Also, both tests seem to be for GENERIC (i.e. I expect that you are already
>> seeing the optimized version with -fdump-tree-original or
>> -fdump-tree-gimple). Maybe something as simple as:
>> int f(long a, long b, long c) {
>>  int cmp1 = a < b;
>>  int cmp2 = a < c;
>>  return cmp1 & cmp2;
>> }
>>
>>> +    a[i] = b[i] + c[i];
>>> +}
>>> +
>>> +void add2 (unsigned int m, unsigned int n)
>>> +{
>>> +  unsigned int i;
>>> +  for (i = N-1; i > m && i > n; --i)
>>> +    a[i] = b[i] + c[i];
>>> +}
>>> +
>>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
>>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
>>
>>
>>
>
> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-18  7:38               ` Marc Glisse
  2015-09-18  7:41                 ` Marc Glisse
@ 2015-09-18 22:01                 ` Michael Collison
  2015-09-18 22:09                   ` Marc Glisse
  1 sibling, 1 reply; 29+ messages in thread
From: Michael Collison @ 2015-09-18 22:01 UTC (permalink / raw)
  To: gcc-patches; +Cc: Jeff Law, marc.glisse, Richard Biener

Marc,

Can you elaborate on merging the patterns using 'ext' as mentioned in 
your post? I don't see any documentation or examples.

On 09/18/2015 12:00 AM, Marc Glisse wrote:
> On Thu, 17 Sep 2015, Michael Collison wrote:
>
>> Here is the the patch modified with test cases for MIN_EXPR and 
>> MAX_EXPR expressions. I need some assistance; this test case will 
>> fail on targets that don't have support for MIN/MAX such as 68k. Is 
>> there any way to remedy this short of enumerating whether a target 
>> support MIN/MAX in testsuite/lib/target_support?
>>
>> 2015-07-24  Michael Collison <michael.collison@linaro.org>
>>                    Andrew Pinski <andrew.pinski@caviumnetworks.com>
>>
>>                * match.pd ((x < y) && (x < z) -> x < min (y,z),
>>                (x > y) and (x > z) -> x > max (y,z))
>>                * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.
>>
>> diff --git a/gcc/match.pd b/gcc/match.pd
>> index 5e8fd32..8691710 100644
>> --- a/gcc/match.pd
>> +++ b/gcc/match.pd
>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3. If not see
>>     (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>>               (convert:utype @4)))))))
>>
>> +
>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>> +(for op (lt le)
>> +(simplify
>
> You seem to be missing all indentation.
>
>> +(bit_and:c (op @0 @1) (op @0 @2))
>
> :c seems useless here. On the other hand, it might make sense to use 
> op:s since this is mostly useful if it removes the 2 original 
> comparisons.
>
>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>
> How did you chose this restriction? It seems safe enough, but the 
> transformation could make sense in other cases as well. It can always 
> be generalized later though.
>
>> +(op @0 (min @1 @2)))))
>> +
>> +/* Transform (@0 > @1 and @0 > @2) to use max */
>> +(for op (gt ge)
>
> Note that you could unify the patterns with something like:
> (for op (lt le gt ge)
>      ext (min min max max)
>  (simplify ...
>
>> +(simplify
>> +(bit_and:c (op @0 @1) (op @0 @2))
>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>> +(op @0 (max @1 @2)))))
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c 
>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>> new file mode 100644
>> index 0000000..cc0189a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>> @@ -0,0 +1,23 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>> +
>> +#define N 1024
>> +
>> +int a[N], b[N], c[N];
>> +
>> +void add (unsigned int m, unsigned int n)
>> +{
>> +  unsigned int i;
>> +  for (i = 0; i < m && i < n; ++i)
>
> Maybe writing '&' instead of '&&' would make it depend less on the 
> target. Also, both tests seem to be for GENERIC (i.e. I expect that 
> you are already seeing the optimized version with -fdump-tree-original 
> or -fdump-tree-gimple). Maybe something as simple as:
> int f(long a, long b, long c) {
>   int cmp1 = a < b;
>   int cmp2 = a < c;
>   return cmp1 & cmp2;
> }
>
>> +    a[i] = b[i] + c[i];
>> +}
>> +
>> +void add2 (unsigned int m, unsigned int n)
>> +{
>> +  unsigned int i;
>> +  for (i = N-1; i > m && i > n; --i)
>> +    a[i] = b[i] + c[i];
>> +}
>> +
>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
>

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-18 22:01                 ` Michael Collison
@ 2015-09-18 22:09                   ` Marc Glisse
  0 siblings, 0 replies; 29+ messages in thread
From: Marc Glisse @ 2015-09-18 22:09 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches, Jeff Law, Richard Biener

On Fri, 18 Sep 2015, Michael Collison wrote:

> Can you elaborate on merging the patterns using 'ext' as mentioned in your 
> post? I don't see any documentation or examples.

https://gcc.gnu.org/onlinedocs/gccint/The-Language.html
"for" lets you iterate on several variables at the same time.

For instance,

(for bitop (bit_and bit_ior)
      rbitop (bit_ior bit_and)
  (simplify
   (bitop:c (rbitop:c @0 @1) (bit_not@2 @0))
   (bitop @1 @2)))

expands to

(simplify
  (bit_and:c (bit_ior:c @0 @1) (bit_not@2 @0))
  (bit_and @1 @2))
(simplify
  (bit_ior:c (bit_and:c @0 @1) (bit_not@2 @0))
  (bit_ior @1 @2))

there are other examples using

(for cmp (eq ne)
      icmp (ne eq)

or

(for cmp (simple_comparison)
      scmp (swapped_simple_comparison)

and you can have repetitions

(for logs (LOG LOG
 	   LOG2 LOG2
 	   LOG10 LOG10)
      exps (SQRT CBRT)

(the iteration wraps around for exps).

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-18 10:06                   ` Richard Biener
@ 2015-09-30  8:08                     ` Michael Collison
  2015-09-30  9:10                       ` Richard Biener
  0 siblings, 1 reply; 29+ messages in thread
From: Michael Collison @ 2015-09-30  8:08 UTC (permalink / raw)
  To: Richard Biener, GCC Patches; +Cc: Jeff Law, Marc Glisse

Richard and Marc,

What is ':s'? I don't see any documentation for it. So you would like me 
to remove :c and add :s?


On 09/18/2015 02:23 AM, Richard Biener wrote:
> On Fri, Sep 18, 2015 at 9:38 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>> Just a couple extra points. We can end up with a mix of < and >, which might
>> prevent from matching:
>>
>>    _3 = b_1(D) > a_2(D);
>>    _5 = a_2(D) < c_4(D);
>>    _8 = _3 & _5;
>>
>> Just like with &, we could also transform:
>> x < y | x < z  --->  x < max(y, z)
>>
>> (but maybe wait to make sure reviewers are ok with the first transformation
>> before generalizing)
> Please merge the patterns as suggested and do the :c/:s changes as well.
>
> The issue with getting mixed < and > is indeed there - I've wanted to
> extend :c to handle tcc_comparison in some way at some point but
> didn't get to how best to implement that yet...
>
> So to fix that currently you have to replicate the merged pattern
> with swapped comparison operands.
>
> Otherwise I'm fine with the general approach.
>
> Richard.
>
>> On Fri, 18 Sep 2015, Marc Glisse wrote:
>>
>>> On Thu, 17 Sep 2015, Michael Collison wrote:
>>>
>>>> Here is the the patch modified with test cases for MIN_EXPR and MAX_EXPR
>>>> expressions. I need some assistance; this test case will fail on targets
>>>> that don't have support for MIN/MAX such as 68k. Is there any way to remedy
>>>> this short of enumerating whether a target support MIN/MAX in
>>>> testsuite/lib/target_support?
>>>>
>>>> 2015-07-24  Michael Collison <michael.collison@linaro.org>
>>>>                     Andrew Pinski <andrew.pinski@caviumnetworks.com>
>>>>
>>>>                 * match.pd ((x < y) && (x < z) -> x < min (y,z),
>>>>                 (x > y) and (x > z) -> x > max (y,z))
>>>>                 * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.
>>>>
>>>> diff --git a/gcc/match.pd b/gcc/match.pd
>>>> index 5e8fd32..8691710 100644
>>>> --- a/gcc/match.pd
>>>> +++ b/gcc/match.pd
>>>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not see
>>>>      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>>>>                (convert:utype @4)))))))
>>>>
>>>> +
>>>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>>>> +(for op (lt le)
>>>> +(simplify
>>>
>>> You seem to be missing all indentation.
>>>
>>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>>
>>> :c seems useless here. On the other hand, it might make sense to use op:s
>>> since this is mostly useful if it removes the 2 original comparisons.
>>>
>>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>>
>>> How did you chose this restriction? It seems safe enough, but the
>>> transformation could make sense in other cases as well. It can always be
>>> generalized later though.
>>>
>>>> +(op @0 (min @1 @2)))))
>>>> +
>>>> +/* Transform (@0 > @1 and @0 > @2) to use max */
>>>> +(for op (gt ge)
>>>
>>> Note that you could unify the patterns with something like:
>>> (for op (lt le gt ge)
>>>      ext (min min max max)
>>> (simplify ...
>>>
>>>> +(simplify
>>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>>> +(op @0 (max @1 @2)))))
>>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>>> new file mode 100644
>>>> index 0000000..cc0189a
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>>> @@ -0,0 +1,23 @@
>>>> +/* { dg-do compile } */
>>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>>>> +
>>>> +#define N 1024
>>>> +
>>>> +int a[N], b[N], c[N];
>>>> +
>>>> +void add (unsigned int m, unsigned int n)
>>>> +{
>>>> +  unsigned int i;
>>>> +  for (i = 0; i < m && i < n; ++i)
>>>
>>> Maybe writing '&' instead of '&&' would make it depend less on the target.
>>> Also, both tests seem to be for GENERIC (i.e. I expect that you are already
>>> seeing the optimized version with -fdump-tree-original or
>>> -fdump-tree-gimple). Maybe something as simple as:
>>> int f(long a, long b, long c) {
>>>   int cmp1 = a < b;
>>>   int cmp2 = a < c;
>>>   return cmp1 & cmp2;
>>> }
>>>
>>>> +    a[i] = b[i] + c[i];
>>>> +}
>>>> +
>>>> +void add2 (unsigned int m, unsigned int n)
>>>> +{
>>>> +  unsigned int i;
>>>> +  for (i = N-1; i > m && i > n; --i)
>>>> +    a[i] = b[i] + c[i];
>>>> +}
>>>> +
>>>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
>>>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
>>>
>>>
>> --
>> Marc Glisse

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-30  8:08                     ` Michael Collison
@ 2015-09-30  9:10                       ` Richard Biener
  2015-09-30 16:51                         ` Michael Collison
  0 siblings, 1 reply; 29+ messages in thread
From: Richard Biener @ 2015-09-30  9:10 UTC (permalink / raw)
  To: Michael Collison; +Cc: GCC Patches, Jeff Law, Marc Glisse

On Wed, Sep 30, 2015 at 9:29 AM, Michael Collison
<michael.collison@linaro.org> wrote:
> Richard and Marc,
>
> What is ':s'? I don't see any documentation for it. So you would like me to
> remove :c and add :s?

There is documentation for both in the internals manual.

I don't have enough context to say whether you should remove "them" or
not.  What's
the current patch?  If you made the suggested changes you should be left with
only required :s and :c.

Richard.

>
>
> On 09/18/2015 02:23 AM, Richard Biener wrote:
>>
>> On Fri, Sep 18, 2015 at 9:38 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>
>>> Just a couple extra points. We can end up with a mix of < and >, which
>>> might
>>> prevent from matching:
>>>
>>>    _3 = b_1(D) > a_2(D);
>>>    _5 = a_2(D) < c_4(D);
>>>    _8 = _3 & _5;
>>>
>>> Just like with &, we could also transform:
>>> x < y | x < z  --->  x < max(y, z)
>>>
>>> (but maybe wait to make sure reviewers are ok with the first
>>> transformation
>>> before generalizing)
>>
>> Please merge the patterns as suggested and do the :c/:s changes as well.
>>
>> The issue with getting mixed < and > is indeed there - I've wanted to
>> extend :c to handle tcc_comparison in some way at some point but
>> didn't get to how best to implement that yet...
>>
>> So to fix that currently you have to replicate the merged pattern
>> with swapped comparison operands.
>>
>> Otherwise I'm fine with the general approach.
>>
>> Richard.
>>
>>> On Fri, 18 Sep 2015, Marc Glisse wrote:
>>>
>>>> On Thu, 17 Sep 2015, Michael Collison wrote:
>>>>
>>>>> Here is the the patch modified with test cases for MIN_EXPR and
>>>>> MAX_EXPR
>>>>> expressions. I need some assistance; this test case will fail on
>>>>> targets
>>>>> that don't have support for MIN/MAX such as 68k. Is there any way to
>>>>> remedy
>>>>> this short of enumerating whether a target support MIN/MAX in
>>>>> testsuite/lib/target_support?
>>>>>
>>>>> 2015-07-24  Michael Collison <michael.collison@linaro.org>
>>>>>                     Andrew Pinski <andrew.pinski@caviumnetworks.com>
>>>>>
>>>>>                 * match.pd ((x < y) && (x < z) -> x < min (y,z),
>>>>>                 (x > y) and (x > z) -> x > max (y,z))
>>>>>                 * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.
>>>>>
>>>>> diff --git a/gcc/match.pd b/gcc/match.pd
>>>>> index 5e8fd32..8691710 100644
>>>>> --- a/gcc/match.pd
>>>>> +++ b/gcc/match.pd
>>>>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not
>>>>> see
>>>>>      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>>>>>                (convert:utype @4)))))))
>>>>>
>>>>> +
>>>>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>>>>> +(for op (lt le)
>>>>> +(simplify
>>>>
>>>>
>>>> You seem to be missing all indentation.
>>>>
>>>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>>>
>>>>
>>>> :c seems useless here. On the other hand, it might make sense to use
>>>> op:s
>>>> since this is mostly useful if it removes the 2 original comparisons.
>>>>
>>>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>>>
>>>>
>>>> How did you chose this restriction? It seems safe enough, but the
>>>> transformation could make sense in other cases as well. It can always be
>>>> generalized later though.
>>>>
>>>>> +(op @0 (min @1 @2)))))
>>>>> +
>>>>> +/* Transform (@0 > @1 and @0 > @2) to use max */
>>>>> +(for op (gt ge)
>>>>
>>>>
>>>> Note that you could unify the patterns with something like:
>>>> (for op (lt le gt ge)
>>>>      ext (min min max max)
>>>> (simplify ...
>>>>
>>>>> +(simplify
>>>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>>>> +(op @0 (max @1 @2)))))
>>>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>>>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>>>> new file mode 100644
>>>>> index 0000000..cc0189a
>>>>> --- /dev/null
>>>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>>>> @@ -0,0 +1,23 @@
>>>>> +/* { dg-do compile } */
>>>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>>>>> +
>>>>> +#define N 1024
>>>>> +
>>>>> +int a[N], b[N], c[N];
>>>>> +
>>>>> +void add (unsigned int m, unsigned int n)
>>>>> +{
>>>>> +  unsigned int i;
>>>>> +  for (i = 0; i < m && i < n; ++i)
>>>>
>>>>
>>>> Maybe writing '&' instead of '&&' would make it depend less on the
>>>> target.
>>>> Also, both tests seem to be for GENERIC (i.e. I expect that you are
>>>> already
>>>> seeing the optimized version with -fdump-tree-original or
>>>> -fdump-tree-gimple). Maybe something as simple as:
>>>> int f(long a, long b, long c) {
>>>>   int cmp1 = a < b;
>>>>   int cmp2 = a < c;
>>>>   return cmp1 & cmp2;
>>>> }
>>>>
>>>>> +    a[i] = b[i] + c[i];
>>>>> +}
>>>>> +
>>>>> +void add2 (unsigned int m, unsigned int n)
>>>>> +{
>>>>> +  unsigned int i;
>>>>> +  for (i = N-1; i > m && i > n; --i)
>>>>> +    a[i] = b[i] + c[i];
>>>>> +}
>>>>> +
>>>>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
>>>>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
>>>>
>>>>
>>>>
>>> --
>>> Marc Glisse
>
>
> --
> Michael Collison
> Linaro Toolchain Working Group
> michael.collison@linaro.org
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-30  9:10                       ` Richard Biener
@ 2015-09-30 16:51                         ` Michael Collison
  2015-09-30 21:12                           ` Marc Glisse
  0 siblings, 1 reply; 29+ messages in thread
From: Michael Collison @ 2015-09-30 16:51 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches, Jeff Law, Marc Glisse

[-- Attachment #1: Type: text/plain, Size: 5723 bytes --]


The current patch is attached.

2015-09-30  Michael Collison  <michael.collison@linaro.org>
         Andrew Pinski <andrew.pinski@caviumnetworks.com>

     * match.pd ((x < y) && (x < z) -> x < min (y,z),
     (x > y) and (x > z) -> x > max (y,z))
     * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.


On 09/30/2015 01:14 AM, Richard Biener wrote:
> On Wed, Sep 30, 2015 at 9:29 AM, Michael Collison
> <michael.collison@linaro.org> wrote:
>> Richard and Marc,
>>
>> What is ':s'? I don't see any documentation for it. So you would like me to
>> remove :c and add :s?
> There is documentation for both in the internals manual.
>
> I don't have enough context to say whether you should remove "them" or
> not.  What's
> the current patch?  If you made the suggested changes you should be left with
> only required :s and :c.
>
> Richard.
>
>>
>> On 09/18/2015 02:23 AM, Richard Biener wrote:
>>> On Fri, Sep 18, 2015 at 9:38 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
>>>> Just a couple extra points. We can end up with a mix of < and >, which
>>>> might
>>>> prevent from matching:
>>>>
>>>>     _3 = b_1(D) > a_2(D);
>>>>     _5 = a_2(D) < c_4(D);
>>>>     _8 = _3 & _5;
>>>>
>>>> Just like with &, we could also transform:
>>>> x < y | x < z  --->  x < max(y, z)
>>>>
>>>> (but maybe wait to make sure reviewers are ok with the first
>>>> transformation
>>>> before generalizing)
>>> Please merge the patterns as suggested and do the :c/:s changes as well.
>>>
>>> The issue with getting mixed < and > is indeed there - I've wanted to
>>> extend :c to handle tcc_comparison in some way at some point but
>>> didn't get to how best to implement that yet...
>>>
>>> So to fix that currently you have to replicate the merged pattern
>>> with swapped comparison operands.
>>>
>>> Otherwise I'm fine with the general approach.
>>>
>>> Richard.
>>>
>>>> On Fri, 18 Sep 2015, Marc Glisse wrote:
>>>>
>>>>> On Thu, 17 Sep 2015, Michael Collison wrote:
>>>>>
>>>>>> Here is the the patch modified with test cases for MIN_EXPR and
>>>>>> MAX_EXPR
>>>>>> expressions. I need some assistance; this test case will fail on
>>>>>> targets
>>>>>> that don't have support for MIN/MAX such as 68k. Is there any way to
>>>>>> remedy
>>>>>> this short of enumerating whether a target support MIN/MAX in
>>>>>> testsuite/lib/target_support?
>>>>>>
>>>>>> 2015-07-24  Michael Collison <michael.collison@linaro.org>
>>>>>>                      Andrew Pinski <andrew.pinski@caviumnetworks.com>
>>>>>>
>>>>>>                  * match.pd ((x < y) && (x < z) -> x < min (y,z),
>>>>>>                  (x > y) and (x > z) -> x > max (y,z))
>>>>>>                  * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.
>>>>>>
>>>>>> diff --git a/gcc/match.pd b/gcc/match.pd
>>>>>> index 5e8fd32..8691710 100644
>>>>>> --- a/gcc/match.pd
>>>>>> +++ b/gcc/match.pd
>>>>>> @@ -1793,3 +1793,17 @@ along with GCC; see the file COPYING3.  If not
>>>>>> see
>>>>>>       (convert (bit_and (op (convert:utype @0) (convert:utype @1))
>>>>>>                 (convert:utype @4)))))))
>>>>>>
>>>>>> +
>>>>>> +/* Transform (@0 < @1 and @0 < @2) to use min */
>>>>>> +(for op (lt le)
>>>>>> +(simplify
>>>>>
>>>>> You seem to be missing all indentation.
>>>>>
>>>>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>>>>
>>>>> :c seems useless here. On the other hand, it might make sense to use
>>>>> op:s
>>>>> since this is mostly useful if it removes the 2 original comparisons.
>>>>>
>>>>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>>>>
>>>>> How did you chose this restriction? It seems safe enough, but the
>>>>> transformation could make sense in other cases as well. It can always be
>>>>> generalized later though.
>>>>>
>>>>>> +(op @0 (min @1 @2)))))
>>>>>> +
>>>>>> +/* Transform (@0 > @1 and @0 > @2) to use max */
>>>>>> +(for op (gt ge)
>>>>>
>>>>> Note that you could unify the patterns with something like:
>>>>> (for op (lt le gt ge)
>>>>>       ext (min min max max)
>>>>> (simplify ...
>>>>>
>>>>>> +(simplify
>>>>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>>>>> +(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>>>>>> +(op @0 (max @1 @2)))))
>>>>>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>>>>> b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>>>>> new file mode 100644
>>>>>> index 0000000..cc0189a
>>>>>> --- /dev/null
>>>>>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
>>>>>> @@ -0,0 +1,23 @@
>>>>>> +/* { dg-do compile } */
>>>>>> +/* { dg-options "-O2 -fdump-tree-optimized" } */
>>>>>> +
>>>>>> +#define N 1024
>>>>>> +
>>>>>> +int a[N], b[N], c[N];
>>>>>> +
>>>>>> +void add (unsigned int m, unsigned int n)
>>>>>> +{
>>>>>> +  unsigned int i;
>>>>>> +  for (i = 0; i < m && i < n; ++i)
>>>>>
>>>>> Maybe writing '&' instead of '&&' would make it depend less on the
>>>>> target.
>>>>> Also, both tests seem to be for GENERIC (i.e. I expect that you are
>>>>> already
>>>>> seeing the optimized version with -fdump-tree-original or
>>>>> -fdump-tree-gimple). Maybe something as simple as:
>>>>> int f(long a, long b, long c) {
>>>>>    int cmp1 = a < b;
>>>>>    int cmp2 = a < c;
>>>>>    return cmp1 & cmp2;
>>>>> }
>>>>>
>>>>>> +    a[i] = b[i] + c[i];
>>>>>> +}
>>>>>> +
>>>>>> +void add2 (unsigned int m, unsigned int n)
>>>>>> +{
>>>>>> +  unsigned int i;
>>>>>> +  for (i = N-1; i > m && i > n; --i)
>>>>>> +    a[i] = b[i] + c[i];
>>>>>> +}
>>>>>> +
>>>>>> +/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
>>>>>> +/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
>>>>>
>>>>>
>>>> --
>>>> Marc Glisse
>>
>> --
>> Michael Collison
>> Linaro Toolchain Working Group
>> michael.collison@linaro.org
>>

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org


[-- Attachment #2: tcwg-140-v6-upstream.patch --]
[-- Type: text/x-patch, Size: 1207 bytes --]

diff --git a/gcc/match.pd b/gcc/match.pd
index 5e8fd32..77cb91d 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1793,3 +1793,13 @@ along with GCC; see the file COPYING3.  If not see
 	(convert (bit_and (op (convert:utype @0) (convert:utype @1))
 			  (convert:utype @4)))))))
 
+
+/* Transform (@0 < @1 and @0 < @2) to use min, 
+   (@0 > @1 and @0 > @2) to use max */
+(for op (lt le gt ge)
+     ext (min min max max)
+(simplify
+(bit_and:c (op @0 @1) (op @0 @2))
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+(op @0 (ext @1 @2)))))
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
new file mode 100644
index 0000000..dfe6120
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int min_test(long a, long b, long c) {
+  int cmp1 = a < b;
+  int cmp2 = a < c;
+  return cmp1 & cmp2;
+}
+
+int max_test (long a, long b, long c) {
+  int cmp1 = a > b;
+  int cmp2 = a > c;
+  return cmp1 & cmp2;
+}
+
+/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-30 16:51                         ` Michael Collison
@ 2015-09-30 21:12                           ` Marc Glisse
  2015-10-01  5:40                             ` Michael Collison
  2015-10-01  7:59                             ` Michael Collison
  0 siblings, 2 replies; 29+ messages in thread
From: Marc Glisse @ 2015-09-30 21:12 UTC (permalink / raw)
  To: Michael Collison; +Cc: Richard Biener, GCC Patches, Jeff Law

>>>>> On Fri, 18 Sep 2015, Marc Glisse wrote:
>>>>>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>>>>> 
>>>>>> :c seems useless here. On the other hand, it might make sense to use op:s
>>>>>> since this is mostly useful if it removes the 2 original comparisons.

As I was saying, :c is useless.
(x:c y z)
is replaced by two copies of the transformation, one with
(x y z)
and the other with
(x z y)
In your transformation, both versions would be equivalent, so the second
one is redundant.

Also, if you have:
a=x<y;
b=x<z;
c=a&b;
reuse(a);
reuse(b);

(i.e. the comparison results are used for more than just this bit_and)
then your transformation may make the code more expensive. To avoid
this, you can write op:s, meaning that the result of op is used only
once.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-30 21:12                           ` Marc Glisse
@ 2015-10-01  5:40                             ` Michael Collison
  2015-10-01  6:42                               ` Marc Glisse
  2015-10-01  7:59                             ` Michael Collison
  1 sibling, 1 reply; 29+ messages in thread
From: Michael Collison @ 2015-10-01  5:40 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener, marc.glisse, Jeff Law

[-- Attachment #1: Type: text/plain, Size: 1298 bytes --]

Richard and Marc,

Latest patch attached which incorporates all comments.

2015-09-30  Michael Collison <michael.collison@linaro.org>
         Andrew Pinski <andrew.pinski@caviumnetworks.com>

     * match.pd ((x < y) && (x < z) -> x < min (y,z),
     (x > y) and (x > z) -> x > max (y,z))
     * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.

On 09/30/2015 12:30 PM, Marc Glisse wrote:
>>>>>> On Fri, 18 Sep 2015, Marc Glisse wrote:
>>>>>>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>>>>>>
>>>>>>> :c seems useless here. On the other hand, it might make sense to 
>>>>>>> use op:s
>>>>>>> since this is mostly useful if it removes the 2 original 
>>>>>>> comparisons.
>
> As I was saying, :c is useless.
> (x:c y z)
> is replaced by two copies of the transformation, one with
> (x y z)
> and the other with
> (x z y)
> In your transformation, both versions would be equivalent, so the second
> one is redundant.
>
> Also, if you have:
> a=x<y;
> b=x<z;
> c=a&b;
> reuse(a);
> reuse(b);
>
> (i.e. the comparison results are used for more than just this bit_and)
> then your transformation may make the code more expensive. To avoid
> this, you can write op:s, meaning that the result of op is used only
> once.
>

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org


[-- Attachment #2: tcwg-140-v7.patch --]
[-- Type: text/x-patch, Size: 1279 bytes --]

diff --git a/gcc/match.pd b/gcc/match.pd
index bd5c267..ef2e025 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2311,3 +2311,13 @@ along with GCC; see the file COPYING3.  If not see
     (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
 	       (convert:utype @4))))))))
+
+/* Transform (@0 < @1 and @0 < @2) to use min, 
+   (@0 > @1 and @0 > @2) to use max */
+(for op (lt le gt ge)
+     ext (min min max max)
+(simplify
+(bit_and (op:s @0 @1) (op:s @0 @2))
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+(op @0 (ext @1 @2)))))
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
new file mode 100644
index 0000000..dfe6120
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int min_test(long a, long b, long c) {
+  int cmp1 = a < b;
+  int cmp2 = a < c;
+  return cmp1 & cmp2;
+}
+
+int max_test (long a, long b, long c) {
+  int cmp1 = a > b;
+  int cmp2 = a > c;
+  return cmp1 & cmp2;
+}
+
+/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump "MAX_EXPR" 1 "optimized" } } */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-10-01  5:40                             ` Michael Collison
@ 2015-10-01  6:42                               ` Marc Glisse
  0 siblings, 0 replies; 29+ messages in thread
From: Marc Glisse @ 2015-10-01  6:42 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches, Richard Biener, Jeff Law

On Wed, 30 Sep 2015, Michael Collison wrote:

> Richard and Marc,
>
> Latest patch attached which incorporates all comments.
>
> 2015-09-30  Michael Collison <michael.collison@linaro.org>
>        Andrew Pinski <andrew.pinski@caviumnetworks.com>
>
>    * match.pd ((x < y) && (x < z) -> x < min (y,z),
>    (x > y) and (x > z) -> x > max (y,z))
>    * testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.

You are still missing at least the indentation.

+/* { dg-final { scan-tree-dump "MIN_EXPR" 1 "optimized" } } */

I believe it is only scan-tree-dump-times that takes a number. 
scan-tree-dump seems to have only 2 arguments.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-09-30 21:12                           ` Marc Glisse
  2015-10-01  5:40                             ` Michael Collison
@ 2015-10-01  7:59                             ` Michael Collison
  2015-10-01  8:05                               ` Marc Glisse
  1 sibling, 1 reply; 29+ messages in thread
From: Michael Collison @ 2015-10-01  7:59 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener, marc.glisse, Jeff Law

[-- Attachment #1: Type: text/plain, Size: 990 bytes --]


ChangeLog formatting and test case fixed.

On 09/30/2015 12:30 PM, Marc Glisse wrote:
>>>>>> On Fri, 18 Sep 2015, Marc Glisse wrote:
>>>>>>>> +(bit_and:c (op @0 @1) (op @0 @2))
>>>>>>>
>>>>>>> :c seems useless here. On the other hand, it might make sense to 
>>>>>>> use op:s
>>>>>>> since this is mostly useful if it removes the 2 original 
>>>>>>> comparisons.
>
> As I was saying, :c is useless.
> (x:c y z)
> is replaced by two copies of the transformation, one with
> (x y z)
> and the other with
> (x z y)
> In your transformation, both versions would be equivalent, so the second
> one is redundant.
>
> Also, if you have:
> a=x<y;
> b=x<z;
> c=a&b;
> reuse(a);
> reuse(b);
>
> (i.e. the comparison results are used for more than just this bit_and)
> then your transformation may make the code more expensive. To avoid
> this, you can write op:s, meaning that the result of op is used only
> once.
>

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org


[-- Attachment #2: ChangeLog.txt --]
[-- Type: text/plain, Size: 261 bytes --]

2015-09-30  Michael Collison  <michael.collison@linaro.org>
	    Andrew Pinski <andrew.pinski@caviumnetworks.com>

	* match.pd ((x < y) && (x < z) -> x < min (y,z),
	(x > y) and (x > z) -> x > max (y,z))
	* testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.

[-- Attachment #3: tcwg-140-v8.patch --]
[-- Type: text/x-patch, Size: 1291 bytes --]

diff --git a/gcc/match.pd b/gcc/match.pd
index bd5c267..ef2e025 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2311,3 +2311,13 @@ along with GCC; see the file COPYING3.  If not see
     (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
 	       (convert:utype @4))))))))
+
+/* Transform (@0 < @1 and @0 < @2) to use min, 
+   (@0 > @1 and @0 > @2) to use max */
+(for op (lt le gt ge)
+     ext (min min max max)
+(simplify
+(bit_and (op:s @0 @1) (op:s @0 @2))
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+(op @0 (ext @1 @2)))))
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
new file mode 100644
index 0000000..2e4300c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int min_test(long a, long b, long c) {
+  int cmp1 = a < b;
+  int cmp2 = a < c;
+  return cmp1 & cmp2;
+}
+
+int max_test (long a, long b, long c) {
+  int cmp1 = a > b;
+  int cmp2 = a > c;
+  return cmp1 & cmp2;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-10-01  7:59                             ` Michael Collison
@ 2015-10-01  8:05                               ` Marc Glisse
  2015-10-01  8:21                                 ` Michael Collison
  0 siblings, 1 reply; 29+ messages in thread
From: Marc Glisse @ 2015-10-01  8:05 UTC (permalink / raw)
  To: Michael Collison; +Cc: gcc-patches, Richard Biener, Jeff Law

On Thu, 1 Oct 2015, Michael Collison wrote:

> ChangeLog formatting and test case fixed.

Oups, sorry for the lack of precision, but I meant indenting the code in 
match.pd, I hadn't even looked at the ChangeLog.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-10-01  8:05                               ` Marc Glisse
@ 2015-10-01  8:21                                 ` Michael Collison
  2015-10-06  9:21                                   ` Richard Biener
  0 siblings, 1 reply; 29+ messages in thread
From: Michael Collison @ 2015-10-01  8:21 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Biener, Marc Glisse, Jeff Law

[-- Attachment #1: Type: text/plain, Size: 421 bytes --]

Marc,

Ah I did misunderstand you. Patch with match.pd formatting fix.

On 10/01/2015 01:05 AM, Marc Glisse wrote:
> On Thu, 1 Oct 2015, Michael Collison wrote:
>
>> ChangeLog formatting and test case fixed.
>
> Oups, sorry for the lack of precision, but I meant indenting the code 
> in match.pd, I hadn't even looked at the ChangeLog.
>

-- 
Michael Collison
Linaro Toolchain Working Group
michael.collison@linaro.org


[-- Attachment #2: ChangeLog.txt --]
[-- Type: text/plain, Size: 261 bytes --]

2015-09-30  Michael Collison  <michael.collison@linaro.org>
	    Andrew Pinski <andrew.pinski@caviumnetworks.com>

	* match.pd ((x < y) && (x < z) -> x < min (y,z),
	(x > y) and (x > z) -> x > max (y,z))
	* testsuite/gcc.dg/tree-ssa/minmax-loopend.c: New test.

[-- Attachment #3: tcwg-140-v9.patch --]
[-- Type: text/x-patch, Size: 1299 bytes --]

diff --git a/gcc/match.pd b/gcc/match.pd
index bd5c267..caf3c82 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2311,3 +2311,13 @@ along with GCC; see the file COPYING3.  If not see
     (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
      (convert (bit_and (op (convert:utype @0) (convert:utype @1))
 	       (convert:utype @4))))))))
+
+/* Transform (@0 < @1 and @0 < @2) to use min, 
+   (@0 > @1 and @0 > @2) to use max */
+(for op (lt le gt ge)
+     ext (min min max max)
+ (simplify
+  (bit_and (op:s @0 @1) (op:s @0 @2))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)))
+   (op @0 (ext @1 @2)))))
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
new file mode 100644
index 0000000..2e4300c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/minmax-loopend.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int min_test(long a, long b, long c) {
+  int cmp1 = a < b;
+  int cmp2 = a < c;
+  return cmp1 & cmp2;
+}
+
+int max_test (long a, long b, long c) {
+  int cmp1 = a > b;
+  int cmp2 = a > c;
+  return cmp1 & cmp2;
+}
+
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "optimized" } } */
-- 
1.9.1


^ permalink raw reply	[flat|nested] 29+ messages in thread

* Re: [PATCH] Optimize certain end of loop conditions into min/max operation
  2015-10-01  8:21                                 ` Michael Collison
@ 2015-10-06  9:21                                   ` Richard Biener
  0 siblings, 0 replies; 29+ messages in thread
From: Richard Biener @ 2015-10-06  9:21 UTC (permalink / raw)
  To: Michael Collison; +Cc: GCC Patches, Marc Glisse, Jeff Law

On Thu, Oct 1, 2015 at 10:20 AM, Michael Collison
<michael.collison@linaro.org> wrote:
> Marc,
>
> Ah I did misunderstand you. Patch with match.pd formatting fix.

Ok if bootstrap/regtest passes.

Thanks,
Richard.

> On 10/01/2015 01:05 AM, Marc Glisse wrote:
>>
>> On Thu, 1 Oct 2015, Michael Collison wrote:
>>
>>> ChangeLog formatting and test case fixed.
>>
>>
>> Oups, sorry for the lack of precision, but I meant indenting the code in
>> match.pd, I hadn't even looked at the ChangeLog.
>>
>
> --
> Michael Collison
> Linaro Toolchain Working Group
> michael.collison@linaro.org
>

^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2015-10-06  9:21 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-27  4:23 [PATCH] Optimize certain end of loop conditions into min/max operation Michael Collison
2015-07-27  7:36 ` Bin.Cheng
2015-07-27  8:11   ` Bin.Cheng
2015-07-27  9:47 ` Richard Biener
2015-07-27 16:25   ` Jeff Law
2015-07-28  7:59     ` Richard Biener
2015-07-29 22:38       ` Michael Collison
2015-07-31 16:40       ` Jeff Law
2015-07-31 18:48         ` Michael Collison
2015-07-31 19:10           ` Jeff Law
2015-08-03  7:34             ` Richard Biener
2015-09-18  7:00             ` Michael Collison
2015-09-18  7:38               ` Marc Glisse
2015-09-18  7:41                 ` Marc Glisse
2015-09-18 10:06                   ` Richard Biener
2015-09-30  8:08                     ` Michael Collison
2015-09-30  9:10                       ` Richard Biener
2015-09-30 16:51                         ` Michael Collison
2015-09-30 21:12                           ` Marc Glisse
2015-10-01  5:40                             ` Michael Collison
2015-10-01  6:42                               ` Marc Glisse
2015-10-01  7:59                             ` Michael Collison
2015-10-01  8:05                               ` Marc Glisse
2015-10-01  8:21                                 ` Michael Collison
2015-10-06  9:21                                   ` Richard Biener
2015-09-18 22:01                 ` Michael Collison
2015-09-18 22:09                   ` Marc Glisse
2015-08-05 16:08   ` Alan Lawrence
2015-08-05 16:15     ` Jeff Law

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).