match.pd: Optimize (x & y) ^ (x

public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed

* match.pd: Optimize (x & y) ^ (x | y)
@ 2015-06-11 11:08 Marek Polacek
  2015-06-11 11:09 ` Richard Biener
                   ` (2 more replies)
  0 siblings, 3 replies; 16+ messages in thread
From: Marek Polacek @ 2015-06-11 11:08 UTC (permalink / raw)
  To: GCC Patches, Richard Biener

This patch introduces a new pattern for the match-and-simplify
machinery.

I have verified this transformation on a toy testcase (tried x and y
in the range [-1000,1000]) and it does a correct thing for all integers.

The asm diff for fn1 is
-	andl	%esi, %eax
-	orl	%edi, %esi
so clearly a win.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2015-06-11  Marek Polacek  <polacek@redhat.com>

	* match.pd ((x & y) ^ (x | y) -> x ^ y): New pattern.

	* gcc.dg/fold-xor-3.c: New test.

diff --git gcc/match.pd gcc/match.pd
index 48358a8..7a7b201 100644
--- gcc/match.pd
+++ gcc/match.pd
@@ -320,6 +320,12 @@ along with GCC; see the file COPYING3.  If not see
   (bitop:c (rbitop:c @0 @1) (bit_not@2 @0))
   (bitop @1 @2)))
 
+/* (x & y) ^ (x | y) -> x ^ y */
+(simplify
+ (bit_xor:c (bit_and@2 @0 @1) (bit_ior@3 @0 @1))
+  (if (single_use (@2) && single_use (@3))
+   (bit_xor @0 @1)))
+
 (simplify
  (abs (negate @0))
  (abs @0))
diff --git gcc/testsuite/gcc.dg/fold-xor-3.c gcc/testsuite/gcc.dg/fold-xor-3.c
index e69de29..a66f89d 100644
--- gcc/testsuite/gcc.dg/fold-xor-3.c
+++ gcc/testsuite/gcc.dg/fold-xor-3.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-fdump-tree-original" } */
+
+int
+fn1 (signed int x, signed int y)
+{
+  return (x & y) ^ (x | y);
+}
+
+unsigned int
+fn2 (unsigned int x, unsigned int y)
+{
+  return (x & y) ^ (x | y);
+}
+
+int
+fn3 (signed int x, signed int y)
+{
+  return (x | y) ^ (x & y);
+}
+
+unsigned int
+fn4 (unsigned int x, unsigned int y)
+{
+  return (x | y) ^ (x & y);
+}
+
+/* { dg-final { scan-tree-dump-times "return x \\^ y;" 4 "original" } } */

	Marek

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 11:08 match.pd: Optimize (x & y) ^ (x | y) Marek Polacek
@ 2015-06-11 11:09 ` Richard Biener
  2015-06-11 12:14   ` Marek Polacek
  2015-06-11 11:18 ` Jakub Jelinek
  2015-06-11 15:26 ` Marc Glisse
  2 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2015-06-11 11:09 UTC (permalink / raw)
  To: Marek Polacek; +Cc: GCC Patches

On Thu, 11 Jun 2015, Marek Polacek wrote:

> This patch introduces a new pattern for the match-and-simplify
> machinery.
> 
> I have verified this transformation on a toy testcase (tried x and y
> in the range [-1000,1000]) and it does a correct thing for all integers.
> 
> The asm diff for fn1 is
> -	andl	%esi, %eax
> -	orl	%edi, %esi
> so clearly a win.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

If the pattern doesn't exist in fold-const.c (obviously it doesn't)
then I think it makes sense to only add testcases to match this on
GIMPLE (because in the long run we'd like to _not_ transform things
in GENERIC, at least not without -O).

Thus sth like

> +int
> +fn1 (signed int x, signed int y)
> +{
   signed int tem1 = x & y;
   singed int tem2 = x | y;
> +  return tem1 ^ tem2;
> +}

and scanning the first DCE dump (cddec1) for the applied transform.

Yes, the scanning is a little bit more complicated...

Otherwise the patch is of course ok.  (take the suggestion above
for followup patches)

Thanks,
Richard.

> 2015-06-11  Marek Polacek  <polacek@redhat.com>
> 
> 	* match.pd ((x & y) ^ (x | y) -> x ^ y): New pattern.
> 
> 	* gcc.dg/fold-xor-3.c: New test.
> 
> diff --git gcc/match.pd gcc/match.pd
> index 48358a8..7a7b201 100644
> --- gcc/match.pd
> +++ gcc/match.pd
> @@ -320,6 +320,12 @@ along with GCC; see the file COPYING3.  If not see
>    (bitop:c (rbitop:c @0 @1) (bit_not@2 @0))
>    (bitop @1 @2)))
>  
> +/* (x & y) ^ (x | y) -> x ^ y */
> +(simplify
> + (bit_xor:c (bit_and@2 @0 @1) (bit_ior@3 @0 @1))
> +  (if (single_use (@2) && single_use (@3))
> +   (bit_xor @0 @1)))
> +
>  (simplify
>   (abs (negate @0))
>   (abs @0))
> diff --git gcc/testsuite/gcc.dg/fold-xor-3.c gcc/testsuite/gcc.dg/fold-xor-3.c
> index e69de29..a66f89d 100644
> --- gcc/testsuite/gcc.dg/fold-xor-3.c
> +++ gcc/testsuite/gcc.dg/fold-xor-3.c
> @@ -0,0 +1,28 @@
> +/* { dg-do compile } */
> +/* { dg-options "-fdump-tree-original" } */
> +
> +int
> +fn1 (signed int x, signed int y)
> +{
> +  return (x & y) ^ (x | y);
> +}
> +
> +unsigned int
> +fn2 (unsigned int x, unsigned int y)
> +{
> +  return (x & y) ^ (x | y);
> +}
> +
> +int
> +fn3 (signed int x, signed int y)
> +{
> +  return (x | y) ^ (x & y);
> +}
> +
> +unsigned int
> +fn4 (unsigned int x, unsigned int y)
> +{
> +  return (x | y) ^ (x & y);
> +}
> +
> +/* { dg-final { scan-tree-dump-times "return x \\^ y;" 4 "original" } } */
> 
> 	Marek
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Dilip Upmanyu, Graham Norton, HRB 21284 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 11:08 match.pd: Optimize (x & y) ^ (x | y) Marek Polacek
  2015-06-11 11:09 ` Richard Biener
@ 2015-06-11 11:18 ` Jakub Jelinek
  2015-06-11 12:07   ` Marek Polacek
  2015-06-11 15:26 ` Marc Glisse
  2 siblings, 1 reply; 16+ messages in thread
From: Jakub Jelinek @ 2015-06-11 11:18 UTC (permalink / raw)
  To: Marek Polacek; +Cc: GCC Patches, Richard Biener

On Thu, Jun 11, 2015 at 01:04:32PM +0200, Marek Polacek wrote:
> This patch introduces a new pattern for the match-and-simplify
> machinery.
> 
> I have verified this transformation on a toy testcase (tried x and y
> in the range [-1000,1000]) and it does a correct thing for all integers.
> 
> The asm diff for fn1 is
> -	andl	%esi, %eax
> -	orl	%edi, %esi
> so clearly a win.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?

What about some nop type conversions in between?
int
fn1 (unsigned int x, unsigned int y)
{
  int a = x;
  int b = y;
  unsigned int c = x & y;
  int d = a | b;
  return (int) (c ^ d);
}
?  Also wonder, if some testcases for match.pd shouldn't be
in separate statements so that they don't test the GENERIC folding,
but GIMPLE folding.

	Jakub

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 11:18 ` Jakub Jelinek
@ 2015-06-11 12:07   ` Marek Polacek
  2015-06-11 20:12     ` Marc Glisse
  0 siblings, 1 reply; 16+ messages in thread
From: Marek Polacek @ 2015-06-11 12:07 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: GCC Patches, Richard Biener

On Thu, Jun 11, 2015 at 01:09:05PM +0200, Jakub Jelinek wrote:
> What about some nop type conversions in between?
> int
> fn1 (unsigned int x, unsigned int y)
> {
>   int a = x;
>   int b = y;
>   unsigned int c = x & y;
>   int d = a | b;
>   return (int) (c ^ d);
> }
> ?  Also wonder, if some testcases for match.pd shouldn't be

It doesn't work then.  Adding some convert?s into the pattern didn't help
either.  Note that similar patterns don't use convert?s either, so I'm
inclined to keep the pattern as is was.

> in separate statements so that they don't test the GENERIC folding,
> but GIMPLE folding.

Sure - fixed in second version of the patch that I'm regtesting right now.

	Marek

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 11:09 ` Richard Biener
@ 2015-06-11 12:14   ` Marek Polacek
  0 siblings, 0 replies; 16+ messages in thread
From: Marek Polacek @ 2015-06-11 12:14 UTC (permalink / raw)
  To: Richard Biener; +Cc: GCC Patches

On Thu, Jun 11, 2015 at 01:08:41PM +0200, Richard Biener wrote:
> If the pattern doesn't exist in fold-const.c (obviously it doesn't)
> then I think it makes sense to only add testcases to match this on
> GIMPLE (because in the long run we'd like to _not_ transform things
> in GENERIC, at least not without -O).
> 
> Thus sth like
> 
> > +int
> > +fn1 (signed int x, signed int y)
> > +{
>    signed int tem1 = x & y;
>    singed int tem2 = x | y;
> > +  return tem1 ^ tem2;
> > +}
> 
> and scanning the first DCE dump (cddec1) for the applied transform.
 
Makes sense.  Fixed.

> Otherwise the patch is of course ok.  (take the suggestion above
> for followup patches)

I'll commit the following version if testing passes then, thanks.

2015-06-11  Marek Polacek  <polacek@redhat.com>

	* match.pd ((x & y) ^ (x | y) -> x ^ y): New pattern.

	* gcc.dg/fold-xor-3.c: New test.

diff --git gcc/match.pd gcc/match.pd
index 33fa717..9a1317e 100644
--- gcc/match.pd
+++ gcc/match.pd
@@ -320,6 +320,12 @@ along with GCC; see the file COPYING3.  If not see
   (bitop:c (rbitop:c @0 @1) (bit_not@2 @0))
   (bitop @1 @2)))
 
+/* (x & y) ^ (x | y) -> x ^ y */
+(simplify
+ (bit_xor:c (bit_and@2 @0 @1) (bit_ior@3 @0 @1))
+  (if (single_use (@2) && single_use (@3))
+   (bit_xor @0 @1)))
+
 (simplify
  (abs (negate @0))
  (abs @0))
diff --git gcc/testsuite/gcc.dg/fold-xor-3.c gcc/testsuite/gcc.dg/fold-xor-3.c
index e69de29..c2c0af6 100644
--- gcc/testsuite/gcc.dg/fold-xor-3.c
+++ gcc/testsuite/gcc.dg/fold-xor-3.c
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fdump-tree-cddce1" } */
+
+int
+fn1 (signed int x, signed int y)
+{
+  signed int tem1 = x & y;
+  signed int tem2 = x | y;
+  return tem1 ^ tem2;
+}
+
+unsigned int
+fn2 (unsigned int x, unsigned int y)
+{
+  unsigned int tem1 = x & y;
+  unsigned int tem2 = x | y;
+  return tem1 ^ tem2;
+}
+
+int
+fn3 (signed int x, signed int y)
+{
+  signed int tem1 = x & y;
+  signed int tem2 = x | y;
+  return tem2 ^ tem1;
+}
+
+unsigned int
+fn4 (unsigned int x, unsigned int y)
+{
+  unsigned int tem1 = x & y;
+  unsigned int tem2 = x | y;
+  return tem2 ^ tem1;
+}
+
+/* { dg-final { scan-tree-dump-not " & " "cddce1" } } */
+/* { dg-final { scan-tree-dump-not " \\| " "cddce1" } } */

	Marek

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 11:08 match.pd: Optimize (x & y) ^ (x | y) Marek Polacek
  2015-06-11 11:09 ` Richard Biener
  2015-06-11 11:18 ` Jakub Jelinek
@ 2015-06-11 15:26 ` Marc Glisse
  2015-06-11 16:12   ` Richard Biener
  2015-06-11 16:14   ` Marek Polacek
  2 siblings, 2 replies; 16+ messages in thread
From: Marc Glisse @ 2015-06-11 15:26 UTC (permalink / raw)
  To: Marek Polacek; +Cc: GCC Patches, Richard Biener

On Thu, 11 Jun 2015, Marek Polacek wrote:

> I have verified this transformation on a toy testcase (tried x and y
> in the range [-1000,1000]) and it does a correct thing for all integers.

Note that for pure bitop (only involving &|^), testing the range [0,1] is 
sufficient.

> +/* (x & y) ^ (x | y) -> x ^ y */
> +(simplify
> + (bit_xor:c (bit_and@2 @0 @1) (bit_ior@3 @0 @1))

Make either bit_and or bit_ior commutative? Or do we canonicalize in a way 
that makes it unnecessary?

> +  (if (single_use (@2) && single_use (@3))
> +   (bit_xor @0 @1)))

I don't think we should use single_use here. The result is never more 
complicated than the original. Sure, it might increase register pressure a 
bit in some cases, but we have not used that as a criterion for other 
simplifications in match.pd yet (LLVM does though).

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 15:26 ` Marc Glisse
@ 2015-06-11 16:12   ` Richard Biener
  2015-06-11 16:14   ` Marek Polacek
  1 sibling, 0 replies; 16+ messages in thread
From: Richard Biener @ 2015-06-11 16:12 UTC (permalink / raw)
  To: gcc-patches, Marc Glisse, Marek Polacek

On June 11, 2015 5:25:30 PM GMT+02:00, Marc Glisse <marc.glisse@inria.fr> wrote:
>On Thu, 11 Jun 2015, Marek Polacek wrote:
>
>> I have verified this transformation on a toy testcase (tried x and y
>> in the range [-1000,1000]) and it does a correct thing for all
>integers.
>
>Note that for pure bitop (only involving &|^), testing the range [0,1]
>is 
>sufficient.
>
>> +/* (x & y) ^ (x | y) -> x ^ y */
>> +(simplify
>> + (bit_xor:c (bit_and@2 @0 @1) (bit_ior@3 @0 @1))
>
>Make either bit_and or bit_ior commutative? Or do we canonicalize in a
>way 
>that makes it unnecessary?

Yes, operand canonicalization should make this unnecessary.

>> +  (if (single_use (@2) && single_use (@3))
>> +   (bit_xor @0 @1)))
>
>I don't think we should use single_use here. The result is never more 
>complicated than the original. Sure, it might increase register
>pressure a 
>bit in some cases, but we have not used that as a criterion for other 
>simplifications in match.pd yet (LLVM does though).

Hmm yes, you are right here.

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 15:26 ` Marc Glisse
  2015-06-11 16:12   ` Richard Biener
@ 2015-06-11 16:14   ` Marek Polacek
  2015-06-11 16:34     ` Marc Glisse
  1 sibling, 1 reply; 16+ messages in thread
From: Marek Polacek @ 2015-06-11 16:14 UTC (permalink / raw)
  To: Marc Glisse; +Cc: GCC Patches, Richard Biener

On Thu, Jun 11, 2015 at 05:25:30PM +0200, Marc Glisse wrote:
> On Thu, 11 Jun 2015, Marek Polacek wrote:
> 
> >I have verified this transformation on a toy testcase (tried x and y
> >in the range [-1000,1000]) and it does a correct thing for all integers.
> 
> Note that for pure bitop (only involving &|^), testing the range [0,1] is
> sufficient.
 
I'd feel safer when testing a wider range of integers ;).

> >+/* (x & y) ^ (x | y) -> x ^ y */
> >+(simplify
> >+ (bit_xor:c (bit_and@2 @0 @1) (bit_ior@3 @0 @1))
> 
> Make either bit_and or bit_ior commutative? Or do we canonicalize in a way
> that makes it unnecessary?

Correct: bit_and and bit_ior don't need :c here.  (But the bit_xor needs it.)
I've tried various ordering of x and y and all of them were optimized.
Arguably I should've put more tests into the testcase...

> >+  (if (single_use (@2) && single_use (@3))
> >+   (bit_xor @0 @1)))
> 
> I don't think we should use single_use here. The result is never more
> complicated than the original. Sure, it might increase register pressure a
> bit in some cases, but we have not used that as a criterion for other
> simplifications in match.pd yet (LLVM does though).

I don't have a strong preference here but we surely use single_use
in match.pd elsewhere.

	Marek

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 16:14   ` Marek Polacek
@ 2015-06-11 16:34     ` Marc Glisse
  2015-06-11 16:58       ` Marek Polacek
  0 siblings, 1 reply; 16+ messages in thread
From: Marc Glisse @ 2015-06-11 16:34 UTC (permalink / raw)
  To: Marek Polacek; +Cc: GCC Patches, Richard Biener

On Thu, 11 Jun 2015, Marek Polacek wrote:

>>> +  (if (single_use (@2) && single_use (@3))
>>> +   (bit_xor @0 @1)))
>>
>> I don't think we should use single_use here. The result is never more
>> complicated than the original. Sure, it might increase register pressure a
>> bit in some cases, but we have not used that as a criterion for other
>> simplifications in match.pd yet (LLVM does though).
>
> I don't have a strong preference here but we surely use single_use
> in match.pd elsewhere.

The criterion for single_use up to now has been whether we may end up with 
more operations after the transformation than before. Take:
(x & ~m) | (y & m) -> ((x ^ y) & m) ^ x

If (x & ~m) and (y & m) have other uses, we are going to compute them 
anyway, and the original is essentially a single bit_ior operation. After 
the transformation, we have 2 more operations. That's worse than we 
started with, so we don't do it.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 16:34     ` Marc Glisse
@ 2015-06-11 16:58       ` Marek Polacek
  0 siblings, 0 replies; 16+ messages in thread
From: Marek Polacek @ 2015-06-11 16:58 UTC (permalink / raw)
  To: Marc Glisse; +Cc: GCC Patches, Richard Biener

On Thu, Jun 11, 2015 at 06:21:11PM +0200, Marc Glisse wrote:
> On Thu, 11 Jun 2015, Marek Polacek wrote:
> 
> >>>+  (if (single_use (@2) && single_use (@3))
> >>>+   (bit_xor @0 @1)))
> >>
> >>I don't think we should use single_use here. The result is never more
> >>complicated than the original. Sure, it might increase register pressure a
> >>bit in some cases, but we have not used that as a criterion for other
> >>simplifications in match.pd yet (LLVM does though).
> >
> >I don't have a strong preference here but we surely use single_use
> >in match.pd elsewhere.
> 
> The criterion for single_use up to now has been whether we may end up with
> more operations after the transformation than before. Take:
> (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x
> 
> If (x & ~m) and (y & m) have other uses, we are going to compute them
> anyway, and the original is essentially a single bit_ior operation. After
> the transformation, we have 2 more operations. That's worse than we started
> with, so we don't do it.

Hmm, yeah.  And it was me who added that pattern ;).

So I'm going to apply the following as obvious to remove the single_uses in
my latest pattern, if testing passes.  Thanks,

2015-06-11  Marek Polacek  <polacek@redhat.com>

	* match.pd ((x & y) ^ (x | y)): Don't check for single_use.

diff --git gcc/match.pd gcc/match.pd
index 9a1317e..1ab2b1c 100644
--- gcc/match.pd
+++ gcc/match.pd
@@ -322,9 +322,8 @@ along with GCC; see the file COPYING3.  If not see
 
 /* (x & y) ^ (x | y) -> x ^ y */
 (simplify
- (bit_xor:c (bit_and@2 @0 @1) (bit_ior@3 @0 @1))
-  (if (single_use (@2) && single_use (@3))
-   (bit_xor @0 @1)))
+ (bit_xor:c (bit_and @0 @1) (bit_ior @0 @1))
+ (bit_xor @0 @1))
 
 (simplify
  (abs (negate @0))

	Marek

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 12:07   ` Marek Polacek
@ 2015-06-11 20:12     ` Marc Glisse
  2015-06-12  5:59       ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Marc Glisse @ 2015-06-11 20:12 UTC (permalink / raw)
  To: Marek Polacek; +Cc: Jakub Jelinek, GCC Patches, Richard Biener

On Thu, 11 Jun 2015, Marek Polacek wrote:

> On Thu, Jun 11, 2015 at 01:09:05PM +0200, Jakub Jelinek wrote:
>> What about some nop type conversions in between?
>> int
>> fn1 (unsigned int x, unsigned int y)
>> {
>>   int a = x;
>>   int b = y;
>>   unsigned int c = x & y;
>>   int d = a | b;
>>   return (int) (c ^ d);
>> }
>> ?  Also wonder, if some testcases for match.pd shouldn't be
>
> It doesn't work then.  Adding some convert?s into the pattern didn't help
> either.

Not judging at all whether it is desirable or not, but you might have hit 
the issue that when you want several convert?, you need to use the 
spelling convert1?, convert2?, and it stops there, while here you would 
probably want at least 4 (maybe 6?) for this case. You might be able to 
work around it with a user-defined predicate, but I keep getting errors 
like
generic-match.c:6655:16: error: redeclaration of Β‘tree_node* o20_pops [2]Β’

If you want to reproduce the error (this is probably not good as is, it is 
only provided as a reproducer)

(match (nopand @0 @1)
  (bit_and (convert1? @0) (convert2? @1))
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
       && tree_nop_conversion_p (type, TREE_TYPE (@1)))))
(match (nopior @0 @1)
  (bit_ior (convert1? @0) (convert2? @1))
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
       && tree_nop_conversion_p (type, TREE_TYPE (@1)))))
(simplify
  (bit_xor:c (convert1? (nopand@2 @0 @1))
             (convert2? (nopior@3 @0 @1)))
  (if (tree_nop_conversion_p (type, TREE_TYPE (@2))
       && tree_nop_conversion_p (type, TREE_TYPE (@3)))
   (bit_xor (convert @0) (convert @1))))


fold-const.c traditionally avoided the combinatorial explosion by using 
strip_nops.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-11 20:12     ` Marc Glisse
@ 2015-06-12  5:59       ` Richard Biener
  2015-06-12  7:22         ` Marc Glisse
  0 siblings, 1 reply; 16+ messages in thread
From: Richard Biener @ 2015-06-12  5:59 UTC (permalink / raw)
  To: Marc Glisse, Marek Polacek; +Cc: Jakub Jelinek, GCC Patches

On June 11, 2015 10:09:11 PM GMT+02:00, Marc Glisse <marc.glisse@inria.fr> wrote:
>On Thu, 11 Jun 2015, Marek Polacek wrote:
>
>> On Thu, Jun 11, 2015 at 01:09:05PM +0200, Jakub Jelinek wrote:
>>> What about some nop type conversions in between?
>>> int
>>> fn1 (unsigned int x, unsigned int y)
>>> {
>>>   int a = x;
>>>   int b = y;
>>>   unsigned int c = x & y;
>>>   int d = a | b;
>>>   return (int) (c ^ d);
>>> }
>>> ?  Also wonder, if some testcases for match.pd shouldn't be
>>
>> It doesn't work then.  Adding some convert?s into the pattern didn't
>help
>> either.
>
>Not judging at all whether it is desirable or not, but you might have
>hit 
>the issue that when you want several convert?, you need to use the 
>spelling convert1?, convert2?, and it stops there, while here you would
>
>probably want at least 4 (maybe 6?) for this case. You might be able to
>
>work around it with a user-defined predicate, but I keep getting errors
>
>like
>generic-match.c:6655:16: error: redeclaration of â€˜tree_node* o20_pops
>[2]â€™
>
>If you want to reproduce the error (this is probably not good as is, it
>is 
>only provided as a reproducer)
>
>(match (nopand @0 @1)
>  (bit_and (convert1? @0) (convert2? @1))
>  (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
>       && tree_nop_conversion_p (type, TREE_TYPE (@1)))))
>(match (nopior @0 @1)
>  (bit_ior (convert1? @0) (convert2? @1))
>  (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
>       && tree_nop_conversion_p (type, TREE_TYPE (@1)))))
>(simplify
>  (bit_xor:c (convert1? (nopand@2 @0 @1))
>             (convert2? (nopior@3 @0 @1)))
>  (if (tree_nop_conversion_p (type, TREE_TYPE (@2))
>       && tree_nop_conversion_p (type, TREE_TYPE (@3)))
>   (bit_xor (convert @0) (convert @1))))
>
>
>fold-const.c traditionally avoided the combinatorial explosion by using
>
>strip_nops.

Yeah.  We can probably special case conditional conversions in code generation instead of lowering it.  And then go the full way and special case nop conversions so you can avoid writing the predicate as well.

Richard.


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-12  5:59       ` Richard Biener
@ 2015-06-12  7:22         ` Marc Glisse
  2015-06-12  9:04           ` Marek Polacek
  0 siblings, 1 reply; 16+ messages in thread
From: Marc Glisse @ 2015-06-12  7:22 UTC (permalink / raw)
  To: Richard Biener; +Cc: Marek Polacek, Jakub Jelinek, GCC Patches

On Fri, 12 Jun 2015, Richard Biener wrote:

>> Not judging at all whether it is desirable or not, but you might have 
>> hit the issue that when you want several convert?, you need to use the 
>> spelling convert1?, convert2?, and it stops there, while here you would 
>> probably want at least 4 (maybe 6?) for this case. You might be able to 
>> work around it with a user-defined predicate, but I keep getting errors 
>> like
>> generic-match.c:6655:16: error: redeclaration of â€˜tree_node* o20_pops [2]â€™
>>
>> If you want to reproduce the error (this is probably not good as is, it 
>> is only provided as a reproducer)
>>
>> (match (nopand @0 @1)
>>  (bit_and (convert1? @0) (convert2? @1))
>>  (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
>>       && tree_nop_conversion_p (type, TREE_TYPE (@1)))))
>> (match (nopior @0 @1)
>>  (bit_ior (convert1? @0) (convert2? @1))
>>  (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
>>       && tree_nop_conversion_p (type, TREE_TYPE (@1)))))
>> (simplify
>>  (bit_xor:c (convert1? (nopand@2 @0 @1))
>>             (convert2? (nopior@3 @0 @1)))
>>  (if (tree_nop_conversion_p (type, TREE_TYPE (@2))
>>       && tree_nop_conversion_p (type, TREE_TYPE (@3)))
>>   (bit_xor (convert @0) (convert @1))))
>>
>>
>> fold-const.c traditionally avoided the combinatorial explosion by using
>> strip_nops.
>
> Yeah.  We can probably special case conditional conversions in code 
> generation instead of lowering it.  And then go the full way and special 
> case nop conversions so you can avoid writing the predicate as well.

Without special casing, I currently have:

(match (nopcvt @0)
  (convert? @0)
  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))))
(simplify
  (bit_xor:c (convert1? (bit_and@2 (nopcvt @0) (nopcvt @1)))
             (convert2? (bit_ior:c (nopcvt @0) (nopcvt @1))))
  (if (tree_nop_conversion_p (type, TREE_TYPE (@2)))
   (bit_xor (convert @0) (convert @1))))

which simplifies Jakub's testcase without exploding the size of *-match.c, 
but it is still not very satisfying.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-12  7:22         ` Marc Glisse
@ 2015-06-12  9:04           ` Marek Polacek
  2015-06-13 10:46             ` Marc Glisse
  0 siblings, 1 reply; 16+ messages in thread
From: Marek Polacek @ 2015-06-12  9:04 UTC (permalink / raw)
  To: Marc Glisse; +Cc: Richard Biener, Jakub Jelinek, GCC Patches

On Fri, Jun 12, 2015 at 08:44:45AM +0200, Marc Glisse wrote:
> On Fri, 12 Jun 2015, Richard Biener wrote:
> 
> >>Not judging at all whether it is desirable or not, but you might have
> >>hit the issue that when you want several convert?, you need to use the
> >>spelling convert1?, convert2?, and it stops there, while here you would
> >>probably want at least 4 (maybe 6?) for this case. You might be able to
> >>work around it with a user-defined predicate, but I keep getting errors
> >>like
> >>generic-match.c:6655:16: error: redeclaration of â€˜tree_node* o20_pops [2]â€™

I remember trying convert1?, convert2?, and so on.  But I gave up soon.

> >>If you want to reproduce the error (this is probably not good as is, it
> >>is only provided as a reproducer)
> >>
> >>(match (nopand @0 @1)
> >> (bit_and (convert1? @0) (convert2? @1))
> >> (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
> >>      && tree_nop_conversion_p (type, TREE_TYPE (@1)))))
> >>(match (nopior @0 @1)
> >> (bit_ior (convert1? @0) (convert2? @1))
> >> (if (tree_nop_conversion_p (type, TREE_TYPE (@0))
> >>      && tree_nop_conversion_p (type, TREE_TYPE (@1)))))
> >>(simplify
> >> (bit_xor:c (convert1? (nopand@2 @0 @1))
> >>            (convert2? (nopior@3 @0 @1)))
> >> (if (tree_nop_conversion_p (type, TREE_TYPE (@2))
> >>      && tree_nop_conversion_p (type, TREE_TYPE (@3)))
> >>  (bit_xor (convert @0) (convert @1))))
> >>
> >>
> >>fold-const.c traditionally avoided the combinatorial explosion by using
> >>strip_nops.
> >
> >Yeah.  We can probably special case conditional conversions in code
> >generation instead of lowering it.  And then go the full way and special
> >case nop conversions so you can avoid writing the predicate as well.
> 
> Without special casing, I currently have:
> 
> (match (nopcvt @0)
>  (convert? @0)
>  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))))
> (simplify
>  (bit_xor:c (convert1? (bit_and@2 (nopcvt @0) (nopcvt @1)))
>             (convert2? (bit_ior:c (nopcvt @0) (nopcvt @1))))
>  (if (tree_nop_conversion_p (type, TREE_TYPE (@2)))
>   (bit_xor (convert @0) (convert @1))))
> 
> which simplifies Jakub's testcase without exploding the size of *-match.c,
> but it is still not very satisfying.

Yeah, imagine if we'd have to change every pattern like that :-(.

	Marek

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-12  9:04           ` Marek Polacek
@ 2015-06-13 10:46             ` Marc Glisse
  2015-06-16 13:47               ` Richard Biener
  0 siblings, 1 reply; 16+ messages in thread
From: Marc Glisse @ 2015-06-13 10:46 UTC (permalink / raw)
  To: Marek Polacek; +Cc: Richard Biener, Jakub Jelinek, GCC Patches

On Fri, 12 Jun 2015, Marek Polacek wrote:

>>>> fold-const.c traditionally avoided the combinatorial explosion by using
>>>> strip_nops.
>>>
>>> Yeah.  We can probably special case conditional conversions in code
>>> generation instead of lowering it.  And then go the full way and special
>>> case nop conversions so you can avoid writing the predicate as well.
>>
>> Without special casing, I currently have:
>>
>> (match (nopcvt @0)
>>  (convert? @0)
>>  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))))
>> (simplify
>>  (bit_xor:c (convert1? (bit_and@2 (nopcvt @0) (nopcvt @1)))
>>             (convert2? (bit_ior:c (nopcvt @0) (nopcvt @1))))
>>  (if (tree_nop_conversion_p (type, TREE_TYPE (@2)))

This could use @0 instead of @2 to save 2 characters.

>>   (bit_xor (convert @0) (convert @1))))
>>
>> which simplifies Jakub's testcase without exploding the size of *-match.c,
>> but it is still not very satisfying.
>
> Yeah, imagine if we'd have to change every pattern like that :-(.

I am not sure that we will be able to completely avoid it. Nop conversions 
can be ignored in some places, but not everywhere, so we have to be 
explicit about it. We could make a nice short notation, say bit_and# to 
switch whether there is an implicit strip_nops or not, and pick the 
default that applies to most patterns. We would still need a way to access 
the unstripped version of @2 (@#2 maybe?). And we would still need the 
'convert' in the output pattern, unless we teach genmatch to add nop 
conversions in many places, which sounds complicated.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: match.pd: Optimize (x & y) ^ (x | y)
  2015-06-13 10:46             ` Marc Glisse
@ 2015-06-16 13:47               ` Richard Biener
  0 siblings, 0 replies; 16+ messages in thread
From: Richard Biener @ 2015-06-16 13:47 UTC (permalink / raw)
  To: Marc Glisse; +Cc: Marek Polacek, Jakub Jelinek, GCC Patches

On Sat, 13 Jun 2015, Marc Glisse wrote:

> On Fri, 12 Jun 2015, Marek Polacek wrote:
> 
> > > > > fold-const.c traditionally avoided the combinatorial explosion by
> > > > > using
> > > > > strip_nops.
> > > > 
> > > > Yeah.  We can probably special case conditional conversions in code
> > > > generation instead of lowering it.  And then go the full way and special
> > > > case nop conversions so you can avoid writing the predicate as well.
> > > 
> > > Without special casing, I currently have:
> > > 
> > > (match (nopcvt @0)
> > >  (convert? @0)
> > >  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))))
> > > (simplify
> > >  (bit_xor:c (convert1? (bit_and@2 (nopcvt @0) (nopcvt @1)))
> > >             (convert2? (bit_ior:c (nopcvt @0) (nopcvt @1))))
> > >  (if (tree_nop_conversion_p (type, TREE_TYPE (@2)))
> 
> This could use @0 instead of @2 to save 2 characters.
> 
> > >   (bit_xor (convert @0) (convert @1))))
> > > 
> > > which simplifies Jakub's testcase without exploding the size of *-match.c,
> > > but it is still not very satisfying.
> > 
> > Yeah, imagine if we'd have to change every pattern like that :-(.

Indeed.  I don't like adding that (match (nopcvt)).  We should improve
on the IL instead.

> I am not sure that we will be able to completely avoid it. Nop conversions can
> be ignored in some places, but not everywhere, so we have to be explicit about
> it. We could make a nice short notation, say bit_and# to switch whether there
> is an implicit strip_nops or not, and pick the default that applies to most
> patterns. We would still need a way to access the unstripped version of @2
> (@#2 maybe?). And we would still need the 'convert' in the output pattern,
> unless we teach genmatch to add nop conversions in many places, which sounds
> complicated.

Yeah, I also think we want to be explicit, which means adding both
the conditional nop-convert and the output convert.  What we can do
is invent new [s]nop (according to STRIP_NOPS / STRIP_SIGN_NOPS)
codes that when used conditional (not sure if we want to support
unconditional use?) always act like if you were using nop1, nop2, nop3, 
... but special-case it in code-generation to magically add
a tree_nop_conversion_p check (somewhere).

I think we can't easily remove the exponential explosion in the number
of nodes in the decision tree though (but for a single leaf op or by
outlining to a function which is the effect of your (match (nopcvt))
approach - though in the end I'd like to have those "inlined")

Richard.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2015-06-16 13:45 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-06-11 11:08 match.pd: Optimize (x & y) ^ (x | y) Marek Polacek
2015-06-11 11:09 ` Richard Biener
2015-06-11 12:14   ` Marek Polacek
2015-06-11 11:18 ` Jakub Jelinek
2015-06-11 12:07   ` Marek Polacek
2015-06-11 20:12     ` Marc Glisse
2015-06-12  5:59       ` Richard Biener
2015-06-12  7:22         ` Marc Glisse
2015-06-12  9:04           ` Marek Polacek
2015-06-13 10:46             ` Marc Glisse
2015-06-16 13:47               ` Richard Biener
2015-06-11 15:26 ` Marc Glisse
2015-06-11 16:12   ` Richard Biener
2015-06-11 16:14   ` Marek Polacek
2015-06-11 16:34     ` Marc Glisse
2015-06-11 16:58       ` Marek Polacek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).