* [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)
@ 2011-04-20 15:27 Kai Tietz
2011-04-20 15:48 ` Richard Henderson
2011-04-20 15:51 ` Jakub Jelinek
0 siblings, 2 replies; 9+ messages in thread
From: Kai Tietz @ 2011-04-20 15:27 UTC (permalink / raw)
To: GCC Patches; +Cc: Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 636 bytes --]
Hello,
well the bonus points might gain somebody else ... But this adds a
missing optimization
for tree level implemented in fold-const.
ChangeLog gcc/
2011-04-20 Kai Tietz
* fold-const.c (fold_binary_loc): Add handling for
(X & ~Y) | (~X & Y) and (X && !Y) | (!X && Y) optimization
to (X ^ Y).
ChangeLog gcc/testsuite
2011-04-20 Kai Tietz
* gcc.dg/binio-xor1.c: New test.
* gcc.dg/binio-xor2.c: New test.
* gcc.dg/binio-xor3.c: New test.
* gcc.dg/binio-xor4.c: New test.
* gcc.dg/binio-xor5.c: New test.
Tested for i686-w64-mingw32, x86_64-w64-mingw32, and
x86_64-pc-linux-gnu (multilib). Ok for apply?
Regards,
Kai
[-- Attachment #2: opt_xor.txt --]
[-- Type: text/plain, Size: 5939 bytes --]
Index: gcc/gcc/fold-const.c
===================================================================
--- gcc.orig/gcc/fold-const.c 2011-04-20 17:10:39.478091900 +0200
+++ gcc/gcc/fold-const.c 2011-04-20 17:11:22.901039400 +0200
@@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc,
&& reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0));
+ /* (X & ~Y) | (~X & Y) is X ^ Y */
+ if (TREE_CODE (arg0) == BIT_AND_EXPR
+ && TREE_CODE (arg1) == BIT_AND_EXPR)
+ {
+ tree a0, a1, l0, l1, n0, n1;
+
+ a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
+ a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
+
+ l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
+ l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
+
+ n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
+ n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
+
+ if ((operand_equal_p (n0, a0, 0)
+ && operand_equal_p (n1, a1, 0))
+ || (operand_equal_p (n0, a1, 0)
+ && operand_equal_p (n1, a0, 0)))
+ return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
+ }
+
t1 = distribute_bit_expr (loc, code, type, arg0, arg1);
if (t1 != NULL_TREE)
return t1;
@@ -12039,6 +12061,28 @@ fold_binary_loc (location_t loc,
&& operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
return omit_one_operand_loc (loc, type, integer_one_node, arg0);
+ /* (X && !Y) || (!X && Y) is X ^ Y */
+ if (TREE_CODE (arg0) == TREE_CODE (arg1)
+ && (TREE_CODE (arg1) == TRUTH_AND_EXPR
+ || TREE_CODE (arg1) == TRUTH_ANDIF_EXPR))
+ {
+ tree a0, a1, l0, l1, n0, n1;
+
+ a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
+ a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
+
+ l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
+ l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
+
+ n0 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l0);
+ n1 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l1);
+
+ if ((operand_equal_p (n0, a0, 0)
+ && operand_equal_p (n1, a1, 0))
+ || (operand_equal_p (n0, a1, 0)
+ && operand_equal_p (n1, a0, 0)))
+ return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
+ }
goto truth_andor;
case TRUTH_XOR_EXPR:
Index: gcc/gcc/testsuite/gcc.dg/binop-xor1.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-xor1.c 2011-04-20 17:11:22.905039900 +0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+ return ((a && !b && c) || (!a && b && c));
+}
+
+/* We expect to see "<bb N>"; confirm that, so that we know to count
+ it in the real test. */
+/* { dg-final { scan-tree-dump-times "<bb\[^>\]*>" 5 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-xor2.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-xor2.c 2011-04-20 17:11:22.908540300 +0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b)
+{
+ return ((a & ~b) | (~a & b));
+}
+
+/* We expect to see "<bb N>"; confirm that, so that we know to count
+ it in the real test. */
+/* { dg-final { scan-tree-dump-times "<bb\[^>\]*>" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-xor3.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-xor3.c 2011-04-20 17:11:22.911040600 +0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b)
+{
+ return ((a && !b) || (!a && b));
+}
+
+/* We expect to see "<bb N>"; confirm that, so that we know to count
+ it in the real test. */
+/* { dg-final { scan-tree-dump-times "<bb\[^>\]*>" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-xor4.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-xor4.c 2011-04-20 17:11:22.913541000 +0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+ return ((a & ~b) | (~a & b)) & c;
+}
+
+/* We expect to see "<bb N>"; confirm that, so that we know to count
+ it in the real test. */
+/* { dg-final { scan-tree-dump-times "<bb\[^>\]*>" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-xor5.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-xor5.c 2011-04-20 17:11:22.916541300 +0200
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+ return ((a & ~b & c) | (~a & b & c));
+}
+
+/* We expect to see "<bb N>"; confirm that, so that we know to count
+ it in the real test. */
+/* { dg-final { scan-tree-dump-times "<bb\[^>\]*>" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\&" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)
2011-04-20 15:27 [patch middle-end]: Missed optimization for (x & ~y) | (~x & y) Kai Tietz
@ 2011-04-20 15:48 ` Richard Henderson
2011-04-20 16:10 ` Kai Tietz
2011-04-20 15:51 ` Jakub Jelinek
1 sibling, 1 reply; 9+ messages in thread
From: Richard Henderson @ 2011-04-20 15:48 UTC (permalink / raw)
To: Kai Tietz; +Cc: GCC Patches, Jakub Jelinek
On 04/20/2011 08:22 AM, Kai Tietz wrote:
> + if (TREE_CODE (arg0) == BIT_AND_EXPR
> + && TREE_CODE (arg1) == BIT_AND_EXPR)
> + {
> + tree a0, a1, l0, l1, n0, n1;
> +
> + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
> + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
> +
> + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
> + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
> +
> + n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
> + n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
> +
> + if ((operand_equal_p (n0, a0, 0)
> + && operand_equal_p (n1, a1, 0))
> + || (operand_equal_p (n0, a1, 0)
> + && operand_equal_p (n1, a0, 0)))
> + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
First, you typoed BIT_XOR_EXPR in this first block.
Second, I don't see how you're arbitrarily choosing L0 and N1 in the
expansion. If you write the expression the other way around,
(~x & y) | (x & ~y)
don't you wind up with
(~x ^ ~y)
? Or do the extra NOT expressions get folded away anyway?
> + if (TREE_CODE (arg0) == TREE_CODE (arg1)
> + && (TREE_CODE (arg1) == TRUTH_AND_EXPR
> + || TREE_CODE (arg1) == TRUTH_ANDIF_EXPR))
I don't believe you want to apply this transformation with ANDIF.
r~
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)
2011-04-20 15:27 [patch middle-end]: Missed optimization for (x & ~y) | (~x & y) Kai Tietz
2011-04-20 15:48 ` Richard Henderson
@ 2011-04-20 15:51 ` Jakub Jelinek
2011-04-20 16:18 ` Kai Tietz
1 sibling, 1 reply; 9+ messages in thread
From: Jakub Jelinek @ 2011-04-20 15:51 UTC (permalink / raw)
To: Kai Tietz; +Cc: GCC Patches
On Wed, Apr 20, 2011 at 05:22:31PM +0200, Kai Tietz wrote:
> --- gcc.orig/gcc/fold-const.c 2011-04-20 17:10:39.478091900 +0200
> +++ gcc/gcc/fold-const.c 2011-04-20 17:11:22.901039400 +0200
> @@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc,
> && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
> return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0));
>
> + /* (X & ~Y) | (~X & Y) is X ^ Y */
> + if (TREE_CODE (arg0) == BIT_AND_EXPR
> + && TREE_CODE (arg1) == BIT_AND_EXPR)
> + {
> + tree a0, a1, l0, l1, n0, n1;
> +
> + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
> + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
> +
> + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
> + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
> +
> + n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
> + n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
> +
> + if ((operand_equal_p (n0, a0, 0)
> + && operand_equal_p (n1, a1, 0))
> + || (operand_equal_p (n0, a1, 0)
> + && operand_equal_p (n1, a0, 0)))
> + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
> + }
> +
I must say I don't like first folding/building new trees, then testing
and then maybe optimizing, that is slow and creates unnecessary garbage
in the likely case the optimization can't do anything.
Wouldn't something like:
int arg0_not = TREE_CODE (TREE_OPERAND (arg0, 1)) == BIT_NOT_EXPR;
int arg1_not = TREE_CODE (TREE_OPERAND (arg1, 1)) == BIT_NOT_EXPR;
if (TREE_CODE (TREE_OPERAND (arg0, arg0_not)) == BIT_NOT_EXPR
&& TREE_CODE (TREE_OPERAND (arg1, arg1_not)) == BIT_NOT_EXPR
&& operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg0, arg0_not), 0),
TREE_OPERAND (arg1, 1 - arg1_not), 0)
&& operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg1, arg1_not), 0),
TREE_OPERAND (arg0, 1 - arg0_not), 0))
return fold_build2_loc (loc, TRUTH_XOR_EXPR, type,
fold_convert_loc (loc, type,
TREE_OPERAND (arg0, 1 - arg0_not)),
fold_convert_loc (loc, type,
TREE_OPERAND (arg1, 1 - arg1_not)));
work better?
Jakub
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)
2011-04-20 15:48 ` Richard Henderson
@ 2011-04-20 16:10 ` Kai Tietz
2011-04-20 16:16 ` Richard Henderson
0 siblings, 1 reply; 9+ messages in thread
From: Kai Tietz @ 2011-04-20 16:10 UTC (permalink / raw)
To: Richard Henderson; +Cc: GCC Patches, Jakub Jelinek
[-- Attachment #1: Type: text/plain, Size: 1853 bytes --]
2011/4/20 Richard Henderson <rth@redhat.com>:
> On 04/20/2011 08:22 AM, Kai Tietz wrote:
>> + if (TREE_CODE (arg0) == BIT_AND_EXPR
>> + && TREE_CODE (arg1) == BIT_AND_EXPR)
>> + {
>> + tree a0, a1, l0, l1, n0, n1;
>> +
>> + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
>> + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
>> +
>> + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
>> + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
>> +
>> + n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
>> + n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
>> +
>> + if ((operand_equal_p (n0, a0, 0)
>> + && operand_equal_p (n1, a1, 0))
>> + || (operand_equal_p (n0, a1, 0)
>> + && operand_equal_p (n1, a0, 0)))
>> + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
>
> First, you typoed BIT_XOR_EXPR in this first block.
Duh, corrected.
> Second, I don't see how you're arbitrarily choosing L0 and N1 in the
> expansion. If you write the expression the other way around,
>
> (~x & y) | (x & ~y)
>
> don't you wind up with
>
> (~x ^ ~y)
>
> ? Or do the extra NOT expressions get folded away anyway?
Not I didn't wind up here. First ~X ^ ~Y is in result the same as X ^
Y, and for this I used here the explicit folding. Well, it might be a
bit slower, but it has the advantage to compare equal transformations
in doubt.
>> + if (TREE_CODE (arg0) == TREE_CODE (arg1)
>> + && (TREE_CODE (arg1) == TRUTH_AND_EXPR
>> + || TREE_CODE (arg1) == TRUTH_ANDIF_EXPR))
>
> I don't believe you want to apply this transformation with ANDIF.
Yes, it is superflous. I removed it.
>
> r~
>
Adjusted patch attached.
Kai
[-- Attachment #2: opt_xor.txt --]
[-- Type: text/plain, Size: 5887 bytes --]
Index: gcc/gcc/fold-const.c
===================================================================
--- gcc.orig/gcc/fold-const.c 2011-04-20 17:10:39.478091900 +0200
+++ gcc/gcc/fold-const.c 2011-04-20 17:41:23.427677200 +0200
@@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc,
&& reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0));
+ /* (X & ~Y) | (~X & Y) is X ^ Y */
+ if (TREE_CODE (arg0) == BIT_AND_EXPR
+ && TREE_CODE (arg1) == BIT_AND_EXPR)
+ {
+ tree a0, a1, l0, l1, n0, n1;
+
+ a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
+ a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
+
+ l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
+ l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
+
+ n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
+ n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
+
+ if ((operand_equal_p (n0, a0, 0)
+ && operand_equal_p (n1, a1, 0))
+ || (operand_equal_p (n0, a1, 0)
+ && operand_equal_p (n1, a0, 0)))
+ return fold_build2_loc (loc, BIT_XOR_EXPR, type, l0, n1);
+ }
+
t1 = distribute_bit_expr (loc, code, type, arg0, arg1);
if (t1 != NULL_TREE)
return t1;
@@ -12039,6 +12061,27 @@ fold_binary_loc (location_t loc,
&& operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
return omit_one_operand_loc (loc, type, integer_one_node, arg0);
+ /* (X && !Y) || (!X && Y) is X ^ Y */
+ if (TREE_CODE (arg0) == TREE_CODE (arg1)
+ && TREE_CODE (arg1) == TRUTH_AND_EXPR)
+ {
+ tree a0, a1, l0, l1, n0, n1;
+
+ a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
+ a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
+
+ l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
+ l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
+
+ n0 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l0);
+ n1 = fold_build1_loc (loc, TRUTH_NOT_EXPR, type, l1);
+
+ if ((operand_equal_p (n0, a0, 0)
+ && operand_equal_p (n1, a1, 0))
+ || (operand_equal_p (n0, a1, 0)
+ && operand_equal_p (n1, a0, 0)))
+ return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
+ }
goto truth_andor;
case TRUTH_XOR_EXPR:
Index: gcc/gcc/testsuite/gcc.dg/binop-xor1.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-xor1.c 2011-04-20 17:11:22.905039900 +0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+ return ((a && !b && c) || (!a && b && c));
+}
+
+/* We expect to see "<bb N>"; confirm that, so that we know to count
+ it in the real test. */
+/* { dg-final { scan-tree-dump-times "<bb\[^>\]*>" 5 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-xor2.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-xor2.c 2011-04-20 17:11:22.908540300 +0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b)
+{
+ return ((a & ~b) | (~a & b));
+}
+
+/* We expect to see "<bb N>"; confirm that, so that we know to count
+ it in the real test. */
+/* { dg-final { scan-tree-dump-times "<bb\[^>\]*>" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-xor3.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-xor3.c 2011-04-20 17:11:22.911040600 +0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b)
+{
+ return ((a && !b) || (!a && b));
+}
+
+/* We expect to see "<bb N>"; confirm that, so that we know to count
+ it in the real test. */
+/* { dg-final { scan-tree-dump-times "<bb\[^>\]*>" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-xor4.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-xor4.c 2011-04-20 17:11:22.913541000 +0200
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+ return ((a & ~b) | (~a & b)) & c;
+}
+
+/* We expect to see "<bb N>"; confirm that, so that we know to count
+ it in the real test. */
+/* { dg-final { scan-tree-dump-times "<bb\[^>\]*>" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
Index: gcc/gcc/testsuite/gcc.dg/binop-xor5.c
===================================================================
--- /dev/null 1970-01-01 00:00:00.000000000 +0000
+++ gcc/gcc/testsuite/gcc.dg/binop-xor5.c 2011-04-20 17:11:22.916541300 +0200
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int
+foo (int a, int b, int c)
+{
+ return ((a & ~b & c) | (~a & b & c));
+}
+
+/* We expect to see "<bb N>"; confirm that, so that we know to count
+ it in the real test. */
+/* { dg-final { scan-tree-dump-times "<bb\[^>\]*>" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\^" 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "\&" 1 "optimized" } } */
+/* { dg-final { cleanup-tree-dump "optimized" } } */
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)
2011-04-20 16:10 ` Kai Tietz
@ 2011-04-20 16:16 ` Richard Henderson
2011-04-20 17:08 ` Kai Tietz
0 siblings, 1 reply; 9+ messages in thread
From: Richard Henderson @ 2011-04-20 16:16 UTC (permalink / raw)
To: Kai Tietz; +Cc: GCC Patches, Jakub Jelinek
On 04/20/2011 08:50 AM, Kai Tietz wrote:
> + if (TREE_CODE (arg0) == TREE_CODE (arg1)
> + && TREE_CODE (arg1) == TRUTH_AND_EXPR)
Ok with these both explicitly testing TRUTH_AND_EXPR now.
r~
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)
2011-04-20 15:51 ` Jakub Jelinek
@ 2011-04-20 16:18 ` Kai Tietz
2011-04-21 9:19 ` Richard Guenther
0 siblings, 1 reply; 9+ messages in thread
From: Kai Tietz @ 2011-04-20 16:18 UTC (permalink / raw)
To: Jakub Jelinek; +Cc: GCC Patches
2011/4/20 Jakub Jelinek <jakub@redhat.com>:
> On Wed, Apr 20, 2011 at 05:22:31PM +0200, Kai Tietz wrote:
>> --- gcc.orig/gcc/fold-const.c 2011-04-20 17:10:39.478091900 +0200
>> +++ gcc/gcc/fold-const.c 2011-04-20 17:11:22.901039400 +0200
>> @@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc,
>> && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
>> return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0));
>>
>> + /* (X & ~Y) | (~X & Y) is X ^ Y */
>> + if (TREE_CODE (arg0) == BIT_AND_EXPR
>> + && TREE_CODE (arg1) == BIT_AND_EXPR)
>> + {
>> + tree a0, a1, l0, l1, n0, n1;
>> +
>> + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
>> + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
>> +
>> + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
>> + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
>> +
>> + n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
>> + n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
>> +
>> + if ((operand_equal_p (n0, a0, 0)
>> + && operand_equal_p (n1, a1, 0))
>> + || (operand_equal_p (n0, a1, 0)
>> + && operand_equal_p (n1, a0, 0)))
>> + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
>> + }
>> +
>
> I must say I don't like first folding/building new trees, then testing
> and then maybe optimizing, that is slow and creates unnecessary garbage
> in the likely case the optimization can't do anything.
>
> Wouldn't something like:
> int arg0_not = TREE_CODE (TREE_OPERAND (arg0, 1)) == BIT_NOT_EXPR;
> int arg1_not = TREE_CODE (TREE_OPERAND (arg1, 1)) == BIT_NOT_EXPR;
> if (TREE_CODE (TREE_OPERAND (arg0, arg0_not)) == BIT_NOT_EXPR
> && TREE_CODE (TREE_OPERAND (arg1, arg1_not)) == BIT_NOT_EXPR
> && operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg0, arg0_not), 0),
> TREE_OPERAND (arg1, 1 - arg1_not), 0)
> && operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg1, arg1_not), 0),
> TREE_OPERAND (arg0, 1 - arg0_not), 0))
> return fold_build2_loc (loc, TRUTH_XOR_EXPR, type,
> fold_convert_loc (loc, type,
> TREE_OPERAND (arg0, 1 - arg0_not)),
> fold_convert_loc (loc, type,
> TREE_OPERAND (arg1, 1 - arg1_not)));
> work better?
>
> Jakub
>
Well, as special case we could use that, but we have here also to
handle integer-values, so I used fold to make sure I get inverse. Also
there might be some transformations, which otherwise might be not
caught, like !(X || Y) == !X && !Y ...
Regards,
Kai
--
| (\_/) This is Bunny. Copy and paste
| (='.'=) Bunny into your signature to help
| (")_(") him gain world domination
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)
2011-04-20 16:16 ` Richard Henderson
@ 2011-04-20 17:08 ` Kai Tietz
2011-04-20 18:00 ` Kai Tietz
0 siblings, 1 reply; 9+ messages in thread
From: Kai Tietz @ 2011-04-20 17:08 UTC (permalink / raw)
To: Richard Henderson; +Cc: GCC Patches, Jakub Jelinek
2011/4/20 Richard Henderson <rth@redhat.com>:
> On 04/20/2011 08:50 AM, Kai Tietz wrote:
>> + if (TREE_CODE (arg0) == TREE_CODE (arg1)
>> + && TREE_CODE (arg1) == TRUTH_AND_EXPR)
>
> Ok with these both explicitly testing TRUTH_AND_EXPR now.
>
>
> r~
>
Committed at revision 172776 with explicit testing for TRUTH_AND_EXPR.
Kai
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)
2011-04-20 17:08 ` Kai Tietz
@ 2011-04-20 18:00 ` Kai Tietz
0 siblings, 0 replies; 9+ messages in thread
From: Kai Tietz @ 2011-04-20 18:00 UTC (permalink / raw)
To: Richard Henderson; +Cc: GCC Patches, Jakub Jelinek
2011/4/20 Kai Tietz <ktietz70@googlemail.com>:
> 2011/4/20 Richard Henderson <rth@redhat.com>:
>> On 04/20/2011 08:50 AM, Kai Tietz wrote:
>>> + if (TREE_CODE (arg0) == TREE_CODE (arg1)
>>> + && TREE_CODE (arg1) == TRUTH_AND_EXPR)
>>
>> Ok with these both explicitly testing TRUTH_AND_EXPR now.
>>
>>
>> r~
>>
>
> Committed at revision 172776 with explicit testing for TRUTH_AND_EXPR.
>
> Kai
Fixed encoding issue of backslashs in testcases at revision 172781.
Committed as obvious.
Kai
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [patch middle-end]: Missed optimization for (x & ~y) | (~x & y)
2011-04-20 16:18 ` Kai Tietz
@ 2011-04-21 9:19 ` Richard Guenther
0 siblings, 0 replies; 9+ messages in thread
From: Richard Guenther @ 2011-04-21 9:19 UTC (permalink / raw)
To: Kai Tietz; +Cc: Jakub Jelinek, GCC Patches
On Wed, Apr 20, 2011 at 5:58 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2011/4/20 Jakub Jelinek <jakub@redhat.com>:
>> On Wed, Apr 20, 2011 at 05:22:31PM +0200, Kai Tietz wrote:
>>> --- gcc.orig/gcc/fold-const.c 2011-04-20 17:10:39.478091900 +0200
>>> +++ gcc/gcc/fold-const.c 2011-04-20 17:11:22.901039400 +0200
>>> @@ -10660,6 +10660,28 @@ fold_binary_loc (location_t loc,
>>> && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
>>> return omit_one_operand_loc (loc, type, arg0, TREE_OPERAND (arg1, 0));
>>>
>>> + /* (X & ~Y) | (~X & Y) is X ^ Y */
>>> + if (TREE_CODE (arg0) == BIT_AND_EXPR
>>> + && TREE_CODE (arg1) == BIT_AND_EXPR)
>>> + {
>>> + tree a0, a1, l0, l1, n0, n1;
>>> +
>>> + a0 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
>>> + a1 = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
>>> +
>>> + l0 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
>>> + l1 = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
>>> +
>>> + n0 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l0);
>>> + n1 = fold_build1_loc (loc, BIT_NOT_EXPR, type, l1);
>>> +
>>> + if ((operand_equal_p (n0, a0, 0)
>>> + && operand_equal_p (n1, a1, 0))
>>> + || (operand_equal_p (n0, a1, 0)
>>> + && operand_equal_p (n1, a0, 0)))
>>> + return fold_build2_loc (loc, TRUTH_XOR_EXPR, type, l0, n1);
>>> + }
>>> +
>>
>> I must say I don't like first folding/building new trees, then testing
>> and then maybe optimizing, that is slow and creates unnecessary garbage
>> in the likely case the optimization can't do anything.
>>
>> Wouldn't something like:
>> int arg0_not = TREE_CODE (TREE_OPERAND (arg0, 1)) == BIT_NOT_EXPR;
>> int arg1_not = TREE_CODE (TREE_OPERAND (arg1, 1)) == BIT_NOT_EXPR;
>> if (TREE_CODE (TREE_OPERAND (arg0, arg0_not)) == BIT_NOT_EXPR
>> && TREE_CODE (TREE_OPERAND (arg1, arg1_not)) == BIT_NOT_EXPR
>> && operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg0, arg0_not), 0),
>> TREE_OPERAND (arg1, 1 - arg1_not), 0)
>> && operand_equal_p (TREE_OPERAND (TREE_OPERAND (arg1, arg1_not), 0),
>> TREE_OPERAND (arg0, 1 - arg0_not), 0))
>> return fold_build2_loc (loc, TRUTH_XOR_EXPR, type,
>> fold_convert_loc (loc, type,
>> TREE_OPERAND (arg0, 1 - arg0_not)),
>> fold_convert_loc (loc, type,
>> TREE_OPERAND (arg1, 1 - arg1_not)));
>> work better?
>>
>> Jakub
>>
>
> Well, as special case we could use that, but we have here also to
> handle integer-values, so I used fold to make sure I get inverse. Also
> there might be some transformations, which otherwise might be not
> caught, like !(X || Y) == !X && !Y ...
Btw, I agree with Jakub. Fold is suppose to not create any garbage
if a folding does not apply. So I don't like your patch either.
Richard.
> Regards,
> Kai
>
>
> --
> | (\_/) This is Bunny. Copy and paste
> | (='.'=) Bunny into your signature to help
> | (")_(") him gain world domination
>
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2011-04-21 8:55 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-04-20 15:27 [patch middle-end]: Missed optimization for (x & ~y) | (~x & y) Kai Tietz
2011-04-20 15:48 ` Richard Henderson
2011-04-20 16:10 ` Kai Tietz
2011-04-20 16:16 ` Richard Henderson
2011-04-20 17:08 ` Kai Tietz
2011-04-20 18:00 ` Kai Tietz
2011-04-20 15:51 ` Jakub Jelinek
2011-04-20 16:18 ` Kai Tietz
2011-04-21 9:19 ` Richard Guenther
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).