public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* Multiply Optimization in match and Simplify
@ 2015-10-29  8:21 Hurugalawadi, Naveen
  2015-10-30 10:08 ` Richard Biener
  0 siblings, 1 reply; 6+ messages in thread
From: Hurugalawadi, Naveen @ 2015-10-29  8:21 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 1604 bytes --]

Hi,

Please find attached the patch that moves some multiply optimizations
from fold-const using simplify and match.

Please review the patch and let me know if any modifications are required.

Tested the patch on X86.

Observing following failures:-

>> FAIL: gcc.dg/fold-plusmult.c scan-tree-dump-times original "<a> \\* 4" 2

Should the testcase be changed to suit current pattern?

>> FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times vrp2 " & 1;" 0
>> FAIL: gcc.dg/tree-ssa/vrp59.c scan-tree-dump-not vrp1 " & 3;"

Its due to the following pattern. Pattern seems to be right.
Fold X & (X ^ Y) as X & ~Y
The test PASSes when we have the result as ~X & Y in some
of the following combinations which is wrong.
(bit_and (convert? (bit_xor:c @0 @1) (convert? @0) ))

Thanks,
Naveen

ChangeLog

2015-10-29  Naveen H.S  <Naveen.Hurugalawadi@caviumnetworks.com>

        * fold-const.c (fold_binary_loc) : Remove A - B -> A + (-B) if B
	is easily negatable as its already present.
	Move x * -C into -x * C if x is easily negatable to match.pd.
	Move (A + A) * C -> A * 2 * C to match.pd.
	Move Fold (X ^ Y) & Y as ~X & Y to match.pd.
	Move Fold (X ^ Y) & X as ~Y & X to match.pd.
	Move Fold X & (X ^ Y) as X & ~Y to match.pd.
	Move Fold X & (Y ^ X) as ~Y & X to match.pd.

        * match.pd (bit_and:c (convert? @0) (convert? (bit_xor:c @0 @1))):
	New simplifier.
	(mult:c (convert? negate_expr_p@0) INTEGER_CST@1): New simplifier.
	(mult:c (convert? (plus @0 @0)) (convert? @1)): New simplifier.
	(mult:c (convert? (plus @0 @0)) INTEGER_CST@1): New simplifier.

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: mult-opt.patch --]
[-- Type: text/x-patch; name="mult-opt.patch", Size: 5421 bytes --]

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index e8ff1de..0334b53 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -9695,19 +9695,6 @@ fold_binary_loc (location_t loc,
 	    }
 	}
 
-      /* A - B -> A + (-B) if B is easily negatable.  */
-      if (negate_expr_p (arg1)
-	  && !TYPE_OVERFLOW_SANITIZED (type)
-	  && ((FLOAT_TYPE_P (type)
-               /* Avoid this transformation if B is a positive REAL_CST.  */
-	       && (TREE_CODE (arg1) != REAL_CST
-		   ||  REAL_VALUE_NEGATIVE (TREE_REAL_CST (arg1))))
-	      || INTEGRAL_TYPE_P (type)))
-	return fold_build2_loc (loc, PLUS_EXPR, type,
-			    fold_convert_loc (loc, type, arg0),
-			    fold_convert_loc (loc, type,
-					      negate_expr (arg1)));
-
       /* Fold &a[i] - &a[j] to i-j.  */
       if (TREE_CODE (arg0) == ADDR_EXPR
 	  && TREE_CODE (TREE_OPERAND (arg0, 0)) == ARRAY_REF
@@ -9749,29 +9736,6 @@ fold_binary_loc (location_t loc,
     case MULT_EXPR:
       if (! FLOAT_TYPE_P (type))
 	{
-	  /* Transform x * -C into -x * C if x is easily negatable.  */
-	  if (TREE_CODE (arg1) == INTEGER_CST
-	      && tree_int_cst_sgn (arg1) == -1
-	      && negate_expr_p (arg0)
-	      && (tem = negate_expr (arg1)) != arg1
-	      && !TREE_OVERFLOW (tem))
-	    return fold_build2_loc (loc, MULT_EXPR, type,
-	    			fold_convert_loc (loc, type,
-						  negate_expr (arg0)),
-				tem);
-
-	  /* (A + A) * C -> A * 2 * C  */
-	  if (TREE_CODE (arg0) == PLUS_EXPR
-	      && TREE_CODE (arg1) == INTEGER_CST
-	      && operand_equal_p (TREE_OPERAND (arg0, 0),
-			          TREE_OPERAND (arg0, 1), 0))
-	    return fold_build2_loc (loc, MULT_EXPR, type,
-				omit_one_operand_loc (loc, type,
-						  TREE_OPERAND (arg0, 0),
-						  TREE_OPERAND (arg0, 1)),
-				fold_build2_loc (loc, MULT_EXPR, type,
-					     build_int_cst (type, 2) , arg1));
-
 	  /* ((T) (X /[ex] C)) * C cancels out if the conversion is
 	     sign-changing only.  */
 	  if (TREE_CODE (arg1) == INTEGER_CST
@@ -9961,45 +9925,6 @@ fold_binary_loc (location_t loc,
 				  build_zero_cst (TREE_TYPE (tem)));
 	}
 
-      /* Fold (X ^ Y) & Y as ~X & Y.  */
-      if (TREE_CODE (arg0) == BIT_XOR_EXPR
-	  && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg1));
-	}
-      /* Fold (X ^ Y) & X as ~Y & X.  */
-      if (TREE_CODE (arg0) == BIT_XOR_EXPR
-	  && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0)
-	  && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg1));
-	}
-      /* Fold X & (X ^ Y) as X & ~Y.  */
-      if (TREE_CODE (arg1) == BIT_XOR_EXPR
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_convert_loc (loc, type, arg0),
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem));
-	}
-      /* Fold X & (Y ^ X) as ~Y & X.  */
-      if (TREE_CODE (arg1) == BIT_XOR_EXPR
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
-	  && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg0));
-	}
-
       /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
          multiple of 1 << CST.  */
       if (TREE_CODE (arg1) == INTEGER_CST)
diff --git a/gcc/match.pd b/gcc/match.pd
index 1d6dde1..6793f59 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -490,6 +490,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (wi::bit_not (@2) == @1)
   (bit_xor @0 @1)))
 
+/* Fold X & (X ^ Y) as X & ~Y.  */
+(simplify
+ (bit_and:c (convert? @0) (convert? (bit_xor:c @0 @1)))
+  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+   (convert (bit_and @0 (bit_not @1)))))
+
 /* X % Y is smaller than Y.  */
 (for cmp (lt ge)
  (simplify
@@ -1600,12 +1606,31 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
    (if (!TREE_OVERFLOW (tem) || !flag_trapping_math)
     (minus @0 { tem; })))))
 
+/* Convert X * -C into -X * C.  */
+(simplify
+ (mult:c (convert? negate_expr_p@0) INTEGER_CST@1)
+  (if (tree_int_cst_sgn (@1) == -1)
+   (with { tree tem = const_unop (NEGATE_EXPR, type, @1); }
+    (if (!TREE_OVERFLOW (tem) && wi::ne_p (tem, @1)
+	  && tree_nop_conversion_p (type, TREE_TYPE (@0)))
+     (mult (convert (negate @0)) @1)))))
+
 /* Convert x+x into x*2.0.  */
 (simplify
  (plus @0 @0)
  (if (SCALAR_FLOAT_TYPE_P (type))
   (mult @0 { build_real (type, dconst2); })))
 
+/* Convert (A + A) * C -> A * 2 * C.  */
+(simplify
+ (mult:c (convert? (plus @0 @0)) (convert? @1))
+  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
+   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)))))
+(simplify
+ (mult:c (convert? (plus @0 @0)) INTEGER_CST@1)
+  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
+   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)))))
+
 (simplify
  (minus integer_zerop @1)
  (negate @1))

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Multiply Optimization in match and Simplify
  2015-10-29  8:21 Multiply Optimization in match and Simplify Hurugalawadi, Naveen
@ 2015-10-30 10:08 ` Richard Biener
  2015-10-30 10:44   ` Marc Glisse
  2015-11-03  5:13   ` Hurugalawadi, Naveen
  0 siblings, 2 replies; 6+ messages in thread
From: Richard Biener @ 2015-10-30 10:08 UTC (permalink / raw)
  To: Hurugalawadi, Naveen; +Cc: gcc-patches

On Thu, Oct 29, 2015 at 5:34 AM, Hurugalawadi, Naveen
<Naveen.Hurugalawadi@caviumnetworks.com> wrote:
> Hi,
>
> Please find attached the patch that moves some multiply optimizations
> from fold-const using simplify and match.
>
> Please review the patch and let me know if any modifications are required.
>
> Tested the patch on X86.
>
> Observing following failures:-
>
>>> FAIL: gcc.dg/fold-plusmult.c scan-tree-dump-times original "<a> \\* 4" 2
>
> Should the testcase be changed to suit current pattern?
>
>>> FAIL: gcc.dg/tree-ssa/vrp47.c scan-tree-dump-times vrp2 " & 1;" 0
>>> FAIL: gcc.dg/tree-ssa/vrp59.c scan-tree-dump-not vrp1 " & 3;"
>
> Its due to the following pattern. Pattern seems to be right.
> Fold X & (X ^ Y) as X & ~Y
> The test PASSes when we have the result as ~X & Y in some
> of the following combinations which is wrong.
> (bit_and (convert? (bit_xor:c @0 @1) (convert? @0) ))

Please do not drop A - B -> A + (-B) from fold-const as match.pd
doesn't implement all of fold-const.c negate_expr_p support.

+/* Fold X & (X ^ Y) as X & ~Y.  */
+(simplify
+ (bit_and:c (convert? @0) (convert? (bit_xor:c @0 @1)))
+  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+   (convert (bit_and @0 (bit_not @1)))))

Ok, so the VRP regression is because we convert

  _8 = x_2(D) ^ 1;
  _4 = (_Bool) _8;
  _5 = (int) _4;

to

  _7 = ~x_2(D);
  _9 = _7 & 1;

via the new pattern and

    /* A truncation to an unsigned type (a zero-extension) should be
       canonicalized as bitwise and of a mask.  */
    (if (final_int && inter_int && inside_int
         && final_prec == inside_prec
         && final_prec > inter_prec
         && inter_unsignedp)
     (convert (bit_and @0 { wide_int_to_tree
                              (inside_type,
                               wi::mask (inter_prec, false,
                                         TYPE_PRECISION (inside_type))); })))

Previously VRP ended up with

  <bb 3>:
  _8 = x_2(D) ^ 1;

and now we have

  _7 = ~x_2(D);
  _9 = _7 & 1;

which is more expensive.  This means that we miss a

(bit_and (bit_not @0) INTEGER_CST@1)

-> (bit_xor @0 @1)

pattern that applies when VRP knows the range of x_2(D)
(all masked bits are know to zero).

+/* Convert X * -C into -X * C.  */
+(simplify
+ (mult:c (convert? negate_expr_p@0) INTEGER_CST@1)
+  (if (tree_int_cst_sgn (@1) == -1)
+   (with { tree tem = const_unop (NEGATE_EXPR, type, @1); }
+    (if (!TREE_OVERFLOW (tem) && wi::ne_p (tem, @1)
+         && tree_nop_conversion_p (type, TREE_TYPE (@0)))
+     (mult (convert (negate @0)) @1)))))

as said above match.pd negate_expr_p doesn't capture everything
fold-const.c does so moving the above isn't a good idea.

+/* Convert (A + A) * C -> A * 2 * C.  */
+(simplify
+ (mult:c (convert? (plus @0 @0)) (convert? @1))
+  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
+   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)))))
+(simplify
+ (mult:c (convert? (plus @0 @0)) INTEGER_CST@1)
+  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
+   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)))))

fold-const.c only handles constant C, so we only need to 2nd pattern.
Also the :c on the mult in that is not needed due to canonicalization rules.
Please build the result of the inner multiplication directly.
I think the fold-const.c code has an overflow issue when the outer
multiplication
is signed and the inner addition unsigned.  (signed)((unsigned)INT_MAX
+ (unsigned)INT_MAX)*2
is valid but INT_MAX * 4 is not as it overflows.  So I think we should
_not_ allow
nop conversions here (it's fine if all ops are signed or unsigned).

Richard.



> Thanks,
> Naveen
>
> ChangeLog
>
> 2015-10-29  Naveen H.S  <Naveen.Hurugalawadi@caviumnetworks.com>
>
>         * fold-const.c (fold_binary_loc) : Remove A - B -> A + (-B) if B
>         is easily negatable as its already present.
>         Move x * -C into -x * C if x is easily negatable to match.pd.
>         Move (A + A) * C -> A * 2 * C to match.pd.
>         Move Fold (X ^ Y) & Y as ~X & Y to match.pd.
>         Move Fold (X ^ Y) & X as ~Y & X to match.pd.
>         Move Fold X & (X ^ Y) as X & ~Y to match.pd.
>         Move Fold X & (Y ^ X) as ~Y & X to match.pd.
>
>         * match.pd (bit_and:c (convert? @0) (convert? (bit_xor:c @0 @1))):
>         New simplifier.
>         (mult:c (convert? negate_expr_p@0) INTEGER_CST@1): New simplifier.
>         (mult:c (convert? (plus @0 @0)) (convert? @1)): New simplifier.
>         (mult:c (convert? (plus @0 @0)) INTEGER_CST@1): New simplifier.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Multiply Optimization in match and Simplify
  2015-10-30 10:08 ` Richard Biener
@ 2015-10-30 10:44   ` Marc Glisse
  2015-10-30 11:06     ` Richard Biener
  2015-11-03  5:13   ` Hurugalawadi, Naveen
  1 sibling, 1 reply; 6+ messages in thread
From: Marc Glisse @ 2015-10-30 10:44 UTC (permalink / raw)
  To: Richard Biener; +Cc: Hurugalawadi, Naveen, gcc-patches

On Fri, 30 Oct 2015, Richard Biener wrote:

> +/* Convert (A + A) * C -> A * 2 * C.  */
> +(simplify
> + (mult:c (convert? (plus @0 @0)) (convert? @1))
> +  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
> +   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)))))
> +(simplify
> + (mult:c (convert? (plus @0 @0)) INTEGER_CST@1)
> +  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
> +   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)))))
>
> fold-const.c only handles constant C, so we only need to 2nd pattern.
> Also the :c on the mult in that is not needed due to canonicalization rules.
> Please build the result of the inner multiplication directly.
> I think the fold-const.c code has an overflow issue when the outer
> multiplication
> is signed and the inner addition unsigned.  (signed)((unsigned)INT_MAX
> + (unsigned)INT_MAX)*2
> is valid but INT_MAX * 4 is not as it overflows.  So I think we should
> _not_ allow
> nop conversions here (it's fine if all ops are signed or unsigned).

Is there a reason why the simple transformation A+A->2*A is not
generally a desirable canonicalization? We currently restrict it to
SCALAR_FLOAT_TYPE_P.

-- 
Marc Glisse

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Multiply Optimization in match and Simplify
  2015-10-30 10:44   ` Marc Glisse
@ 2015-10-30 11:06     ` Richard Biener
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Biener @ 2015-10-30 11:06 UTC (permalink / raw)
  To: GCC Patches; +Cc: Hurugalawadi, Naveen

On Fri, Oct 30, 2015 at 11:26 AM, Marc Glisse <marc.glisse@inria.fr> wrote:
> On Fri, 30 Oct 2015, Richard Biener wrote:
>
>> +/* Convert (A + A) * C -> A * 2 * C.  */
>> +(simplify
>> + (mult:c (convert? (plus @0 @0)) (convert? @1))
>> +  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
>> +   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)))))
>> +(simplify
>> + (mult:c (convert? (plus @0 @0)) INTEGER_CST@1)
>> +  (if (tree_nop_conversion_p (TREE_TYPE (@0), type))
>> +   (convert (mult @0 (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)))))
>>
>> fold-const.c only handles constant C, so we only need to 2nd pattern.
>> Also the :c on the mult in that is not needed due to canonicalization
>> rules.
>> Please build the result of the inner multiplication directly.
>> I think the fold-const.c code has an overflow issue when the outer
>> multiplication
>> is signed and the inner addition unsigned.  (signed)((unsigned)INT_MAX
>> + (unsigned)INT_MAX)*2
>> is valid but INT_MAX * 4 is not as it overflows.  So I think we should
>> _not_ allow
>> nop conversions here (it's fine if all ops are signed or unsigned).
>
>
> Is there a reason why the simple transformation A+A->2*A is not
> generally a desirable canonicalization? We currently restrict it to
> SCALAR_FLOAT_TYPE_P.

No special reason I know of.

>
> --
> Marc Glisse

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Multiply Optimization in match and Simplify
  2015-10-30 10:08 ` Richard Biener
  2015-10-30 10:44   ` Marc Glisse
@ 2015-11-03  5:13   ` Hurugalawadi, Naveen
  2015-11-03 15:42     ` Richard Biener
  1 sibling, 1 reply; 6+ messages in thread
From: Hurugalawadi, Naveen @ 2015-11-03  5:13 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches, marc.glisse

[-- Attachment #1: Type: text/plain, Size: 924 bytes --]

Hi,

Thanks for the review and suggestions.

>> Please do not drop A - B -> A + (-B) from fold-const as match.pd
>> doesn't implement all of fold-const.c negate_expr_p support.

Done.

>> which is more expensive.  This means that we miss a
>> (bit_and (bit_not @0) INTEGER_CST@1)

Should we have this pattern implemented in match.pd?

>> negate_expr_p doesn't capture everything
>> fold-const.c does so moving the above isn't a good idea.

Dropped the pattern. Was working on some more patterns
that had negate_expr_p. Will drop all of them.

>> fold-const.c only handles constant C, so we only need to 2nd pattern.

Yeah. Thought that even having variable would be optimized in a similar
manner and hence had that pattern.

Please find attached the modified pattern as per suggestions.
Please review the patch and let me know if there should be any further
modifications in it.

Thanks,
Naveen

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: mult-opt1.patch --]
[-- Type: text/x-patch; name="mult-opt1.patch", Size: 3535 bytes --]

diff --git a/gcc/fold-const.c b/gcc/fold-const.c
index ee9b349..c34d462 100644
--- a/gcc/fold-const.c
+++ b/gcc/fold-const.c
@@ -9739,18 +9739,6 @@ fold_binary_loc (location_t loc,
 						  negate_expr (arg0)),
 				tem);
 
-	  /* (A + A) * C -> A * 2 * C  */
-	  if (TREE_CODE (arg0) == PLUS_EXPR
-	      && TREE_CODE (arg1) == INTEGER_CST
-	      && operand_equal_p (TREE_OPERAND (arg0, 0),
-			          TREE_OPERAND (arg0, 1), 0))
-	    return fold_build2_loc (loc, MULT_EXPR, type,
-				omit_one_operand_loc (loc, type,
-						  TREE_OPERAND (arg0, 0),
-						  TREE_OPERAND (arg0, 1)),
-				fold_build2_loc (loc, MULT_EXPR, type,
-					     build_int_cst (type, 2) , arg1));
-
 	  /* ((T) (X /[ex] C)) * C cancels out if the conversion is
 	     sign-changing only.  */
 	  if (TREE_CODE (arg1) == INTEGER_CST
@@ -9940,45 +9928,6 @@ fold_binary_loc (location_t loc,
 				  build_zero_cst (TREE_TYPE (tem)));
 	}
 
-      /* Fold (X ^ Y) & Y as ~X & Y.  */
-      if (TREE_CODE (arg0) == BIT_XOR_EXPR
-	  && operand_equal_p (TREE_OPERAND (arg0, 1), arg1, 0))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg1));
-	}
-      /* Fold (X ^ Y) & X as ~Y & X.  */
-      if (TREE_CODE (arg0) == BIT_XOR_EXPR
-	  && operand_equal_p (TREE_OPERAND (arg0, 0), arg1, 0)
-	  && reorder_operands_p (TREE_OPERAND (arg0, 1), arg1))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg0, 1));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg1));
-	}
-      /* Fold X & (X ^ Y) as X & ~Y.  */
-      if (TREE_CODE (arg1) == BIT_XOR_EXPR
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 1));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_convert_loc (loc, type, arg0),
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem));
-	}
-      /* Fold X & (Y ^ X) as ~Y & X.  */
-      if (TREE_CODE (arg1) == BIT_XOR_EXPR
-	  && operand_equal_p (arg0, TREE_OPERAND (arg1, 1), 0)
-	  && reorder_operands_p (arg0, TREE_OPERAND (arg1, 0)))
-	{
-	  tem = fold_convert_loc (loc, type, TREE_OPERAND (arg1, 0));
-	  return fold_build2_loc (loc, BIT_AND_EXPR, type,
-			      fold_build1_loc (loc, BIT_NOT_EXPR, type, tem),
-			      fold_convert_loc (loc, type, arg0));
-	}
-
       /* Fold (X * Y) & -(1 << CST) to X * Y if Y is a constant
          multiple of 1 << CST.  */
       if (TREE_CODE (arg1) == INTEGER_CST)
diff --git a/gcc/match.pd b/gcc/match.pd
index f6c5c07..bd33ea2 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -492,6 +492,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (wi::bit_not (@2) == @1)
   (bit_xor @0 @1)))
 
+/* Fold X & (X ^ Y) as X & ~Y.  */
+(simplify
+ (bit_and:c (convert? @0) (convert? (bit_xor:c @0 @1)))
+  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+   (convert (bit_and @0 (bit_not @1)))))
+
 /* X % Y is smaller than Y.  */
 (for cmp (lt ge)
  (simplify
@@ -1608,6 +1614,11 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (if (SCALAR_FLOAT_TYPE_P (type))
   (mult @0 { build_real (type, dconst2); })))
 
+/* Convert (A + A) * C -> A * 2 * C.  */
+(simplify
+ (mult (convert? (plus @0 @0)) INTEGER_CST@1)
+  (mult (convert @0) (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)))
+
 (simplify
  (minus integer_zerop @1)
  (negate @1))

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Multiply Optimization in match and Simplify
  2015-11-03  5:13   ` Hurugalawadi, Naveen
@ 2015-11-03 15:42     ` Richard Biener
  0 siblings, 0 replies; 6+ messages in thread
From: Richard Biener @ 2015-11-03 15:42 UTC (permalink / raw)
  To: Hurugalawadi, Naveen; +Cc: gcc-patches, marc.glisse

On Tue, Nov 3, 2015 at 6:12 AM, Hurugalawadi, Naveen
<Naveen.Hurugalawadi@caviumnetworks.com> wrote:
> Hi,
>
> Thanks for the review and suggestions.
>
>>> Please do not drop A - B -> A + (-B) from fold-const as match.pd
>>> doesn't implement all of fold-const.c negate_expr_p support.
>
> Done.
>
>>> which is more expensive.  This means that we miss a
>>> (bit_and (bit_not @0) INTEGER_CST@1)
>
> Should we have this pattern implemented in match.pd?

I think so.  Please combine with

+/* Fold X & (X ^ Y) as X & ~Y.  */
+(simplify
+ (bit_and:c (convert? @0) (convert? (bit_xor:c @0 @1)))
+  (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
+   (convert (bit_and @0 (bit_not @1)))))


>>> negate_expr_p doesn't capture everything
>>> fold-const.c does so moving the above isn't a good idea.
>
> Dropped the pattern. Was working on some more patterns
> that had negate_expr_p. Will drop all of them.
>
>>> fold-const.c only handles constant C, so we only need to 2nd pattern.
>
> Yeah. Thought that even having variable would be optimized in a similar
> manner and hence had that pattern.

+/* Convert (A + A) * C -> A * 2 * C.  */
+(simplify
+ (mult (convert? (plus @0 @0)) INTEGER_CST@1)
+  (mult (convert @0) (mult { build_int_cst (TREE_TYPE (@1), 2); } @1)))

so you dropped the nop-conversion check but not the converts ...  This should
now simply match

 (mult (plus @0 @0) INTEGER_CST@1)

but I'd rather (as Marc pointed out) enable the

/* Convert x+x into x*2.0.  */
(simplify
 (plus @0 @0)
 (if (SCALAR_FLOAT_TYPE_P (type))
  (mult @0 { build_real (type, dconst2); })))

pattern also for integral types.  That should make the pattern redundant
(and the fold code can be removed anyway).  Like with

@@ -1606,7 +1619,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (plus @0 @0)
  (if (SCALAR_FLOAT_TYPE_P (type))
-  (mult @0 { build_real (type, dconst2); })))
+  (mult @0 { build_real (type, dconst2); })
+  (if (INTEGRAL_TYPE_P (type))
+   (mult @0 { build_int_cst (type, 2); }))))

 (simplify
  (minus integer_zerop @1)

Richard.

> Please find attached the modified pattern as per suggestions.
> Please review the patch and let me know if there should be any further
> modifications in it.
>
> Thanks,
> Naveen

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2015-11-03 15:42 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-29  8:21 Multiply Optimization in match and Simplify Hurugalawadi, Naveen
2015-10-30 10:08 ` Richard Biener
2015-10-30 10:44   ` Marc Glisse
2015-10-30 11:06     ` Richard Biener
2015-11-03  5:13   ` Hurugalawadi, Naveen
2015-11-03 15:42     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).