public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314]
@ 2022-06-01 13:55 Jakub Jelinek
  2022-06-01 14:07 ` Jeff Law
  2022-06-02  8:36 ` Richard Biener
  0 siblings, 2 replies; 5+ messages in thread
From: Jakub Jelinek @ 2022-06-01 13:55 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

Hi!

A comparison with a constant is most likely always faster than
.MUL_OVERFLOW from which we only check whether it overflowed and not the
multiplication result, and even if not, it is simpler operation on GIMPLE
and even if a target exists where such multiplications with overflow checking
are cheaper than comparisons, because comparisons are so much more common
than overflow checking multiplications, it would be nice if it simply
arranged for comparisons to be emitted like those multiplications on its
own...

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-06-01  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/30314
	* match.pd (__builtin_mul_overflow_p (x, cst, (utype) 0) ->
	x > ~(utype)0 / cst): New simplification.

	* gcc.dg/tree-ssa/pr30314.c: New test.

--- gcc/match.pd.jj	2022-06-01 13:54:32.000654151 +0200
+++ gcc/match.pd	2022-06-01 15:13:35.473084402 +0200
@@ -5969,6 +5969,17 @@ (define_operator_list SYNC_FETCH_AND_AND
        && (!TYPE_UNSIGNED (TREE_TYPE (@2)) || TYPE_UNSIGNED (TREE_TYPE (@0))))
    (ovf @1 @0))))
 
+/* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types
+   are unsigned to x > (umax / cst).  */
+(simplify
+ (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1))
+  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+       && TYPE_UNSIGNED (TREE_TYPE (@0))
+       && TYPE_MAX_VALUE (TREE_TYPE (@0))
+       && types_match (TREE_TYPE (@0), TREE_TYPE (TREE_TYPE (@2)))
+       && int_fits_type_p (@1, TREE_TYPE (@0)))
+   (convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1)))))
+
 /* Simplification of math builtins.  These rules must all be optimizations
    as well as IL simplifications.  If there is a possibility that the new
    form could be a pessimization, the rule should go in the canonicalization
--- gcc/testsuite/gcc.dg/tree-ssa/pr30314.c.jj	2022-06-01 15:22:53.201271365 +0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr30314.c	2022-06-01 15:13:24.725196482 +0200
@@ -0,0 +1,18 @@
+/* PR middle-end/30314 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "\.MUL_OVERFLOW " "optimized" } } */
+/* { dg-final { scan-tree-dump " > 122713351" "optimized" { target int32 } } } */
+/* { dg-final { scan-tree-dump " > 527049830677415760" "optimized" { target lp64 } } } */
+
+int
+foo (unsigned int x)
+{
+  return __builtin_mul_overflow_p (x, 35U, 0U);
+}
+
+int
+bar (unsigned long int x)
+{
+  return __builtin_mul_overflow_p (x, 35UL, 0UL);
+}

	Jakub


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314]
  2022-06-01 13:55 [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314] Jakub Jelinek
@ 2022-06-01 14:07 ` Jeff Law
  2022-06-02  8:36 ` Richard Biener
  1 sibling, 0 replies; 5+ messages in thread
From: Jeff Law @ 2022-06-01 14:07 UTC (permalink / raw)
  To: gcc-patches



On 6/1/2022 7:55 AM, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> A comparison with a constant is most likely always faster than
> .MUL_OVERFLOW from which we only check whether it overflowed and not the
> multiplication result, and even if not, it is simpler operation on GIMPLE
> and even if a target exists where such multiplications with overflow checking
> are cheaper than comparisons, because comparisons are so much more common
> than overflow checking multiplications, it would be nice if it simply
> arranged for comparisons to be emitted like those multiplications on its
> own...
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2022-06-01  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR middle-end/30314
> 	* match.pd (__builtin_mul_overflow_p (x, cst, (utype) 0) ->
> 	x > ~(utype)0 / cst): New simplification.
>
> 	* gcc.dg/tree-ssa/pr30314.c: New test.
OK
jeff


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314]
  2022-06-01 13:55 [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314] Jakub Jelinek
  2022-06-01 14:07 ` Jeff Law
@ 2022-06-02  8:36 ` Richard Biener
  2022-06-02  9:18   ` Jakub Jelinek
  1 sibling, 1 reply; 5+ messages in thread
From: Richard Biener @ 2022-06-02  8:36 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Wed, 1 Jun 2022, Jakub Jelinek wrote:

> Hi!
> 
> A comparison with a constant is most likely always faster than
> .MUL_OVERFLOW from which we only check whether it overflowed and not the
> multiplication result, and even if not, it is simpler operation on GIMPLE
> and even if a target exists where such multiplications with overflow checking
> are cheaper than comparisons, because comparisons are so much more common
> than overflow checking multiplications, it would be nice if it simply
> arranged for comparisons to be emitted like those multiplications on its
> own...
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2022-06-01  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR middle-end/30314
> 	* match.pd (__builtin_mul_overflow_p (x, cst, (utype) 0) ->
> 	x > ~(utype)0 / cst): New simplification.
> 
> 	* gcc.dg/tree-ssa/pr30314.c: New test.
> 
> --- gcc/match.pd.jj	2022-06-01 13:54:32.000654151 +0200
> +++ gcc/match.pd	2022-06-01 15:13:35.473084402 +0200
> @@ -5969,6 +5969,17 @@ (define_operator_list SYNC_FETCH_AND_AND
>         && (!TYPE_UNSIGNED (TREE_TYPE (@2)) || TYPE_UNSIGNED (TREE_TYPE (@0))))
>     (ovf @1 @0))))
>  
> +/* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types
> +   are unsigned to x > (umax / cst).  */
> +(simplify
> + (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1))

does :c work here?  I think it is at least ignored, possibly diagnostic
in genmatch is missing ...

> +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +       && TYPE_UNSIGNED (TREE_TYPE (@0))
> +       && TYPE_MAX_VALUE (TREE_TYPE (@0))
> +       && types_match (TREE_TYPE (@0), TREE_TYPE (TREE_TYPE (@2)))
> +       && int_fits_type_p (@1, TREE_TYPE (@0)))
> +   (convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1)))))
> +
>  /* Simplification of math builtins.  These rules must all be optimizations
>     as well as IL simplifications.  If there is a possibility that the new
>     form could be a pessimization, the rule should go in the canonicalization
> --- gcc/testsuite/gcc.dg/tree-ssa/pr30314.c.jj	2022-06-01 15:22:53.201271365 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr30314.c	2022-06-01 15:13:24.725196482 +0200
> @@ -0,0 +1,18 @@
> +/* PR middle-end/30314 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not "\.MUL_OVERFLOW " "optimized" } } */
> +/* { dg-final { scan-tree-dump " > 122713351" "optimized" { target int32 } } } */
> +/* { dg-final { scan-tree-dump " > 527049830677415760" "optimized" { target lp64 } } } */
> +
> +int
> +foo (unsigned int x)
> +{
> +  return __builtin_mul_overflow_p (x, 35U, 0U);
> +}
> +
> +int
> +bar (unsigned long int x)
> +{
> +  return __builtin_mul_overflow_p (x, 35UL, 0UL);
> +}
> 
> 	Jakub
> 
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314]
  2022-06-02  8:36 ` Richard Biener
@ 2022-06-02  9:18   ` Jakub Jelinek
  2022-06-02  9:58     ` Richard Biener
  0 siblings, 1 reply; 5+ messages in thread
From: Jakub Jelinek @ 2022-06-02  9:18 UTC (permalink / raw)
  To: Richard Biener; +Cc: gcc-patches

On Thu, Jun 02, 2022 at 08:36:42AM +0000, Richard Biener wrote:
> > --- gcc/match.pd.jj	2022-06-01 13:54:32.000654151 +0200
> > +++ gcc/match.pd	2022-06-01 15:13:35.473084402 +0200
> > @@ -5969,6 +5969,17 @@ (define_operator_list SYNC_FETCH_AND_AND
> >         && (!TYPE_UNSIGNED (TREE_TYPE (@2)) || TYPE_UNSIGNED (TREE_TYPE (@0))))
> >     (ovf @1 @0))))
> >  
> > +/* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types
> > +   are unsigned to x > (umax / cst).  */
> > +(simplify
> > + (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1))
> 
> does :c work here?  I think it is at least ignored, possibly diagnostic
> in genmatch is missing ...

I saw it used in another pattern, so thought it will work fine:
  (cmp:c (realpart (IFN_ADD_OVERFLOW:c@2 @0 @1)) @0)

And looking at the generated source, I think it does work:
              case CFN_MUL_OVERFLOW:
                if (gimple_call_num_args (_c1) == 2)
                  {
                    tree _q20 = gimple_call_arg (_c1, 0);
                    _q20 = do_valueize (valueize, _q20);
                    tree _q21 = gimple_call_arg (_c1, 1);
                    _q21 = do_valueize (valueize, _q21);
                    if (integer_nonzerop (_q21))
                      {
                        {
/* #line 5976 "../../gcc/match.pd" */
                          tree captures[3] ATTRIBUTE_UNUSED = { _p0, _q20, _q21 };
                          if (gimple_simplify_256 (res_op, seq, valueize, type, captures))
                            return true;
                        }
                      }
                    if (integer_nonzerop (_q20))
                      {
                        {
/* #line 5976 "../../gcc/match.pd" */
                          tree captures[3] ATTRIBUTE_UNUSED = { _p0, _q21, _q20 };
                          if (gimple_simplify_256 (res_op, seq, valueize, type, captures))
                            return true;
                        }
                      }
                  }
Though, sure, I should test it in the testcase too.
Here is what I've committed as obvious after testing on x86_64-linux
-m32/-m64:

2022-06-02  Jakub Jelinek  <jakub@redhat.com>

	PR middle-end/30314
	* gcc.dg/tree-ssa/pr30314.c: Add tests with swapped arguments.

--- gcc/testsuite/gcc.dg/tree-ssa/pr30314.c.jj	2022-06-01 17:54:30.000000000 +0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr30314.c	2022-06-02 11:01:27.818987766 +0200
@@ -4,6 +4,8 @@
 /* { dg-final { scan-tree-dump-not "\.MUL_OVERFLOW " "optimized" } } */
 /* { dg-final { scan-tree-dump " > 122713351" "optimized" { target int32 } } } */
 /* { dg-final { scan-tree-dump " > 527049830677415760" "optimized" { target lp64 } } } */
+/* { dg-final { scan-tree-dump " > 102261126" "optimized" { target int32 } } } */
+/* { dg-final { scan-tree-dump " > 439208192231179800" "optimized" { target lp64 } } } */
 
 int
 foo (unsigned int x)
@@ -16,3 +18,15 @@ bar (unsigned long int x)
 {
   return __builtin_mul_overflow_p (x, 35UL, 0UL);
 }
+
+int
+baz (unsigned int x)
+{
+  return __builtin_mul_overflow_p (42, x, 0U);
+}
+
+int
+qux (unsigned long int x)
+{
+  return __builtin_mul_overflow_p (42, x, 0UL);
+}


	Jakub


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314]
  2022-06-02  9:18   ` Jakub Jelinek
@ 2022-06-02  9:58     ` Richard Biener
  0 siblings, 0 replies; 5+ messages in thread
From: Richard Biener @ 2022-06-02  9:58 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: gcc-patches

On Thu, 2 Jun 2022, Jakub Jelinek wrote:

> On Thu, Jun 02, 2022 at 08:36:42AM +0000, Richard Biener wrote:
> > > --- gcc/match.pd.jj	2022-06-01 13:54:32.000654151 +0200
> > > +++ gcc/match.pd	2022-06-01 15:13:35.473084402 +0200
> > > @@ -5969,6 +5969,17 @@ (define_operator_list SYNC_FETCH_AND_AND
> > >         && (!TYPE_UNSIGNED (TREE_TYPE (@2)) || TYPE_UNSIGNED (TREE_TYPE (@0))))
> > >     (ovf @1 @0))))
> > >  
> > > +/* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types
> > > +   are unsigned to x > (umax / cst).  */
> > > +(simplify
> > > + (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1))
> > 
> > does :c work here?  I think it is at least ignored, possibly diagnostic
> > in genmatch is missing ...
> 
> I saw it used in another pattern, so thought it will work fine:
>   (cmp:c (realpart (IFN_ADD_OVERFLOW:c@2 @0 @1)) @0)
> 
> And looking at the generated source, I think it does work:
>               case CFN_MUL_OVERFLOW:
>                 if (gimple_call_num_args (_c1) == 2)
>                   {
>                     tree _q20 = gimple_call_arg (_c1, 0);
>                     _q20 = do_valueize (valueize, _q20);
>                     tree _q21 = gimple_call_arg (_c1, 1);
>                     _q21 = do_valueize (valueize, _q21);
>                     if (integer_nonzerop (_q21))
>                       {
>                         {
> /* #line 5976 "../../gcc/match.pd" */
>                           tree captures[3] ATTRIBUTE_UNUSED = { _p0, _q20, _q21 };
>                           if (gimple_simplify_256 (res_op, seq, valueize, type, captures))
>                             return true;
>                         }
>                       }
>                     if (integer_nonzerop (_q20))
>                       {
>                         {
> /* #line 5976 "../../gcc/match.pd" */
>                           tree captures[3] ATTRIBUTE_UNUSED = { _p0, _q21, _q20 };
>                           if (gimple_simplify_256 (res_op, seq, valueize, type, captures))
>                             return true;
>                         }
>                       }
>                   }
> Though, sure, I should test it in the testcase too.

Ah yeah, we simply trust :c on functions with two arguments (and
explicitely handle CFN_FMA).

Richard.

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2022-06-02  9:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-01 13:55 [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314] Jakub Jelinek
2022-06-01 14:07 ` Jeff Law
2022-06-02  8:36 ` Richard Biener
2022-06-02  9:18   ` Jakub Jelinek
2022-06-02  9:58     ` Richard Biener

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).