From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by sourceware.org (Postfix) with ESMTPS id 25E6F384BC2C for ; Thu, 2 Jun 2022 13:46:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 25E6F384BC2C Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out1.suse.de (Postfix) with ESMTP id 0813921AF7; Thu, 2 Jun 2022 13:46:26 +0000 (UTC) Received: from wotan.suse.de (wotan.suse.de [10.160.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id DDCB22C141; Thu, 2 Jun 2022 13:46:25 +0000 (UTC) Date: Thu, 2 Jun 2022 13:46:25 +0000 (UTC) From: Richard Biener To: Jakub Jelinek cc: Jeff Law , gcc-patches@gcc.gnu.org Subject: Re: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (stype)0) [PR105777] In-Reply-To: Message-ID: References: User-Agent: Alpine 2.22 (LSU 394 2020-01-19) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jun 2022 13:46:29 -0000 On Thu, 2 Jun 2022, Jakub Jelinek wrote: > Hi! > > The following patch is an incremental change to the PR30314 enhancement, > this one handles signed types. > For signed types (but still, the same for 1st and result element type > and non-zero constant that fits into that type), we actually need to > watch for overflow in direction to positive and negative infinity > and it also depends on whether the cst operand is positive or negative. > For __builtin_mul_overflow_p (x, cst, (stype) 0): > For cst > 0, we can simplify it to: > x > INT_MAX / cst || x < INT_MIN / cst > aka: > x + (unsigned) (INT_MIN / cst) > (unsigned) (INT_MAX / cst) - (unsigned) (INT_MIN / cst) > and for cst < 0 to: > x < INT_MAX / cst || x > INT_MIN / cst > aka: > x + (unsigned) (INT_MAX / cst) > (unsigned) (INT_MIN / cst) - (unsigned) (INT_MAX / cst) > > Additionally, I've added executable testcases, so we don't just check for > the optimization to be performed, but also that it is correct (done that > even for the other PR's testcase). > > Starting x86_64-linux and i686-linux bootstrap/regtest, ok for trunk if > it passes them? OK. Thanks, Richard. > 2022-06-02 Jakub Jelinek > > PR middle-end/30314 > PR middle-end/105777 > * match.pd (__builtin_mul_overflow_p (x, cst, (stype) 0) -> > x > stype_max / cst || x < stype_min / cst): New simplification. > > * gcc.dg/tree-ssa/pr30314.c: Add noipa attribute to all functions. > * gcc.dg/tree-ssa/pr105777.c: New test. > * gcc.c-torture/execute/pr30314.c: New test. > * gcc.c-torture/execute/pr105777.c: New test. > > --- gcc/match.pd.jj 2022-06-01 17:54:30.536372912 +0200 > +++ gcc/match.pd 2022-06-02 13:16:17.171415948 +0200 > @@ -5970,15 +5970,37 @@ (define_operator_list SYNC_FETCH_AND_AND > (ovf @1 @0)))) > > /* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types > - are unsigned to x > (umax / cst). */ > + are unsigned to x > (umax / cst). Similarly for signed type, but > + in that case it needs to be outside of a range. */ > (simplify > (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1)) > (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) > - && TYPE_UNSIGNED (TREE_TYPE (@0)) > && TYPE_MAX_VALUE (TREE_TYPE (@0)) > && types_match (TREE_TYPE (@0), TREE_TYPE (TREE_TYPE (@2))) > && int_fits_type_p (@1, TREE_TYPE (@0))) > - (convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1))))) > + (if (TYPE_UNSIGNED (TREE_TYPE (@0))) > + (convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1))) > + (if (TYPE_MIN_VALUE (TREE_TYPE (@0))) > + (with > + { > + tree lo = int_const_binop (TRUNC_DIV_EXPR, > + TYPE_MIN_VALUE (TREE_TYPE (@0)), > + fold_convert (TREE_TYPE (@0), @1)); > + tree hi = int_const_binop (TRUNC_DIV_EXPR, > + TYPE_MAX_VALUE (TREE_TYPE (@0)), > + fold_convert (TREE_TYPE (@0), @1)); > + tree etype = range_check_type (TREE_TYPE (@0)); > + if (etype) > + { > + if (wi::neg_p (wi::to_wide (@1))) > + std::swap (lo, hi); > + lo = fold_convert (etype, lo); > + hi = fold_convert (etype, hi); > + hi = int_const_binop (MINUS_EXPR, hi, lo); > + } > + } > + (if (etype) > + (convert (gt (minus (convert:etype @0) { lo; }) { hi; })))))))) > > /* Simplification of math builtins. These rules must all be optimizations > as well as IL simplifications. If there is a possibility that the new > --- gcc/testsuite/gcc.dg/tree-ssa/pr30314.c.jj 2022-06-02 11:17:23.689835550 +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr30314.c 2022-06-02 14:23:19.294093445 +0200 > @@ -7,25 +7,25 @@ > /* { dg-final { scan-tree-dump " > 102261126" "optimized" { target int32 } } } */ > /* { dg-final { scan-tree-dump " > 439208192231179800" "optimized" { target lp64 } } } */ > > -int > +__attribute__((noipa)) int > foo (unsigned int x) > { > return __builtin_mul_overflow_p (x, 35U, 0U); > } > > -int > +__attribute__((noipa)) int > bar (unsigned long int x) > { > return __builtin_mul_overflow_p (x, 35UL, 0UL); > } > > -int > +__attribute__((noipa)) int > baz (unsigned int x) > { > return __builtin_mul_overflow_p (42, x, 0U); > } > > -int > +__attribute__((noipa)) int > qux (unsigned long int x) > { > return __builtin_mul_overflow_p (42, x, 0UL); > --- gcc/testsuite/gcc.dg/tree-ssa/pr105777.c.jj 2022-06-02 14:22:57.017328731 +0200 > +++ gcc/testsuite/gcc.dg/tree-ssa/pr105777.c 2022-06-02 14:19:29.399521503 +0200 > @@ -0,0 +1,68 @@ > +/* PR middle-end/105777 */ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -fdump-tree-optimized" } */ > +/* { dg-final { scan-tree-dump-not "\.MUL_OVERFLOW " "optimized" } } */ > +/* { dg-final { scan-tree-dump " \\+ 61356675;" "optimized" { target int32 } } } */ > +/* { dg-final { scan-tree-dump " > 122713350" "optimized" { target int32 } } } */ > +/* { dg-final { scan-tree-dump " \\+ 263524915338707880" "optimized" { target lp64 } } } */ > +/* { dg-final { scan-tree-dump " > 527049830677415760" "optimized" { target lp64 } } } */ > +/* { dg-final { scan-tree-dump " \\+ 51130563" "optimized" { target int32 } } } */ > +/* { dg-final { scan-tree-dump " > 102261126" "optimized" { target int32 } } } */ > +/* { dg-final { scan-tree-dump " \\+ 219604096115589900" "optimized" { target lp64 } } } */ > +/* { dg-final { scan-tree-dump " > 439208192231179800" "optimized" { target lp64 } } } */ > +/* { dg-final { scan-tree-dump " \\+ 55063683;" "optimized" { target int32 } } } */ > +/* { dg-final { scan-tree-dump " > 110127366" "optimized" { target int32 } } } */ > +/* { dg-final { scan-tree-dump " \\+ 236496718893712200" "optimized" { target lp64 } } } */ > +/* { dg-final { scan-tree-dump " > 472993437787424400" "optimized" { target lp64 } } } */ > +/* { dg-final { scan-tree-dump " \\+ 46684427" "optimized" { target int32 } } } */ > +/* { dg-final { scan-tree-dump " > 93368854" "optimized" { target int32 } } } */ > +/* { dg-final { scan-tree-dump " \\+ 200508087757712517" "optimized" { target lp64 } } } */ > +/* { dg-final { scan-tree-dump " > 401016175515425034" "optimized" { target lp64 } } } */ > + > +__attribute__((noipa)) int > +foo (int x) > +{ > + return __builtin_mul_overflow_p (x, 35, 0); > +} > + > +__attribute__((noipa)) int > +bar (long int x) > +{ > + return __builtin_mul_overflow_p (x, 35L, 0L); > +} > + > +__attribute__((noipa)) int > +baz (int x) > +{ > + return __builtin_mul_overflow_p (42, x, 0); > +} > + > +__attribute__((noipa)) int > +qux (long int x) > +{ > + return __builtin_mul_overflow_p (42, x, 0L); > +} > + > +__attribute__((noipa)) int > +corge (int x) > +{ > + return __builtin_mul_overflow_p (x, -39, 0); > +} > + > +__attribute__((noipa)) int > +garply (long int x) > +{ > + return __builtin_mul_overflow_p (x, -39L, 0L); > +} > + > +__attribute__((noipa)) int > +grault (int x) > +{ > + return __builtin_mul_overflow_p (-46, x, 0); > +} > + > +__attribute__((noipa)) int > +waldo (long int x) > +{ > + return __builtin_mul_overflow_p (-46, x, 0L); > +} > --- gcc/testsuite/gcc.c-torture/execute/pr30314.c.jj 2022-06-02 14:32:30.070276332 +0200 > +++ gcc/testsuite/gcc.c-torture/execute/pr30314.c 2022-06-02 14:32:22.121360286 +0200 > @@ -0,0 +1,29 @@ > +/* PR middle-end/30314 */ > + > +#include "../../gcc.dg/tree-ssa/pr30314.c" > + > +int > +main () > +{ > + if (foo (0) != 0 > + || foo (~0U / 35) != 0 > + || foo (~0U / 35 + 1) != 1 > + || foo (~0U) != 1) > + __builtin_abort (); > + if (bar (0) != 0 > + || bar (~0UL / 35) != 0 > + || bar (~0UL / 35 + 1) != 1 > + || bar (~0UL) != 1) > + __builtin_abort (); > + if (baz (0) != 0 > + || baz (~0U / 42) != 0 > + || baz (~0U / 42 + 1) != 1 > + || baz (~0U) != 1) > + __builtin_abort (); > + if (qux (0) != 0 > + || qux (~0UL / 42) != 0 > + || qux (~0UL / 42 + 1) != 1 > + || qux (~0UL) != 1) > + __builtin_abort (); > + return 0; > +} > --- gcc/testsuite/gcc.c-torture/execute/pr105777.c.jj 2022-06-02 14:32:50.287062810 +0200 > +++ gcc/testsuite/gcc.c-torture/execute/pr105777.c 2022-06-02 14:38:36.289409212 +0200 > @@ -0,0 +1,73 @@ > +/* PR middle-end/105777 */ > + > +#include "../../gcc.dg/tree-ssa/pr105777.c" > + > +int > +main () > +{ > + if (foo (0) != 0 > + || foo (__INT_MAX__ / 35) != 0 > + || foo (__INT_MAX__ / 35 + 1) != 1 > + || foo (__INT_MAX__) != 1 > + || foo ((-__INT_MAX__ - 1) / 35) != 0 > + || foo ((-__INT_MAX__ - 1) / 35 - 1) != 1 > + || foo (-__INT_MAX__ - 1) != 1) > + __builtin_abort (); > + if (bar (0) != 0 > + || bar (__LONG_MAX__ / 35) != 0 > + || bar (__LONG_MAX__ / 35 + 1) != 1 > + || bar (__LONG_MAX__) != 1 > + || bar ((-__LONG_MAX__ - 1) / 35) != 0 > + || bar ((-__LONG_MAX__ - 1) / 35 - 1) != 1 > + || bar (-__LONG_MAX__ - 1) != 1) > + __builtin_abort (); > + if (baz (0) != 0 > + || baz (__INT_MAX__ / 42) != 0 > + || baz (__INT_MAX__ / 42 + 1) != 1 > + || baz (__INT_MAX__) != 1 > + || baz ((-__INT_MAX__ - 1) / 42) != 0 > + || baz ((-__INT_MAX__ - 1) / 42 - 1) != 1 > + || baz (-__INT_MAX__ - 1) != 1) > + __builtin_abort (); > + if (qux (0) != 0 > + || qux (__LONG_MAX__ / 42) != 0 > + || qux (__LONG_MAX__ / 42 + 1) != 1 > + || qux (__LONG_MAX__) != 1 > + || qux ((-__LONG_MAX__ - 1) / 42) != 0 > + || qux ((-__LONG_MAX__ - 1) / 42 - 1) != 1 > + || qux (-__LONG_MAX__ - 1) != 1) > + __builtin_abort (); > + if (corge (0) != 0 > + || corge (__INT_MAX__ / -39) != 0 > + || corge (__INT_MAX__ / -39 - 1) != 1 > + || corge (__INT_MAX__) != 1 > + || corge ((-__INT_MAX__ - 1) / -39) != 0 > + || corge ((-__INT_MAX__ - 1) / -39 + 1) != 1 > + || corge (-__INT_MAX__ - 1) != 1) > + __builtin_abort (); > + if (garply (0) != 0 > + || garply (__LONG_MAX__ / -39) != 0 > + || garply (__LONG_MAX__ / -39 - 1) != 1 > + || garply (__LONG_MAX__) != 1 > + || garply ((-__LONG_MAX__ - 1) / -39) != 0 > + || garply ((-__LONG_MAX__ - 1) / -39 + 1) != 1 > + || garply (-__LONG_MAX__ - 1) != 1) > + __builtin_abort (); > + if (grault (0) != 0 > + || grault (__INT_MAX__ / -46) != 0 > + || grault (__INT_MAX__ / -46 - 1) != 1 > + || grault (__INT_MAX__) != 1 > + || grault ((-__INT_MAX__ - 1) / -46) != 0 > + || grault ((-__INT_MAX__ - 1) / -46 + 1) != 1 > + || grault (-__INT_MAX__ - 1) != 1) > + __builtin_abort (); > + if (waldo (0) != 0 > + || waldo (__LONG_MAX__ / -46) != 0 > + || waldo (__LONG_MAX__ / -46 - 1) != 1 > + || waldo (__LONG_MAX__) != 1 > + || waldo ((-__LONG_MAX__ - 1) / -46) != 0 > + || waldo ((-__LONG_MAX__ - 1) / -46 + 1) != 1 > + || waldo (-__LONG_MAX__ - 1) != 1) > + __builtin_abort (); > + return 0; > +} > > Jakub > > -- Richard Biener SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman; HRB 36809 (AG Nuernberg)