From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 9B6833856DD4 for ; Thu, 2 Jun 2022 13:11:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9B6833856DD4 Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-607-WtxwyX4JOtyKb8XD3hzZIQ-1; Thu, 02 Jun 2022 09:11:19 -0400 X-MC-Unique: WtxwyX4JOtyKb8XD3hzZIQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EF08C85A5B9; Thu, 2 Jun 2022 13:11:18 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.33.36.77]) by smtp.corp.redhat.com (Postfix) with ESMTPS id AF7101121314; Thu, 2 Jun 2022 13:11:18 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 252DBFkW737208 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 2 Jun 2022 15:11:16 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 252DBFEK737207; Thu, 2 Jun 2022 15:11:15 +0200 Date: Thu, 2 Jun 2022 15:11:14 +0200 From: Jakub Jelinek To: Richard Biener , Jeff Law Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (stype)0) [PR105777] Message-ID: Reply-To: Jakub Jelinek MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Spam-Status: No, score=-3.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 02 Jun 2022 13:11:26 -0000 Hi! The following patch is an incremental change to the PR30314 enhancement, this one handles signed types. For signed types (but still, the same for 1st and result element type and non-zero constant that fits into that type), we actually need to watch for overflow in direction to positive and negative infinity and it also depends on whether the cst operand is positive or negative. For __builtin_mul_overflow_p (x, cst, (stype) 0): For cst > 0, we can simplify it to: x > INT_MAX / cst || x < INT_MIN / cst aka: x + (unsigned) (INT_MIN / cst) > (unsigned) (INT_MAX / cst) - (unsigned) (INT_MIN / cst) and for cst < 0 to: x < INT_MAX / cst || x > INT_MIN / cst aka: x + (unsigned) (INT_MAX / cst) > (unsigned) (INT_MIN / cst) - (unsigned) (INT_MAX / cst) Additionally, I've added executable testcases, so we don't just check for the optimization to be performed, but also that it is correct (done that even for the other PR's testcase). Starting x86_64-linux and i686-linux bootstrap/regtest, ok for trunk if it passes them? 2022-06-02 Jakub Jelinek PR middle-end/30314 PR middle-end/105777 * match.pd (__builtin_mul_overflow_p (x, cst, (stype) 0) -> x > stype_max / cst || x < stype_min / cst): New simplification. * gcc.dg/tree-ssa/pr30314.c: Add noipa attribute to all functions. * gcc.dg/tree-ssa/pr105777.c: New test. * gcc.c-torture/execute/pr30314.c: New test. * gcc.c-torture/execute/pr105777.c: New test. --- gcc/match.pd.jj 2022-06-01 17:54:30.536372912 +0200 +++ gcc/match.pd 2022-06-02 13:16:17.171415948 +0200 @@ -5970,15 +5970,37 @@ (define_operator_list SYNC_FETCH_AND_AND (ovf @1 @0)))) /* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types - are unsigned to x > (umax / cst). */ + are unsigned to x > (umax / cst). Similarly for signed type, but + in that case it needs to be outside of a range. */ (simplify (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1)) (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) - && TYPE_UNSIGNED (TREE_TYPE (@0)) && TYPE_MAX_VALUE (TREE_TYPE (@0)) && types_match (TREE_TYPE (@0), TREE_TYPE (TREE_TYPE (@2))) && int_fits_type_p (@1, TREE_TYPE (@0))) - (convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1))))) + (if (TYPE_UNSIGNED (TREE_TYPE (@0))) + (convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1))) + (if (TYPE_MIN_VALUE (TREE_TYPE (@0))) + (with + { + tree lo = int_const_binop (TRUNC_DIV_EXPR, + TYPE_MIN_VALUE (TREE_TYPE (@0)), + fold_convert (TREE_TYPE (@0), @1)); + tree hi = int_const_binop (TRUNC_DIV_EXPR, + TYPE_MAX_VALUE (TREE_TYPE (@0)), + fold_convert (TREE_TYPE (@0), @1)); + tree etype = range_check_type (TREE_TYPE (@0)); + if (etype) + { + if (wi::neg_p (wi::to_wide (@1))) + std::swap (lo, hi); + lo = fold_convert (etype, lo); + hi = fold_convert (etype, hi); + hi = int_const_binop (MINUS_EXPR, hi, lo); + } + } + (if (etype) + (convert (gt (minus (convert:etype @0) { lo; }) { hi; })))))))) /* Simplification of math builtins. These rules must all be optimizations as well as IL simplifications. If there is a possibility that the new --- gcc/testsuite/gcc.dg/tree-ssa/pr30314.c.jj 2022-06-02 11:17:23.689835550 +0200 +++ gcc/testsuite/gcc.dg/tree-ssa/pr30314.c 2022-06-02 14:23:19.294093445 +0200 @@ -7,25 +7,25 @@ /* { dg-final { scan-tree-dump " > 102261126" "optimized" { target int32 } } } */ /* { dg-final { scan-tree-dump " > 439208192231179800" "optimized" { target lp64 } } } */ -int +__attribute__((noipa)) int foo (unsigned int x) { return __builtin_mul_overflow_p (x, 35U, 0U); } -int +__attribute__((noipa)) int bar (unsigned long int x) { return __builtin_mul_overflow_p (x, 35UL, 0UL); } -int +__attribute__((noipa)) int baz (unsigned int x) { return __builtin_mul_overflow_p (42, x, 0U); } -int +__attribute__((noipa)) int qux (unsigned long int x) { return __builtin_mul_overflow_p (42, x, 0UL); --- gcc/testsuite/gcc.dg/tree-ssa/pr105777.c.jj 2022-06-02 14:22:57.017328731 +0200 +++ gcc/testsuite/gcc.dg/tree-ssa/pr105777.c 2022-06-02 14:19:29.399521503 +0200 @@ -0,0 +1,68 @@ +/* PR middle-end/105777 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ +/* { dg-final { scan-tree-dump-not "\.MUL_OVERFLOW " "optimized" } } */ +/* { dg-final { scan-tree-dump " \\+ 61356675;" "optimized" { target int32 } } } */ +/* { dg-final { scan-tree-dump " > 122713350" "optimized" { target int32 } } } */ +/* { dg-final { scan-tree-dump " \\+ 263524915338707880" "optimized" { target lp64 } } } */ +/* { dg-final { scan-tree-dump " > 527049830677415760" "optimized" { target lp64 } } } */ +/* { dg-final { scan-tree-dump " \\+ 51130563" "optimized" { target int32 } } } */ +/* { dg-final { scan-tree-dump " > 102261126" "optimized" { target int32 } } } */ +/* { dg-final { scan-tree-dump " \\+ 219604096115589900" "optimized" { target lp64 } } } */ +/* { dg-final { scan-tree-dump " > 439208192231179800" "optimized" { target lp64 } } } */ +/* { dg-final { scan-tree-dump " \\+ 55063683;" "optimized" { target int32 } } } */ +/* { dg-final { scan-tree-dump " > 110127366" "optimized" { target int32 } } } */ +/* { dg-final { scan-tree-dump " \\+ 236496718893712200" "optimized" { target lp64 } } } */ +/* { dg-final { scan-tree-dump " > 472993437787424400" "optimized" { target lp64 } } } */ +/* { dg-final { scan-tree-dump " \\+ 46684427" "optimized" { target int32 } } } */ +/* { dg-final { scan-tree-dump " > 93368854" "optimized" { target int32 } } } */ +/* { dg-final { scan-tree-dump " \\+ 200508087757712517" "optimized" { target lp64 } } } */ +/* { dg-final { scan-tree-dump " > 401016175515425034" "optimized" { target lp64 } } } */ + +__attribute__((noipa)) int +foo (int x) +{ + return __builtin_mul_overflow_p (x, 35, 0); +} + +__attribute__((noipa)) int +bar (long int x) +{ + return __builtin_mul_overflow_p (x, 35L, 0L); +} + +__attribute__((noipa)) int +baz (int x) +{ + return __builtin_mul_overflow_p (42, x, 0); +} + +__attribute__((noipa)) int +qux (long int x) +{ + return __builtin_mul_overflow_p (42, x, 0L); +} + +__attribute__((noipa)) int +corge (int x) +{ + return __builtin_mul_overflow_p (x, -39, 0); +} + +__attribute__((noipa)) int +garply (long int x) +{ + return __builtin_mul_overflow_p (x, -39L, 0L); +} + +__attribute__((noipa)) int +grault (int x) +{ + return __builtin_mul_overflow_p (-46, x, 0); +} + +__attribute__((noipa)) int +waldo (long int x) +{ + return __builtin_mul_overflow_p (-46, x, 0L); +} --- gcc/testsuite/gcc.c-torture/execute/pr30314.c.jj 2022-06-02 14:32:30.070276332 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr30314.c 2022-06-02 14:32:22.121360286 +0200 @@ -0,0 +1,29 @@ +/* PR middle-end/30314 */ + +#include "../../gcc.dg/tree-ssa/pr30314.c" + +int +main () +{ + if (foo (0) != 0 + || foo (~0U / 35) != 0 + || foo (~0U / 35 + 1) != 1 + || foo (~0U) != 1) + __builtin_abort (); + if (bar (0) != 0 + || bar (~0UL / 35) != 0 + || bar (~0UL / 35 + 1) != 1 + || bar (~0UL) != 1) + __builtin_abort (); + if (baz (0) != 0 + || baz (~0U / 42) != 0 + || baz (~0U / 42 + 1) != 1 + || baz (~0U) != 1) + __builtin_abort (); + if (qux (0) != 0 + || qux (~0UL / 42) != 0 + || qux (~0UL / 42 + 1) != 1 + || qux (~0UL) != 1) + __builtin_abort (); + return 0; +} --- gcc/testsuite/gcc.c-torture/execute/pr105777.c.jj 2022-06-02 14:32:50.287062810 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr105777.c 2022-06-02 14:38:36.289409212 +0200 @@ -0,0 +1,73 @@ +/* PR middle-end/105777 */ + +#include "../../gcc.dg/tree-ssa/pr105777.c" + +int +main () +{ + if (foo (0) != 0 + || foo (__INT_MAX__ / 35) != 0 + || foo (__INT_MAX__ / 35 + 1) != 1 + || foo (__INT_MAX__) != 1 + || foo ((-__INT_MAX__ - 1) / 35) != 0 + || foo ((-__INT_MAX__ - 1) / 35 - 1) != 1 + || foo (-__INT_MAX__ - 1) != 1) + __builtin_abort (); + if (bar (0) != 0 + || bar (__LONG_MAX__ / 35) != 0 + || bar (__LONG_MAX__ / 35 + 1) != 1 + || bar (__LONG_MAX__) != 1 + || bar ((-__LONG_MAX__ - 1) / 35) != 0 + || bar ((-__LONG_MAX__ - 1) / 35 - 1) != 1 + || bar (-__LONG_MAX__ - 1) != 1) + __builtin_abort (); + if (baz (0) != 0 + || baz (__INT_MAX__ / 42) != 0 + || baz (__INT_MAX__ / 42 + 1) != 1 + || baz (__INT_MAX__) != 1 + || baz ((-__INT_MAX__ - 1) / 42) != 0 + || baz ((-__INT_MAX__ - 1) / 42 - 1) != 1 + || baz (-__INT_MAX__ - 1) != 1) + __builtin_abort (); + if (qux (0) != 0 + || qux (__LONG_MAX__ / 42) != 0 + || qux (__LONG_MAX__ / 42 + 1) != 1 + || qux (__LONG_MAX__) != 1 + || qux ((-__LONG_MAX__ - 1) / 42) != 0 + || qux ((-__LONG_MAX__ - 1) / 42 - 1) != 1 + || qux (-__LONG_MAX__ - 1) != 1) + __builtin_abort (); + if (corge (0) != 0 + || corge (__INT_MAX__ / -39) != 0 + || corge (__INT_MAX__ / -39 - 1) != 1 + || corge (__INT_MAX__) != 1 + || corge ((-__INT_MAX__ - 1) / -39) != 0 + || corge ((-__INT_MAX__ - 1) / -39 + 1) != 1 + || corge (-__INT_MAX__ - 1) != 1) + __builtin_abort (); + if (garply (0) != 0 + || garply (__LONG_MAX__ / -39) != 0 + || garply (__LONG_MAX__ / -39 - 1) != 1 + || garply (__LONG_MAX__) != 1 + || garply ((-__LONG_MAX__ - 1) / -39) != 0 + || garply ((-__LONG_MAX__ - 1) / -39 + 1) != 1 + || garply (-__LONG_MAX__ - 1) != 1) + __builtin_abort (); + if (grault (0) != 0 + || grault (__INT_MAX__ / -46) != 0 + || grault (__INT_MAX__ / -46 - 1) != 1 + || grault (__INT_MAX__) != 1 + || grault ((-__INT_MAX__ - 1) / -46) != 0 + || grault ((-__INT_MAX__ - 1) / -46 + 1) != 1 + || grault (-__INT_MAX__ - 1) != 1) + __builtin_abort (); + if (waldo (0) != 0 + || waldo (__LONG_MAX__ / -46) != 0 + || waldo (__LONG_MAX__ / -46 - 1) != 1 + || waldo (__LONG_MAX__) != 1 + || waldo ((-__LONG_MAX__ - 1) / -46) != 0 + || waldo ((-__LONG_MAX__ - 1) / -46 + 1) != 1 + || waldo (-__LONG_MAX__ - 1) != 1) + __builtin_abort (); + return 0; +} Jakub