From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id 8BAA33858D35 for ; Tue, 1 Feb 2022 08:57:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8BAA33858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=DMt1KQS05wcmbJMWpXt/BROg2XdeB4hLcGxuXPQruxs=; b=kDXqLlluphE3mMte5lduaJM58O m6vIQTIflN+E0BXHb0Do5+o0TbQknmxLZH5vknI2Uy6Qi9/pV4EfCmVTmyx9x0z591MAjzaSx/zVy 0Qv77iczOKCcz1O6lsXmxfM5Cfw4QiuXSGPe647k3Ph+ix2B2y8zrbyNHkfhJk7xU74ZlICnaQWf2 yiYgios+nTxNhPUzBeWGgWkf1PqYagqLZpXYaGw1TAIW+HSbX5nKHCTjKcbt7vRHc/aDIM6yyXdnD KWQi298X1s4jKOGHMnCwSNpKxiMufJsG4+4rEm3ocGsRIeDHJODZntzXTDllgfafPyN81NnBYQQRs jEnJbWNw==; Received: from host86-160-23-130.range86-160.btcentralplus.com ([86.160.23.130]:62943 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1nEoyK-00010C-SU for gcc-patches@gcc.gnu.org; Tue, 01 Feb 2022 03:57:09 -0500 From: "Roger Sayle" To: "'GCC Patches'" Subject: [PATCH] PR tree-optimization/102950: Improved EVRP for signed BIT_XOR_EXPR. Date: Tue, 1 Feb 2022 08:57:07 -0000 Message-ID: <005401d81749$b1d77ad0$15867070$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_0055_01D81749.B1D9EBD0" X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdgXST5WyC8Rc2QaREqihBpTM/a1Dw== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_BARRACUDACENTRAL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Feb 2022 08:57:11 -0000 This is a multipart message in MIME format. ------=_NextPart_000_0055_01D81749.B1D9EBD0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit This patch fixes PR tree-optimization/102950, which is a P2 regression, by providing better range bounds for BIT_XOR_EXPR, BIT_AND_EXPR and BIT_IOR_EXPR on signed integer types. In general terms, any binary bitwise operation on sign-extended or zero-extended integer types will produce results that are themselves sign-extended or zero-extended. More precisely, we can derive signed bounds from the number of leading redundant sign bit copies, from the equation: clrsb(X op Y) >= min (clrsb (X), clrsb(Y)) and from the property that for any (signed or unsigned) range [lb, ub] that clrsb([lb, ub]) >= min (clrsb(lb), clrsb(ub)). These can be used to show that [-1, 0] op [-1, 0] is [-1, 0] or that [-128, 127] op [-128, 127] is [-128, 127], even when tracking nonzero bits would result in VARYING (as every bit can be 0 or 1). This is equivalent to determining the minimum type precision in which the operation can be performed then sign extending the result. One additional refinement is to observe that X ^ Y can never be zero if the ranges of X and Y don't overlap, i.e. X can't be equal to Y. Previously, the expression "(int)(char)a ^ 233" in the PR was considered VARYING, but with the above changes now has the range [-256, -1][1, 255], which is sufficient to optimize away the call to foo. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check with no new failures. Ok for mainline? 2022-02-01 Roger Sayle gcc/ChangeLog PR tree-optimization/102950 * range-op.cc (wi_optimize_signed_bitwise_op): New function to determine bounds of bitwise operations on signed types. (operator_bitwise_and::wi_fold): Call the above function. (operator_bitwise_or::wi_fold): Likewise. (operator_bitwise_xor::wi_fold): Likewise. Additionally, the result can't be zero if the operands can't be equal. gcc/testsuite/ChangeLog PR tree-optimization/102950 gcc.dg/pr102950.c: New test case. gcc.dg/tree-ssa/evrp10.c: New test case. Thanks in advance (and Happy Chinese New Year), Roger -- ------=_NextPart_000_0055_01D81749.B1D9EBD0 Content-Type: text/plain; name="patchu2b.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="patchu2b.txt" diff --git a/gcc/range-op.cc b/gcc/range-op.cc=0A= index 19bdf30..71264ba 100644=0A= --- a/gcc/range-op.cc=0A= +++ b/gcc/range-op.cc=0A= @@ -2659,6 +2659,29 @@ operator_bitwise_and::fold_range (irange &r, tree = type,=0A= }=0A= =0A= =0A= +// Optimize BIT_AND_EXPR, BIT_IOR_EXPR and BIT_XOR_EXPR of signed types=0A= +// by considering the number of leading redundant sign bit copies.=0A= +// clrsb (X op Y) =3D min (clrsb (X), clrsb (Y)), so for example=0A= +// [-1, 0] op [-1, 0] is [-1, 0] (where nonzero_bits doesn't help).=0A= +static bool=0A= +wi_optimize_signed_bitwise_op (irange &r, tree type,=0A= + const wide_int &lh_lb, const wide_int &lh_ub,=0A= + const wide_int &rh_lb, const wide_int &rh_ub)=0A= +{=0A= + int lh_clrsb =3D MIN (wi::clrsb (lh_lb), wi::clrsb (lh_ub));=0A= + int rh_clrsb =3D MIN (wi::clrsb (rh_lb), wi::clrsb (rh_ub));=0A= + int new_clrsb =3D MIN (lh_clrsb, rh_clrsb);=0A= + if (new_clrsb =3D=3D 0)=0A= + return false;=0A= + int type_prec =3D TYPE_PRECISION (type);=0A= + int rprec =3D (type_prec - new_clrsb) - 1;=0A= + value_range_with_overflow (r, type,=0A= + wi::mask (rprec, true, type_prec),=0A= + wi::mask (rprec, false, type_prec));=0A= + return true;=0A= +}=0A= +=0A= +=0A= // Optimize BIT_AND_EXPR and BIT_IOR_EXPR in terms of a mask if=0A= // possible. Basically, see if we can optimize:=0A= //=0A= @@ -2839,7 +2862,14 @@ operator_bitwise_and::wi_fold (irange &r, tree = type,=0A= }=0A= // If the limits got swapped around, return varying.=0A= if (wi::gt_p (new_lb, new_ub,sign))=0A= - r.set_varying (type);=0A= + {=0A= + if (sign =3D=3D SIGNED=0A= + && wi_optimize_signed_bitwise_op (r, type,=0A= + lh_lb, lh_ub,=0A= + rh_lb, rh_ub))=0A= + return;=0A= + r.set_varying (type);=0A= + }=0A= else=0A= value_range_with_overflow (r, type, new_lb, new_ub);=0A= }=0A= @@ -3093,6 +3123,11 @@ operator_bitwise_or::wi_fold (irange &r, tree = type,=0A= || wi::lt_p (lh_ub, 0, sign)=0A= || wi::lt_p (rh_ub, 0, sign))=0A= r.set_nonzero (type);=0A= + else if (sign =3D=3D SIGNED=0A= + && wi_optimize_signed_bitwise_op (r, type,=0A= + lh_lb, lh_ub,=0A= + rh_lb, rh_ub))=0A= + return;=0A= else=0A= r.set_varying (type);=0A= return;=0A= @@ -3180,8 +3215,23 @@ operator_bitwise_xor::wi_fold (irange &r, tree = type,=0A= // is better than VARYING.=0A= if (wi::lt_p (new_lb, 0, sign) || wi::ge_p (new_ub, 0, sign))=0A= value_range_with_overflow (r, type, new_lb, new_ub);=0A= + else if (sign =3D=3D SIGNED=0A= + && wi_optimize_signed_bitwise_op (r, type,=0A= + lh_lb, lh_ub,=0A= + rh_lb, rh_ub))=0A= + ; /* Do nothing. */=0A= else=0A= r.set_varying (type);=0A= +=0A= + /* Furthermore, XOR is non-zero if its arguments can't be equal. */=0A= + if (wi::lt_p (lh_ub, rh_lb, sign)=0A= + || wi::lt_p (rh_ub, lh_lb, sign)=0A= + || wi::ne_p (result_one_bits, 0))=0A= + {=0A= + int_range<2> tmp;=0A= + tmp.set_nonzero (type);=0A= + r.intersect (tmp);=0A= + }=0A= }=0A= =0A= bool=0A= diff --git a/gcc/testsuite/gcc.dg/pr102950.c = b/gcc/testsuite/gcc.dg/pr102950.c=0A= new file mode 100644=0A= index 0000000..0ab23bd=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.dg/pr102950.c=0A= @@ -0,0 +1,21 @@=0A= +/* { dg-do link } */=0A= +/* { dg-options "-O2" } */=0A= +=0A= +extern void link_error(void);=0A= +=0A= +static char a;=0A= +static short d(unsigned e) {=0A= + char b;=0A= + short c;=0A= + a =3D b =3D e;=0A= + if (b)=0A= + return 0;=0A= + if (1 >=3D e) {=0A= + c =3D e =3D=3D 0;=0A= + if (c)=0A= + link_error();=0A= + }=0A= + return 0;=0A= +}=0A= +int main() { d(a ^ 233); }=0A= +=0A= diff --git a/gcc/testsuite/gcc.dg/tree-ssa/evrp10.c = b/gcc/testsuite/gcc.dg/tree-ssa/evrp10.c=0A= new file mode 100644=0A= index 0000000..6ca00e4=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.dg/tree-ssa/evrp10.c=0A= @@ -0,0 +1,30 @@=0A= +/* { dg-do compile } */=0A= +/* { dg-options "-O2 -fdump-tree-evrp" }*/=0A= +=0A= +typedef __INT32_TYPE__ int32_t;=0A= +=0A= +int32_t and(int32_t x, int32_t y)=0A= +{=0A= + int32_t tx =3D x >> 24;=0A= + int32_t ty =3D y >> 24;=0A= + int32_t t =3D tx & ty;=0A= + return t;=0A= +}=0A= +=0A= +int32_t ior(int32_t x, int32_t y)=0A= +{=0A= + int32_t tx =3D x >> 24;=0A= + int32_t ty =3D y >> 24;=0A= + int32_t t =3D tx | ty;=0A= + return t;=0A= +}=0A= +=0A= +int32_t xor(int32_t x, int32_t y)=0A= +{=0A= + int32_t tx =3D x >> 24;=0A= + int32_t ty =3D y >> 24;=0A= + int32_t t =3D tx ^ ty;=0A= + return t;=0A= +}=0A= +=0A= +/* { dg-final { scan-tree-dump-times "\\\[-128, 127\\\]" 9 "evrp" } } */=0A= ------=_NextPart_000_0055_01D81749.B1D9EBD0--