From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id E52DC3858D35 for ; Tue, 23 May 2023 18:30:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E52DC3858D35 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=aq+hxkfKNTB85Rpm5IyG71hKXzS04J6RedSchf1udO0=; b=nNwghWDypxL7kLsrOyGrWheC0V eKMIUNtsz6rnIzQQzW9oOSpunzt+SRybStc+aPaCC8Pzdy1jv0NrjTThMZ0JMgcx1JYHwa50ZEj59 Hw9ZmLdq4wIuKSObTDcpigcRQl5lu3XQxSz7zFxXpvrHaz4e99rksPSlPK3yly3S4OyArc6e+pCzT N4RZRRdsq5ZOQrQ4vjFzWiWh3zsbHJZPDyZ+sPGTmn+HUHx2XRKe84w0Gh+MTsbjH3qDFuMXenpyq 1FdAf9STlDxr6gr0N2rPRNOBGBQB94X8m//KcUnU2N3qNjiF06Q3v4N0JoGp5DgR9Ne7xiVb69ox/ uRDz26Ow==; Received: from [185.62.158.67] (port=51120 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1q1Wm5-00040G-0z for gcc-patches@gcc.gnu.org; Tue, 23 May 2023 14:30:21 -0400 From: "Roger Sayle" To: Subject: [PATCH] PR middle-end/109840: Preserve popcount/parity type in match.pd. Date: Tue, 23 May 2023 19:30:19 +0100 Message-ID: <075901d98da4$a1fc4dc0$e5f4e940$@nextmovesoftware.com> MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="----=_NextPart_000_075A_01D98DAD.03C326C0" X-Mailer: Microsoft Outlook 16.0 Content-Language: en-gb Thread-Index: AdmNpDKX06pw16BlR5yIskQqPjxF9Q== X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This is a multipart message in MIME format. ------=_NextPart_000_075A_01D98DAD.03C326C0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit PR middle-end/109840 is a regression introduced by my recent patch to fold popcount(bswap(x)) as popcount(x). When the bswap and the popcount have the same precision, everything works fine, but this optimization also allowed a zero-extension between the two. The oversight is that we need to be strict with type conversions, both to avoid accidentally changing the argument type to popcount, and also to reflect the effects of argument/return-value promotion in the call to bswap, so this zero extension needs to be preserved/explicit in the optimized form. Interestingly, match.pd should (in theory) be able to narrow calls to popcount and parity, removing a zero-extension from its argument, but that is an independent optimization, that needs to check IFN_ support. Many thanks to Andrew Pinski for his help/fixes with these transformations. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-05-23 Roger Sayle gcc/ChangeLog PR middle-end/109840 * match.pd : Preserve zero-extension when optimizing popcount((T)bswap(x)) and popcount((T)rotate(x,y)) as popcount((T)x), so the popcount's argument keeps the same type. : Likewise preserve extensions when simplifying parity((T)bswap(x)) and parity((T)rotate(x,y)) as parity((T)x), so that the parity's argument type is the same. gcc/testsuite/ChangeLog PR middle-end/109840 * gcc.dg/fold-parity-8.c: New test. * gcc.dg/fold-popcount-11.c: Likewise. Thanks in advance, and apologies for any inconvenience. Roger -- ------=_NextPart_000_075A_01D98DAD.03C326C0 Content-Type: text/plain; name="patcha2.txt" Content-Transfer-Encoding: quoted-printable Content-Disposition: attachment; filename="patcha2.txt" diff --git a/gcc/match.pd b/gcc/match.pd=0A= index 1fe0559..6e32f47 100644=0A= --- a/gcc/match.pd=0A= +++ b/gcc/match.pd=0A= @@ -7865,10 +7865,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)=0A= (popcount (convert?@0 (bswap:s@1 @2)))=0A= (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))=0A= && INTEGRAL_TYPE_P (TREE_TYPE (@1)))=0A= - (with { unsigned int prec0 =3D TYPE_PRECISION (TREE_TYPE (@0));=0A= - unsigned int prec1 =3D TYPE_PRECISION (TREE_TYPE (@1)); }=0A= - (if (prec0 =3D=3D prec1 || (prec0 > prec1 && TYPE_UNSIGNED = (TREE_TYPE (@1))))=0A= - (popcount @2)))))))=0A= + (with { tree type0 =3D TREE_TYPE (@0);=0A= + tree type1 =3D TREE_TYPE (@1);=0A= + unsigned int prec0 =3D TYPE_PRECISION (type0);=0A= + unsigned int prec1 =3D TYPE_PRECISION (type1); }=0A= + (if (prec0 =3D=3D prec1 || (prec0 > prec1 && TYPE_UNSIGNED (type1)))=0A= + (popcount (convert:type0 (convert:type1 @2)))))))))=0A= =0A= /* popcount(rotate(X Y)) is popcount(X). */=0A= (for popcount (POPCOUNT)=0A= @@ -7878,10 +7880,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)=0A= (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))=0A= && INTEGRAL_TYPE_P (TREE_TYPE (@1)) =0A= && (GIMPLE || !TREE_SIDE_EFFECTS (@3)))=0A= - (with { unsigned int prec0 =3D TYPE_PRECISION (TREE_TYPE (@0));=0A= - unsigned int prec1 =3D TYPE_PRECISION (TREE_TYPE (@1)); }=0A= - (if (prec0 =3D=3D prec1 || (prec0 > prec1 && TYPE_UNSIGNED = (TREE_TYPE (@1))))=0A= - (popcount @2)))))))=0A= + (with { tree type0 =3D TREE_TYPE (@0);=0A= + tree type1 =3D TREE_TYPE (@1);=0A= + unsigned int prec0 =3D TYPE_PRECISION (type0);=0A= + unsigned int prec1 =3D TYPE_PRECISION (type1); }=0A= + (if (prec0 =3D=3D prec1 || (prec0 > prec1 && TYPE_UNSIGNED (type1)))=0A= + (popcount (convert:type0 @2))))))))=0A= =0A= /* Canonicalize POPCOUNT(x)&1 as PARITY(X). */=0A= (simplify=0A= @@ -7923,7 +7927,9 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)=0A= && INTEGRAL_TYPE_P (TREE_TYPE (@1))=0A= && TYPE_PRECISION (TREE_TYPE (@0))=0A= >=3D TYPE_PRECISION (TREE_TYPE (@1)))=0A= - (parity @2)))))=0A= + (with { tree type0 =3D TREE_TYPE (@0);=0A= + tree type1 =3D TREE_TYPE (@1); }=0A= + (parity (convert:type0 (convert:type1 @2))))))))=0A= =0A= /* parity(rotate(X Y)) is parity(X). */=0A= (for parity (PARITY)=0A= @@ -7935,7 +7941,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)=0A= && (GIMPLE || !TREE_SIDE_EFFECTS (@3))=0A= && TYPE_PRECISION (TREE_TYPE (@0))=0A= >=3D TYPE_PRECISION (TREE_TYPE (@1)))=0A= - (parity @2)))))=0A= + (with { tree type0 =3D TREE_TYPE (@0); }=0A= + (parity (convert:type0 @2)))))))=0A= =0A= /* parity(X)^parity(Y) is parity(X^Y). */=0A= (simplify=0A= diff --git a/gcc/testsuite/gcc.dg/fold-parity-8.c = b/gcc/testsuite/gcc.dg/fold-parity-8.c=0A= new file mode 100644=0A= index 0000000..48e1f7f=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.dg/fold-parity-8.c=0A= @@ -0,0 +1,25 @@=0A= +/* { dg-do compile } */=0A= +/* { dg-options "-O2 -fdump-tree-optimized" } */=0A= +=0A= +int foo(unsigned short x)=0A= +{=0A= + unsigned short t1 =3D __builtin_bswap16(x);=0A= + unsigned int t2 =3D t1;=0A= + return __builtin_parity (t2);=0A= +}=0A= +=0A= +int fool(unsigned short x)=0A= +{=0A= + unsigned short t1 =3D __builtin_bswap16(x);=0A= + unsigned long t2 =3D t1;=0A= + return __builtin_parityl (t2);=0A= +}=0A= +=0A= +int fooll(unsigned short x)=0A= +{=0A= + unsigned short t1 =3D __builtin_bswap16(x);=0A= + unsigned long long t2 =3D t1;=0A= + return __builtin_parityll (t2);=0A= +}=0A= +=0A= +/* { dg-final { scan-tree-dump-not "bswap" "optimized" } } */=0A= diff --git a/gcc/testsuite/gcc.dg/fold-popcount-11.c = b/gcc/testsuite/gcc.dg/fold-popcount-11.c=0A= new file mode 100644=0A= index 0000000..e59be00=0A= --- /dev/null=0A= +++ b/gcc/testsuite/gcc.dg/fold-popcount-11.c=0A= @@ -0,0 +1,25 @@=0A= +/* { dg-do compile } */=0A= +/* { dg-options "-O2 -fdump-tree-optimized" } */=0A= +=0A= +int foo(unsigned short x)=0A= +{=0A= + unsigned short t1 =3D __builtin_bswap16(x);=0A= + unsigned int t2 =3D t1;=0A= + return __builtin_popcount (t2);=0A= +}=0A= +=0A= +int fool(unsigned short x)=0A= +{=0A= + unsigned short t1 =3D __builtin_bswap16(x);=0A= + unsigned long t2 =3D t1;=0A= + return __builtin_popcountl (t2);=0A= +}=0A= +=0A= +int fooll(unsigned short x)=0A= +{=0A= + unsigned short t1 =3D __builtin_bswap16(x);=0A= + unsigned long long t2 =3D t1;=0A= + return __builtin_popcountll (t2);=0A= +}=0A= +=0A= +/* { dg-final { scan-tree-dump-not "bswap" "optimized" } } */=0A= ------=_NextPart_000_075A_01D98DAD.03C326C0--