From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 753843858D32 for ; Mon, 24 Apr 2023 05:35:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 753843858D32 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353728.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33O43l55018509; Mon, 24 Apr 2023 05:35:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : from : to : cc : references : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=N8IavL0ffuNXL6NtFIEwH1Y6L9SaVe0etSQm9C0v5jY=; b=jPCWfrmIqjaFhXo7T0iO7KWlyecAmKi7GeUKINP9WRaOU77/34dXcZcPYiHY9BUGYIx0 GjHNnEet+6rhSa7v6d1aJDDwHApZEbSZ6aHaSfTh/mCvEiwBB1ngvgnRxTh5Qd/MH8t+ qyCckpwRz2NOPvne7/hu+q2t4jJ3miGf/0XgNPFFMCgRwJh49Ar7klDnadRw/nUa+Cht mFlgq8v5bBWuvBWK1+M2VgZvtdpRR8FHzR3c4kiKOUJdol9qbXb6ID2V31cTbP9VZT8e 9Hmyo4tMGVHa9me9jHilDQvyFLqNrmJoP4FAEm403JicD0sfcPI72MBjan29tUg5EWoH zg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3q46m5gcum-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Apr 2023 05:35:13 +0000 Received: from m0353728.ppops.net (m0353728.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33O5SG5e011316; Mon, 24 Apr 2023 05:35:12 GMT Received: from ppma05fra.de.ibm.com (6c.4a.5195.ip4.static.sl-reverse.com [149.81.74.108]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3q46m5gcp8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Apr 2023 05:35:12 +0000 Received: from pps.filterd (ppma05fra.de.ibm.com [127.0.0.1]) by ppma05fra.de.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 33NNtLFl022087; Mon, 24 Apr 2023 05:35:08 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma05fra.de.ibm.com (PPS) with ESMTPS id 3q4776rrb7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 24 Apr 2023 05:35:08 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 33O5Z43u23527966 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 24 Apr 2023 05:35:04 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A1E912004B; Mon, 24 Apr 2023 05:35:04 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BDCBC20043; Mon, 24 Apr 2023 05:35:02 +0000 (GMT) Received: from [9.200.39.156] (unknown [9.200.39.156]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 24 Apr 2023 05:35:02 +0000 (GMT) Message-ID: <0b9cb9c6-2003-facf-fa2c-998a81d1e26c@linux.ibm.com> Date: Mon, 24 Apr 2023 13:35:03 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Ping^2 [PATCH, rs6000] Split TImode for logical operations in expand pass [PR100694] Content-Language: en-US From: HAO CHEN GUI To: gcc-patches Cc: Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner References: <740e9ed6-8730-1dec-ca78-a002df8d431a@linux.ibm.com> In-Reply-To: Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: BE0-lPfaoKl_eWlCbq3GtshaltiagDSo X-Proofpoint-GUID: ucKB2S-M4B7AQWBA5pPAKEGpLycVgX4e Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-24_02,2023-04-21_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxlogscore=999 priorityscore=1501 mlxscore=0 spamscore=0 adultscore=0 clxscore=1015 lowpriorityscore=0 suspectscore=0 impostorscore=0 phishscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304240052 X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, Gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html Thanks Gui Haochen 在 2023/2/20 10:10, HAO CHEN GUI 写道: > Hi, > Gently ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html > > Gui Haochen > Thanks > > 在 2023/2/8 13:08, HAO CHEN GUI 写道: >> Hi, >> The logical operations for TImode is split after reload pass right now. Some >> potential optimizations miss as the split is too late. This patch removes >> TImode from "AND", "IOR", "XOR" and "NOT" expander so that these logical >> operations can be split at expand pass. The new test case illustrates the >> optimization. >> >> Two test cases of pr92398 are merged into one as all sub-targets generates >> the same sequence of instructions with the patch. >> >> Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. >> >> Thanks >> Gui Haochen >> >> >> ChangeLog >> 2023-02-08 Haochen Gui >> >> gcc/ >> PR target/100694 >> * config/rs6000/rs6000.md (BOOL_128_V): New mode iterator for 128-bit >> vector types. >> (and3): Replace BOOL_128 with BOOL_128_V. >> (ior3): Likewise. >> (xor3): Likewise. >> (one_cmpl2 expander): New expander with BOOL_128_V. >> (one_cmpl2 insn_and_split): Rename to ... >> (*one_cmpl2): ... this. >> >> gcc/testsuite/ >> PR target/100694 >> * gcc.target/powerpc/pr100694.c: New. >> * gcc.target/powerpc/pr92398.c: New. >> * gcc.target/powerpc/pr92398.h: Remove. >> * gcc.target/powerpc/pr92398.p9-.c: Remove. >> * gcc.target/powerpc/pr92398.p9+.c: Remove. >> >> >> patch.diff >> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >> index 4bd1dfd3da9..455b7329643 100644 >> --- a/gcc/config/rs6000/rs6000.md >> +++ b/gcc/config/rs6000/rs6000.md >> @@ -743,6 +743,15 @@ (define_mode_iterator BOOL_128 [TI >> (V2DF "TARGET_ALTIVEC") >> (V1TI "TARGET_ALTIVEC")]) >> >> +;; Mode iterator for logical operations on 128-bit vector types >> +(define_mode_iterator BOOL_128_V [(V16QI "TARGET_ALTIVEC") >> + (V8HI "TARGET_ALTIVEC") >> + (V4SI "TARGET_ALTIVEC") >> + (V4SF "TARGET_ALTIVEC") >> + (V2DI "TARGET_ALTIVEC") >> + (V2DF "TARGET_ALTIVEC") >> + (V1TI "TARGET_ALTIVEC")]) >> + >> ;; For the GPRs we use 3 constraints for register outputs, two that are the >> ;; same as the output register, and a third where the output register is an >> ;; early clobber, so we don't have to deal with register overlaps. For the >> @@ -7135,23 +7144,23 @@ (define_expand "subti3" >> ;; 128-bit logical operations expanders >> >> (define_expand "and3" >> - [(set (match_operand:BOOL_128 0 "vlogical_operand") >> - (and:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand") >> - (match_operand:BOOL_128 2 "vlogical_operand")))] >> + [(set (match_operand:BOOL_128_V 0 "vlogical_operand") >> + (and:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand") >> + (match_operand:BOOL_128_V 2 "vlogical_operand")))] >> "" >> "") >> >> (define_expand "ior3" >> - [(set (match_operand:BOOL_128 0 "vlogical_operand") >> - (ior:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand") >> - (match_operand:BOOL_128 2 "vlogical_operand")))] >> + [(set (match_operand:BOOL_128_V 0 "vlogical_operand") >> + (ior:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand") >> + (match_operand:BOOL_128_V 2 "vlogical_operand")))] >> "" >> "") >> >> (define_expand "xor3" >> - [(set (match_operand:BOOL_128 0 "vlogical_operand") >> - (xor:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand") >> - (match_operand:BOOL_128 2 "vlogical_operand")))] >> + [(set (match_operand:BOOL_128_V 0 "vlogical_operand") >> + (xor:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand") >> + (match_operand:BOOL_128_V 2 "vlogical_operand")))] >> "" >> "") >> >> @@ -7449,7 +7458,14 @@ (define_insn_and_split "*eqv3_internal2" >> (const_string "16")))]) >> >> ;; 128-bit one's complement >> -(define_insn_and_split "one_cmpl2" >> +(define_expand "one_cmpl2" >> +[(set (match_operand:BOOL_128_V 0 "vlogical_operand" "=") >> + (not:BOOL_128_V >> + (match_operand:BOOL_128_V 1 "vlogical_operand" "")))] >> + "" >> + "") >> + >> +(define_insn_and_split "*one_cmpl2" >> [(set (match_operand:BOOL_128 0 "vlogical_operand" "=") >> (not:BOOL_128 >> (match_operand:BOOL_128 1 "vlogical_operand" "")))] >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100694.c b/gcc/testsuite/gcc.target/powerpc/pr100694.c >> new file mode 100644 >> index 00000000000..96a895d6c44 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr100694.c >> @@ -0,0 +1,14 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target int128 } */ >> +/* { dg-options "-O2" } */ >> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 3 } } */ >> + >> +/* It just needs two std and one blr. */ >> +void foo (unsigned __int128* res, unsigned long long hi, unsigned long long lo) >> +{ >> + unsigned __int128 i = hi; >> + i <<= 64; >> + i |= lo; >> + *res = i; >> +} >> + >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.c b/gcc/testsuite/gcc.target/powerpc/pr92398.c >> new file mode 100644 >> index 00000000000..7d6201cc5bb >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr92398.c >> @@ -0,0 +1,12 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target int128 } */ >> +/* { dg-options "-O2" } */ >> +/* { dg-final { scan-assembler-times {\mnot\M} 2 } } */ >> +/* { dg-final { scan-assembler-times {\mstd\M} 2 } } */ >> + >> +/* All platforms should generate the same instructions: not;not;std;std. */ >> +void bar (__int128_t *dst, __int128_t src) >> +{ >> + *dst = ~src; >> +} >> + >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.h b/gcc/testsuite/gcc.target/powerpc/pr92398.h >> deleted file mode 100644 >> index 5a4a8bcab80..00000000000 >> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.h >> +++ /dev/null >> @@ -1,17 +0,0 @@ >> -/* This test code is included into pr92398.p9-.c and pr92398.p9+.c. >> - The two files have the tests for the number of instructions generated for >> - P9- versus P9+. >> - >> - store generates difference instructions as below: >> - P9+: mtvsrdd;xxlnot;stxv. >> - P8/P7/P6 LE: not;not;std;std. >> - P8 BE: mtvsrd;mtvsrd;xxpermdi;xxlnor;stxvd2x. >> - P7/P6 BE: std;std;addi;lxvd2x;xxlnor;stxvd2x. >> - P9+ and P9- LE are expected, P6/P7/P8 BE are unexpected. */ >> - >> -void >> -bar (__int128_t *dst, __int128_t src) >> -{ >> - *dst = ~src; >> -} >> - >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c b/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c >> deleted file mode 100644 >> index 72dd1d9a274..00000000000 >> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c >> +++ /dev/null >> @@ -1,12 +0,0 @@ >> -/* { dg-do compile { target { lp64 && has_arch_pwr9 } } } */ >> -/* { dg-require-effective-target powerpc_vsx_ok } */ >> -/* { dg-options "-O2 -mvsx" } */ >> - >> -/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */ >> -/* { dg-final { scan-assembler-times {\mxxlnor\M} 1 } } */ >> -/* { dg-final { scan-assembler-times {\mstxv\M} 1 } } */ >> -/* { dg-final { scan-assembler-not {\mld\M} } } */ >> -/* { dg-final { scan-assembler-not {\mnot\M} } } */ >> - >> -/* Source code for the test in pr92398.h */ >> -#include "pr92398.h" >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c b/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c >> deleted file mode 100644 >> index bd7fa98af51..00000000000 >> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c >> +++ /dev/null >> @@ -1,10 +0,0 @@ >> -/* { dg-do compile { target { lp64 && {! has_arch_pwr9} } } } */ >> -/* { dg-require-effective-target powerpc_vsx_ok } */ >> -/* { dg-options "-O2 -mvsx" } */ >> - >> -/* { dg-final { scan-assembler-times {\mnot\M} 2 { xfail be } } } */ >> -/* { dg-final { scan-assembler-times {\mstd\M} 2 { xfail { { {! has_arch_pwr9} && has_arch_pwr8 } && be } } } } */ >> - >> -/* Source code for the test in pr92398.h */ >> -#include "pr92398.h" >> -