From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id 0F4C438346B7 for ; Mon, 20 Feb 2023 09:49:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0F4C438346B7 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=linux.ibm.com Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5] helo=mx0a-001b2d01.pphosted.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pTvdL-0006SV-QF for gcc-patches@gcc.gnu.org; Sun, 19 Feb 2023 21:10:29 -0500 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 31JMLfxR029061; Mon, 20 Feb 2023 02:10:24 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : from : to : cc : references : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=7q05usjtub+5FbtsZ7IKxBSa/rZ4iYcMO5bDRLKUThY=; b=IU5rw3Lq8fIN7g6avMescoKaWK9b/IfKR1zROdTZ9zmAJUO8Mk6MQLSIY2xClUpgBefe OtZrTp71N8DhhUK8cPiNqG95VsiR12ixs825pjtE4GqfKvOLymDo9O1PGfKmdXBjq6mC PTa8REtl2IC7jYLaUQvs7uA9eY44cUPzTZpPqbQz8kfpVs2xkcP6HTDMgaUCq7gvU9O5 vvnUs8n8dNC/BDRIstSnKVCuS3ADPWICRBrrirqDavJu6gCMwKpku+tfi7Ue9uPQjFmA +OZCsR9XmQONFTJtOsdh16fiX73KoD87mEYslrBFv+lEnkklu95rWkWGcYnlDbgckwWN yw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3nuvgcba88-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Feb 2023 02:10:23 +0000 Received: from m0098419.ppops.net (m0098419.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 31K1x2UJ015312; Mon, 20 Feb 2023 02:10:23 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3nuvgcba7x-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Feb 2023 02:10:23 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 31JJAqDR008605; Mon, 20 Feb 2023 02:10:21 GMT Received: from smtprelay03.fra02v.mail.ibm.com ([9.218.2.224]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3ntpa6a1jn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Feb 2023 02:10:21 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay03.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 31K2AHEj50790824 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 20 Feb 2023 02:10:17 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id C6E9B2004E; Mon, 20 Feb 2023 02:10:17 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id EE5A920040; Mon, 20 Feb 2023 02:10:15 +0000 (GMT) Received: from [9.197.250.3] (unknown [9.197.250.3]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Mon, 20 Feb 2023 02:10:15 +0000 (GMT) Message-ID: Date: Mon, 20 Feb 2023 10:10:14 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.7.2 Subject: Ping [PATCH, rs6000] Split TImode for logical operations in expand pass [PR100694] Content-Language: en-US From: HAO CHEN GUI To: gcc-patches Cc: Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner References: <740e9ed6-8730-1dec-ca78-a002df8d431a@linux.ibm.com> In-Reply-To: <740e9ed6-8730-1dec-ca78-a002df8d431a@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: qcJEfKFxFCeb7up5By_oZcX4hc4hTNb2 X-Proofpoint-GUID: D9q4l94HDMXF-aGOaf7OVtqPiIAUfiDe Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.930,Hydra:6.0.562,FMLib:17.11.170.22 definitions=2023-02-19_15,2023-02-17_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 mlxscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 priorityscore=1501 spamscore=0 impostorscore=0 suspectscore=0 malwarescore=0 mlxlogscore=999 bulkscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2302200017 Received-SPF: pass client-ip=148.163.158.5; envelope-from=guihaoc@linux.ibm.com; helo=mx0a-001b2d01.pphosted.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9,DKIM_SIGNED=0.1,DKIM_VALID=-0.1,DKIM_VALID_EF=-0.1,RCVD_IN_MSPIKE_H2=-0.001,SPF_HELO_NONE=0.001,SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,SPF_HELO_PASS,SPF_SOFTFAIL,TXREP autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, Gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html Gui Haochen Thanks 在 2023/2/8 13:08, HAO CHEN GUI 写道: > Hi, > The logical operations for TImode is split after reload pass right now. Some > potential optimizations miss as the split is too late. This patch removes > TImode from "AND", "IOR", "XOR" and "NOT" expander so that these logical > operations can be split at expand pass. The new test case illustrates the > optimization. > > Two test cases of pr92398 are merged into one as all sub-targets generates > the same sequence of instructions with the patch. > > Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. > > Thanks > Gui Haochen > > > ChangeLog > 2023-02-08 Haochen Gui > > gcc/ > PR target/100694 > * config/rs6000/rs6000.md (BOOL_128_V): New mode iterator for 128-bit > vector types. > (and3): Replace BOOL_128 with BOOL_128_V. > (ior3): Likewise. > (xor3): Likewise. > (one_cmpl2 expander): New expander with BOOL_128_V. > (one_cmpl2 insn_and_split): Rename to ... > (*one_cmpl2): ... this. > > gcc/testsuite/ > PR target/100694 > * gcc.target/powerpc/pr100694.c: New. > * gcc.target/powerpc/pr92398.c: New. > * gcc.target/powerpc/pr92398.h: Remove. > * gcc.target/powerpc/pr92398.p9-.c: Remove. > * gcc.target/powerpc/pr92398.p9+.c: Remove. > > > patch.diff > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 4bd1dfd3da9..455b7329643 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -743,6 +743,15 @@ (define_mode_iterator BOOL_128 [TI > (V2DF "TARGET_ALTIVEC") > (V1TI "TARGET_ALTIVEC")]) > > +;; Mode iterator for logical operations on 128-bit vector types > +(define_mode_iterator BOOL_128_V [(V16QI "TARGET_ALTIVEC") > + (V8HI "TARGET_ALTIVEC") > + (V4SI "TARGET_ALTIVEC") > + (V4SF "TARGET_ALTIVEC") > + (V2DI "TARGET_ALTIVEC") > + (V2DF "TARGET_ALTIVEC") > + (V1TI "TARGET_ALTIVEC")]) > + > ;; For the GPRs we use 3 constraints for register outputs, two that are the > ;; same as the output register, and a third where the output register is an > ;; early clobber, so we don't have to deal with register overlaps. For the > @@ -7135,23 +7144,23 @@ (define_expand "subti3" > ;; 128-bit logical operations expanders > > (define_expand "and3" > - [(set (match_operand:BOOL_128 0 "vlogical_operand") > - (and:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand") > - (match_operand:BOOL_128 2 "vlogical_operand")))] > + [(set (match_operand:BOOL_128_V 0 "vlogical_operand") > + (and:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand") > + (match_operand:BOOL_128_V 2 "vlogical_operand")))] > "" > "") > > (define_expand "ior3" > - [(set (match_operand:BOOL_128 0 "vlogical_operand") > - (ior:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand") > - (match_operand:BOOL_128 2 "vlogical_operand")))] > + [(set (match_operand:BOOL_128_V 0 "vlogical_operand") > + (ior:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand") > + (match_operand:BOOL_128_V 2 "vlogical_operand")))] > "" > "") > > (define_expand "xor3" > - [(set (match_operand:BOOL_128 0 "vlogical_operand") > - (xor:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand") > - (match_operand:BOOL_128 2 "vlogical_operand")))] > + [(set (match_operand:BOOL_128_V 0 "vlogical_operand") > + (xor:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand") > + (match_operand:BOOL_128_V 2 "vlogical_operand")))] > "" > "") > > @@ -7449,7 +7458,14 @@ (define_insn_and_split "*eqv3_internal2" > (const_string "16")))]) > > ;; 128-bit one's complement > -(define_insn_and_split "one_cmpl2" > +(define_expand "one_cmpl2" > +[(set (match_operand:BOOL_128_V 0 "vlogical_operand" "=") > + (not:BOOL_128_V > + (match_operand:BOOL_128_V 1 "vlogical_operand" "")))] > + "" > + "") > + > +(define_insn_and_split "*one_cmpl2" > [(set (match_operand:BOOL_128 0 "vlogical_operand" "=") > (not:BOOL_128 > (match_operand:BOOL_128 1 "vlogical_operand" "")))] > diff --git a/gcc/testsuite/gcc.target/powerpc/pr100694.c b/gcc/testsuite/gcc.target/powerpc/pr100694.c > new file mode 100644 > index 00000000000..96a895d6c44 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr100694.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target int128 } */ > +/* { dg-options "-O2" } */ > +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 3 } } */ > + > +/* It just needs two std and one blr. */ > +void foo (unsigned __int128* res, unsigned long long hi, unsigned long long lo) > +{ > + unsigned __int128 i = hi; > + i <<= 64; > + i |= lo; > + *res = i; > +} > + > diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.c b/gcc/testsuite/gcc.target/powerpc/pr92398.c > new file mode 100644 > index 00000000000..7d6201cc5bb > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr92398.c > @@ -0,0 +1,12 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target int128 } */ > +/* { dg-options "-O2" } */ > +/* { dg-final { scan-assembler-times {\mnot\M} 2 } } */ > +/* { dg-final { scan-assembler-times {\mstd\M} 2 } } */ > + > +/* All platforms should generate the same instructions: not;not;std;std. */ > +void bar (__int128_t *dst, __int128_t src) > +{ > + *dst = ~src; > +} > + > diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.h b/gcc/testsuite/gcc.target/powerpc/pr92398.h > deleted file mode 100644 > index 5a4a8bcab80..00000000000 > --- a/gcc/testsuite/gcc.target/powerpc/pr92398.h > +++ /dev/null > @@ -1,17 +0,0 @@ > -/* This test code is included into pr92398.p9-.c and pr92398.p9+.c. > - The two files have the tests for the number of instructions generated for > - P9- versus P9+. > - > - store generates difference instructions as below: > - P9+: mtvsrdd;xxlnot;stxv. > - P8/P7/P6 LE: not;not;std;std. > - P8 BE: mtvsrd;mtvsrd;xxpermdi;xxlnor;stxvd2x. > - P7/P6 BE: std;std;addi;lxvd2x;xxlnor;stxvd2x. > - P9+ and P9- LE are expected, P6/P7/P8 BE are unexpected. */ > - > -void > -bar (__int128_t *dst, __int128_t src) > -{ > - *dst = ~src; > -} > - > diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c b/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c > deleted file mode 100644 > index 72dd1d9a274..00000000000 > --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c > +++ /dev/null > @@ -1,12 +0,0 @@ > -/* { dg-do compile { target { lp64 && has_arch_pwr9 } } } */ > -/* { dg-require-effective-target powerpc_vsx_ok } */ > -/* { dg-options "-O2 -mvsx" } */ > - > -/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */ > -/* { dg-final { scan-assembler-times {\mxxlnor\M} 1 } } */ > -/* { dg-final { scan-assembler-times {\mstxv\M} 1 } } */ > -/* { dg-final { scan-assembler-not {\mld\M} } } */ > -/* { dg-final { scan-assembler-not {\mnot\M} } } */ > - > -/* Source code for the test in pr92398.h */ > -#include "pr92398.h" > diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c b/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c > deleted file mode 100644 > index bd7fa98af51..00000000000 > --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c > +++ /dev/null > @@ -1,10 +0,0 @@ > -/* { dg-do compile { target { lp64 && {! has_arch_pwr9} } } } */ > -/* { dg-require-effective-target powerpc_vsx_ok } */ > -/* { dg-options "-O2 -mvsx" } */ > - > -/* { dg-final { scan-assembler-times {\mnot\M} 2 { xfail be } } } */ > -/* { dg-final { scan-assembler-times {\mstd\M} 2 { xfail { { {! has_arch_pwr9} && has_arch_pwr8 } && be } } } } */ > - > -/* Source code for the test in pr92398.h */ > -#include "pr92398.h" > -