From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 370993858D1E for ; Wed, 8 May 2024 02:18:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 370993858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 370993858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715134708; cv=none; b=r+QyLNl/D943YDcGms3xTJ2TmdDHRkjy7mThra9LUm4yNsQEujPaqKiI6fYK4d32bnf/1xuPKc6s0ThO7+/+zuF1mzpf4Arsr3aPgeUAd2Hqdw4YdcrA7il7zxerVUloiQczq2UFd3mrNgFJTIjlApSDWqhLB5OxwT/YYrt4JJ4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1715134708; c=relaxed/simple; bh=xTpaRuP5y1fcdELEBjg72J1sGsL1klymB5bCaCyJ+sc=; h=DKIM-Signature:Message-ID:Date:Subject:From:To:MIME-Version; b=AFWisjqor8kHGz9Mm8Xh10V1t8Xmlu5GkmooBh7+Pf/Bn4FjukNLWhPgvb/g/JXdDxciXvM1VzbrjelAQtG5zrHFgDL1csxo0NBU2BCmh/PW6r4CawxsqnM4ACXi///KvFUEXuOhKqFpcrqe9MHrP/6FmY9wTQDgR8AX2V3Lu2Y= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0360072.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 4482DDe6022778; Wed, 8 May 2024 02:18:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : subject : from : to : cc : references : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pp1; bh=YJQJp/tfXhOVMEAJ6f+p+WaRK4QsMPv+Z03AhBmZO1g=; b=CLd4oJNtNLPEQ0WDbI8tgnHHT2QBlZhXF30rk8PT2kUAAaSi22FFolrzmKCdaic5K9/J t1HLDvDLgZjPLmDNDFmzb/7VlrjkjWnkFYIeXB99NXkGDAAcqRY5cHBerFWWyqsFVLqK Q622Js3HrIkOVY1vDSWLsIeTo1TpTloZa8rESRGyBVZo6lSiTxEuy4/R2pr0ZMYOYrh2 2dl69DTycqppyTSGXVXBgDHonpW86UijYN1XYpL/5bhqPfe8FSIFD18vmhKvfB6Wg30a vYhd2wn9mVuSnvUwP1WGdG3GCiNrONoozP8A3qv6LGLXSjFA6m4idFS7oUpR+wyvUsfN xg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3y00der0ew-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 May 2024 02:18:25 +0000 Received: from m0360072.ppops.net (m0360072.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 4482IPls032333; Wed, 8 May 2024 02:18:25 GMT Received: from ppma11.dal12v.mail.ibm.com (db.9e.1632.ip4.static.sl-reverse.com [50.22.158.219]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3y00der0es-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 May 2024 02:18:24 +0000 Received: from pps.filterd (ppma11.dal12v.mail.ibm.com [127.0.0.1]) by ppma11.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 4480jZxN017558; Wed, 8 May 2024 01:52:06 GMT Received: from smtprelay02.fra02v.mail.ibm.com ([9.218.2.226]) by ppma11.dal12v.mail.ibm.com (PPS) with ESMTPS id 3xyshst05n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 08 May 2024 01:52:06 +0000 Received: from smtpav02.fra02v.mail.ibm.com (smtpav02.fra02v.mail.ibm.com [10.20.54.101]) by smtprelay02.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 4481q1sR52887962 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 8 May 2024 01:52:03 GMT Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 215B420043; Wed, 8 May 2024 01:52:01 +0000 (GMT) Received: from smtpav02.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id A41CD20040; Wed, 8 May 2024 01:51:59 +0000 (GMT) Received: from [9.200.103.244] (unknown [9.200.103.244]) by smtpav02.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 8 May 2024 01:51:59 +0000 (GMT) Message-ID: <8f03cc2c-d4cd-4251-991d-e790604e6d85@linux.ibm.com> Date: Wed, 8 May 2024 09:52:00 +0800 User-Agent: Mozilla Thunderbird Subject: Ping^3 [PATCH, rs6000] Split TImode for logical operations in expand pass [PR100694] From: HAO CHEN GUI To: gcc-patches Cc: Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner References: <740e9ed6-8730-1dec-ca78-a002df8d431a@linux.ibm.com> <0b9cb9c6-2003-facf-fa2c-998a81d1e26c@linux.ibm.com> Content-Language: en-US In-Reply-To: <0b9cb9c6-2003-facf-fa2c-998a81d1e26c@linux.ibm.com> Content-Type: text/plain; charset=UTF-8 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: bPdQq3CqGwbkIoRP8CMmVEJVkdve6Ch8 X-Proofpoint-ORIG-GUID: 2vt3KmSMMLmCnnjE3AtuCsuRjcKbS-gf Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1039,Hydra:6.0.650,FMLib:17.11.176.26 definitions=2024-05-07_16,2024-05-06_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 bulkscore=0 clxscore=1015 priorityscore=1501 mlxscore=0 adultscore=0 malwarescore=0 suspectscore=0 phishscore=0 lowpriorityscore=0 spamscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2405010000 definitions=main-2405080015 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, As now it's stage-1, gently ping this: https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html Gui Haochen Thanks 在 2023/4/24 13:35, HAO CHEN GUI 写道: > Hi, > Gently ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html > > Thanks > Gui Haochen > > 在 2023/2/20 10:10, HAO CHEN GUI 写道: >> Hi, >> Gently ping this: >> https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html >> >> Gui Haochen >> Thanks >> >> 在 2023/2/8 13:08, HAO CHEN GUI 写道: >>> Hi, >>> The logical operations for TImode is split after reload pass right now. Some >>> potential optimizations miss as the split is too late. This patch removes >>> TImode from "AND", "IOR", "XOR" and "NOT" expander so that these logical >>> operations can be split at expand pass. The new test case illustrates the >>> optimization. >>> >>> Two test cases of pr92398 are merged into one as all sub-targets generates >>> the same sequence of instructions with the patch. >>> >>> Bootstrapped and tested on powerpc64-linux BE and LE with no regressions. >>> >>> Thanks >>> Gui Haochen >>> >>> >>> ChangeLog >>> 2023-02-08 Haochen Gui >>> >>> gcc/ >>> PR target/100694 >>> * config/rs6000/rs6000.md (BOOL_128_V): New mode iterator for 128-bit >>> vector types. >>> (and3): Replace BOOL_128 with BOOL_128_V. >>> (ior3): Likewise. >>> (xor3): Likewise. >>> (one_cmpl2 expander): New expander with BOOL_128_V. >>> (one_cmpl2 insn_and_split): Rename to ... >>> (*one_cmpl2): ... this. >>> >>> gcc/testsuite/ >>> PR target/100694 >>> * gcc.target/powerpc/pr100694.c: New. >>> * gcc.target/powerpc/pr92398.c: New. >>> * gcc.target/powerpc/pr92398.h: Remove. >>> * gcc.target/powerpc/pr92398.p9-.c: Remove. >>> * gcc.target/powerpc/pr92398.p9+.c: Remove. >>> >>> >>> patch.diff >>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>> index 4bd1dfd3da9..455b7329643 100644 >>> --- a/gcc/config/rs6000/rs6000.md >>> +++ b/gcc/config/rs6000/rs6000.md >>> @@ -743,6 +743,15 @@ (define_mode_iterator BOOL_128 [TI >>> (V2DF "TARGET_ALTIVEC") >>> (V1TI "TARGET_ALTIVEC")]) >>> >>> +;; Mode iterator for logical operations on 128-bit vector types >>> +(define_mode_iterator BOOL_128_V [(V16QI "TARGET_ALTIVEC") >>> + (V8HI "TARGET_ALTIVEC") >>> + (V4SI "TARGET_ALTIVEC") >>> + (V4SF "TARGET_ALTIVEC") >>> + (V2DI "TARGET_ALTIVEC") >>> + (V2DF "TARGET_ALTIVEC") >>> + (V1TI "TARGET_ALTIVEC")]) >>> + >>> ;; For the GPRs we use 3 constraints for register outputs, two that are the >>> ;; same as the output register, and a third where the output register is an >>> ;; early clobber, so we don't have to deal with register overlaps. For the >>> @@ -7135,23 +7144,23 @@ (define_expand "subti3" >>> ;; 128-bit logical operations expanders >>> >>> (define_expand "and3" >>> - [(set (match_operand:BOOL_128 0 "vlogical_operand") >>> - (and:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand") >>> - (match_operand:BOOL_128 2 "vlogical_operand")))] >>> + [(set (match_operand:BOOL_128_V 0 "vlogical_operand") >>> + (and:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand") >>> + (match_operand:BOOL_128_V 2 "vlogical_operand")))] >>> "" >>> "") >>> >>> (define_expand "ior3" >>> - [(set (match_operand:BOOL_128 0 "vlogical_operand") >>> - (ior:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand") >>> - (match_operand:BOOL_128 2 "vlogical_operand")))] >>> + [(set (match_operand:BOOL_128_V 0 "vlogical_operand") >>> + (ior:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand") >>> + (match_operand:BOOL_128_V 2 "vlogical_operand")))] >>> "" >>> "") >>> >>> (define_expand "xor3" >>> - [(set (match_operand:BOOL_128 0 "vlogical_operand") >>> - (xor:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand") >>> - (match_operand:BOOL_128 2 "vlogical_operand")))] >>> + [(set (match_operand:BOOL_128_V 0 "vlogical_operand") >>> + (xor:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand") >>> + (match_operand:BOOL_128_V 2 "vlogical_operand")))] >>> "" >>> "") >>> >>> @@ -7449,7 +7458,14 @@ (define_insn_and_split "*eqv3_internal2" >>> (const_string "16")))]) >>> >>> ;; 128-bit one's complement >>> -(define_insn_and_split "one_cmpl2" >>> +(define_expand "one_cmpl2" >>> +[(set (match_operand:BOOL_128_V 0 "vlogical_operand" "=") >>> + (not:BOOL_128_V >>> + (match_operand:BOOL_128_V 1 "vlogical_operand" "")))] >>> + "" >>> + "") >>> + >>> +(define_insn_and_split "*one_cmpl2" >>> [(set (match_operand:BOOL_128 0 "vlogical_operand" "=") >>> (not:BOOL_128 >>> (match_operand:BOOL_128 1 "vlogical_operand" "")))] >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100694.c b/gcc/testsuite/gcc.target/powerpc/pr100694.c >>> new file mode 100644 >>> index 00000000000..96a895d6c44 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr100694.c >>> @@ -0,0 +1,14 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-require-effective-target int128 } */ >>> +/* { dg-options "-O2" } */ >>> +/* { dg-final { scan-assembler-times {(?n)^\s+[a-z]} 3 } } */ >>> + >>> +/* It just needs two std and one blr. */ >>> +void foo (unsigned __int128* res, unsigned long long hi, unsigned long long lo) >>> +{ >>> + unsigned __int128 i = hi; >>> + i <<= 64; >>> + i |= lo; >>> + *res = i; >>> +} >>> + >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.c b/gcc/testsuite/gcc.target/powerpc/pr92398.c >>> new file mode 100644 >>> index 00000000000..7d6201cc5bb >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr92398.c >>> @@ -0,0 +1,12 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-require-effective-target int128 } */ >>> +/* { dg-options "-O2" } */ >>> +/* { dg-final { scan-assembler-times {\mnot\M} 2 } } */ >>> +/* { dg-final { scan-assembler-times {\mstd\M} 2 } } */ >>> + >>> +/* All platforms should generate the same instructions: not;not;std;std. */ >>> +void bar (__int128_t *dst, __int128_t src) >>> +{ >>> + *dst = ~src; >>> +} >>> + >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.h b/gcc/testsuite/gcc.target/powerpc/pr92398.h >>> deleted file mode 100644 >>> index 5a4a8bcab80..00000000000 >>> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.h >>> +++ /dev/null >>> @@ -1,17 +0,0 @@ >>> -/* This test code is included into pr92398.p9-.c and pr92398.p9+.c. >>> - The two files have the tests for the number of instructions generated for >>> - P9- versus P9+. >>> - >>> - store generates difference instructions as below: >>> - P9+: mtvsrdd;xxlnot;stxv. >>> - P8/P7/P6 LE: not;not;std;std. >>> - P8 BE: mtvsrd;mtvsrd;xxpermdi;xxlnor;stxvd2x. >>> - P7/P6 BE: std;std;addi;lxvd2x;xxlnor;stxvd2x. >>> - P9+ and P9- LE are expected, P6/P7/P8 BE are unexpected. */ >>> - >>> -void >>> -bar (__int128_t *dst, __int128_t src) >>> -{ >>> - *dst = ~src; >>> -} >>> - >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c b/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c >>> deleted file mode 100644 >>> index 72dd1d9a274..00000000000 >>> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9+.c >>> +++ /dev/null >>> @@ -1,12 +0,0 @@ >>> -/* { dg-do compile { target { lp64 && has_arch_pwr9 } } } */ >>> -/* { dg-require-effective-target powerpc_vsx_ok } */ >>> -/* { dg-options "-O2 -mvsx" } */ >>> - >>> -/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 } } */ >>> -/* { dg-final { scan-assembler-times {\mxxlnor\M} 1 } } */ >>> -/* { dg-final { scan-assembler-times {\mstxv\M} 1 } } */ >>> -/* { dg-final { scan-assembler-not {\mld\M} } } */ >>> -/* { dg-final { scan-assembler-not {\mnot\M} } } */ >>> - >>> -/* Source code for the test in pr92398.h */ >>> -#include "pr92398.h" >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c b/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c >>> deleted file mode 100644 >>> index bd7fa98af51..00000000000 >>> --- a/gcc/testsuite/gcc.target/powerpc/pr92398.p9-.c >>> +++ /dev/null >>> @@ -1,10 +0,0 @@ >>> -/* { dg-do compile { target { lp64 && {! has_arch_pwr9} } } } */ >>> -/* { dg-require-effective-target powerpc_vsx_ok } */ >>> -/* { dg-options "-O2 -mvsx" } */ >>> - >>> -/* { dg-final { scan-assembler-times {\mnot\M} 2 { xfail be } } } */ >>> -/* { dg-final { scan-assembler-times {\mstd\M} 2 { xfail { { {! has_arch_pwr9} && has_arch_pwr8 } && be } } } } */ >>> - >>> -/* Source code for the test in pr92398.h */ >>> -#include "pr92398.h" >>> -