From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 8F3163858C5E; Wed, 15 Nov 2023 03:02:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8F3163858C5E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8F3163858C5E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=148.163.156.1 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700017370; cv=none; b=Fbs5/HQ0xDp8nVqxhI/wExA4qlzqEbTcFVCadLlfzuaGkh5rIFrCo3tvNRU/Q7yygrOEsfjYrsOeiMFyzN7cvOgDA6aKAlN9eDVdT/GNr4ydAJr+lj9OA6wrRakosqWD1cx+39tvQFjpVvQVL1JU4JNdlQwGlAm3ZjGDteh015o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700017370; c=relaxed/simple; bh=mL4qfsGonjw/6k1oEmHRE3xFeA5hmbYHjmw4fveJWic=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=BJADIiGt5JhiCIBd3Azjs2hJ+cyh1YwZv6MTztwo54Qr+MkWluFJVgtYl2UCXTjNXcReMJXgXAre9b1R7At5RlXX+O8gD2YnHvuInOUWYB2WLOJmSwoutVaHrgrYBEPkK6JLD1pSry9tA5wW2jWgxEUcGoG+4B80xq45o/70TB8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AF32MSc011466; Wed, 15 Nov 2023 03:02:47 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : mime-version; s=pp1; bh=3eBiHhdI3ZFsoFpnl2RNqVbjD99sJ9dcge9cyY9lCqY=; b=piOPtd/J4Rem0mm5ejC5rQT319lQ0ytXYE+ovGL0C2SGBHP4R0qXzggWLzUj4NKD7DQp ZMRQ7THnGZuABp5HGonOWJqeaoSQStzRmHIPsq5NO1L4LMz37zWD/M3xhE05LdtlT4MJ sSG0ZFpp7VM6VjoKedMi32Q0j3f99TmdUaqj9zxioiqcrCMyKuvAtM1FJXsWdbiAtNq8 d23QbWNXQzywd8SUDr17r94aLYcwktsJO8SlDd436zl6BVbgu4XyemunqHAM80+u6S4u RqoeEnUqOEgZFlwtyzZrBycyyafbxl/iTfxPQsRCiGLIZQ7vHb91hDaZMqfF3sJ6t20p ug== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucnqtg0py-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Nov 2023 03:02:47 +0000 Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 3AF32DLa011175; Wed, 15 Nov 2023 03:02:46 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3ucnqtg0pj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Nov 2023 03:02:46 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 3AF1Wowu026532; Wed, 15 Nov 2023 03:02:45 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3uap5k40d4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 15 Nov 2023 03:02:45 +0000 Received: from smtpav03.fra02v.mail.ibm.com (smtpav03.fra02v.mail.ibm.com [10.20.54.102]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 3AF32hEK47120702 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 15 Nov 2023 03:02:43 GMT Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 004A42004E; Wed, 15 Nov 2023 03:02:43 +0000 (GMT) Received: from smtpav03.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B1DA320040; Wed, 15 Nov 2023 03:02:41 +0000 (GMT) Received: from genoa.aus.stglabs.ibm.com (unknown [9.40.192.157]) by smtpav03.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 15 Nov 2023 03:02:41 +0000 (GMT) From: Jiufu Guo To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, bergner@linux.ibm.com, guojiufu@linux.ibm.com Subject: [PATCH V2 3/3] split complicate constant to memory Date: Wed, 15 Nov 2023 11:02:37 +0800 Message-Id: <20231115030237.1188073-3-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231115030237.1188073-1-guojiufu@linux.ibm.com> References: <20231115030237.1188073-1-guojiufu@linux.ibm.com> X-TM-AS-GCONF: 00 X-Proofpoint-GUID: RC6m_lfBoBRLdF8xIf0hdc1DnaXuYvu7 X-Proofpoint-ORIG-GUID: workAyBZVqHK6arObs9twkyj88s7g4_b Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-15_01,2023-11-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 adultscore=0 lowpriorityscore=0 suspectscore=0 mlxscore=0 mlxlogscore=999 clxscore=1011 impostorscore=0 spamscore=0 bulkscore=0 priorityscore=1501 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311150024 X-Spam-Status: No, score=-10.1 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,KAM_STOCKGEN,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi, Sometimes, a complicated constant is built via 3(or more) instructions to build. Generally speaking, it would not be as faster as loading it from the constant pool (as a few discussions in PR63281): * "ld" is one instruction. If consider "address/toc" adjust, we may count it as 2 instructions (the high part of address computation could be optimized as nop by linker further). And "pld" may need less cycles. * As testing(SPEC2017), it could get better/stable runtime if set the threshold as "> 2" (compare with "> 3"). As tested on spec2017, for visible performance changes, we can find the runtime improvement on 500.perlbench_r about ~1.8% (-O2, P10) with the patch. And for performance downgrade on other benchmarks, as investigation, the recessions are not caused by this patch. Compare with previous version: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634197.html This verion updates commit message. Boostrap & regtest pass on ppc64{,le}. Is this ok for trunk? BR, Jeff (Jiufu Guo) PR target/63281 gcc/ChangeLog: * config/rs6000/rs6000.cc (rs6000_emit_set_const): Update to split complicate constant to memory. gcc/testsuite/ChangeLog: * gcc.target/powerpc/const_anchors.c: Update to test final-rtl. * gcc.target/powerpc/parall_5insn_const.c: Update to keep original test point. * gcc.target/powerpc/pr106550.c: Likewise.. * gcc.target/powerpc/pr106550_1.c: Likewise. * gcc.target/powerpc/pr87870.c: Update according to latest behavior. * gcc.target/powerpc/pr93012.c: Likewise. --- gcc/config/rs6000/rs6000.cc | 16 ++++++++++++++++ .../gcc.target/powerpc/const_anchors.c | 5 ++--- .../gcc.target/powerpc/parall_5insn_const.c | 14 ++++++++++++-- gcc/testsuite/gcc.target/powerpc/pr106550.c | 17 +++++++++++++++-- gcc/testsuite/gcc.target/powerpc/pr106550_1.c | 15 +++++++++++++-- gcc/testsuite/gcc.target/powerpc/pr87870.c | 5 ++++- gcc/testsuite/gcc.target/powerpc/pr93012.c | 4 +++- 7 files changed, 65 insertions(+), 11 deletions(-) diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc index b277c52687b..c878e1030ea 100644 --- a/gcc/config/rs6000/rs6000.cc +++ b/gcc/config/rs6000/rs6000.cc @@ -10271,6 +10271,22 @@ rs6000_emit_set_const (rtx dest, rtx source) c = sext_hwi (c, 32); emit_move_insn (lo, GEN_INT (c)); } + + /* If it can be stored to the constant pool and profitable. */ + else if (base_reg_operand (dest, mode) + && num_insns_constant (source, mode) > 2) + { + rtx sym = force_const_mem (mode, source); + if (TARGET_TOC && SYMBOL_REF_P (XEXP (sym, 0)) + && use_toc_relative_ref (XEXP (sym, 0), mode)) + { + rtx toc = create_TOC_reference (XEXP (sym, 0), copy_rtx (dest)); + sym = gen_const_mem (mode, toc); + set_mem_alias_set (sym, get_TOC_alias_set ()); + } + + emit_insn (gen_rtx_SET (dest, sym)); + } else rs6000_emit_set_long_const (dest, c, NULL); break; diff --git a/gcc/testsuite/gcc.target/powerpc/const_anchors.c b/gcc/testsuite/gcc.target/powerpc/const_anchors.c index 542e2674b12..188744165f2 100644 --- a/gcc/testsuite/gcc.target/powerpc/const_anchors.c +++ b/gcc/testsuite/gcc.target/powerpc/const_anchors.c @@ -1,5 +1,5 @@ /* { dg-do compile { target has_arch_ppc64 } } */ -/* { dg-options "-O2" } */ +/* { dg-options "-O2 -fdump-rtl-final" } */ #define C1 0x2351847027482577ULL #define C2 0x2351847027482578ULL @@ -16,5 +16,4 @@ void __attribute__ ((noinline)) foo1 (long long *a, long long b) if (b) *a++ = C2; } - -/* { dg-final { scan-assembler-times {\maddi\M} 2 } } */ +/* { dg-final { scan-rtl-dump-times {\madddi3\M} 2 "final" } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c index e3a9a7264cf..df0690b90be 100644 --- a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c @@ -9,8 +9,18 @@ void __attribute__ ((noinline)) foo (unsigned long long *a) { /* 2 lis + 2 ori + 1 rldimi for each constant. */ - *a++ = 0x800aabcdc167fa16ULL; - *a++ = 0x7543a876867f616ULL; + { + register long long d asm("r0") = 0x800aabcdc167fa16ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } + { + register long long d asm("r0") = 0x7543a876867f616ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } } long long A[] = {0x800aabcdc167fa16ULL, 0x7543a876867f616ULL}; diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550.c b/gcc/testsuite/gcc.target/powerpc/pr106550.c index 74e395331ab..5eca2b2f701 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr106550.c +++ b/gcc/testsuite/gcc.target/powerpc/pr106550.c @@ -1,12 +1,25 @@ /* PR target/106550 */ /* { dg-options "-O2 -mdejagnu-cpu=power10" } */ /* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target has_arch_ppc64 } */ void foo (unsigned long long *a) { - *a++ = 0x020805006106003; /* pli+pli+rldimi */ - *a++ = 0x2351847027482577;/* pli+pli+rldimi */ + { + /* pli+pli+rldimi */ + register long long d asm("r0") = 0x020805006106003ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } + { + /* pli+pli+rldimi */ + register long long d asm("r0") = 0x2351847027482577ULL; + long long n; + asm("mr %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } } /* { dg-final { scan-assembler-times {\mpli\M} 4 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr106550_1.c b/gcc/testsuite/gcc.target/powerpc/pr106550_1.c index 7e709fcf9d8..11878d893a4 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr106550_1.c +++ b/gcc/testsuite/gcc.target/powerpc/pr106550_1.c @@ -12,8 +12,19 @@ foo (unsigned long long *a) asm("cntlzd %0, %1" : "=r"(n) : "r"(d)); *a++ = n; - *a++ = 0x235a8470a7480000ULL; /* pli+sldi+oris */ - *a++ = 0x23a184700000b677ULL; /* pli+sldi+ori */ + { + register long long d asm("r0") = 0x235a8470a7480000ULL; /* pli+sldi+oris */ + long long n; + asm("cntlzd %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } + + { + register long long d asm("r0") = 0x23a184700000b677ULL; /* pli+sldi+ori */ + long long n; + asm("cntlzd %0, %1" : "=r"(n) : "r"(d)); + *a++ = n; + } } /* { dg-final { scan-assembler-times {\mpli\M} 3 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr87870.c b/gcc/testsuite/gcc.target/powerpc/pr87870.c index d2108ac3386..5fee06744ae 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr87870.c +++ b/gcc/testsuite/gcc.target/powerpc/pr87870.c @@ -25,4 +25,7 @@ test3 (void) return ((__int128)0xdeadbeefcafebabe << 64) | 0xfacefeedbaaaaaad; } -/* { dg-final { scan-assembler-not {\mld\M} } } */ +/* test3 using "ld" to load the value for r3 and r4. + test0, test1 and test2 are using "li". */ +/* { dg-final { scan-assembler-times {\mp?ld\M} 2 } } */ +/* { dg-final { scan-assembler-times {\mli\M} 6 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/pr93012.c b/gcc/testsuite/gcc.target/powerpc/pr93012.c index a07ff764bbf..ef0f8fabcc6 100644 --- a/gcc/testsuite/gcc.target/powerpc/pr93012.c +++ b/gcc/testsuite/gcc.target/powerpc/pr93012.c @@ -11,4 +11,6 @@ unsigned long long mskl1() { return 0x2bcdffff2bcdffffULL; } unsigned long long mskse() { return 0xffff1234ffff1234ULL; } /* { dg-final { scan-assembler-times {\mpli\M} 4 { target has_arch_pwr10 }} } */ -/* { dg-final { scan-assembler-times {\mrldimi\M} 7 } } */ +/* { dg-final { scan-assembler-times {\mrldimi\M} 7 { target has_arch_pwr10 } } } */ +/* { dg-final { scan-assembler-times {\mrldimi\M} 3 { target { ! has_arch_pwr10 } } } } */ +/* { dg-final { scan-assembler-times {\mld\M} 4 { target { ! has_arch_pwr10 } } } } */ -- 2.25.1