From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id B4C3E385B516 for ; Thu, 23 Mar 2023 08:10:35 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B4C3E385B516 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32N7gPRJ001517; Thu, 23 Mar 2023 08:10:34 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : subject : to : references : cc : from : in-reply-to : content-type : content-transfer-encoding; s=pp1; bh=KKu9jfRcjOvr1c94rQ+6qODqkyOYIls3TMvrVMoUrPo=; b=giwnPZoy1duR+dDUUobUBQmztume1z0dnCCdBcEbe1h8hONfASyWzAtflMMtah1ifYwx b/f9pn/UlUGZZHWvEr4gnCOkBVXSposFWg9fHxZxSMImnRupUD4pTAwHZRreTGkYHBvz 7rAJ89hy5TI+h49vNyycrxWcwTbdhP7arw5XyqAyU8M5USOk0IJsNq/uDMbltM8wyo5i r2eIzmKBUfLXekXM3XS643cDcS/R+Hi6s7QVWgB/f1FaG3TFLtpoJmcMqK1Ey3+r/UlY wvOz4XUXF383YfTuIX6IUKGDBaYPUiBLjQJeUBkEzfFF/tL5R0umKdQRapPT6Js6+j/m fw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3pgjm1gq9q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 23 Mar 2023 08:10:33 +0000 Received: from m0098410.ppops.net (m0098410.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 32N86vUj015290; Thu, 23 Mar 2023 08:10:33 GMT Received: from ppma03ams.nl.ibm.com (62.31.33a9.ip4.static.sl-reverse.com [169.51.49.98]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3pgjm1gq8k-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 23 Mar 2023 08:10:33 +0000 Received: from pps.filterd (ppma03ams.nl.ibm.com [127.0.0.1]) by ppma03ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 32MHn3vs022403; Thu, 23 Mar 2023 08:10:31 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma03ams.nl.ibm.com (PPS) with ESMTPS id 3pd4x6f2e3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 23 Mar 2023 08:10:31 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 32N8ARPb29622798 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 23 Mar 2023 08:10:27 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4001A2004E; Thu, 23 Mar 2023 08:10:27 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BBB0420040; Thu, 23 Mar 2023 08:10:24 +0000 (GMT) Received: from [9.177.74.141] (unknown [9.177.74.141]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 23 Mar 2023 08:10:24 +0000 (GMT) Message-ID: Date: Thu, 23 Mar 2023 16:10:22 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [PATCH] PR target/105325, Make load/cmp fusion know about prefixed loads Content-Language: en-US To: Michael Meissner References: Cc: gcc-patches@gcc.gnu.org, Segher Boessenkool , David Edelsohn , Peter Bergner , Will Schmidt From: "Kewen.Lin" In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: Tg0Jo2ZZ2wY8WUkwbnU8pMma3-n9h0M6 X-Proofpoint-GUID: 3gid7EpiijhwOr_045fUiS5U8roLy4iE X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-22_21,2023-03-22_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 adultscore=0 mlxscore=0 spamscore=0 malwarescore=0 lowpriorityscore=0 phishscore=0 mlxlogscore=999 clxscore=1015 bulkscore=0 priorityscore=1501 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303150002 definitions=main-2303230060 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,KAM_SHORT,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Hi Mike, Thanks for fixing this, some minor comments are inlined below. on 2023/3/22 07:53, Michael Meissner wrote: > The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion > optimization generates illegal assembler code. > > Ultimately the code was dying because the fusion load + compare -1/0/1 patterns > did not handle the possibility that the load might be prefixed. > > The main cause is the constraints for the individual loads in the fusion did not > match the machine. In particular, LWA is a ds format instruction when it is > unprefixed. The code did not also set the prefixed attribute correctly. > > This patch rewrites the genfusion.pl script so that it will have more accurate > constraints for the LWA and LD instructions (which are DS instructions). The > updated genfusion.pl was then run to update fusion.md. Finally, the code for > the "prefixed" attribute is modified so that it considers load + compare > immediate patterns to be like the normal load insns in checking whether > operand[1] is a prefixed instruction. > > I have tested this patch on a little endian power10 system, on a little endian > power9 system, and a big endian power8 system (both -m32 and -m64 tested on > BE). There were no regressions, can I check this into the trunk? > > The same patch applies to the gcc-12 and gcc-11 branches. Can I check this > patch into those branches also after a burn-in period? > > 2023-03-21 Michael Meissner > Aaron Sawdey > > gcc/ > > PR target/105325 > * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation > of the ld and lwa instructions which use the DS encoding instead of D. > Use the YZ constraint for these loads. Handle prefixed loads better. > Set the sign_extend attribute as appropriate. > * gcc/config/rs6000/fusion.md: Regenerate. > * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi > instructions to the list of instructions that might have a prefixed load > instruction. > > gcc/testsuite/ > > PR target/105325 > * g++.target/powerpc/pr105325.C: New test. > * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts. > --- > gcc/config/rs6000/genfusion.pl | 26 ++++++++++++++++--- > gcc/config/rs6000/fusion.md | 17 +++++++----- > gcc/config/rs6000/rs6000.md | 2 +- > gcc/testsuite/g++.target/powerpc/pr105325.C | 24 +++++++++++++++++ > .../gcc.target/powerpc/fusion-p10-ldcmpi.c | 4 +-- > 5 files changed, 59 insertions(+), 14 deletions(-) > create mode 100644 gcc/testsuite/g++.target/powerpc/pr105325.C > > diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl > index e4db352e0ce..4f367cadc52 100755 > --- a/gcc/config/rs6000/genfusion.pl > +++ b/gcc/config/rs6000/genfusion.pl > @@ -56,7 +56,7 @@ sub mode_to_ldst_char > sub gen_ld_cmpi_p10 > { > my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred, > - $mempred, $ccmode, $np, $extend, $resultmode); > + $mempred, $ccmode, $np, $extend, $resultmode, $constraint); > LMODE: foreach $lmode ('DI','SI','HI','QI') { > $ldst = mode_to_ldst_char($lmode); > $clobbermode = $lmode; > @@ -71,21 +71,34 @@ sub gen_ld_cmpi_p10 > CCMODE: foreach $ccmode ('CC','CCUNS') { > $np = "NON_PREFIXED_D"; > $mempred = "non_update_memory_operand"; > + $constraint = "m"; The three assignments on $np $mempred $constraint can be moved to place (a) (see below) and add one explicit assignment for $constraint at place (b), since for the condition ccmode eq 'CC', HI/SI/DI have their own settings (btw QI is skipped), these assignments for default value can be moved to else arm (for CCUNS). > if ( $ccmode eq 'CC' ) { > next CCMODE if $lmode eq 'QI'; > - if ( $lmode eq 'DI' || $lmode eq 'SI' ) { > + if ( $lmode eq 'HI' ) { > + $np = "NON_PREFIXED_D"; > + $mempred = "non_update_memory_operand"; > + $echr = "a"; // place b $constraint = "m"; > + } elsif ( $lmode eq 'SI' ) { > + # ld and lwa are both DS-FORM. we have broken it into two different arms for SI and DI, this comment can be removed? > + $np = "NON_PREFIXED_DS"; > + $mempred = "lwa_operand"; > + $echr = "a"; > + $constraint = "YZ"; > + } elsif ( $lmode eq 'DI' ) { > # ld and lwa are both DS-FORM. ... and this comment. > $np = "NON_PREFIXED_DS"; > $mempred = "ds_form_mem_operand"; > + $echr = ""; > + $constraint = "YZ"; > } > $cmpl = ""; > - $echr = "a"; > $constpred = "const_m1_to_1_operand"; > } else { // place a > if ( $lmode eq 'DI' ) { > # ld is DS-form, but lwz is not. > $np = "NON_PREFIXED_DS"; > $mempred = "ds_form_mem_operand"; > + $constraint = "YZ"; > } > $cmpl = "l"; > $echr = "z"; > @@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10 > > print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n"; > print " [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n"; > - print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"m\")\n"; > + print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"${constraint}\")\n"; > if ($ccmode eq 'CCUNS') { print " "; } > print " (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n"; > if ($result eq 'clobber') { > @@ -137,6 +150,11 @@ sub gen_ld_cmpi_p10 > print " \"\"\n"; > print " [(set_attr \"type\" \"fused_load_cmpi\")\n"; > print " (set_attr \"cost\" \"8\")\n"; > + > + if ($extend eq "sign") { > + print " (set_attr \"sign_extend\" \"yes\")\n"; > + } > + > print " (set_attr \"length\" \"8\")])\n"; > print "\n"; > } > diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md > index d45fb138a70..da9953d9ad9 100644 > --- a/gcc/config/rs6000/fusion.md > +++ b/gcc/config/rs6000/fusion.md > @@ -22,7 +22,7 @@ > ;; load mode is DI result mode is clobber compare mode is CC extend is none > (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_m1_to_1_operand" "n"))) > (clobber (match_scratch:DI 0 "=r"))] > "(TARGET_P10_FUSION)" > @@ -43,7 +43,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" > ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none > (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" > [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_0_to_1_operand" "n"))) > (clobber (match_scratch:DI 0 "=r"))] > "(TARGET_P10_FUSION)" > @@ -64,7 +64,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" > ;; load mode is DI result mode is DI compare mode is CC extend is none > (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_m1_to_1_operand" "n"))) > (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] > "(TARGET_P10_FUSION)" > @@ -85,7 +85,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" > ;; load mode is DI result mode is DI compare mode is CCUNS extend is none > (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" > [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") > - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") > + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") > (match_operand:DI 3 "const_0_to_1_operand" "n"))) > (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] > "(TARGET_P10_FUSION)" > @@ -106,7 +106,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" > ;; load mode is SI result mode is clobber compare mode is CC extend is none > (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") > (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > (clobber (match_scratch:SI 0 "=r"))] > "(TARGET_P10_FUSION)" > @@ -148,7 +148,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none" > ;; load mode is SI result mode is SI compare mode is CC extend is none > (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") > (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))] > "(TARGET_P10_FUSION)" > @@ -190,7 +190,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none" > ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign > (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" > [(set (match_operand:CC 2 "cc_reg_operand" "=x") > - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") > + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") > (match_operand:SI 3 "const_m1_to_1_operand" "n"))) > (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))] > "(TARGET_P10_FUSION)" > @@ -205,6 +205,7 @@ (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" > "" > [(set_attr "type" "fused_load_cmpi") > (set_attr "cost" "8") > + (set_attr "sign_extend" "yes") > (set_attr "length" "8")]) > > ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > @@ -247,6 +248,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign" > "" > [(set_attr "type" "fused_load_cmpi") > (set_attr "cost" "8") > + (set_attr "sign_extend" "yes") > (set_attr "length" "8")]) > > ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > @@ -289,6 +291,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign" > "" > [(set_attr "type" "fused_load_cmpi") > (set_attr "cost" "8") > + (set_attr "sign_extend" "yes") > (set_attr "length" "8")]) > > ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index 81bffb04ceb..0f809c3793f 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -302,7 +302,7 @@ (define_attr "prefixed" "no,yes" > (eq_attr "maybe_prefixed" "no")) > (const_string "no") > > - (eq_attr "type" "load,fpload,vecload") > + (eq_attr "type" "load,fpload,vecload,vecload,fused_load_cmpi") > (if_then_else (match_test "prefixed_load_p (insn)") > (const_string "yes") > (const_string "no")) > diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C b/gcc/testsuite/g++.target/powerpc/pr105325.C > new file mode 100644 > index 00000000000..f4ab384daa7 > --- /dev/null > +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C > @@ -0,0 +1,24 @@ > +/* { dg-do assemble } */ > +/* { dg-require-effective-target lp64 } */ Nit: lp64 seems not necessary. The others look good to me, thanks! BR, Kewen