From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 2A95D3858D28 for ; Wed, 3 May 2023 16:56:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2A95D3858D28 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353727.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 343Gu2ZP020594; Wed, 3 May 2023 16:56:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : cc : subject : message-id : references : mime-version : content-type : in-reply-to; s=pp1; bh=NIiI0nmVhHCLa8AI9gO13fwxVHmaNWWmm2EV4kDhO2o=; b=Jm4nkqZAG6+8e/GLs0qhUz5Hq+W3ONPWLewK8FnlsRDN1uOV7XGIJH/0sqOj8N2hua2A UpVKbPQGUc1oJ+suIY0KHEyOrJh+EgUqs62NxxVwM16v6QiKvCt0DYAfLLxAx3SelFK7 ARkywpbSD7k4mFvZli/II+CGS2zfdqWpoS+gsUwbVSK7efGUDcNGFdkLUSiCw9DHAa49 G0hfwTjb9wWVNrEj943oaJPO7Xh8b9LUW6eiVLa/mvXhIhCxwVke6RKbrvB7jituLI5v MVI2t3kAfbCVVD9/FCt/Bxv9I9zcEpSY/KDF+190Y3R1lW5EGXcaGkbaH9iZBRPCdAcv SA== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qbu7ggthh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 May 2023 16:56:06 +0000 Received: from m0353727.ppops.net (m0353727.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 343Gt3oZ015519; Wed, 3 May 2023 16:56:06 GMT Received: from ppma02wdc.us.ibm.com (aa.5b.37a9.ip4.static.sl-reverse.com [169.55.91.170]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qbu7ggth6-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 May 2023 16:56:06 +0000 Received: from pps.filterd (ppma02wdc.us.ibm.com [127.0.0.1]) by ppma02wdc.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 343FFdxm002611; Wed, 3 May 2023 16:56:05 GMT Received: from smtprelay06.dal12v.mail.ibm.com ([9.208.130.100]) by ppma02wdc.us.ibm.com (PPS) with ESMTPS id 3q8tv8de0v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 03 May 2023 16:56:05 +0000 Received: from smtpav03.wdc07v.mail.ibm.com (smtpav03.wdc07v.mail.ibm.com [10.39.53.230]) by smtprelay06.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 343Gu3Hk26804596 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 3 May 2023 16:56:04 GMT Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id BA5775805A; Wed, 3 May 2023 16:56:03 +0000 (GMT) Received: from smtpav03.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 04C9758054; Wed, 3 May 2023 16:56:03 +0000 (GMT) Received: from toto.the-meissners.org (unknown [9.160.59.115]) by smtpav03.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Wed, 3 May 2023 16:56:02 +0000 (GMT) Date: Wed, 3 May 2023 12:56:01 -0400 From: Michael Meissner To: Segher Boessenkool Cc: Michael Meissner , gcc-patches@gcc.gnu.org, "Kewen.Lin" , David Edelsohn , Peter Bergner Subject: Re: [PATCH, V4] PR target/105325, Make load/cmp fusion know about prefixed loads. Message-ID: Mail-Followup-To: Michael Meissner , Segher Boessenkool , gcc-patches@gcc.gnu.org, "Kewen.Lin" , David Edelsohn , Peter Bergner References: <20230502223204.GD19790@gate.crashing.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230502223204.GD19790@gate.crashing.org> X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 47tNhfelhoBYASqNsO6_49THZBu7cc1R X-Proofpoint-GUID: lQXTCPzw450fl0KneJnwyxBqHni_7z5N X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-03_11,2023-05-03_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 adultscore=0 spamscore=0 mlxscore=0 phishscore=0 impostorscore=0 suspectscore=0 mlxlogscore=999 bulkscore=0 lowpriorityscore=0 clxscore=1011 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2305030141 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Tue, May 02, 2023 at 05:32:04PM -0500, Segher Boessenkool wrote: > On Wed, Apr 26, 2023 at 12:18:36PM -0400, Michael Meissner wrote: > > * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation > > of the ld and lwa instructions which use the DS encoding instead of D. > > Use the YZ constraint for these loads. Handle prefixed loads better. > > Don't use tabs in the middle of a line. > > "Handle prefixed loads better" is not what the patch does, and/or is so > vague as to be useless. Ok. > > --- a/gcc/config/rs6000/genfusion.pl > > +++ b/gcc/config/rs6000/genfusion.pl > > @@ -56,7 +56,7 @@ sub mode_to_ldst_char > > sub gen_ld_cmpi_p10 > > { > > my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred, > > - $mempred, $ccmode, $np, $extend, $resultmode); > > + $mempred, $ccmode, $np, $extend, $resultmode, $constraint); > > LMODE: foreach $lmode ('DI','SI','HI','QI') { > > $ldst = mode_to_ldst_char($lmode); > > $clobbermode = $lmode; > > @@ -71,21 +71,34 @@ sub gen_ld_cmpi_p10 > > CCMODE: foreach $ccmode ('CC','CCUNS') { > > $np = "NON_PREFIXED_D"; > > $mempred = "non_update_memory_operand"; > > + $constraint = "m"; > > if ( $ccmode eq 'CC' ) { > > next CCMODE if $lmode eq 'QI'; > > - if ( $lmode eq 'DI' || $lmode eq 'SI' ) { > > + if ( $lmode eq 'HI' ) { > > + $np = "NON_PREFIXED_D"; > > + $mempred = "non_update_memory_operand"; > > + $echr = "a"; > > + } elsif ( $lmode eq 'SI' ) { > > + # ld and lwa are both DS-FORM. > > + $np = "NON_PREFIXED_DS"; > > + $mempred = "lwa_operand"; > > + $echr = "a"; > > + $constraint = "YZ"; > > + } elsif ( $lmode eq 'DI' ) { > > # ld and lwa are both DS-FORM. > > $np = "NON_PREFIXED_DS"; > > $mempred = "ds_form_mem_operand"; > > + $echr = ""; > > + $constraint = "YZ"; > > } > > $cmpl = ""; > > - $echr = "a"; > > $constpred = "const_m1_to_1_operand"; > > } else { > > if ( $lmode eq 'DI' ) { > > # ld is DS-form, but lwz is not. > > $np = "NON_PREFIXED_DS"; > > $mempred = "ds_form_mem_operand"; > > + $constraint = "YZ"; > > } > > $cmpl = "l"; > > $echr = "z"; > > @@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10 > > > > print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n"; > > print " [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n"; > > - print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"m\")\n"; > > + print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"${constraint}\")\n"; > > if ($ccmode eq 'CCUNS') { print " "; } > > print " (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n"; > > if ($result eq 'clobber') { > > @@ -137,6 +150,11 @@ sub gen_ld_cmpi_p10 > > print " \"\"\n"; > > print " [(set_attr \"type\" \"fused_load_cmpi\")\n"; > > print " (set_attr \"cost\" \"8\")\n"; > > + > > + if ($extend eq "sign") { > > + print " (set_attr \"sign_extend\" \"yes\")\n"; > > + } > > + > > print " (set_attr \"length\" \"8\")])\n"; > > print "\n"; > > } > > This already was a 90-line function that did too many things. Now it is > bigger and does more things, and the patch is unintelligible. > > Please first factor things. There are many more things terrible Perl > code style here (like all of the quoting), but where to start :-/ Note, I didn't write the original patch nor the original code (Aaron did), but without a lot of rewrites it will take more time to get it done. > I once again spent many hours trying to review this, and once again > failed. Please write better code, and please make better patches. > > > index ec783803820..7d6c94aee5b 100644 > > --- a/gcc/config/rs6000/rs6000.md > > +++ b/gcc/config/rs6000/rs6000.md > > @@ -302,7 +302,7 @@ (define_attr "prefixed" "no,yes" > > (eq_attr "maybe_prefixed" "no")) > > (const_string "no") > > > > - (eq_attr "type" "load,fpload,vecload") > > + (eq_attr "type" "load,fpload,vecload,vecload,fused_load_cmpi") > > Don't duplicate vecload. Ok. > > --- /dev/null > > +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C > > @@ -0,0 +1,25 @@ > > +/* { dg-do assemble } */ > > +/* { dg-require-effective-target lp64 } */ > > +/* { dg-require-effective-target power10_ok } */ > > +/* { dg-require-effective-target powerpc_prefixed_addr } */ > > +/* { dg-options "-O2 -mdejagnu-cpu=power10 -fstack-protector" } */ > > The power10_ok selector still is terribly broken (it allows only some > variants of 64-bit Linux and nothing more, to start with). Do we still > need it in any case? > > Same for powerpc_prefixed_addr. Is there any supported target that does > not have a working assembler? In particular, I was building it on a power8 BE system, and I happened to use the system assembler instead of a new assembler that I had built. The powerpc_prefixed_addr test only tests whether the compiler spits out a prefixed instruction by default if you use -mcpu=power10 (which succeeded) but because the test involved running the assembler, and it failed because the assembler doesn't know about -mpower10. If I build my own assembler, it works fine. > What is -fstack-protector here for? That should be documented, or > better, it should just be removed if possible. While I can add documentation to the test, -fstack-protector is required to show up the bug. Both of the users experiencing the issue had used -fstack-protector. If you do not use -fstack-protector, you get no error. I believe it is due to the fact that -fstack-protector runs much later than expand, and it creates code that doesn't take into account whether the load is prefixed or not. Because it is two insns joined together, you don't get the normal processing that converts a LWA into a PLWA. > > > -/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 4 { target lp64 } } } */ > > +/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 8 { target lp64 } } } */ > > > -/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 9 { target ilp32 } } } */ > > +/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 16 { target ilp32 } } } */ > > Why are these new counts correct? I haven't looked in detail. -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meissner@linux.ibm.com