From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from gate.crashing.org (gate.crashing.org [63.228.1.57]) by sourceware.org (Postfix) with ESMTP id 8B1E73858434 for ; Tue, 2 May 2023 22:33:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8B1E73858434 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=kernel.crashing.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kernel.crashing.org Received: from gate.crashing.org (localhost.localdomain [127.0.0.1]) by gate.crashing.org (8.14.1/8.14.1) with ESMTP id 342MW5HW031202; Tue, 2 May 2023 17:32:05 -0500 Received: (from segher@localhost) by gate.crashing.org (8.14.1/8.14.1/Submit) id 342MW5mb031199; Tue, 2 May 2023 17:32:05 -0500 X-Authentication-Warning: gate.crashing.org: segher set sender to segher@kernel.crashing.org using -f Date: Tue, 2 May 2023 17:32:04 -0500 From: Segher Boessenkool To: Michael Meissner , gcc-patches@gcc.gnu.org, "Kewen.Lin" , David Edelsohn , Peter Bergner Subject: Re: [PATCH, V4] PR target/105325, Make load/cmp fusion know about prefixed loads. Message-ID: <20230502223204.GD19790@gate.crashing.org> References: Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.4.2.3i X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,JMQ_SPF_NEUTRAL,KAM_DMARC_STATUS,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Apr 26, 2023 at 12:18:36PM -0400, Michael Meissner wrote: > * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation > of the ld and lwa instructions which use the DS encoding instead of D. > Use the YZ constraint for these loads. Handle prefixed loads better. Don't use tabs in the middle of a line. "Handle prefixed loads better" is not what the patch does, and/or is so vague as to be useless. > --- a/gcc/config/rs6000/genfusion.pl > +++ b/gcc/config/rs6000/genfusion.pl > @@ -56,7 +56,7 @@ sub mode_to_ldst_char > sub gen_ld_cmpi_p10 > { > my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred, > - $mempred, $ccmode, $np, $extend, $resultmode); > + $mempred, $ccmode, $np, $extend, $resultmode, $constraint); > LMODE: foreach $lmode ('DI','SI','HI','QI') { > $ldst = mode_to_ldst_char($lmode); > $clobbermode = $lmode; > @@ -71,21 +71,34 @@ sub gen_ld_cmpi_p10 > CCMODE: foreach $ccmode ('CC','CCUNS') { > $np = "NON_PREFIXED_D"; > $mempred = "non_update_memory_operand"; > + $constraint = "m"; > if ( $ccmode eq 'CC' ) { > next CCMODE if $lmode eq 'QI'; > - if ( $lmode eq 'DI' || $lmode eq 'SI' ) { > + if ( $lmode eq 'HI' ) { > + $np = "NON_PREFIXED_D"; > + $mempred = "non_update_memory_operand"; > + $echr = "a"; > + } elsif ( $lmode eq 'SI' ) { > + # ld and lwa are both DS-FORM. > + $np = "NON_PREFIXED_DS"; > + $mempred = "lwa_operand"; > + $echr = "a"; > + $constraint = "YZ"; > + } elsif ( $lmode eq 'DI' ) { > # ld and lwa are both DS-FORM. > $np = "NON_PREFIXED_DS"; > $mempred = "ds_form_mem_operand"; > + $echr = ""; > + $constraint = "YZ"; > } > $cmpl = ""; > - $echr = "a"; > $constpred = "const_m1_to_1_operand"; > } else { > if ( $lmode eq 'DI' ) { > # ld is DS-form, but lwz is not. > $np = "NON_PREFIXED_DS"; > $mempred = "ds_form_mem_operand"; > + $constraint = "YZ"; > } > $cmpl = "l"; > $echr = "z"; > @@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10 > > print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n"; > print " [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n"; > - print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"m\")\n"; > + print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"${constraint}\")\n"; > if ($ccmode eq 'CCUNS') { print " "; } > print " (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n"; > if ($result eq 'clobber') { > @@ -137,6 +150,11 @@ sub gen_ld_cmpi_p10 > print " \"\"\n"; > print " [(set_attr \"type\" \"fused_load_cmpi\")\n"; > print " (set_attr \"cost\" \"8\")\n"; > + > + if ($extend eq "sign") { > + print " (set_attr \"sign_extend\" \"yes\")\n"; > + } > + > print " (set_attr \"length\" \"8\")])\n"; > print "\n"; > } This already was a 90-line function that did too many things. Now it is bigger and does more things, and the patch is unintelligible. Please first factor things. There are many more things terrible Perl code style here (like all of the quoting), but where to start :-/ I once again spent many hours trying to review this, and once again failed. Please write better code, and please make better patches. > index ec783803820..7d6c94aee5b 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -302,7 +302,7 @@ (define_attr "prefixed" "no,yes" > (eq_attr "maybe_prefixed" "no")) > (const_string "no") > > - (eq_attr "type" "load,fpload,vecload") > + (eq_attr "type" "load,fpload,vecload,vecload,fused_load_cmpi") Don't duplicate vecload. > --- /dev/null > +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C > @@ -0,0 +1,25 @@ > +/* { dg-do assemble } */ > +/* { dg-require-effective-target lp64 } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-require-effective-target powerpc_prefixed_addr } */ > +/* { dg-options "-O2 -mdejagnu-cpu=power10 -fstack-protector" } */ The power10_ok selector still is terribly broken (it allows only some variants of 64-bit Linux and nothing more, to start with). Do we still need it in any case? Same for powerpc_prefixed_addr. Is there any supported target that does not have a working assembler? What is -fstack-protector here for? That should be documented, or better, it should just be removed if possible. > -/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 4 { target lp64 } } } */ > +/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 8 { target lp64 } } } */ > -/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 9 { target ilp32 } } } */ > +/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 16 { target ilp32 } } } */ Why are these new counts correct? Segher