From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id 24F2E3851C23 for ; Wed, 10 May 2023 15:39:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 24F2E3851C23 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353726.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 34AFbcFg005226; Wed, 10 May 2023 15:39:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=date : from : to : subject : message-id : references : mime-version : content-type : in-reply-to; s=pp1; bh=FOUSOa/YXBzRlAI9pnDn/yIdCqniFymC4883prwx+EI=; b=sYbRbR2LfCDt0h1J/IzMomsp4X/SrLDi9PPxNizNPxbBXgPEcOXeSYyobxB4S/lyIY3l xMs6FP+xBM3xBQ7hxMw3TcF0rEAtlXPm63lPdVIlZYzokPpdkb+/AnA0Qk9jEfPdaSDB GsA0wbza+CTKPUd2u+audyGZwukxVhdabbzeo+i7REqupGXh8exVKyQ633aPL4nU+3Ss T6RXKFDUyZg3XZRVZiorJeaxyQfLjFPdVoCM7RYmrIcqdGqpyGNxQy+K8LDaRKQ/JzwV XR/je9QzGJn7wEQrv2Q6jShARM/RpYQ4WCHRmy69hFyCFm7er4Zc5Pr406DaxlD+Z5mr zg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qgc2yv038-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 May 2023 15:39:15 +0000 Received: from m0353726.ppops.net (m0353726.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 34AFbfC5005291; Wed, 10 May 2023 15:39:10 GMT Received: from ppma04dal.us.ibm.com (7a.29.35a9.ip4.static.sl-reverse.com [169.53.41.122]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3qgc2yuyqy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 May 2023 15:39:09 +0000 Received: from pps.filterd (ppma04dal.us.ibm.com [127.0.0.1]) by ppma04dal.us.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 34ADmWbZ032611; Wed, 10 May 2023 15:38:59 GMT Received: from smtprelay05.dal12v.mail.ibm.com ([9.208.130.101]) by ppma04dal.us.ibm.com (PPS) with ESMTPS id 3qf7s6v03s-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 10 May 2023 15:38:59 +0000 Received: from smtpav06.wdc07v.mail.ibm.com (smtpav06.wdc07v.mail.ibm.com [10.39.53.233]) by smtprelay05.dal12v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 34AFcwCi4784706 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 10 May 2023 15:38:58 GMT Received: from smtpav06.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id F155B58067; Wed, 10 May 2023 15:38:57 +0000 (GMT) Received: from smtpav06.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 38DF35804E; Wed, 10 May 2023 15:38:57 +0000 (GMT) Received: from toto.the-meissners.org (unknown [9.160.59.115]) by smtpav06.wdc07v.mail.ibm.com (Postfix) with ESMTPS; Wed, 10 May 2023 15:38:57 +0000 (GMT) Date: Wed, 10 May 2023 11:38:55 -0400 From: Michael Meissner To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner Subject: Re: [PATCH V5, 1/2] PR target/105325: Rewrite genfusion.pl's gen_ld_cmpi_p10 function. Message-ID: Mail-Followup-To: Michael Meissner , gcc-patches@gcc.gnu.org, Segher Boessenkool , "Kewen.Lin" , David Edelsohn , Peter Bergner References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: 1euL6UDBrBfZjbGNYCyr96JiG4ky91jh X-Proofpoint-GUID: xeGkIIpIcABui4WVAa_5rlgab4_ez36x X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-05-10_04,2023-05-05_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 suspectscore=0 phishscore=0 adultscore=0 mlxlogscore=999 priorityscore=1501 lowpriorityscore=0 malwarescore=0 clxscore=1015 bulkscore=0 mlxscore=0 spamscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2304280000 definitions=main-2305100126 X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,GIT_PATCH_0,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: This patch rewrites the gen_ld_cmpi_p10 function in genfusion.pl to be clearer. The resulting fusion.md file that this patch generates is exactly the same output that the previous version of genfusion.pl generated. The next patch in this series will fix PR target/105325 (provide correct predicates and constraints for power10 fusion of load and compare immediate). This patch has been tested on: * Little endian power9 with both IEEE and IBM long double * Little endian power10 * Big endian power8 using both 32-bit and 64-bit code generation. Can I check this into the master branch? Assuming I can check this in, I will also commit to the active GCC branches after a burn-in period. 2023-05-10 Michael Meissner gcc/ PR target/105325 * config/rs6000/genfusion.pl (mode_to_ldst_char): Delete. (print_ld_cmpi_p10): New function, split off from gen_ld_cmpi_p10. (gen_ld_cmpi_p10): Rewrite completely. --- gcc/config/rs6000/genfusion.pl | 248 +++++++++++++++++++++------------ 1 file changed, 157 insertions(+), 91 deletions(-) diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl index e4db352e0ce..81ba4b33940 100755 --- a/gcc/config/rs6000/genfusion.pl +++ b/gcc/config/rs6000/genfusion.pl @@ -45,103 +45,169 @@ print <<'EOF'; EOF -sub mode_to_ldst_char +# Print the insns for load and compare with -1/0/1. +# Arguments: +# lmode -- Integer mode ("DI", "SI", "HI", or "QI"). +# result -- "clobber", "GPR", or $lmode +# ccmode -- Sign vs. unsigned ("CC" or "CCUNS"). +# mem_format -- Memory format ("d" or "ds"). +# cmpl -- Suffix for compare ("l" or "") +# const_pred -- Predicate for constant (i.e. -1/0/1 or 0/1). +# extend -- "sign", "zero", or "none". +# echr -- Suffix for load ("a", "z", or ""). +# load -- Load instruction (i.e. "ld", "lwa", "lwz", etc.) +# np -- enum non_prefixed_form for memory type +# constraint -- constraint to use +# mem_pred -- predicate for the memory operation + +sub print_ld_cmpi_p10 { - my ($mode) = @_; - my %x = (DI => 'd', SI => 'w', HI => 'h', QI => 'b'); - return $x{$mode} if exists $x{$mode}; - return '?'; + my ($lmode, $result, $ccmode, $cmpl, $const_pred, + $extend, $load, $np, $constraint, $mem_pred) = @_; + + # For clobber, we need a SI/DI reg in case we split because we have to + # sign/zero extend. + my $clobbermode = ($lmode =~ m/^[HQ]I$/) ? "GPR" : $lmode; + + # Break long print statements into smaller lines. + my $info = join (" ", + "load mode is ${lmode} result mode is ${result}", + "compare mode is ${ccmode} extend is ${extend}"); + + my $name = join ("", + "${load}_cmp${cmpl}di_cr0_${lmode}", + "_${result}_${ccmode}_${extend}"); + + my $cmp_op1 = "(match_operand:${lmode} 1 \"${mem_pred}\" \"${constraint}\")"; + + my $spaces = " " x (length ($ccmode) + 18); + + print ";; load-cmpi fusion pattern generated by gen_ld_cmpi_p10\n"; + print ";; ${info}\n"; + print "(define_insn_and_split \"*${name}\"\n"; + print " [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n"; + print " (compare:${ccmode} ${cmp_op1}\n"; + print "${spaces}(match_operand:${lmode} 3 \"${const_pred}\" \"n\")))\n"; + + if ($result eq "clobber") + { + print " (clobber (match_scratch:${clobbermode} 0 \"=r\"))]\n"; + } + + else + { + my $load_op0 = "(match_operand:${result} 0 \"gpc_reg_operand\" \"=r\")"; + my $load_op1 = (($result eq $lmode) + ? "(match_dup 1)" + : "(${extend}_extend:${result} (match_dup 1))"); + print " (set ${load_op0} ${load_op1})]\n"; + } + + # Do not match prefixed loads. The machine only fuses non-prefixed loads + # with compare immediate. Take into account whether the load is a ds-form + # or a d-form instruction. + print " \"(TARGET_P10_FUSION)\"\n"; + print " \"${load}%X1 %0,%1\\;cmp${cmpl}di %2,%0,%3\"\n"; + print " \"&& reload_completed\n"; + print " && (cc_reg_not_cr0_operand (operands[2], CCmode)\n"; + print " || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),\n"; + print " ${lmode}mode, ${np}))\"\n"; + + if ($extend eq "none") + { + print " [(set (match_dup 0) (match_dup 1))\n"; + } + + else + { + my $resultmode = ($result eq "clobber") ? $clobbermode : $result; + print " [(set (match_dup 0) (${extend}_extend:${resultmode} (match_dup 1)))\n"; + } + + print " (set (match_dup 2)\n"; + print " (compare:${ccmode} (match_dup 0) (match_dup 3)))]\n"; + print " \"\"\n"; + print " [(set_attr \"type\" \"fused_load_cmpi\")\n"; + print " (set_attr \"cost\" \"8\")\n"; + print " (set_attr \"length\" \"8\")])\n"; + print "\n"; } sub gen_ld_cmpi_p10 { - my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred, - $mempred, $ccmode, $np, $extend, $resultmode); - LMODE: foreach $lmode ('DI','SI','HI','QI') { - $ldst = mode_to_ldst_char($lmode); - $clobbermode = $lmode; - # For clobber, we need a SI/DI reg in case we - # split because we have to sign/zero extend. - if ($lmode eq 'HI' || $lmode eq 'QI') { $clobbermode = "GPR"; } - RESULT: foreach $result ('clobber', $lmode, "EXT".$lmode) { - # EXTDI does not exist, and we cannot directly produce HI/QI results. - next RESULT if $result eq "EXTDI" || $result eq "HI" || $result eq "QI"; - # Don't allow EXTQI because that would allow HI result which we can't do. - $result = "GPR" if $result eq "EXTQI"; - CCMODE: foreach $ccmode ('CC','CCUNS') { - $np = "NON_PREFIXED_D"; - $mempred = "non_update_memory_operand"; - if ( $ccmode eq 'CC' ) { - next CCMODE if $lmode eq 'QI'; - if ( $lmode eq 'DI' || $lmode eq 'SI' ) { - # ld and lwa are both DS-FORM. - $np = "NON_PREFIXED_DS"; - $mempred = "ds_form_mem_operand"; - } - $cmpl = ""; - $echr = "a"; - $constpred = "const_m1_to_1_operand"; - } else { - if ( $lmode eq 'DI' ) { - # ld is DS-form, but lwz is not. - $np = "NON_PREFIXED_DS"; - $mempred = "ds_form_mem_operand"; - } - $cmpl = "l"; - $echr = "z"; - $constpred = "const_0_to_1_operand"; - } - if ($lmode eq 'DI') { $echr = ""; } - if ($result =~ m/^EXT/ || $result eq 'GPR' || $clobbermode eq 'GPR') { - # We always need extension if result > lmode. - if ( $ccmode eq 'CC' ) { - $extend = "sign"; - } else { - $extend = "zero"; - } - } else { - # Result of SI/DI does not need sign extension. - $extend = "none"; - } - print ";; load-cmpi fusion pattern generated by gen_ld_cmpi_p10\n"; - print ";; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend\n"; - - print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n"; - print " [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n"; - print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"m\")\n"; - if ($ccmode eq 'CCUNS') { print " "; } - print " (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n"; - if ($result eq 'clobber') { - print " (clobber (match_scratch:${clobbermode} 0 \"=r\"))]\n"; - } elsif ($result eq $lmode) { - print " (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (match_dup 1))]\n"; - } else { - print " (set (match_operand:${result} 0 \"gpc_reg_operand\" \"=r\") (${extend}_extend:${result} (match_dup 1)))]\n"; - } - print " \"(TARGET_P10_FUSION)\"\n"; - print " \"l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di %2,%0,%3\"\n"; - print " \"&& reload_completed\n"; - print " && (cc_reg_not_cr0_operand (operands[2], CCmode)\n"; - print " || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),\n"; - print " ${lmode}mode, ${np}))\"\n"; - - if ($extend eq "none") { - print " [(set (match_dup 0) (match_dup 1))\n"; - } else { - $resultmode = $result; - if ( $result eq 'clobber' ) { $resultmode = $clobbermode } - print " [(set (match_dup 0) (${extend}_extend:${resultmode} (match_dup 1)))\n"; - } - print " (set (match_dup 2)\n"; - print " (compare:${ccmode} (match_dup 0) (match_dup 3)))]\n"; - print " \"\"\n"; - print " [(set_attr \"type\" \"fused_load_cmpi\")\n"; - print " (set_attr \"cost\" \"8\")\n"; - print " (set_attr \"length\" \"8\")])\n"; - print "\n"; - } + my ($lmode, $result, $mem_format, $extend); + + # Map mode to load instruction + my %signed_load = ("DI" => "ld", + "SI" => "lwa", + "HI" => "lha"); + + my %unsigned_load = ("DI" => "ld", + "SI" => "lwz", + "HI" => "lhz", + "QI" => "lbz"); + + # Memory predicate to use. + my %signed_memory_predicate = ("DI" => "ds_form_mem_operand", + "SI" => "ds_form_mem_operand", + "HI" => "non_update_memory_operand"); + + my %unsigned_memory_predicate = ("DI" => "ds_form_mem_operand", + "SI" => "non_update_memory_operand", + "HI" => "non_update_memory_operand", + "QI" => "non_update_memory_operand"); + + # Internal format of the memory instruction (enum non_prefixed_form) to use. + my %np = ("ds" => "NON_PREFIXED_DS", + "d" => "NON_PREFIXED_D"); + + # Result modes to use. Clobber is used when you are comparing the load to + # -1/0/1, but you are not using it otherwise. EXTDI does not exist. We + # cannot directly use HI/QI results because we only have word and double word + # compared. For promotion, don't allow EXTQI because that would allow HI + # results which we can't do (use GPR instead). + my %result_modes = ("DI" => ["clobber", "DI"], + "SI" => ["clobber", "SI", "EXTSI" ], + "HI" => ["clobber", "EXTHI" ], + "QI" => ["clobber", "GPR" ]); + + foreach $lmode ("DI", "SI", "HI", "QI") + { + foreach $result (@{ $result_modes{$lmode} }) + { + # Handle CCmode (sign extended compares to -1, 0, or 1). We don't + # have a LBA instruction, so skip QImode. Both LD and LWA are + # DS-form instructions for signed loads. + if ($lmode ne "QI") + { + $mem_format = ($lmode =~ m/^[DS]I$/) ? "ds" : "d"; + $extend = (($lmode eq "DI" + || $lmode eq $result + || ($lmode eq "SI" && $result eq "clobber")) + ? "none" + : "sign"); + + print_ld_cmpi_p10 ($lmode, $result, "CC", "", + "const_m1_to_1_operand", $extend, + $signed_load{$lmode}, $np{$mem_format}, "m", + $signed_memory_predicate{$lmode}); + } + + # Handle CCUNS mode (zero extended compares to 0 or 1. + # LD is DS-form, but LWZ is not for unsigned loads. + $mem_format = ($lmode eq "DI") ? "ds" : "d"; + $extend = (($lmode eq "DI" + || $lmode eq $result + || ($lmode eq "SI" && $result eq "clobber")) + ? "none" + : "zero"); + + print_ld_cmpi_p10 ($lmode, $result, "CCUNS", "l", + "const_0_to_1_operand", $extend, + $unsigned_load{$lmode}, $np{$mem_format}, "m", + $unsigned_memory_predicate{$lmode}); + } } - } } sub gen_logical_addsubf -- 2.40.0 -- Michael Meissner, IBM PO Box 98, Ayer, Massachusetts, USA, 01432 email: meissner@linux.ibm.com