From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1005) id A9C083857400; Wed, 7 Jun 2023 20:58:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A9C083857400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686171489; bh=84YMh3B7wMUmenlkpwJYmITGh8cJinID+OS9ulm4AKo=; h=From:To:Subject:Date:From; b=m7OGQ1O8keAqkVsdyUApAOjH+QLGF+b7OkH2DwMjJk5eBvasuFWHmAXpg1gFHorUS lEW7lpQ6oSA8SF0j1zJLnpxzzM9yB2XyZg+U7EyJytDTHZvSkCwAIMSDqrM8n8eCYf d3efm/dS4PLIM1Tf5q+g4VDnDsTTng33uapxYGO4= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Michael Meissner To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work122)] Fix power10 fusion and -fstack-protector, PR target/105325 X-Act-Checkin: gcc X-Git-Author: Michael Meissner X-Git-Refname: refs/users/meissner/heads/work122 X-Git-Oldrev: b8ebc9386a08c492f557e893cf6489d089ecc32a X-Git-Newrev: 8c02d7e251f65cc70ec3d0ac81aa4145dea5943e Message-Id: <20230607205809.A9C083857400@sourceware.org> Date: Wed, 7 Jun 2023 20:58:09 +0000 (GMT) List-Id: https://gcc.gnu.org/g:8c02d7e251f65cc70ec3d0ac81aa4145dea5943e commit 8c02d7e251f65cc70ec3d0ac81aa4145dea5943e Author: Michael Meissner Date: Wed Jun 7 16:57:48 2023 -0400 Fix power10 fusion and -fstack-protector, PR target/105325 This patch fixes an issue where if you use the -fstack-protector and -mcpu=power10 options and you have a large stack frame, the GCC compiler will generate a LWA instruction with a large offset. There are several problems: 1) The prefixed attribute was not checking insns with the type fused_load_cmpi for being load insns. 2) The recognition of LWA for being prefixed looks at the "sign_extend" attribute and whether the register mode was different than the memory load (i.e. does it have a sign_extend wrapper around the load). 3) The constraints in fusion.md (generated by genfusion.pl) use "m" for LWA and LD, when they should use "YZ". 4) There is a lwa_operand that should be used instead of ds_form_mem_operand for the LWA instruction. The main fix is to modify genfusion.pl that it sets the appropriate predicates and constraints. I also added support in genfusion.md so that if we are doing a LWA operation and just setting the CC bits (throwing away the result of the load after the comparison), it generates a LWZ instruction and does a CMPWI instead of CMPDI. This way those loads can use normal D-FORM restrictions instead of DS-form. I set the "sign_extend" attribute on the cases that generate LWA. I modified the "prefixed" attribute so that it also checks fused_load_cmpi. 2023-06-06 Michael Meissner gcc/ * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): Fix constraints and predicates for LD and LWA. Optimize LWA/CMPDI to generate LWZ/CMPWI if we don't need the result of the load after the comparison. Change the name of the insn pattern to reflect whether a DImode or SImode register is loaded. Set sign_extend attribute for LWA instruction. * config/rs6000/fusion.md: Regenerate. * config/rs6000/rs6000.md (prefixed attribute): Treat fused_load_cmpi insns as being load insns. gcc/testsuite/ * g++.target/powerpc/pr105325.C: New test. * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust names and insn counts for PR target/105325 fix. Diff: --- gcc/config/rs6000/fusion.md | 44 +++++++++++----------- gcc/config/rs6000/genfusion.pl | 36 ++++++++++++++---- gcc/config/rs6000/rs6000.md | 2 +- gcc/testsuite/g++.target/powerpc/pr105325.C | 26 +++++++++++++ .../gcc.target/powerpc/fusion-p10-ldcmpi.c | 31 ++++++++------- 5 files changed, 95 insertions(+), 44 deletions(-) diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md index d45fb138a70..4444e3315dd 100644 --- a/gcc/config/rs6000/fusion.md +++ b/gcc/config/rs6000/fusion.md @@ -22,7 +22,7 @@ ;; load mode is DI result mode is clobber compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -43,7 +43,7 @@ ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -64,7 +64,7 @@ ;; load mode is DI result mode is DI compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -85,7 +85,7 @@ ;; load mode is DI result mode is DI compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -104,17 +104,17 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is SI result mode is clobber compare mode is CC extend is none -(define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none" +(define_insn_and_split "*lwz_cmpsi_cr0_SI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (clobber (match_scratch:SI 0 "=r"))] "(TARGET_P10_FUSION)" - "lwa%X1 %0,%1\;cmpdi %2,%0,%3" + "lwz%X1 %0,%1\;cmpwi %2,%0,%3" "&& reload_completed && (cc_reg_not_cr0_operand (operands[2], CCmode) || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0), - SImode, NON_PREFIXED_DS))" + SImode, NON_PREFIXED_D))" [(set (match_dup 0) (match_dup 1)) (set (match_dup 2) (compare:CC (match_dup 0) (match_dup 3)))] @@ -125,7 +125,7 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is SI result mode is clobber compare mode is CCUNS extend is none -(define_insn_and_split "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none" +(define_insn_and_split "*lwz_cmplsi_cr0_SI_clobber_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m") (match_operand:SI 3 "const_0_to_1_operand" "n"))) @@ -146,9 +146,9 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is SI result mode is SI compare mode is CC extend is none -(define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none" +(define_insn_and_split "*lwa_cmpsi_cr0_SI_SI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -163,11 +163,12 @@ "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is SI result mode is SI compare mode is CCUNS extend is none -(define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none" +(define_insn_and_split "*lwz_cmplsi_cr0_SI_SI_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m") (match_operand:SI 3 "const_0_to_1_operand" "n"))) @@ -188,9 +189,9 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign -(define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" +(define_insn_and_split "*lwa_cmp_cr0_SI_EXTSI_CC_sign" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))] "(TARGET_P10_FUSION)" @@ -205,11 +206,12 @@ "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is SI result mode is EXTSI compare mode is CCUNS extend is zero -(define_insn_and_split "*lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero" +(define_insn_and_split "*lwz_cmpl_cr0_SI_EXTSI_CCUNS_zero" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") (compare:CCUNS (match_operand:SI 1 "non_update_memory_operand" "m") (match_operand:SI 3 "const_0_to_1_operand" "n"))) @@ -230,7 +232,7 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is HI result mode is clobber compare mode is CC extend is sign -(define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign" +(define_insn_and_split "*lha_cmp_cr0_HI_clobber_CC_sign" [(set (match_operand:CC 2 "cc_reg_operand" "=x") (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m") (match_operand:HI 3 "const_m1_to_1_operand" "n"))) @@ -251,7 +253,7 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is HI result mode is clobber compare mode is CCUNS extend is zero -(define_insn_and_split "*lhz_cmpldi_cr0_HI_clobber_CCUNS_zero" +(define_insn_and_split "*lhz_cmpl_cr0_HI_clobber_CCUNS_zero" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m") (match_operand:HI 3 "const_0_to_1_operand" "n"))) @@ -272,7 +274,7 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is HI result mode is EXTHI compare mode is CC extend is sign -(define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign" +(define_insn_and_split "*lha_cmp_cr0_HI_EXTHI_CC_sign" [(set (match_operand:CC 2 "cc_reg_operand" "=x") (compare:CC (match_operand:HI 1 "non_update_memory_operand" "m") (match_operand:HI 3 "const_m1_to_1_operand" "n"))) @@ -293,7 +295,7 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is HI result mode is EXTHI compare mode is CCUNS extend is zero -(define_insn_and_split "*lhz_cmpldi_cr0_HI_EXTHI_CCUNS_zero" +(define_insn_and_split "*lhz_cmpl_cr0_HI_EXTHI_CCUNS_zero" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") (compare:CCUNS (match_operand:HI 1 "non_update_memory_operand" "m") (match_operand:HI 3 "const_0_to_1_operand" "n"))) @@ -314,7 +316,7 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is QI result mode is clobber compare mode is CCUNS extend is zero -(define_insn_and_split "*lbz_cmpldi_cr0_QI_clobber_CCUNS_zero" +(define_insn_and_split "*lbz_cmpl_cr0_QI_clobber_CCUNS_zero" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m") (match_operand:QI 3 "const_0_to_1_operand" "n"))) @@ -335,7 +337,7 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is QI result mode is GPR compare mode is CCUNS extend is zero -(define_insn_and_split "*lbz_cmpldi_cr0_QI_GPR_CCUNS_zero" +(define_insn_and_split "*lbz_cmpl_cr0_QI_GPR_CCUNS_zero" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") (compare:CCUNS (match_operand:QI 1 "non_update_memory_operand" "m") (match_operand:QI 3 "const_0_to_1_operand" "n"))) diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl index 82e8f863b02..42517b3bce7 100755 --- a/gcc/config/rs6000/genfusion.pl +++ b/gcc/config/rs6000/genfusion.pl @@ -57,14 +57,26 @@ sub gen_ld_cmpi_p10_one { my ($lmode, $result, $ccmode) = @_; + my $cmp_size = "d"; + my $echr = ($ccmode eq "CC") ? "a" : "z"; + if ($lmode eq "DI") { $echr = ""; } my $np = "NON_PREFIXED_D"; my $mempred = "non_update_memory_operand"; my $extend; if ($ccmode eq "CC") { - # ld and lwa are both DS-FORM. - ($lmode =~ /^[SD]I$/) and $np = "NON_PREFIXED_DS"; - ($lmode =~ /^[SD]I$/) and $mempred = "ds_form_mem_operand"; + # if we would generate lwa and just want the CC value, generate lwz instead + # and use cmpwi. This allows us to avoid the ds-form restrictions on lwa. + if ($lmode eq "SI" && $result eq "clobber") { + $echr = "z"; + $cmp_size = "w"; + + } else { + # ld and lwa are both DS-FORM, use LWA_OPERAND for LWA + ($lmode =~ /^[SD]I$/) and $np = "NON_PREFIXED_DS"; + ($lmode eq "DI") and $mempred = "ds_form_mem_operand"; + ($lmode eq "SI") and $mempred = "lwa_operand"; + } } else { if ($lmode eq "DI") { # ld is DS-form, but lwz is not. @@ -74,8 +86,6 @@ sub gen_ld_cmpi_p10_one } my $cmpl = ($ccmode eq "CC") ? "" : "l"; - my $echr = ($ccmode eq "CC") ? "a" : "z"; - if ($lmode eq "DI") { $echr = ""; } my $constpred = ($ccmode eq "CC") ? "const_m1_to_1_operand" : "const_0_to_1_operand"; @@ -90,13 +100,17 @@ sub gen_ld_cmpi_p10_one $extend = "none"; } + my $load_mode = (($clobbermode eq "GPR" || $result =~ /^EXT/) + ? "" + : lc ($clobbermode)); my $ldst = mode_to_ldst_char($lmode); + my $constraint = ($np eq "NON_PREFIXED_DS") ? "YZ" : "m"; print <