From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1005) id 2B2CB3858D38; Tue, 13 Jun 2023 17:21:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2B2CB3858D38 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686676893; bh=/EuNPHJvK1NRdoSuG4lrLTPfXVsKBXfOhJxDqUVLy8U=; h=From:To:Subject:Date:From; b=DVE70cJz6Cbce3xYJ1mNQh4O/kKidsW3s90boB3GR+aSI0O8AXew6R4SDaNZ+w2/r /fHdo0kmLBO5+anZyVHrLVks2U14LZy5gTdslwjlwK9bCk+pY9VVL9SuKwOjWJk/ch A02nEgmgbQdPoZBH8C1WKXh1DW5T333lg+NQWjXI= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Michael Meissner To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work122)] Fix power10 fusion and -fstack-protector, PR target/105325 X-Act-Checkin: gcc X-Git-Author: Michael Meissner X-Git-Refname: refs/users/meissner/heads/work122 X-Git-Oldrev: 936b5b082e70feb8ad2fb99023348490b003df38 X-Git-Newrev: 816ce136d5776e4d82ac3beca98989ce58e77f1a Message-Id: <20230613172133.2B2CB3858D38@sourceware.org> Date: Tue, 13 Jun 2023 17:21:33 +0000 (GMT) List-Id: https://gcc.gnu.org/g:816ce136d5776e4d82ac3beca98989ce58e77f1a commit 816ce136d5776e4d82ac3beca98989ce58e77f1a Author: Michael Meissner Date: Tue Jun 13 13:21:15 2023 -0400 Fix power10 fusion and -fstack-protector, PR target/105325 This patch fixes an issue where if you use the -fstack-protector and -mcpu=power10 options and you have a large stack frame, the GCC compiler will generate a LWA instruction with a large offset. The important thing in the bug is that -fstack-protector is used, but it could potentially happen with fused load-compare to any stack location when the stack frame is larger than 32K without -fstack-protector. What happens is the initial insn that is created is: (insn 6 5 7 2 (parallel [ (set (reg:CC 119) (compare:CC (mem/c:SI (plus:DI (reg/f:DI 110 sfp) (const_int -4)) (const_int 0 [0]))) (clobber (scratch:DI)) ]) (nil)) After the stack size is finalized, the frame pointer removed, and the post reload phase is run, the insn is now: (insn 6 5 7 2 (parallel [ (set (reg:CC 100 0 [119]) (compare:CC (mem/c:SI (plus:DI (reg/f:DI 1 1) (const_int 40044)) (const_int 0 [0]))) (clobber (reg:DI 9 9 [120])) ]) (nil)) When the split2 pass is run after reload has finished the ds_form_mem_operand predicate that used for lwa and ld no longer returns true. This means that since the operand predicates aren't recognized, it won't be split. The solution involves: 1) Don't use ds_form_mem_operand for ld and lwa, always use non_update_memory_operand. 2) Delete ds_form_mem_operand since it is no longer use. 3) Use the "YZ" constraints for ld/lwa instead of "m". 4) Insure that the insn will be recognized as having a prefixed operand (and hence the instruction length is 16 bytes instead of 8 bytes). 4a) Set the prefixed and maybe_prefix attributes to know that fused_load_cmpi are also load insns; 4b) In the case where we are just setting CC and not using the memory afterward, set the clobber to use a DI register, and put an explicit sign_extend operation in the split; 4c) Set the sign_extend attribute to "yes". 4d) 4a-4c are the things that prefixed_load_p in rs6000.cc checks to ensure that lwa is treated as a ds-form instruction and not as a d-form instruction (i.e. lwz). 5) Add a new test case for this case. 6) Adjust the insn counts in fusion-p10-ldcmpi.c. Because we are no longer using ds_form_mem_operand, the ld and lwa instructions will fuse x-form (reg+reg) addresses in addition ds-form (reg+offset or reg). 2023-06-12 Michael Meissner gcc/ * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): Fix problems that allowed prefixed lwa to be generated. * config/rs6000/fusion.md: Regenerate. * config/rs6000/predicates.md (ds_form_mem_operand): Delete. * config/rs6000/rs6000.md (prefixed attribute): Add support for load plus compare immediate fused insns. (maybe_prefixed): Likewise. gcc/testsuite/ * g++.target/powerpc/pr105325.C: New test. * gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c: Update insn counts. Diff: --- gcc/config/rs6000/fusion.md | 23 ++++++++------ gcc/config/rs6000/genfusion.pl | 37 +++++++++++++++++++--- gcc/config/rs6000/predicates.md | 14 -------- gcc/config/rs6000/rs6000.md | 4 +-- gcc/testsuite/g++.target/powerpc/pr105325.C | 26 +++++++++++++++ .../gcc.target/powerpc/fusion-p10-ldcmpi.c | 14 ++++---- 6 files changed, 81 insertions(+), 37 deletions(-) diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md index d45fb138a70..9eefae22a1a 100644 --- a/gcc/config/rs6000/fusion.md +++ b/gcc/config/rs6000/fusion.md @@ -22,7 +22,7 @@ ;; load mode is DI result mode is clobber compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -43,7 +43,7 @@ ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -64,7 +64,7 @@ ;; load mode is DI result mode is DI compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -85,7 +85,7 @@ ;; load mode is DI result mode is DI compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -106,21 +106,22 @@ ;; load mode is SI result mode is clobber compare mode is CC extend is none (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) - (clobber (match_scratch:SI 0 "=r"))] + (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" "lwa%X1 %0,%1\;cmpdi %2,%0,%3" "&& reload_completed && (cc_reg_not_cr0_operand (operands[2], CCmode) || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0), SImode, NON_PREFIXED_DS))" - [(set (match_dup 0) (match_dup 1)) + [(set (match_dup 0) (sign_extend:DI (match_dup 1))) (set (match_dup 2) (compare:CC (match_dup 0) (match_dup 3)))] "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 @@ -148,7 +149,7 @@ ;; load mode is SI result mode is SI compare mode is CC extend is none (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -157,12 +158,13 @@ && (cc_reg_not_cr0_operand (operands[2], CCmode) || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0), SImode, NON_PREFIXED_DS))" - [(set (match_dup 0) (match_dup 1)) + [(set (match_dup 0) (sign_extend:DI (match_dup 1))) (set (match_dup 2) (compare:CC (match_dup 0) (match_dup 3)))] "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 @@ -190,7 +192,7 @@ ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))] "(TARGET_P10_FUSION)" @@ -205,6 +207,7 @@ "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl index 82e8f863b02..31ee54aea93 100755 --- a/gcc/config/rs6000/genfusion.pl +++ b/gcc/config/rs6000/genfusion.pl @@ -61,15 +61,23 @@ sub gen_ld_cmpi_p10_one my $mempred = "non_update_memory_operand"; my $extend; + # We need to special case lwa. The prefixed_load_p function in rs6000.cc + # (which determines if a load instruction is prefixed) uses the fact that the + # register mode is different from the memory mode, and that the sign_extend + # attribute is set to use DS-form rules for the address instead of D-form. + # If the register size is the same, prefixed_load_p assumes we are doing a + # lwz. + my $lwa_insn = ($lmode eq "SI" && $ccmode eq "CC"); + if ($ccmode eq "CC") { # ld and lwa are both DS-FORM. ($lmode =~ /^[SD]I$/) and $np = "NON_PREFIXED_DS"; - ($lmode =~ /^[SD]I$/) and $mempred = "ds_form_mem_operand"; +# ($lmode =~ /^[SD]I$/) and $mempred = "ds_form_mem_operand"; } else { if ($lmode eq "DI") { # ld is DS-form, but lwz is not. $np = "NON_PREFIXED_DS"; - $mempred = "ds_form_mem_operand"; + # $mempred = "ds_form_mem_operand"; } } @@ -81,7 +89,9 @@ sub gen_ld_cmpi_p10_one # For clobber, we need a SI/DI reg in case we # split because we have to sign/zero extend. - my $clobbermode = ($lmode =~ /^[QH]I$/) ? "GPR" : $lmode; + my $clobbermode = (($lmode =~ /^[QH]I$/) + ? "GPR" + : ($lwa_insn ? "DI" : $lmode)); if ($result =~ /^EXT/ || $result eq "GPR" || $clobbermode eq "GPR") { # We always need extension if result > lmode. $extend = ($ccmode eq "CC") ? "sign" : "zero"; @@ -91,12 +101,15 @@ sub gen_ld_cmpi_p10_one } my $ldst = mode_to_ldst_char($lmode); + + # DS-form addresses need YZ, and not m. + my $constraint = ($np eq "NON_PREFIXED_DS") ? "YZ" : "m"; print <