public inbox for gcc-cvs@sourceware.org help / color / mirror / Atom feed
From: Michael Meissner <meissner@gcc.gnu.org> To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work122)] Fix power10 fusion and -fstack-protector, PR target/105325 Date: Tue, 13 Jun 2023 17:21:33 +0000 (GMT) [thread overview] Message-ID: <20230613172133.2B2CB3858D38@sourceware.org> (raw) https://gcc.gnu.org/g:816ce136d5776e4d82ac3beca98989ce58e77f1a commit 816ce136d5776e4d82ac3beca98989ce58e77f1a Author: Michael Meissner <meissner@linux.ibm.com> Date: Tue Jun 13 13:21:15 2023 -0400 Fix power10 fusion and -fstack-protector, PR target/105325 This patch fixes an issue where if you use the -fstack-protector and -mcpu=power10 options and you have a large stack frame, the GCC compiler will generate a LWA instruction with a large offset. The important thing in the bug is that -fstack-protector is used, but it could potentially happen with fused load-compare to any stack location when the stack frame is larger than 32K without -fstack-protector. What happens is the initial insn that is created is: (insn 6 5 7 2 (parallel [ (set (reg:CC 119) (compare:CC (mem/c:SI (plus:DI (reg/f:DI 110 sfp) (const_int -4)) (const_int 0 [0]))) (clobber (scratch:DI)) ]) (nil)) After the stack size is finalized, the frame pointer removed, and the post reload phase is run, the insn is now: (insn 6 5 7 2 (parallel [ (set (reg:CC 100 0 [119]) (compare:CC (mem/c:SI (plus:DI (reg/f:DI 1 1) (const_int 40044)) (const_int 0 [0]))) (clobber (reg:DI 9 9 [120])) ]) (nil)) When the split2 pass is run after reload has finished the ds_form_mem_operand predicate that used for lwa and ld no longer returns true. This means that since the operand predicates aren't recognized, it won't be split. The solution involves: 1) Don't use ds_form_mem_operand for ld and lwa, always use non_update_memory_operand. 2) Delete ds_form_mem_operand since it is no longer use. 3) Use the "YZ" constraints for ld/lwa instead of "m". 4) Insure that the insn will be recognized as having a prefixed operand (and hence the instruction length is 16 bytes instead of 8 bytes). 4a) Set the prefixed and maybe_prefix attributes to know that fused_load_cmpi are also load insns; 4b) In the case where we are just setting CC and not using the memory afterward, set the clobber to use a DI register, and put an explicit sign_extend operation in the split; 4c) Set the sign_extend attribute to "yes". 4d) 4a-4c are the things that prefixed_load_p in rs6000.cc checks to ensure that lwa is treated as a ds-form instruction and not as a d-form instruction (i.e. lwz). 5) Add a new test case for this case. 6) Adjust the insn counts in fusion-p10-ldcmpi.c. Because we are no longer using ds_form_mem_operand, the ld and lwa instructions will fuse x-form (reg+reg) addresses in addition ds-form (reg+offset or reg). 2023-06-12 Michael Meissner <meissner@linux.ibm.com> gcc/ * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): Fix problems that allowed prefixed lwa to be generated. * config/rs6000/fusion.md: Regenerate. * config/rs6000/predicates.md (ds_form_mem_operand): Delete. * config/rs6000/rs6000.md (prefixed attribute): Add support for load plus compare immediate fused insns. (maybe_prefixed): Likewise. gcc/testsuite/ * g++.target/powerpc/pr105325.C: New test. * gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c: Update insn counts. Diff: --- gcc/config/rs6000/fusion.md | 23 ++++++++------ gcc/config/rs6000/genfusion.pl | 37 +++++++++++++++++++--- gcc/config/rs6000/predicates.md | 14 -------- gcc/config/rs6000/rs6000.md | 4 +-- gcc/testsuite/g++.target/powerpc/pr105325.C | 26 +++++++++++++++ .../gcc.target/powerpc/fusion-p10-ldcmpi.c | 14 ++++---- 6 files changed, 81 insertions(+), 37 deletions(-) diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md index d45fb138a70..9eefae22a1a 100644 --- a/gcc/config/rs6000/fusion.md +++ b/gcc/config/rs6000/fusion.md @@ -22,7 +22,7 @@ ;; load mode is DI result mode is clobber compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -43,7 +43,7 @@ ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -64,7 +64,7 @@ ;; load mode is DI result mode is DI compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -85,7 +85,7 @@ ;; load mode is DI result mode is DI compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -106,21 +106,22 @@ ;; load mode is SI result mode is clobber compare mode is CC extend is none (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) - (clobber (match_scratch:SI 0 "=r"))] + (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" "lwa%X1 %0,%1\;cmpdi %2,%0,%3" "&& reload_completed && (cc_reg_not_cr0_operand (operands[2], CCmode) || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0), SImode, NON_PREFIXED_DS))" - [(set (match_dup 0) (match_dup 1)) + [(set (match_dup 0) (sign_extend:DI (match_dup 1))) (set (match_dup 2) (compare:CC (match_dup 0) (match_dup 3)))] "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 @@ -148,7 +149,7 @@ ;; load mode is SI result mode is SI compare mode is CC extend is none (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -157,12 +158,13 @@ && (cc_reg_not_cr0_operand (operands[2], CCmode) || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0), SImode, NON_PREFIXED_DS))" - [(set (match_dup 0) (match_dup 1)) + [(set (match_dup 0) (sign_extend:DI (match_dup 1))) (set (match_dup 2) (compare:CC (match_dup 0) (match_dup 3)))] "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 @@ -190,7 +192,7 @@ ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))] "(TARGET_P10_FUSION)" @@ -205,6 +207,7 @@ "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl index 82e8f863b02..31ee54aea93 100755 --- a/gcc/config/rs6000/genfusion.pl +++ b/gcc/config/rs6000/genfusion.pl @@ -61,15 +61,23 @@ sub gen_ld_cmpi_p10_one my $mempred = "non_update_memory_operand"; my $extend; + # We need to special case lwa. The prefixed_load_p function in rs6000.cc + # (which determines if a load instruction is prefixed) uses the fact that the + # register mode is different from the memory mode, and that the sign_extend + # attribute is set to use DS-form rules for the address instead of D-form. + # If the register size is the same, prefixed_load_p assumes we are doing a + # lwz. + my $lwa_insn = ($lmode eq "SI" && $ccmode eq "CC"); + if ($ccmode eq "CC") { # ld and lwa are both DS-FORM. ($lmode =~ /^[SD]I$/) and $np = "NON_PREFIXED_DS"; - ($lmode =~ /^[SD]I$/) and $mempred = "ds_form_mem_operand"; +# ($lmode =~ /^[SD]I$/) and $mempred = "ds_form_mem_operand"; } else { if ($lmode eq "DI") { # ld is DS-form, but lwz is not. $np = "NON_PREFIXED_DS"; - $mempred = "ds_form_mem_operand"; + # $mempred = "ds_form_mem_operand"; } } @@ -81,7 +89,9 @@ sub gen_ld_cmpi_p10_one # For clobber, we need a SI/DI reg in case we # split because we have to sign/zero extend. - my $clobbermode = ($lmode =~ /^[QH]I$/) ? "GPR" : $lmode; + my $clobbermode = (($lmode =~ /^[QH]I$/) + ? "GPR" + : ($lwa_insn ? "DI" : $lmode)); if ($result =~ /^EXT/ || $result eq "GPR" || $clobbermode eq "GPR") { # We always need extension if result > lmode. $extend = ($ccmode eq "CC") ? "sign" : "zero"; @@ -91,12 +101,15 @@ sub gen_ld_cmpi_p10_one } my $ldst = mode_to_ldst_char($lmode); + + # DS-form addresses need YZ, and not m. + my $constraint = ($np eq "NON_PREFIXED_DS") ? "YZ" : "m"; print <<HERE; ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend (define_insn_and_split "*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}" [(set (match_operand:${ccmode} 2 "cc_reg_operand" "=x") - (compare:${ccmode} (match_operand:${lmode} 1 "${mempred}" "m") + (compare:${ccmode} (match_operand:${lmode} 1 "${mempred}" "${constraint}") HERE print " " if $ccmode eq "CCUNS"; print <<HERE; @@ -126,7 +139,12 @@ HERE ${lmode}mode, ${np}))" HERE - if ($extend eq "none") { + # prefixed_load_p needs to see the register mode being different than the + # memory insn in order to validate lwa as a DS-form instruction and not a + # D-form instruction. + if ($lwa_insn && $extend eq "none") { + print " [(set (match_dup 0) (sign_extend:${clobbermode} (match_dup 1)))\n"; + } elsif ($extend eq "none") { print " [(set (match_dup 0) (match_dup 1))\n"; } elsif ($result eq "clobber") { print " [(set (match_dup 0) (${extend}_extend:${clobbermode} (match_dup 1)))\n"; @@ -140,6 +158,15 @@ HERE "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") +HERE + + if ($lwa_insn) { + # prefixed_load_p needs the sign_extend attribute to validate lwa as a + # DS-form instruction instead of D-form. + print " (set_attr \"sign_extend\" \"yes\")\n"; + } + + print <<HERE (set_attr "length" "8")]) HERE diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index a16ee30f0c0..6b564837c6e 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -1125,20 +1125,6 @@ return INTVAL (offset) % 4 == 0; }) -;; Return 1 if the operand is a memory operand that has a valid address for -;; a DS-form instruction. I.e. the address has to be either just a register, -;; or register + const where the two low order bits of const are zero. -(define_predicate "ds_form_mem_operand" - (match_code "subreg,mem") -{ - if (!any_memory_operand (op, mode)) - return false; - - rtx addr = XEXP (op, 0); - - return address_to_insn_form (addr, mode, NON_PREFIXED_DS) == INSN_FORM_DS; -}) - ;; Return 1 if the operand, used inside a MEM, is a SYMBOL_REF. (define_predicate "symbol_ref_operand" (and (match_code "symbol_ref") diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index b0db8ae508d..75c5e5fc93d 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -287,7 +287,7 @@ ;; Whether this insn has a prefixed form and a non-prefixed form. (define_attr "maybe_prefixed" "no,yes" (if_then_else (eq_attr "type" "load,fpload,vecload,store,fpstore,vecstore, - integer,add") + integer,add,fused_load_cmpi") (const_string "yes") (const_string "no"))) @@ -302,7 +302,7 @@ (eq_attr "maybe_prefixed" "no")) (const_string "no") - (eq_attr "type" "load,fpload,vecload") + (eq_attr "type" "load,fpload,vecload,fused_load_cmpi") (if_then_else (match_test "prefixed_load_p (insn)") (const_string "yes") (const_string "no")) diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C b/gcc/testsuite/g++.target/powerpc/pr105325.C new file mode 100644 index 00000000000..d0e66a0b897 --- /dev/null +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C @@ -0,0 +1,26 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target powerpc_prefixed_addr } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -fstack-protector" } */ + +/* Test that power10 fusion does not generate an LWA/CMPDI instruction pair + instead of PLWZ/CMPWI. Ultimately the code was dying because the fusion + load + compare -1/0/1 patterns did not handle the possibility that the load + might be prefixed. The -fstack-protector option is needed to show the + bug. */ + +struct Ath__array1D { + int _current; + int getCnt() { return _current; } +}; +struct extMeasure { + int _mapTable[10000]; + Ath__array1D _metRCTable; +}; +void measureRC() { + extMeasure m; + for (; m._metRCTable.getCnt();) + for (;;) + ; +} diff --git a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c index 526a026d874..3efbb34f2b4 100644 --- a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c +++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c @@ -54,14 +54,14 @@ TEST(uint8_t) TEST(int8_t) /* { dg-final { scan-assembler-times "lbz_cmpldi_cr0_QI_clobber_CCUNS_zero" 4 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_DI_CC_none" 4 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_clobber_CC_none" 4 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_DI_CCUNS_none" 1 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_clobber_CCUNS_none" 1 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_DI_CC_none" 24 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_clobber_CC_none" 8 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_DI_CCUNS_none" 2 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_clobber_CCUNS_none" 2 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign" 16 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero" 4 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign" 0 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 4 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 8 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero" 0 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none" 2 { target lp64 } } } */ @@ -73,6 +73,8 @@ TEST(int8_t) /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign" 8 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero" 2 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign" 0 { target ilp32 } } } */ -/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 9 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_SI_CC_none" 36 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 16 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero" 0 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none" 6 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_SI_CCUNS_none" 0 { target ilp32 } } } */
next reply other threads:[~2023-06-13 17:21 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-06-13 17:21 Michael Meissner [this message] -- strict thread matches above, loose matches on Subject: below -- 2023-06-13 23:24 Michael Meissner 2023-06-13 2:57 Michael Meissner 2023-06-09 21:04 Michael Meissner 2023-06-09 6:08 Michael Meissner 2023-06-09 4:17 Michael Meissner 2023-06-09 1:32 Michael Meissner 2023-06-08 21:20 Michael Meissner 2023-06-08 20:23 Michael Meissner 2023-06-08 16:53 Michael Meissner 2023-06-08 14:38 Michael Meissner 2023-06-07 20:58 Michael Meissner 2023-06-07 18:56 Michael Meissner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230613172133.2B2CB3858D38@sourceware.org \ --to=meissner@gcc.gnu.org \ --cc=gcc-cvs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).