public inbox for gcc-cvs@sourceware.org help / color / mirror / Atom feed
From: Michael Meissner <meissner@gcc.gnu.org> To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work122)] Fix power10 fusion and -fstack-protector, PR target/105325 Date: Tue, 13 Jun 2023 23:24:53 +0000 (GMT) [thread overview] Message-ID: <20230613232453.CEA3F3858CDA@sourceware.org> (raw) https://gcc.gnu.org/g:4070564c15613cbb757ebb9b2debd3508458aa0c commit 4070564c15613cbb757ebb9b2debd3508458aa0c Author: Michael Meissner <meissner@linux.ibm.com> Date: Tue Jun 13 19:24:28 2023 -0400 Fix power10 fusion and -fstack-protector, PR target/105325 This patch fixes an issue where if you use the -fstack-protector and -mcpu=power10 options and you have a large stack frame, the GCC compiler will generate a LWA instruction with a large offset. The important thing in the bug is that -fstack-protector is used, but it could potentially happen with fused load-compare to any stack location when the stack frame is larger than 32K without -fstack-protector. What happens is the initial insn that is created is: (insn 6 5 7 2 (parallel [ (set (reg:CC 119) (compare:CC (mem/c:SI (plus:DI (reg/f:DI 110 sfp) (const_int -4)) (const_int 0 [0]))) (clobber (scratch:DI)) ]) (nil)) After the stack size is finalized, the frame pointer removed, and the post reload phase is run, the insn is now: (insn 6 5 7 2 (parallel [ (set (reg:CC 100 0 [119]) (compare:CC (mem/c:SI (plus:DI (reg/f:DI 1 1) (const_int 40044)) (const_int 0 [0]))) (clobber (reg:DI 9 9 [120])) ]) (nil)) When the split2 pass is run after reload has finished the ds_form_mem_operand predicate that used for lwa and ld no longer returns true. This means that since the operand predicates aren't recognized, it won't be split. The solution involves: 1) Don't use ds_form_mem_operand for ld and lwa, always use non_update_memory_operand. 2) Delete ds_form_mem_operand since it is no longer used. 3) Use the "YZ" constraints for ld/lwa instead of "m". 4) If we don't need to sign extend the lwa, convert it to lwz, and use cmpwi instead of cmpdi. Adjust the insn name to reflect the code generate. 5) Insure that the insn using lwa will be recognized as having a prefixed operand (and hence the instruction length is 16 bytes instead of 8 bytes). 5a) Set the prefixed and maybe_prefix attributes to know that fused_load_cmpi are also load insns; 5b) In the case where we are just setting CC and not using the memory afterward, set the clobber to use a DI register, and put an explicit sign_extend operation in the split; 5c) Set the sign_extend attribute to "yes". 5d) 5a-5c are the things that prefixed_load_p in rs6000.cc checks to ensure that lwa is treated as a ds-form instruction and not as a d-form instruction (i.e. lwz). 6) Add a new test case for this case. 7) Adjust the insn counts in fusion-p10-ldcmpi.c. Because we are no longer using ds_form_mem_operand, the ld and lwa instructions will fuse x-form (reg+reg) addresses in addition ds-form (reg+offset or reg). 2023-06-12 Michael Meissner <meissner@linux.ibm.com> gcc/ * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): Fix problems that allowed prefixed lwa to be generated. * config/rs6000/fusion.md: Regenerate. * config/rs6000/predicates.md (ds_form_mem_operand): Delete. * config/rs6000/rs6000.md (prefixed attribute): Add support for load plus compare immediate fused insns. (maybe_prefixed): Likewise. gcc/testsuite/ * g++.target/powerpc/pr105325.C: New test. * gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c: Update insn counts. Diff: --- gcc/config/rs6000/fusion.md | 27 ++++++++-------- gcc/config/rs6000/genfusion.pl | 36 +++++++++++++++++----- gcc/config/rs6000/predicates.md | 14 --------- gcc/config/rs6000/rs6000.md | 4 +-- gcc/testsuite/g++.target/powerpc/pr105325.C | 26 ++++++++++++++++ .../gcc.target/powerpc/fusion-p10-ldcmpi.c | 16 ++++++---- 6 files changed, 81 insertions(+), 42 deletions(-) diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md index d45fb138a70..e286bf526a2 100644 --- a/gcc/config/rs6000/fusion.md +++ b/gcc/config/rs6000/fusion.md @@ -22,7 +22,7 @@ ;; load mode is DI result mode is clobber compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -43,7 +43,7 @@ ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -64,7 +64,7 @@ ;; load mode is DI result mode is DI compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -85,7 +85,7 @@ ;; load mode is DI result mode is DI compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CCUNS (match_operand:DI 1 "non_update_memory_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -104,17 +104,17 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is SI result mode is clobber compare mode is CC extend is none -(define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none" +(define_insn_and_split "*lwz_cmpwi_cr0_SI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (clobber (match_scratch:SI 0 "=r"))] "(TARGET_P10_FUSION)" - "lwa%X1 %0,%1\;cmpdi %2,%0,%3" + "lwz%X1 %0,%1\;cmpwi %2,%0,%3" "&& reload_completed && (cc_reg_not_cr0_operand (operands[2], CCmode) || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0), - SImode, NON_PREFIXED_DS))" + SImode, NON_PREFIXED_D))" [(set (match_dup 0) (match_dup 1)) (set (match_dup 2) (compare:CC (match_dup 0) (match_dup 3)))] @@ -146,17 +146,17 @@ ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is SI result mode is SI compare mode is CC extend is none -(define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none" +(define_insn_and_split "*lwz_cmpwi_cr0_SI_SI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" - "lwa%X1 %0,%1\;cmpdi %2,%0,%3" + "lwz%X1 %0,%1\;cmpwi %2,%0,%3" "&& reload_completed && (cc_reg_not_cr0_operand (operands[2], CCmode) || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0), - SImode, NON_PREFIXED_DS))" + SImode, NON_PREFIXED_D))" [(set (match_dup 0) (match_dup 1)) (set (match_dup 2) (compare:CC (match_dup 0) (match_dup 3)))] @@ -190,7 +190,7 @@ ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "non_update_memory_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))] "(TARGET_P10_FUSION)" @@ -205,6 +205,7 @@ "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl index 82e8f863b02..c8b232847a8 100755 --- a/gcc/config/rs6000/genfusion.pl +++ b/gcc/config/rs6000/genfusion.pl @@ -61,20 +61,30 @@ sub gen_ld_cmpi_p10_one my $mempred = "non_update_memory_operand"; my $extend; + # We need to special case lwa. The prefixed_load_p function in rs6000.cc + # (which determines if a load instruction is prefixed) uses the fact that the + # register mode is different from the memory mode, and that the sign_extend + # attribute is set to use DS-form rules for the address instead of D-form. + # If the register size is the same, prefixed_load_p assumes we are doing a + # lwz. We change to use an lwz and word compare if we don't need to sign + # extend the SImode value. Otherwise if we need the value, we need to + # make sure the insn is marked as ds-form. + my $lwa_insn = ($lmode eq "SI" && $ccmode eq "CC"); + my $cmp_size = ($lwa_insn && $result !~ /^EXT|^DI$/) ? "w" : "d"; + if ($ccmode eq "CC") { # ld and lwa are both DS-FORM. - ($lmode =~ /^[SD]I$/) and $np = "NON_PREFIXED_DS"; - ($lmode =~ /^[SD]I$/) and $mempred = "ds_form_mem_operand"; + ($lmode eq "DI") and $np = "NON_PREFIXED_DS"; + ($lmode eq "SI" && $cmp_size eq "d") and $np = "NON_PREFIXED_DS"; } else { if ($lmode eq "DI") { # ld is DS-form, but lwz is not. $np = "NON_PREFIXED_DS"; - $mempred = "ds_form_mem_operand"; } } my $cmpl = ($ccmode eq "CC") ? "" : "l"; - my $echr = ($ccmode eq "CC") ? "a" : "z"; + my $echr = ($ccmode eq "CC" && $cmp_size eq "d") ? "a" : "z"; if ($lmode eq "DI") { $echr = ""; } my $constpred = ($ccmode eq "CC") ? "const_m1_to_1_operand" : "const_0_to_1_operand"; @@ -91,12 +101,15 @@ sub gen_ld_cmpi_p10_one } my $ldst = mode_to_ldst_char($lmode); + + # DS-form addresses need YZ, and not m. + my $constraint = ($np eq "NON_PREFIXED_DS") ? "YZ" : "m"; print <<HERE; ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 ;; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend -(define_insn_and_split "*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}" +(define_insn_and_split "*l${ldst}${echr}_cmp${cmpl}${cmp_size}i_cr0_${lmode}_${result}_${ccmode}_${extend}" [(set (match_operand:${ccmode} 2 "cc_reg_operand" "=x") - (compare:${ccmode} (match_operand:${lmode} 1 "${mempred}" "m") + (compare:${ccmode} (match_operand:${lmode} 1 "${mempred}" "${constraint}") HERE print " " if $ccmode eq "CCUNS"; print <<HERE; @@ -119,7 +132,7 @@ HERE print <<HERE; "(TARGET_P10_FUSION)" - "l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di %2,%0,%3" + "l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}${cmp_size}i %2,%0,%3" "&& reload_completed && (cc_reg_not_cr0_operand (operands[2], CCmode) || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0), @@ -140,6 +153,15 @@ HERE "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") +HERE + + if ($lwa_insn && $cmp_size eq "d") { + # prefixed_load_p needs the sign_extend attribute to validate lwa as a + # DS-form instruction instead of D-form. + print " (set_attr \"sign_extend\" \"yes\")\n"; + } + + print <<HERE (set_attr "length" "8")]) HERE diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md index a16ee30f0c0..6b564837c6e 100644 --- a/gcc/config/rs6000/predicates.md +++ b/gcc/config/rs6000/predicates.md @@ -1125,20 +1125,6 @@ return INTVAL (offset) % 4 == 0; }) -;; Return 1 if the operand is a memory operand that has a valid address for -;; a DS-form instruction. I.e. the address has to be either just a register, -;; or register + const where the two low order bits of const are zero. -(define_predicate "ds_form_mem_operand" - (match_code "subreg,mem") -{ - if (!any_memory_operand (op, mode)) - return false; - - rtx addr = XEXP (op, 0); - - return address_to_insn_form (addr, mode, NON_PREFIXED_DS) == INSN_FORM_DS; -}) - ;; Return 1 if the operand, used inside a MEM, is a SYMBOL_REF. (define_predicate "symbol_ref_operand" (and (match_code "symbol_ref") diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index b0db8ae508d..75c5e5fc93d 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -287,7 +287,7 @@ ;; Whether this insn has a prefixed form and a non-prefixed form. (define_attr "maybe_prefixed" "no,yes" (if_then_else (eq_attr "type" "load,fpload,vecload,store,fpstore,vecstore, - integer,add") + integer,add,fused_load_cmpi") (const_string "yes") (const_string "no"))) @@ -302,7 +302,7 @@ (eq_attr "maybe_prefixed" "no")) (const_string "no") - (eq_attr "type" "load,fpload,vecload") + (eq_attr "type" "load,fpload,vecload,fused_load_cmpi") (if_then_else (match_test "prefixed_load_p (insn)") (const_string "yes") (const_string "no")) diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C b/gcc/testsuite/g++.target/powerpc/pr105325.C new file mode 100644 index 00000000000..d0e66a0b897 --- /dev/null +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C @@ -0,0 +1,26 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-require-effective-target powerpc_prefixed_addr } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -fstack-protector" } */ + +/* Test that power10 fusion does not generate an LWA/CMPDI instruction pair + instead of PLWZ/CMPWI. Ultimately the code was dying because the fusion + load + compare -1/0/1 patterns did not handle the possibility that the load + might be prefixed. The -fstack-protector option is needed to show the + bug. */ + +struct Ath__array1D { + int _current; + int getCnt() { return _current; } +}; +struct extMeasure { + int _mapTable[10000]; + Ath__array1D _metRCTable; +}; +void measureRC() { + extMeasure m; + for (; m._metRCTable.getCnt();) + for (;;) + ; +} diff --git a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c index 526a026d874..165bd9a07ad 100644 --- a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c +++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c @@ -54,15 +54,17 @@ TEST(uint8_t) TEST(int8_t) /* { dg-final { scan-assembler-times "lbz_cmpldi_cr0_QI_clobber_CCUNS_zero" 4 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_DI_CC_none" 4 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_clobber_CC_none" 4 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_DI_CCUNS_none" 1 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_clobber_CCUNS_none" 1 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_DI_CC_none" 24 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_clobber_CC_none" 8 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_DI_CCUNS_none" 2 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_clobber_CCUNS_none" 2 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign" 16 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero" 4 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign" 0 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 4 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "lwz_cmpwi_cr0_SI_clobber_CC_none" 8 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "lwz_cmpwi_cr0_SI_SI_CC_none" 8 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero" 0 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_SI_CCUNS_none" 2 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none" 2 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lbz_cmpldi_cr0_QI_clobber_CCUNS_zero" 2 { target ilp32 } } } */ @@ -73,6 +75,8 @@ TEST(int8_t) /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign" 8 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero" 2 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign" 0 { target ilp32 } } } */ -/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 9 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times "lwz_cmpwi_cr0_SI_SI_CC_none" 36 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times "lwz_cmpwi_cr0_SI_clobber_CC_none" 16 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero" 0 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none" 6 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_SI_CCUNS_none" 2 { target ilp32 } } } */
next reply other threads:[~2023-06-13 23:24 UTC|newest] Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top 2023-06-13 23:24 Michael Meissner [this message] -- strict thread matches above, loose matches on Subject: below -- 2023-06-13 17:21 Michael Meissner 2023-06-13 2:57 Michael Meissner 2023-06-09 21:04 Michael Meissner 2023-06-09 6:08 Michael Meissner 2023-06-09 4:17 Michael Meissner 2023-06-09 1:32 Michael Meissner 2023-06-08 21:20 Michael Meissner 2023-06-08 20:23 Michael Meissner 2023-06-08 16:53 Michael Meissner 2023-06-08 14:38 Michael Meissner 2023-06-07 20:58 Michael Meissner 2023-06-07 18:56 Michael Meissner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20230613232453.CEA3F3858CDA@sourceware.org \ --to=meissner@gcc.gnu.org \ --cc=gcc-cvs@gcc.gnu.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).