From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: by sourceware.org (Postfix, from userid 1005) id 89B933858CDB; Thu, 23 Mar 2023 20:50:56 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 89B933858CDB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1679604656; bh=cXUEmDtKDQ1UBkHMpdMWMXowHyfG8hE3zxZaagzHSPg=; h=From:To:Subject:Date:From; b=La9HGl5v5ktqRF1y5dBuBfmVtY64cINj5rLn5UuZvWULKqsR9E1uV3atK7w9QbGnf /eRGwV53z8qEslBPkW3WzrFO6bAJ1DYqgl8RVQqZlIrBsfJSibfj2DVoPlPQFmF+7W iGpYDK/y9bIgC+KtQPPcfgGd2BxxYP62Wdpx5U5g= Content-Type: text/plain; charset="us-ascii" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit From: Michael Meissner To: gcc-cvs@gcc.gnu.org Subject: [gcc(refs/users/meissner/heads/work115)] Make load/cmp fusion know about prefixed loads. X-Act-Checkin: gcc X-Git-Author: Michael Meissner X-Git-Refname: refs/users/meissner/heads/work115 X-Git-Oldrev: c2fbcd1a1b48ab0a9efab3adb88580998ff3e6d8 X-Git-Newrev: 9fd92085fd74a01e09d911f8fab7e12b751527e3 Message-Id: <20230323205056.89B933858CDB@sourceware.org> Date: Thu, 23 Mar 2023 20:50:56 +0000 (GMT) List-Id: https://gcc.gnu.org/g:9fd92085fd74a01e09d911f8fab7e12b751527e3 commit 9fd92085fd74a01e09d911f8fab7e12b751527e3 Author: Michael Meissner Date: Thu Mar 23 16:50:06 2023 -0400 Make load/cmp fusion know about prefixed loads. The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion optimization generates illegal assembler code. Ultimately the code was dying because the fusion load + compare -1/0/1 patterns did not handle the possibility that the load might be prefixed. The main cause is the prefixed attribute did not consider that fused_load_cmpi insns are essentially load instructions, and to check whether the load is prefixed. This code ensures that the prefixed attribute is correctly set for the fusion load plus compare immediate insns combined instruction. This means it will split the insn before final is called, and the load instruction will use a prefixed load. The original patch by Aaron reworked the insns generated by genfusion.pl so that they had constraints that limited the load to be YZ, which are constraints that restrict the load to offsets that the non-prefixed LWA instruction can handle. I will submit that patch as a second patch. However, just setting the prefixed attribute correctly will correctly split the insns. 2023-03-23 Michael Meissner gcc/ PR target/105325 * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi instructions to the list of instructions that might have a prefixed load instruction. gcc/testsuite/ PR target/105325 * g++.target/powerpc/pr105325.C: New test. Diff: --- gcc/config/rs6000/fusion.md | 17 ++++++++------ gcc/config/rs6000/genfusion.pl | 26 ++++++++++++++++++---- gcc/config/rs6000/rs6000.md | 2 +- gcc/testsuite/g++.target/powerpc/pr105325.C | 24 ++++++++++++++++++++ .../gcc.target/powerpc/fusion-p10-ldcmpi.c | 4 ++-- 5 files changed, 59 insertions(+), 14 deletions(-) diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md index d45fb138a70..da9953d9ad9 100644 --- a/gcc/config/rs6000/fusion.md +++ b/gcc/config/rs6000/fusion.md @@ -22,7 +22,7 @@ ;; load mode is DI result mode is clobber compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -43,7 +43,7 @@ ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (clobber (match_scratch:DI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -64,7 +64,7 @@ ;; load mode is DI result mode is DI compare mode is CC extend is none (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -85,7 +85,7 @@ ;; load mode is DI result mode is DI compare mode is CCUNS extend is none (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none" [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x") - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m") + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ") (match_operand:DI 3 "const_0_to_1_operand" "n"))) (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -106,7 +106,7 @@ ;; load mode is SI result mode is clobber compare mode is CC extend is none (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (clobber (match_scratch:SI 0 "=r"))] "(TARGET_P10_FUSION)" @@ -148,7 +148,7 @@ ;; load mode is SI result mode is SI compare mode is CC extend is none (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))] "(TARGET_P10_FUSION)" @@ -190,7 +190,7 @@ ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign" [(set (match_operand:CC 2 "cc_reg_operand" "=x") - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m") + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ") (match_operand:SI 3 "const_m1_to_1_operand" "n"))) (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))] "(TARGET_P10_FUSION)" @@ -205,6 +205,7 @@ "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 @@ -247,6 +248,7 @@ "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 @@ -289,6 +291,7 @@ "" [(set_attr "type" "fused_load_cmpi") (set_attr "cost" "8") + (set_attr "sign_extend" "yes") (set_attr "length" "8")]) ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10 diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl index e4db352e0ce..4f367cadc52 100755 --- a/gcc/config/rs6000/genfusion.pl +++ b/gcc/config/rs6000/genfusion.pl @@ -56,7 +56,7 @@ sub mode_to_ldst_char sub gen_ld_cmpi_p10 { my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred, - $mempred, $ccmode, $np, $extend, $resultmode); + $mempred, $ccmode, $np, $extend, $resultmode, $constraint); LMODE: foreach $lmode ('DI','SI','HI','QI') { $ldst = mode_to_ldst_char($lmode); $clobbermode = $lmode; @@ -71,21 +71,34 @@ sub gen_ld_cmpi_p10 CCMODE: foreach $ccmode ('CC','CCUNS') { $np = "NON_PREFIXED_D"; $mempred = "non_update_memory_operand"; + $constraint = "m"; if ( $ccmode eq 'CC' ) { next CCMODE if $lmode eq 'QI'; - if ( $lmode eq 'DI' || $lmode eq 'SI' ) { + if ( $lmode eq 'HI' ) { + $np = "NON_PREFIXED_D"; + $mempred = "non_update_memory_operand"; + $echr = "a"; + } elsif ( $lmode eq 'SI' ) { + # ld and lwa are both DS-FORM. + $np = "NON_PREFIXED_DS"; + $mempred = "lwa_operand"; + $echr = "a"; + $constraint = "YZ"; + } elsif ( $lmode eq 'DI' ) { # ld and lwa are both DS-FORM. $np = "NON_PREFIXED_DS"; $mempred = "ds_form_mem_operand"; + $echr = ""; + $constraint = "YZ"; } $cmpl = ""; - $echr = "a"; $constpred = "const_m1_to_1_operand"; } else { if ( $lmode eq 'DI' ) { # ld is DS-form, but lwz is not. $np = "NON_PREFIXED_DS"; $mempred = "ds_form_mem_operand"; + $constraint = "YZ"; } $cmpl = "l"; $echr = "z"; @@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10 print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n"; print " [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n"; - print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"m\")\n"; + print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"${constraint}\")\n"; if ($ccmode eq 'CCUNS') { print " "; } print " (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n"; if ($result eq 'clobber') { @@ -137,6 +150,11 @@ sub gen_ld_cmpi_p10 print " \"\"\n"; print " [(set_attr \"type\" \"fused_load_cmpi\")\n"; print " (set_attr \"cost\" \"8\")\n"; + + if ($extend eq "sign") { + print " (set_attr \"sign_extend\" \"yes\")\n"; + } + print " (set_attr \"length\" \"8\")])\n"; print "\n"; } diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index 44f7dd509cb..d836a8a58b3 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -302,7 +302,7 @@ (eq_attr "maybe_prefixed" "no")) (const_string "no") - (eq_attr "type" "load,fpload,vecload") + (eq_attr "type" "load,fpload,vecload,vecload,fused_load_cmpi") (if_then_else (match_test "prefixed_load_p (insn)") (const_string "yes") (const_string "no")) diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C b/gcc/testsuite/g++.target/powerpc/pr105325.C new file mode 100644 index 00000000000..f4ab384daa7 --- /dev/null +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C @@ -0,0 +1,24 @@ +/* { dg-do assemble } */ +/* { dg-require-effective-target lp64 } */ +/* { dg-require-effective-target powerpc_prefixed_addr } */ +/* { dg-options "-O2 -mdejagnu-cpu=power10 -fstack-protector" } */ + +/* Test that power10 fusion does not generate an LWA/CMPDI instruction pair + instead of PLWZ/CMPWI. Ultimately the code was dying because the fusion + load + compare -1/0/1 patterns did not handle the possibility that the load + might be prefixed. */ + +struct Ath__array1D { + int _current; + int getCnt() { return _current; } +}; +struct extMeasure { + int _mapTable[10000]; + Ath__array1D _metRCTable; +}; +void measureRC() { + extMeasure m; + for (; m._metRCTable.getCnt();) + for (;;) + ; +} diff --git a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c index 526a026d874..ca7297375a4 100644 --- a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c +++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c @@ -61,7 +61,7 @@ TEST(int8_t) /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign" 16 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero" 4 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign" 0 { target lp64 } } } */ -/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 4 { target lp64 } } } */ +/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 8 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero" 0 { target lp64 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none" 2 { target lp64 } } } */ @@ -73,6 +73,6 @@ TEST(int8_t) /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign" 8 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero" 2 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign" 0 { target ilp32 } } } */ -/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 9 { target ilp32 } } } */ +/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none" 16 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero" 0 { target ilp32 } } } */ /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none" 6 { target ilp32 } } } */