public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
From: Michael Meissner <meissner@gcc.gnu.org>
To: gcc-cvs@gcc.gnu.org
Subject: [gcc(refs/users/meissner/heads/work122)] Fix power10 fusion and -fstack-protector, PR target/105325
Date: Thu,  8 Jun 2023 14:38:57 +0000 (GMT)	[thread overview]
Message-ID: <20230608143857.B3CB5385702B@sourceware.org> (raw)

https://gcc.gnu.org/g:6c1d18a1db648ccce383a6c1ecdb49f08c5e2136

commit 6c1d18a1db648ccce383a6c1ecdb49f08c5e2136
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Thu Jun 8 10:38:25 2023 -0400

    Fix power10 fusion and -fstack-protector, PR target/105325
    
    This patch fixes an issue where if you use the -fstack-protector and
    -mcpu=power10 options and you have a large stack frame, the GCC compiler will
    generate a LWA instruction with a large offset.
    
    There are several problems:
    
        1)  The prefixed attribute was not checking insns with the type
            fused_load_cmpi for being load insns.
    
        2)  The recognition of LWA for being prefixed looks at the "sign_extend"
            attribute and whether the register mode was different than the memory
            load (i.e. does it have a sign_extend wrapper around the load).
    
        3)  The constraints in fusion.md (generated by genfusion.pl) use "m" for LWA
            and LD, when they should use "YZ".
    
        4)  There is a lwa_operand that should be used instead of
            ds_form_mem_operand for the LWA instruction.
    
    The main fix is to modify genfusion.pl that it sets the appropriate predicates
    and constraints.
    
    I also added support in genfusion.md so that if we are doing a LWA operation and
    just setting the CC bits (throwing away the result of the load after the
    comparison), it generates a LWZ instruction and does a CMPWI instead of CMPDI.
    This way those loads can use normal D-FORM restrictions instead of DS-form.
    
    I set the "sign_extend" attribute on the cases that generate LWA.
    
    I modified the "prefixed" attribute so that it also checks fused_load_cmpi.
    
    2023-06-07   Michael Meissner  <meissner@linux.ibm.com>
    
    gcc/
    
            * config/rs6000/genfusion.pl (gen_ld_cmpi_p10_one): Fix constraints and
            predicates for LD and LWA.  Optimize LWA/CMPDI to generate LWZ/CMPWI if
            we don't need the result of the load after the comparison.  Change the
            name of the insn pattern to reflect whether a DImode or SImode register
            is loaded.  Set sign_extend attribute for LWA instruction.
            * config/rs6000/fusion.md: Regenerate.
            * config/rs6000/rs6000.md (prefixed attribute): Treat fused_load_cmpi
            insns as being load insns.
    
    gcc/testsuite/
    
            * g++.target/powerpc/pr105325.C: New test.
            * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust names and insn counts
            for PR target/105325 fix.

Diff:
---
 gcc/config/rs6000/fusion.md                        | 22 ++++++++-------
 gcc/config/rs6000/genfusion.pl                     | 33 ++++++++++++++++------
 gcc/config/rs6000/rs6000.md                        |  2 +-
 gcc/testsuite/g++.target/powerpc/pr105325.C        | 26 +++++++++++++++++
 .../gcc.target/powerpc/fusion-p10-ldcmpi.c         | 15 ++++++----
 5 files changed, 73 insertions(+), 25 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index d45fb138a70..5b82c61e959 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -22,7 +22,7 @@
 ;; load mode is DI result mode is clobber compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                     (match_operand:DI 3 "const_m1_to_1_operand" "n")))
    (clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION)"
@@ -43,7 +43,7 @@
 ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
 (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
   [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
-        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                        (match_operand:DI 3 "const_0_to_1_operand" "n")))
    (clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION)"
@@ -64,7 +64,7 @@
 ;; load mode is DI result mode is DI compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                     (match_operand:DI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -85,7 +85,7 @@
 ;; load mode is DI result mode is DI compare mode is CCUNS extend is none
 (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
   [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
-        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                        (match_operand:DI 3 "const_0_to_1_operand" "n")))
    (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -104,17 +104,17 @@
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
 ;; load mode is SI result mode is clobber compare mode is CC extend is none
-(define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
+(define_insn_and_split "*lwz_cmpwi_cr0_SI_clobber_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "non_update_memory_operand" "m")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (clobber (match_scratch:SI 0 "=r"))]
   "(TARGET_P10_FUSION)"
-  "lwa%X1 %0,%1\;cmpdi %2,%0,%3"
+  "lwz%X1 %0,%1\;cmpwi %2,%0,%3"
   "&& reload_completed
    && (cc_reg_not_cr0_operand (operands[2], CCmode)
        || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),
-                                      SImode, NON_PREFIXED_DS))"
+                                      SImode, NON_PREFIXED_D))"
   [(set (match_dup 0) (match_dup 1))
    (set (match_dup 2)
         (compare:CC (match_dup 0) (match_dup 3)))]
@@ -148,7 +148,7 @@
 ;; load mode is SI result mode is SI compare mode is CC extend is none
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -163,6 +163,7 @@
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
@@ -190,7 +191,7 @@
 ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))]
   "(TARGET_P10_FUSION)"
@@ -205,6 +206,7 @@
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index 82e8f863b02..310ac5f359a 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -57,14 +57,26 @@ sub gen_ld_cmpi_p10_one
 {
   my ($lmode, $result, $ccmode) = @_;
 
+  my $cmp_size = "d";
+  my $echr = ($ccmode eq "CC") ? "a" : "z";
+  if ($lmode eq "DI") { $echr = ""; }
   my $np = "NON_PREFIXED_D";
   my $mempred = "non_update_memory_operand";
   my $extend;
 
   if ($ccmode eq "CC") {
-    # ld and lwa are both DS-FORM.
-    ($lmode =~ /^[SD]I$/) and $np = "NON_PREFIXED_DS";
-    ($lmode =~ /^[SD]I$/) and $mempred = "ds_form_mem_operand";
+    # if we would generate lwa and just want the CC value, generate lwz instead
+    # and use cmpwi.  This allows us to avoid the ds-form restrictions on lwa.
+    if ($lmode eq "SI" && $result eq "clobber") {
+      $echr = "z";
+      $cmp_size = "w";
+
+    } else {
+      # ld and lwa are both DS-FORM, use LWA_OPERAND for LWA
+      ($lmode =~ /^[SD]I$/) and $np = "NON_PREFIXED_DS";
+      ($lmode eq "DI") and $mempred = "ds_form_mem_operand";
+      ($lmode eq "SI") and $mempred = "lwa_operand";
+    }
   } else {
     if ($lmode eq "DI") {
       # ld is DS-form, but lwz is not.
@@ -74,8 +86,6 @@ sub gen_ld_cmpi_p10_one
   }
 
   my $cmpl = ($ccmode eq "CC") ? "" : "l";
-  my $echr = ($ccmode eq "CC") ? "a" : "z";
-  if ($lmode eq "DI") { $echr = ""; }
   my $constpred = ($ccmode eq "CC") ? "const_m1_to_1_operand"
   				    : "const_0_to_1_operand";
 
@@ -91,12 +101,13 @@ sub gen_ld_cmpi_p10_one
   }
 
   my $ldst = mode_to_ldst_char($lmode);
+  my $constraint = ($np eq "NON_PREFIXED_DS") ? "YZ" : "m";
   print <<HERE;
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
 ;; load mode is $lmode result mode is $result compare mode is $ccmode extend is $extend
-(define_insn_and_split "*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}"
+(define_insn_and_split "*l${ldst}${echr}_cmp${cmpl}${cmp_size}i_cr0_${lmode}_${result}_${ccmode}_${extend}"
   [(set (match_operand:${ccmode} 2 "cc_reg_operand" "=x")
-        (compare:${ccmode} (match_operand:${lmode} 1 "${mempred}" "m")
+        (compare:${ccmode} (match_operand:${lmode} 1 "${mempred}" "${constraint}")
 HERE
   print "   " if $ccmode eq "CCUNS";
 print <<HERE;
@@ -119,7 +130,7 @@ HERE
 
   print <<HERE;
   "(TARGET_P10_FUSION)"
-  "l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}di %2,%0,%3"
+  "l${ldst}${echr}%X1 %0,%1\\;cmp${cmpl}${cmp_size}i %2,%0,%3"
   "&& reload_completed
    && (cc_reg_not_cr0_operand (operands[2], CCmode)
        || !address_is_non_pfx_d_or_x (XEXP (operands[1], 0),
@@ -140,6 +151,12 @@ HERE
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+HERE
+  # prefixed_load_p looks at sign_extend to deal with lwa.
+  if ($mempred eq "lwa_operand") {
+    print "   (set_attr \"sign_extend\" \"yes\")\n";
+  }
+  print <<HERE;
    (set_attr "length" "8")])
 
 HERE
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b0db8ae508d..ed2f06e56ac 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -302,7 +302,7 @@
 	      (eq_attr "maybe_prefixed" "no"))
 	 (const_string "no")
 
-	 (eq_attr "type" "load,fpload,vecload")
+	 (eq_attr "type" "load,fpload,vecload,fused_load_cmpi")
 	 (if_then_else (match_test "prefixed_load_p (insn)")
 		       (const_string "yes")
 		       (const_string "no"))
diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C b/gcc/testsuite/g++.target/powerpc/pr105325.C
new file mode 100644
index 00000000000..d0e66a0b897
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr105325.C
@@ -0,0 +1,26 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target power10_ok } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -fstack-protector" } */
+
+/* Test that power10 fusion does not generate an LWA/CMPDI instruction pair
+   instead of PLWZ/CMPWI.  Ultimately the code was dying because the fusion
+   load + compare -1/0/1 patterns did not handle the possibility that the load
+   might be prefixed.  The -fstack-protector option is needed to show the
+   bug.  */
+
+struct Ath__array1D {
+  int _current;
+  int getCnt() { return _current; }
+};
+struct extMeasure {
+  int _mapTable[10000];
+  Ath__array1D _metRCTable;
+};
+void measureRC() {
+  extMeasure m;
+  for (; m._metRCTable.getCnt();)
+    for (;;)
+      ;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
index 526a026d874..3a7a16be5d0 100644
--- a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
+++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
@@ -60,19 +60,22 @@ TEST(int8_t)
 /* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_clobber_CCUNS_none"    1 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign"      16 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"   4 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign"         0 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       4 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"     0 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_SI_CC_none"            8 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_SI_CCUNS_none"        2 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none"   2 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "lwz_cmpdi_cr0_SI_clobber_CC_none"       8 { target lp64 } } } */
 
 /* { dg-final { scan-assembler-times "lbz_cmpldi_cr0_QI_clobber_CCUNS_zero"   2 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lbz_cmpldi_cr0_QI_GPR_CCUNS_zero"       2 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_DI_CC_none"             0 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "ld_cmpdi_cr0_DI_clobber_CC_none"        0 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_DI_CCUNS_none"         0 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "ld_cmpldi_cr0_DI_clobber_CCUNS_none"    0 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign"       8 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_EXTHI_CC_sign"         4 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"   2 { target ilp32 } } } */
-/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign"         0 { target ilp32 } } } */
-/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       9 { target ilp32 } } } */
-/* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"     0 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_EXTHI_CCUNS_zero"     2 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_SI_CC_none"           36 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lwz_cmpdi_cr0_SI_clobber_CC_none"      16 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none"   6 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_SI_CCUNS_none"        2 { target ilp32 } } } */

             reply	other threads:[~2023-06-08 14:38 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-08 14:38 Michael Meissner [this message]
  -- strict thread matches above, loose matches on Subject: below --
2023-06-13 23:24 Michael Meissner
2023-06-13 17:21 Michael Meissner
2023-06-13  2:57 Michael Meissner
2023-06-09 21:04 Michael Meissner
2023-06-09  6:08 Michael Meissner
2023-06-09  4:17 Michael Meissner
2023-06-09  1:32 Michael Meissner
2023-06-08 21:20 Michael Meissner
2023-06-08 20:23 Michael Meissner
2023-06-08 16:53 Michael Meissner
2023-06-07 20:58 Michael Meissner
2023-06-07 18:56 Michael Meissner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230608143857.B3CB5385702B@sourceware.org \
    --to=meissner@gcc.gnu.org \
    --cc=gcc-cvs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).