public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc(refs/users/meissner/heads/work114)] Make load/cmp fusion know about prefixed loads.
@ 2023-03-23 19:35 Michael Meissner
  0 siblings, 0 replies; only message in thread
From: Michael Meissner @ 2023-03-23 19:35 UTC (permalink / raw)
  To: gcc-cvs

https://gcc.gnu.org/g:235da76df84024ceec07142599ad998b5ac62b3c

commit 235da76df84024ceec07142599ad998b5ac62b3c
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Thu Mar 23 15:34:38 2023 -0400

    Make load/cmp fusion know about prefixed loads.
    
    The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion
    optimization generates illegal assembler code.
    
    Ultimately the code was dying because the fusion load + compare -1/0/1 patterns
    did not handle the possibility that the load might be prefixed.
    
    The main cause is the prefixed attribute did not consider that fused_load_cmpi
    insns are essentially load instructions, and to check whether the load is
    prefixed.
    
    This code ensures that the prefixed attribute is correctly set for the fusion
    load plus compare immediate insns combined instruction.  This means it will
    split the insn before final is called, and the load instruction will use a
    prefixed load.
    
    The original patch by Aaron reworked the insns generated by genfusion.pl so
    that they had constraints that limited the load to be YZ, which are constraints
    that restrict the load to offsets that the non-prefixed LWA instruction can
    handle.  I will submit that patch as a second patch.  However, just setting the
    prefixed attribute correctly will correctly split the insns.
    
    2023-03-23   Michael Meissner  <meissner@linux.ibm.com>
    
    gcc/
    
            PR target/105325
            * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi
            instructions to the list of instructions that might have a prefixed load
            instruction.
    
    gcc/testsuite/
    
            PR target/105325
            * g++.target/powerpc/pr105325.C: New test.

Diff:
---
 gcc/config/rs6000/fusion.md                        | 17 ++++++++------
 gcc/config/rs6000/genfusion.pl                     | 26 ++++++++++++++++++----
 gcc/config/rs6000/rs6000.md                        |  2 +-
 gcc/testsuite/g++.target/powerpc/pr105325.C        | 24 ++++++++++++++++++++
 .../gcc.target/powerpc/fusion-p10-ldcmpi.c         |  4 ++--
 5 files changed, 59 insertions(+), 14 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index d45fb138a70..da9953d9ad9 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -22,7 +22,7 @@
 ;; load mode is DI result mode is clobber compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                     (match_operand:DI 3 "const_m1_to_1_operand" "n")))
    (clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION)"
@@ -43,7 +43,7 @@
 ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
 (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
   [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
-        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                        (match_operand:DI 3 "const_0_to_1_operand" "n")))
    (clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION)"
@@ -64,7 +64,7 @@
 ;; load mode is DI result mode is DI compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                     (match_operand:DI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -85,7 +85,7 @@
 ;; load mode is DI result mode is DI compare mode is CCUNS extend is none
 (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
   [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
-        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                        (match_operand:DI 3 "const_0_to_1_operand" "n")))
    (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -106,7 +106,7 @@
 ;; load mode is SI result mode is clobber compare mode is CC extend is none
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (clobber (match_scratch:SI 0 "=r"))]
   "(TARGET_P10_FUSION)"
@@ -148,7 +148,7 @@
 ;; load mode is SI result mode is SI compare mode is CC extend is none
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -190,7 +190,7 @@
 ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))]
   "(TARGET_P10_FUSION)"
@@ -205,6 +205,7 @@
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
@@ -247,6 +248,7 @@
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
@@ -289,6 +291,7 @@
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index e4db352e0ce..4f367cadc52 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -56,7 +56,7 @@ sub mode_to_ldst_char
 sub gen_ld_cmpi_p10
 {
     my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred,
-	$mempred, $ccmode, $np, $extend, $resultmode);
+	$mempred, $ccmode, $np, $extend, $resultmode, $constraint);
   LMODE: foreach $lmode ('DI','SI','HI','QI') {
       $ldst = mode_to_ldst_char($lmode);
       $clobbermode = $lmode;
@@ -71,21 +71,34 @@ sub gen_ld_cmpi_p10
       CCMODE: foreach $ccmode ('CC','CCUNS') {
 	  $np = "NON_PREFIXED_D";
 	  $mempred = "non_update_memory_operand";
+	  $constraint = "m";
 	  if ( $ccmode eq 'CC' ) {
 	      next CCMODE if $lmode eq 'QI';
-	      if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
+	      if ( $lmode eq 'HI' ) {
+		  $np = "NON_PREFIXED_D";
+		  $mempred = "non_update_memory_operand";
+		  $echr = "a";
+	      } elsif ( $lmode eq 'SI' ) {
+		  # ld and lwa are both DS-FORM.
+		  $np = "NON_PREFIXED_DS";
+		  $mempred = "lwa_operand";
+		  $echr = "a";
+		  $constraint = "YZ";
+	      } elsif ( $lmode eq 'DI' ) {
 		  # ld and lwa are both DS-FORM.
 		  $np = "NON_PREFIXED_DS";
 		  $mempred = "ds_form_mem_operand";
+		  $echr = "";
+		  $constraint = "YZ";
 	      }
 	      $cmpl = "";
-	      $echr = "a";
 	      $constpred = "const_m1_to_1_operand";
 	  } else {
 	      if ( $lmode eq 'DI' ) {
 		  # ld is DS-form, but lwz is not.
 		  $np = "NON_PREFIXED_DS";
 		  $mempred = "ds_form_mem_operand";
+		  $constraint = "YZ";
 	      }
 	      $cmpl = "l";
 	      $echr = "z";
@@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10
 
 	  print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";
 	  print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n";
-	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"m\")\n";
+	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"${constraint}\")\n";
 	  if ($ccmode eq 'CCUNS') { print "   "; }
 	  print "                    (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n";
 	  if ($result eq 'clobber') {
@@ -137,6 +150,11 @@ sub gen_ld_cmpi_p10
 	  print "  \"\"\n";
 	  print "  [(set_attr \"type\" \"fused_load_cmpi\")\n";
 	  print "   (set_attr \"cost\" \"8\")\n";
+
+	  if ($extend eq "sign") {
+		  print "   (set_attr \"sign_extend\" \"yes\")\n";
+	  }
+
 	  print "   (set_attr \"length\" \"8\")])\n";
 	  print "\n";
       }
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 81bffb04ceb..0f809c3793f 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -302,7 +302,7 @@
 	      (eq_attr "maybe_prefixed" "no"))
 	 (const_string "no")
 
-	 (eq_attr "type" "load,fpload,vecload")
+	 (eq_attr "type" "load,fpload,vecload,vecload,fused_load_cmpi")
 	 (if_then_else (match_test "prefixed_load_p (insn)")
 		       (const_string "yes")
 		       (const_string "no"))
diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C b/gcc/testsuite/g++.target/powerpc/pr105325.C
new file mode 100644
index 00000000000..f4ab384daa7
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr105325.C
@@ -0,0 +1,24 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -fstack-protector" } */
+
+/* Test that power10 fusion does not generate an LWA/CMPDI instruction pair
+   instead of PLWZ/CMPWI.  Ultimately the code was dying because the fusion
+   load + compare -1/0/1 patterns did not handle the possibility that the load
+   might be prefixed.  */
+
+struct Ath__array1D {
+  int _current;
+  int getCnt() { return _current; }
+};
+struct extMeasure {
+  int _mapTable[10000];
+  Ath__array1D _metRCTable;
+};
+void measureRC() {
+  extMeasure m;
+  for (; m._metRCTable.getCnt();)
+    for (;;)
+      ;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
index 526a026d874..ca7297375a4 100644
--- a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
+++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
@@ -61,7 +61,7 @@ TEST(int8_t)
 /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign"      16 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"   4 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign"         0 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       4 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       8 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"     0 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none"   2 { target lp64 } } } */
 
@@ -73,6 +73,6 @@ TEST(int8_t)
 /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign"       8 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"   2 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign"         0 { target ilp32 } } } */
-/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       9 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"      16 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"     0 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none"   6 { target ilp32 } } } */

^ permalink raw reply	[flat|nested] only message in thread

only message in thread, other threads:[~2023-03-23 19:35 UTC | newest]

Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-03-23 19:35 [gcc(refs/users/meissner/heads/work114)] Make load/cmp fusion know about prefixed loads Michael Meissner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).