public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
From: Michael Meissner <meissner@gcc.gnu.org>
To: gcc-cvs@gcc.gnu.org
Subject: [gcc(refs/users/meissner/heads/work116)] Make load/cmp fusion know about prefixed loads.
Date: Fri, 24 Mar 2023 21:29:34 +0000 (GMT)	[thread overview]
Message-ID: <20230324212934.C9D7F3858D28@sourceware.org> (raw)

https://gcc.gnu.org/g:cfc165a045c7907a563cb0d5798a4994770ca166

commit cfc165a045c7907a563cb0d5798a4994770ca166
Author: Michael Meissner <meissner@linux.ibm.com>
Date:   Fri Mar 24 17:29:04 2023 -0400

    Make load/cmp fusion know about prefixed loads.
    
    The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion
    optimization generates illegal assembler code.
    
    Ultimately the code was dying because the fusion load + compare -1/0/1 patterns
    did not handle the possibility that the load might be prefixed.
    
    The main cause is the constraints for the individual loads in the fusion did not
    match the machine.  In particular, LWA is a ds format instruction when it is
    unprefixed.  The code did not also set the prefixed attribute correctly.
    
    This patch rewrites the genfusion.pl script so that it will have more accurate
    constraints for the LWA and LD instructions (which are DS instructions).  The
    updated genfusion.pl was then run to update fusion.md.  Finally, the code for
    the "prefixed" attribute is modified so that it considers load + compare
    immediate patterns to be like the normal load insns in checking whether
    operand[1] is a prefixed instruction.
    
    I posted a version of patch on March 21st.  This patch makes some code changes
    suggested in the genfusion.pl code.
    
    2023-03-21   Michael Meissner  <meissner@linux.ibm.com>
    
    gcc/
    
            PR target/105325
            * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation
            of the ld and lwa instructions which use the DS encoding instead of D.
            Use the YZ constraint for these loads.  Handle prefixed loads better.
            Set the sign_extend attribute as appropriate.
            * gcc/config/rs6000/fusion.md: Regenerate.
            * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi
            instructions to the list of instructions that might have a prefixed load
            instruction.
    
    gcc/testsuite/
    
            PR target/105325
            * g++.target/powerpc/pr105325.C: New test.
            * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts.

Diff:
---
 gcc/config/rs6000/fusion.md                        | 17 +++++++-----
 gcc/config/rs6000/genfusion.pl                     | 32 +++++++++++++++++-----
 gcc/config/rs6000/rs6000.md                        |  2 +-
 gcc/testsuite/g++.target/powerpc/pr105325.C        | 24 ++++++++++++++++
 .../gcc.target/powerpc/fusion-p10-ldcmpi.c         |  4 +--
 5 files changed, 62 insertions(+), 17 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index d45fb138a70..da9953d9ad9 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -22,7 +22,7 @@
 ;; load mode is DI result mode is clobber compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                     (match_operand:DI 3 "const_m1_to_1_operand" "n")))
    (clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION)"
@@ -43,7 +43,7 @@
 ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
 (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
   [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
-        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                        (match_operand:DI 3 "const_0_to_1_operand" "n")))
    (clobber (match_scratch:DI 0 "=r"))]
   "(TARGET_P10_FUSION)"
@@ -64,7 +64,7 @@
 ;; load mode is DI result mode is DI compare mode is CC extend is none
 (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                     (match_operand:DI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -85,7 +85,7 @@
 ;; load mode is DI result mode is DI compare mode is CCUNS extend is none
 (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
   [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
-        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
+        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
                        (match_operand:DI 3 "const_0_to_1_operand" "n")))
    (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -106,7 +106,7 @@
 ;; load mode is SI result mode is clobber compare mode is CC extend is none
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (clobber (match_scratch:SI 0 "=r"))]
   "(TARGET_P10_FUSION)"
@@ -148,7 +148,7 @@
 ;; load mode is SI result mode is SI compare mode is CC extend is none
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
   "(TARGET_P10_FUSION)"
@@ -190,7 +190,7 @@
 ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
 (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
   [(set (match_operand:CC 2 "cc_reg_operand" "=x")
-        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
+        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
                     (match_operand:SI 3 "const_m1_to_1_operand" "n")))
    (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))]
   "(TARGET_P10_FUSION)"
@@ -205,6 +205,7 @@
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
@@ -247,6 +248,7 @@
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
@@ -289,6 +291,7 @@
   ""
   [(set_attr "type" "fused_load_cmpi")
    (set_attr "cost" "8")
+   (set_attr "sign_extend" "yes")
    (set_attr "length" "8")])
 
 ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index e4db352e0ce..fa9ab7e9704 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -56,7 +56,7 @@ sub mode_to_ldst_char
 sub gen_ld_cmpi_p10
 {
     my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred,
-	$mempred, $ccmode, $np, $extend, $resultmode);
+	$ccmode, $extend, $resultmode);
   LMODE: foreach $lmode ('DI','SI','HI','QI') {
       $ldst = mode_to_ldst_char($lmode);
       $clobbermode = $lmode;
@@ -69,23 +69,36 @@ sub gen_ld_cmpi_p10
 	# Don't allow EXTQI because that would allow HI result which we can't do.
 	$result = "GPR" if $result eq "EXTQI";
       CCMODE: foreach $ccmode ('CC','CCUNS') {
-	  $np = "NON_PREFIXED_D";
-	  $mempred = "non_update_memory_operand";
+	  my $np = "NON_PREFIXED_D";
+	  my $mempred = "non_update_memory_operand";
+	  my $constraint = "m";
 	  if ( $ccmode eq 'CC' ) {
 	      next CCMODE if $lmode eq 'QI';
-	      if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
-		  # ld and lwa are both DS-FORM.
+	      if ( $lmode eq 'HI' ) {
+		  $np = "NON_PREFIXED_D";
+		  $mempred = "non_update_memory_operand";
+		  $echr = "a";
+	      } elsif ( $lmode eq 'SI' ) {
+		  # ld is DS-FORM.
+		  $np = "NON_PREFIXED_DS";
+		  $mempred = "lwa_operand";
+		  $echr = "a";
+		  $constraint = "YZ";
+	      } elsif ( $lmode eq 'DI' ) {
+		  # lwa is DS-FORM.
 		  $np = "NON_PREFIXED_DS";
 		  $mempred = "ds_form_mem_operand";
+		  $echr = "";
+		  $constraint = "YZ";
 	      }
 	      $cmpl = "";
-	      $echr = "a";
 	      $constpred = "const_m1_to_1_operand";
 	  } else {
 	      if ( $lmode eq 'DI' ) {
 		  # ld is DS-form, but lwz is not.
 		  $np = "NON_PREFIXED_DS";
 		  $mempred = "ds_form_mem_operand";
+		  $constraint = "YZ";
 	      }
 	      $cmpl = "l";
 	      $echr = "z";
@@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10
 
 	  print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";
 	  print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n";
-	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"m\")\n";
+	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"${constraint}\")\n";
 	  if ($ccmode eq 'CCUNS') { print "   "; }
 	  print "                    (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n";
 	  if ($result eq 'clobber') {
@@ -137,6 +150,11 @@ sub gen_ld_cmpi_p10
 	  print "  \"\"\n";
 	  print "  [(set_attr \"type\" \"fused_load_cmpi\")\n";
 	  print "   (set_attr \"cost\" \"8\")\n";
+
+	  if ($extend eq "sign") {
+		  print "   (set_attr \"sign_extend\" \"yes\")\n";
+	  }
+
 	  print "   (set_attr \"length\" \"8\")])\n";
 	  print "\n";
       }
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 44f7dd509cb..d836a8a58b3 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -302,7 +302,7 @@
 	      (eq_attr "maybe_prefixed" "no"))
 	 (const_string "no")
 
-	 (eq_attr "type" "load,fpload,vecload")
+	 (eq_attr "type" "load,fpload,vecload,vecload,fused_load_cmpi")
 	 (if_then_else (match_test "prefixed_load_p (insn)")
 		       (const_string "yes")
 		       (const_string "no"))
diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C b/gcc/testsuite/g++.target/powerpc/pr105325.C
new file mode 100644
index 00000000000..f4ab384daa7
--- /dev/null
+++ b/gcc/testsuite/g++.target/powerpc/pr105325.C
@@ -0,0 +1,24 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target lp64 } */
+/* { dg-require-effective-target powerpc_prefixed_addr } */
+/* { dg-options "-O2 -mdejagnu-cpu=power10 -fstack-protector" } */
+
+/* Test that power10 fusion does not generate an LWA/CMPDI instruction pair
+   instead of PLWZ/CMPWI.  Ultimately the code was dying because the fusion
+   load + compare -1/0/1 patterns did not handle the possibility that the load
+   might be prefixed.  */
+
+struct Ath__array1D {
+  int _current;
+  int getCnt() { return _current; }
+};
+struct extMeasure {
+  int _mapTable[10000];
+  Ath__array1D _metRCTable;
+};
+void measureRC() {
+  extMeasure m;
+  for (; m._metRCTable.getCnt();)
+    for (;;)
+      ;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
index 526a026d874..ca7297375a4 100644
--- a/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
+++ b/gcc/testsuite/gcc.target/powerpc/fusion-p10-ldcmpi.c
@@ -61,7 +61,7 @@ TEST(int8_t)
 /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign"      16 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"   4 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign"         0 { target lp64 } } } */
-/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       4 { target lp64 } } } */
+/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       8 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"     0 { target lp64 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none"   2 { target lp64 } } } */
 
@@ -73,6 +73,6 @@ TEST(int8_t)
 /* { dg-final { scan-assembler-times "lha_cmpdi_cr0_HI_clobber_CC_sign"       8 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lhz_cmpldi_cr0_HI_clobber_CCUNS_zero"   2 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_EXTSI_CC_sign"         0 { target ilp32 } } } */
-/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"       9 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times "lwa_cmpdi_cr0_SI_clobber_CC_none"      16 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_EXTSI_CCUNS_zero"     0 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times "lwz_cmpldi_cr0_SI_clobber_CCUNS_none"   6 { target ilp32 } } } */

             reply	other threads:[~2023-03-24 21:29 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-24 21:29 Michael Meissner [this message]
2023-03-27 20:03 Michael Meissner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230324212934.C9D7F3858D28@sourceware.org \
    --to=meissner@gcc.gnu.org \
    --cc=gcc-cvs@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).