public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Michael Meissner <meissner@linux.ibm.com>
Cc: gcc-patches@gcc.gnu.org,
	Segher Boessenkool <segher@kernel.crashing.org>,
	David Edelsohn <dje.gcc@gmail.com>,
	Peter Bergner <bergner@linux.ibm.com>,
	Will Schmidt <will_schmidt@vnet.ibm.com>
Subject: Re: [PATCH] PR target/105325, Make load/cmp fusion know about prefixed loads
Date: Thu, 23 Mar 2023 16:10:22 +0800	[thread overview]
Message-ID: <f641dfc7-46a7-b323-5b82-fb18f05af8a8@linux.ibm.com> (raw)
In-Reply-To: <ZBpDaXeXCMBNxNWk@toto.the-meissners.org>

Hi Mike,

Thanks for fixing this, some minor comments are inlined below.

on 2023/3/22 07:53, Michael Meissner wrote:
> The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion
> optimization generates illegal assembler code.
> 
> Ultimately the code was dying because the fusion load + compare -1/0/1 patterns
> did not handle the possibility that the load might be prefixed.
> 
> The main cause is the constraints for the individual loads in the fusion did not
> match the machine.  In particular, LWA is a ds format instruction when it is
> unprefixed.  The code did not also set the prefixed attribute correctly.
> 
> This patch rewrites the genfusion.pl script so that it will have more accurate
> constraints for the LWA and LD instructions (which are DS instructions).  The
> updated genfusion.pl was then run to update fusion.md.  Finally, the code for
> the "prefixed" attribute is modified so that it considers load + compare
> immediate patterns to be like the normal load insns in checking whether
> operand[1] is a prefixed instruction.
> 
> I have tested this patch on a little endian power10 system, on a little endian
> power9 system, and a big endian power8 system (both -m32 and -m64 tested on
> BE).  There were no regressions, can I check this into the trunk?
> 
> The same patch applies to the gcc-12 and gcc-11 branches.  Can I check this
> patch into those branches also after a burn-in period?
> 
> 2023-03-21   Michael Meissner  <meissner@linux.ibm.com>
>              Aaron Sawdey  <acsawdey@linux.ibm.com>
> 
> gcc/
> 
> 	PR target/105325
> 	* gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation
> 	of the ld and lwa instructions which use the DS encoding instead of D.
> 	Use the YZ constraint for these loads.	Handle prefixed loads better.
> 	Set the sign_extend attribute as appropriate.
> 	* gcc/config/rs6000/fusion.md: Regenerate.
> 	* gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi
> 	instructions to the list of instructions that might have a prefixed load
> 	instruction.
> 
> gcc/testsuite/
> 
> 	PR target/105325
> 	* g++.target/powerpc/pr105325.C: New test.
> 	* gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts.
> ---
>  gcc/config/rs6000/genfusion.pl                | 26 ++++++++++++++++---
>  gcc/config/rs6000/fusion.md                   | 17 +++++++-----
>  gcc/config/rs6000/rs6000.md                   |  2 +-
>  gcc/testsuite/g++.target/powerpc/pr105325.C   | 24 +++++++++++++++++
>  .../gcc.target/powerpc/fusion-p10-ldcmpi.c    |  4 +--
>  5 files changed, 59 insertions(+), 14 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/powerpc/pr105325.C
> 
> diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
> index e4db352e0ce..4f367cadc52 100755
> --- a/gcc/config/rs6000/genfusion.pl
> +++ b/gcc/config/rs6000/genfusion.pl
> @@ -56,7 +56,7 @@ sub mode_to_ldst_char
>  sub gen_ld_cmpi_p10
>  {
>      my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred,
> -	$mempred, $ccmode, $np, $extend, $resultmode);
> +	$mempred, $ccmode, $np, $extend, $resultmode, $constraint);
>    LMODE: foreach $lmode ('DI','SI','HI','QI') {
>        $ldst = mode_to_ldst_char($lmode);
>        $clobbermode = $lmode;
> @@ -71,21 +71,34 @@ sub gen_ld_cmpi_p10
>        CCMODE: foreach $ccmode ('CC','CCUNS') {
>  	  $np = "NON_PREFIXED_D";
>  	  $mempred = "non_update_memory_operand";
> +	  $constraint = "m";

The three assignments on $np $mempred $constraint can be moved
to place (a) (see below) and add one explicit assignment for
$constraint at place (b), since for the condition ccmode eq 'CC',
HI/SI/DI have their own settings (btw QI is skipped), these
assignments for default value can be moved to else arm (for CCUNS).

>  	  if ( $ccmode eq 'CC' ) {
>  	      next CCMODE if $lmode eq 'QI';
> -	      if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
> +	      if ( $lmode eq 'HI' ) {
> +		  $np = "NON_PREFIXED_D";
> +		  $mempred = "non_update_memory_operand";
> +		  $echr = "a";
                  // place b
                  $constraint = "m";
                   
> +	      } elsif ( $lmode eq 'SI' ) {
> +		  # ld and lwa are both DS-FORM.
  
we have broken it into two different arms for SI and DI, this
comment can be removed?

> +		  $np = "NON_PREFIXED_DS";
> +		  $mempred = "lwa_operand";
> +		  $echr = "a";
> +		  $constraint = "YZ";
> +	      } elsif ( $lmode eq 'DI' ) {
>  		  # ld and lwa are both DS-FORM.

... and this comment.

>  		  $np = "NON_PREFIXED_DS";
>  		  $mempred = "ds_form_mem_operand";
> +		  $echr = "";
> +		  $constraint = "YZ";
>  	      }
>  	      $cmpl = "";
> -	      $echr = "a";
>  	      $constpred = "const_m1_to_1_operand";
>  	  } else {

              // place a

>  	      if ( $lmode eq 'DI' ) {
>  		  # ld is DS-form, but lwz is not.
>  		  $np = "NON_PREFIXED_DS";
>  		  $mempred = "ds_form_mem_operand";
> +		  $constraint = "YZ";
>  	      }
>  	      $cmpl = "l";
>  	      $echr = "z";
> @@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10
>  
>  	  print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";
>  	  print "  [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n";
> -	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"m\")\n";
> +	  print "        (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"${constraint}\")\n";
>  	  if ($ccmode eq 'CCUNS') { print "   "; }
>  	  print "                    (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n";
>  	  if ($result eq 'clobber') {
> @@ -137,6 +150,11 @@ sub gen_ld_cmpi_p10
>  	  print "  \"\"\n";
>  	  print "  [(set_attr \"type\" \"fused_load_cmpi\")\n";
>  	  print "   (set_attr \"cost\" \"8\")\n";
> +
> +	  if ($extend eq "sign") {
> +		  print "   (set_attr \"sign_extend\" \"yes\")\n";
> +	  }
> +
>  	  print "   (set_attr \"length\" \"8\")])\n";
>  	  print "\n";
>        }
> diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
> index d45fb138a70..da9953d9ad9 100644
> --- a/gcc/config/rs6000/fusion.md
> +++ b/gcc/config/rs6000/fusion.md
> @@ -22,7 +22,7 @@
>  ;; load mode is DI result mode is clobber compare mode is CC extend is none
>  (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
>    [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> -        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
> +        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
>                      (match_operand:DI 3 "const_m1_to_1_operand" "n")))
>     (clobber (match_scratch:DI 0 "=r"))]
>    "(TARGET_P10_FUSION)"
> @@ -43,7 +43,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
>  ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
>  (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
>    [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> -        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
> +        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
>                         (match_operand:DI 3 "const_0_to_1_operand" "n")))
>     (clobber (match_scratch:DI 0 "=r"))]
>    "(TARGET_P10_FUSION)"
> @@ -64,7 +64,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
>  ;; load mode is DI result mode is DI compare mode is CC extend is none
>  (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
>    [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> -        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
> +        (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
>                      (match_operand:DI 3 "const_m1_to_1_operand" "n")))
>     (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
>    "(TARGET_P10_FUSION)"
> @@ -85,7 +85,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
>  ;; load mode is DI result mode is DI compare mode is CCUNS extend is none
>  (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
>    [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> -        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
> +        (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
>                         (match_operand:DI 3 "const_0_to_1_operand" "n")))
>     (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
>    "(TARGET_P10_FUSION)"
> @@ -106,7 +106,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
>  ;; load mode is SI result mode is clobber compare mode is CC extend is none
>  (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
>    [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> -        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
> +        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
>                      (match_operand:SI 3 "const_m1_to_1_operand" "n")))
>     (clobber (match_scratch:SI 0 "=r"))]
>    "(TARGET_P10_FUSION)"
> @@ -148,7 +148,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none"
>  ;; load mode is SI result mode is SI compare mode is CC extend is none
>  (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
>    [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> -        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
> +        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
>                      (match_operand:SI 3 "const_m1_to_1_operand" "n")))
>     (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
>    "(TARGET_P10_FUSION)"
> @@ -190,7 +190,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none"
>  ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
>  (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
>    [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> -        (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
> +        (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
>                      (match_operand:SI 3 "const_m1_to_1_operand" "n")))
>     (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))]
>    "(TARGET_P10_FUSION)"
> @@ -205,6 +205,7 @@ (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
>    ""
>    [(set_attr "type" "fused_load_cmpi")
>     (set_attr "cost" "8")
> +   (set_attr "sign_extend" "yes")
>     (set_attr "length" "8")])
>  
>  ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> @@ -247,6 +248,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign"
>    ""
>    [(set_attr "type" "fused_load_cmpi")
>     (set_attr "cost" "8")
> +   (set_attr "sign_extend" "yes")
>     (set_attr "length" "8")])
>  
>  ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> @@ -289,6 +291,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign"
>    ""
>    [(set_attr "type" "fused_load_cmpi")
>     (set_attr "cost" "8")
> +   (set_attr "sign_extend" "yes")
>     (set_attr "length" "8")])
>  
>  ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 81bffb04ceb..0f809c3793f 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -302,7 +302,7 @@ (define_attr "prefixed" "no,yes"
>  	      (eq_attr "maybe_prefixed" "no"))
>  	 (const_string "no")
>  
> -	 (eq_attr "type" "load,fpload,vecload")
> +	 (eq_attr "type" "load,fpload,vecload,vecload,fused_load_cmpi")
>  	 (if_then_else (match_test "prefixed_load_p (insn)")
>  		       (const_string "yes")
>  		       (const_string "no"))
> diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C b/gcc/testsuite/g++.target/powerpc/pr105325.C
> new file mode 100644
> index 00000000000..f4ab384daa7
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C
> @@ -0,0 +1,24 @@
> +/* { dg-do assemble } */
> +/* { dg-require-effective-target lp64 } */

Nit: lp64 seems not necessary.

The others look good to me, thanks!

BR,
Kewen

  reply	other threads:[~2023-03-23  8:10 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-21 23:53 Michael Meissner
2023-03-23  8:10 ` Kewen.Lin [this message]
2023-03-24 23:09   ` Michael Meissner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f641dfc7-46a7-b323-5b82-fb18f05af8a8@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=bergner@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=meissner@linux.ibm.com \
    --cc=segher@kernel.crashing.org \
    --cc=will_schmidt@vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).