From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Michael Meissner <meissner@linux.ibm.com>
Cc: gcc-patches@gcc.gnu.org,
Segher Boessenkool <segher@kernel.crashing.org>,
David Edelsohn <dje.gcc@gmail.com>,
Peter Bergner <bergner@linux.ibm.com>,
Will Schmidt <will_schmidt@vnet.ibm.com>
Subject: Re: [PATCH] PR target/105325, Make load/cmp fusion know about prefixed loads
Date: Thu, 23 Mar 2023 16:10:22 +0800 [thread overview]
Message-ID: <f641dfc7-46a7-b323-5b82-fb18f05af8a8@linux.ibm.com> (raw)
In-Reply-To: <ZBpDaXeXCMBNxNWk@toto.the-meissners.org>
Hi Mike,
Thanks for fixing this, some minor comments are inlined below.
on 2023/3/22 07:53, Michael Meissner wrote:
> The issue with the bug is the power10 load GPR + cmpi -1/0/1 fusion
> optimization generates illegal assembler code.
>
> Ultimately the code was dying because the fusion load + compare -1/0/1 patterns
> did not handle the possibility that the load might be prefixed.
>
> The main cause is the constraints for the individual loads in the fusion did not
> match the machine. In particular, LWA is a ds format instruction when it is
> unprefixed. The code did not also set the prefixed attribute correctly.
>
> This patch rewrites the genfusion.pl script so that it will have more accurate
> constraints for the LWA and LD instructions (which are DS instructions). The
> updated genfusion.pl was then run to update fusion.md. Finally, the code for
> the "prefixed" attribute is modified so that it considers load + compare
> immediate patterns to be like the normal load insns in checking whether
> operand[1] is a prefixed instruction.
>
> I have tested this patch on a little endian power10 system, on a little endian
> power9 system, and a big endian power8 system (both -m32 and -m64 tested on
> BE). There were no regressions, can I check this into the trunk?
>
> The same patch applies to the gcc-12 and gcc-11 branches. Can I check this
> patch into those branches also after a burn-in period?
>
> 2023-03-21 Michael Meissner <meissner@linux.ibm.com>
> Aaron Sawdey <acsawdey@linux.ibm.com>
>
> gcc/
>
> PR target/105325
> * gcc/config/rs6000/genfusion.pl (gen_ld_cmpi_p10): Improve generation
> of the ld and lwa instructions which use the DS encoding instead of D.
> Use the YZ constraint for these loads. Handle prefixed loads better.
> Set the sign_extend attribute as appropriate.
> * gcc/config/rs6000/fusion.md: Regenerate.
> * gcc/config/rs6000/rs6000.md (prefixed attribute): Add fused_load_cmpi
> instructions to the list of instructions that might have a prefixed load
> instruction.
>
> gcc/testsuite/
>
> PR target/105325
> * g++.target/powerpc/pr105325.C: New test.
> * gcc.target/powerpc/fusion-p10-ldcmpi.c: Adjust insn counts.
> ---
> gcc/config/rs6000/genfusion.pl | 26 ++++++++++++++++---
> gcc/config/rs6000/fusion.md | 17 +++++++-----
> gcc/config/rs6000/rs6000.md | 2 +-
> gcc/testsuite/g++.target/powerpc/pr105325.C | 24 +++++++++++++++++
> .../gcc.target/powerpc/fusion-p10-ldcmpi.c | 4 +--
> 5 files changed, 59 insertions(+), 14 deletions(-)
> create mode 100644 gcc/testsuite/g++.target/powerpc/pr105325.C
>
> diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
> index e4db352e0ce..4f367cadc52 100755
> --- a/gcc/config/rs6000/genfusion.pl
> +++ b/gcc/config/rs6000/genfusion.pl
> @@ -56,7 +56,7 @@ sub mode_to_ldst_char
> sub gen_ld_cmpi_p10
> {
> my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred,
> - $mempred, $ccmode, $np, $extend, $resultmode);
> + $mempred, $ccmode, $np, $extend, $resultmode, $constraint);
> LMODE: foreach $lmode ('DI','SI','HI','QI') {
> $ldst = mode_to_ldst_char($lmode);
> $clobbermode = $lmode;
> @@ -71,21 +71,34 @@ sub gen_ld_cmpi_p10
> CCMODE: foreach $ccmode ('CC','CCUNS') {
> $np = "NON_PREFIXED_D";
> $mempred = "non_update_memory_operand";
> + $constraint = "m";
The three assignments on $np $mempred $constraint can be moved
to place (a) (see below) and add one explicit assignment for
$constraint at place (b), since for the condition ccmode eq 'CC',
HI/SI/DI have their own settings (btw QI is skipped), these
assignments for default value can be moved to else arm (for CCUNS).
> if ( $ccmode eq 'CC' ) {
> next CCMODE if $lmode eq 'QI';
> - if ( $lmode eq 'DI' || $lmode eq 'SI' ) {
> + if ( $lmode eq 'HI' ) {
> + $np = "NON_PREFIXED_D";
> + $mempred = "non_update_memory_operand";
> + $echr = "a";
// place b
$constraint = "m";
> + } elsif ( $lmode eq 'SI' ) {
> + # ld and lwa are both DS-FORM.
we have broken it into two different arms for SI and DI, this
comment can be removed?
> + $np = "NON_PREFIXED_DS";
> + $mempred = "lwa_operand";
> + $echr = "a";
> + $constraint = "YZ";
> + } elsif ( $lmode eq 'DI' ) {
> # ld and lwa are both DS-FORM.
... and this comment.
> $np = "NON_PREFIXED_DS";
> $mempred = "ds_form_mem_operand";
> + $echr = "";
> + $constraint = "YZ";
> }
> $cmpl = "";
> - $echr = "a";
> $constpred = "const_m1_to_1_operand";
> } else {
// place a
> if ( $lmode eq 'DI' ) {
> # ld is DS-form, but lwz is not.
> $np = "NON_PREFIXED_DS";
> $mempred = "ds_form_mem_operand";
> + $constraint = "YZ";
> }
> $cmpl = "l";
> $echr = "z";
> @@ -108,7 +121,7 @@ sub gen_ld_cmpi_p10
>
> print "(define_insn_and_split \"*l${ldst}${echr}_cmp${cmpl}di_cr0_${lmode}_${result}_${ccmode}_${extend}\"\n";
> print " [(set (match_operand:${ccmode} 2 \"cc_reg_operand\" \"=x\")\n";
> - print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"m\")\n";
> + print " (compare:${ccmode} (match_operand:${lmode} 1 \"${mempred}\" \"${constraint}\")\n";
> if ($ccmode eq 'CCUNS') { print " "; }
> print " (match_operand:${lmode} 3 \"${constpred}\" \"n\")))\n";
> if ($result eq 'clobber') {
> @@ -137,6 +150,11 @@ sub gen_ld_cmpi_p10
> print " \"\"\n";
> print " [(set_attr \"type\" \"fused_load_cmpi\")\n";
> print " (set_attr \"cost\" \"8\")\n";
> +
> + if ($extend eq "sign") {
> + print " (set_attr \"sign_extend\" \"yes\")\n";
> + }
> +
> print " (set_attr \"length\" \"8\")])\n";
> print "\n";
> }
> diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
> index d45fb138a70..da9953d9ad9 100644
> --- a/gcc/config/rs6000/fusion.md
> +++ b/gcc/config/rs6000/fusion.md
> @@ -22,7 +22,7 @@
> ;; load mode is DI result mode is clobber compare mode is CC extend is none
> (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
> [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
> + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
> (match_operand:DI 3 "const_m1_to_1_operand" "n")))
> (clobber (match_scratch:DI 0 "=r"))]
> "(TARGET_P10_FUSION)"
> @@ -43,7 +43,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_clobber_CC_none"
> ;; load mode is DI result mode is clobber compare mode is CCUNS extend is none
> (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
> [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
> + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
> (match_operand:DI 3 "const_0_to_1_operand" "n")))
> (clobber (match_scratch:DI 0 "=r"))]
> "(TARGET_P10_FUSION)"
> @@ -64,7 +64,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_clobber_CCUNS_none"
> ;; load mode is DI result mode is DI compare mode is CC extend is none
> (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
> [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> - (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "m")
> + (compare:CC (match_operand:DI 1 "ds_form_mem_operand" "YZ")
> (match_operand:DI 3 "const_m1_to_1_operand" "n")))
> (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> "(TARGET_P10_FUSION)"
> @@ -85,7 +85,7 @@ (define_insn_and_split "*ld_cmpdi_cr0_DI_DI_CC_none"
> ;; load mode is DI result mode is DI compare mode is CCUNS extend is none
> (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
> [(set (match_operand:CCUNS 2 "cc_reg_operand" "=x")
> - (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "m")
> + (compare:CCUNS (match_operand:DI 1 "ds_form_mem_operand" "YZ")
> (match_operand:DI 3 "const_0_to_1_operand" "n")))
> (set (match_operand:DI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> "(TARGET_P10_FUSION)"
> @@ -106,7 +106,7 @@ (define_insn_and_split "*ld_cmpldi_cr0_DI_DI_CCUNS_none"
> ;; load mode is SI result mode is clobber compare mode is CC extend is none
> (define_insn_and_split "*lwa_cmpdi_cr0_SI_clobber_CC_none"
> [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
> + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
> (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> (clobber (match_scratch:SI 0 "=r"))]
> "(TARGET_P10_FUSION)"
> @@ -148,7 +148,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_clobber_CCUNS_none"
> ;; load mode is SI result mode is SI compare mode is CC extend is none
> (define_insn_and_split "*lwa_cmpdi_cr0_SI_SI_CC_none"
> [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
> + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
> (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> (set (match_operand:SI 0 "gpc_reg_operand" "=r") (match_dup 1))]
> "(TARGET_P10_FUSION)"
> @@ -190,7 +190,7 @@ (define_insn_and_split "*lwz_cmpldi_cr0_SI_SI_CCUNS_none"
> ;; load mode is SI result mode is EXTSI compare mode is CC extend is sign
> (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
> [(set (match_operand:CC 2 "cc_reg_operand" "=x")
> - (compare:CC (match_operand:SI 1 "ds_form_mem_operand" "m")
> + (compare:CC (match_operand:SI 1 "lwa_operand" "YZ")
> (match_operand:SI 3 "const_m1_to_1_operand" "n")))
> (set (match_operand:EXTSI 0 "gpc_reg_operand" "=r") (sign_extend:EXTSI (match_dup 1)))]
> "(TARGET_P10_FUSION)"
> @@ -205,6 +205,7 @@ (define_insn_and_split "*lwa_cmpdi_cr0_SI_EXTSI_CC_sign"
> ""
> [(set_attr "type" "fused_load_cmpi")
> (set_attr "cost" "8")
> + (set_attr "sign_extend" "yes")
> (set_attr "length" "8")])
>
> ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> @@ -247,6 +248,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_clobber_CC_sign"
> ""
> [(set_attr "type" "fused_load_cmpi")
> (set_attr "cost" "8")
> + (set_attr "sign_extend" "yes")
> (set_attr "length" "8")])
>
> ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> @@ -289,6 +291,7 @@ (define_insn_and_split "*lha_cmpdi_cr0_HI_EXTHI_CC_sign"
> ""
> [(set_attr "type" "fused_load_cmpi")
> (set_attr "cost" "8")
> + (set_attr "sign_extend" "yes")
> (set_attr "length" "8")])
>
> ;; load-cmpi fusion pattern generated by gen_ld_cmpi_p10
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 81bffb04ceb..0f809c3793f 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -302,7 +302,7 @@ (define_attr "prefixed" "no,yes"
> (eq_attr "maybe_prefixed" "no"))
> (const_string "no")
>
> - (eq_attr "type" "load,fpload,vecload")
> + (eq_attr "type" "load,fpload,vecload,vecload,fused_load_cmpi")
> (if_then_else (match_test "prefixed_load_p (insn)")
> (const_string "yes")
> (const_string "no"))
> diff --git a/gcc/testsuite/g++.target/powerpc/pr105325.C b/gcc/testsuite/g++.target/powerpc/pr105325.C
> new file mode 100644
> index 00000000000..f4ab384daa7
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/powerpc/pr105325.C
> @@ -0,0 +1,24 @@
> +/* { dg-do assemble } */
> +/* { dg-require-effective-target lp64 } */
Nit: lp64 seems not necessary.
The others look good to me, thanks!
BR,
Kewen
next prev parent reply other threads:[~2023-03-23 8:10 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-21 23:53 Michael Meissner
2023-03-23 8:10 ` Kewen.Lin [this message]
2023-03-24 23:09 ` Michael Meissner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f641dfc7-46a7-b323-5b82-fb18f05af8a8@linux.ibm.com \
--to=linkw@linux.ibm.com \
--cc=bergner@linux.ibm.com \
--cc=dje.gcc@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=meissner@linux.ibm.com \
--cc=segher@kernel.crashing.org \
--cc=will_schmidt@vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).