public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: "Kewen.Lin" <linkw@linux.ibm.com>
To: Peter Bergner <bergner@linux.ibm.com>,
	Segher Boessenkool <segher@kernel.crashing.org>
Cc: Michael Meissner <meissner@linux.ibm.com>,
	GCC Patches <gcc-patches@gcc.gnu.org>,
	David Edelsohn <dje.gcc@gmail.com>
Subject: Re: [PATCH] rs6000: Disassemble opaque modes using subregs to allow optimizations [PR109116]
Date: Fri, 24 Nov 2023 17:28:59 +0800	[thread overview]
Message-ID: <710b8ada-87de-2947-4050-6a307adff783@linux.ibm.com> (raw)
In-Reply-To: <1f32e2bf-83c2-4664-b7f3-4a6996978a5e@linux.ibm.com>

Hi Peter,

on 2023/11/16 07:50, Peter Bergner wrote:
> PR109116 exposes an issue where using unspecs to access each vector component
> of an opaque mode variable leads to unneeded register copies, because our rtl
> optimizers cannot handle unspecs.  Instead, use subregs to access each vector
> component of the opaque mode variable, which our optimizers know how to handle.
> 
> I did not include a test case with the patch, since writing a test case that
> attempts to ensure we don't emit unneeded register copies is nearly impossible
> since those copies can still be generated for reasons other than the causes
> in this patch.  I have verified that this patch does improve code generation
> for some unit tests and our AI libraries team has confirmed that performance
> of their tests improved when using this patch.
> 
> This passed bootstrap and regtesting with no regressions on powerpc64le-linux
> and powerpc64-linux.  Ok for trunk?
> 
> Peter
> 
> 
> gcc/
> 	PR target/109116
> 	* config/rs6000/mma.md (vsx_disassemble_pair): Expand into a vector
> 	register sized subreg.
> 	* config/rs6000/mma.md (*vsx_disassemble_pair): Delete.
> 	(mma_disassemble_acc): Expand into a vector register sized subreg.
> 	(*mma_disassemble_acc): Delete.
> 	* config/rs6000/rs6000.cc (rs6000_modes_tieable_p): Allow vector modes
> 	to tie with OOmode.
> 
> diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
> index 575751d477e..2ca405469e2 100644
> --- a/gcc/config/rs6000/mma.md
> +++ b/gcc/config/rs6000/mma.md
> @@ -398,29 +398,8 @@ (define_expand "vsx_disassemble_pair"
>     (match_operand 2 "const_0_to_1_operand")]
>    "TARGET_MMA"
>  {
> -  rtx src;
> -  int regoff = INTVAL (operands[2]);
> -  src = gen_rtx_UNSPEC (V16QImode,
> -			gen_rtvec (2, operands[1], GEN_INT (regoff)),
> -			UNSPEC_MMA_EXTRACT);
> -  emit_move_insn (operands[0], src);
> -  DONE;
> -})
> -
> -(define_insn_and_split "*vsx_disassemble_pair"
> -  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
> -       (unspec:V16QI [(match_operand:OO 1 "vsx_register_operand" "wa")
> -		      (match_operand 2 "const_0_to_1_operand")]
> -		      UNSPEC_MMA_EXTRACT))]
> -  "TARGET_MMA
> -   && vsx_register_operand (operands[1], OOmode)"
> -  "#"
> -  "&& reload_completed"
> -  [(const_int 0)]
> -{
> -  int reg = REGNO (operands[1]);
> -  int regoff = INTVAL (operands[2]);
> -  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
> +  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);

Is it intentional to keep GET_MODE_SIZE (V16QImode) instead of 16?
I think if one day NUM_POLY_INT_COEFFS isn't 1 on rs6000 any more,
we have to add one explicit .to_constant () here.  So I prefer this
to use 16 directly, maybe one comment above to indicate what's for
the value 16.

> +  rtx src = simplify_gen_subreg (V16QImode, operands[1], OOmode, regoff);
>    emit_move_insn (operands[0], src);
>    DONE;
>  })
> @@ -472,29 +451,8 @@ (define_expand "mma_disassemble_acc"
>     (match_operand 2 "const_0_to_3_operand")]
>    "TARGET_MMA"
>  {
> -  rtx src;
> -  int regoff = INTVAL (operands[2]);
> -  src = gen_rtx_UNSPEC (V16QImode,
> -			gen_rtvec (2, operands[1], GEN_INT (regoff)),
> -			UNSPEC_MMA_EXTRACT);
> -  emit_move_insn (operands[0], src);
> -  DONE;
> -})
> -
> -(define_insn_and_split "*mma_disassemble_acc"
> -  [(set (match_operand:V16QI 0 "mma_disassemble_output_operand" "=mwa")
> -       (unspec:V16QI [(match_operand:XO 1 "fpr_reg_operand" "d")
> -		      (match_operand 2 "const_0_to_3_operand")]
> -		      UNSPEC_MMA_EXTRACT))]
> -  "TARGET_MMA
> -   && fpr_reg_operand (operands[1], XOmode)"
> -  "#"
> -  "&& reload_completed"
> -  [(const_int 0)]
> -{
> -  int reg = REGNO (operands[1]);
> -  int regoff = INTVAL (operands[2]);
> -  rtx src = gen_rtx_REG (V16QImode, reg + regoff);
> +  int regoff = INTVAL (operands[2]) * GET_MODE_SIZE (V16QImode);

Likewise.

> +  rtx src = simplify_gen_subreg (V16QImode, operands[1], XOmode, regoff);
>    emit_move_insn (operands[0], src);
>    DONE;
>  })
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 5f56c3ed85b..f2efa46c147 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -1964,9 +1964,12 @@ rs6000_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
>  static bool
>  rs6000_modes_tieable_p (machine_mode mode1, machine_mode mode2)
>  {
> -  if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
> -      || mode2 == PTImode || mode2 == OOmode || mode2 == XOmode)
> -    return mode1 == mode2;
> +   if (mode1 == PTImode || mode1 == OOmode || mode1 == XOmode
> +       || mode2 == PTImode || mode2 == XOmode)
> +     return mode1 == mode2;
> + 
> +  if (mode2 == OOmode)
> +    return ALTIVEC_OR_VSX_VECTOR_MODE (mode1);

I vaguely remembered that Segher mentioned it's unexpected for opaque
modes to have tieable modes excepting for themselves, but if this is the
only way to get rid of those extra moves, I guess we can special-case
them here.  Looking forward to Segher's comments on this part.

BR,
Kewen

>  
>    if (ALTIVEC_OR_VSX_VECTOR_MODE (mode1))
>      return ALTIVEC_OR_VSX_VECTOR_MODE (mode2);

  reply	other threads:[~2023-11-24  9:29 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-15 23:50 Peter Bergner
2023-11-24  9:28 ` Kewen.Lin [this message]
2023-12-13 22:01   ` Peter Bergner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=710b8ada-87de-2947-4050-6a307adff783@linux.ibm.com \
    --to=linkw@linux.ibm.com \
    --cc=bergner@linux.ibm.com \
    --cc=dje.gcc@gmail.com \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=meissner@linux.ibm.com \
    --cc=segher@kernel.crashing.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).