public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: chenglulu <chenglulu@loongson.cn>
To: Xi Ruoyao <xry111@xry111.site>, gcc-patches@gcc.gnu.org
Cc: i@xen0n.name, xuchenghua@loongson.cn
Subject: Re: [PATCH] LoongArch: Optimize LSX vector shuffle on floating-point vector
Date: Wed, 22 Nov 2023 14:41:33 +0800	[thread overview]
Message-ID: <ad0c527f-fa39-3ec7-0e84-507cd967b7a6@loongson.cn> (raw)
In-Reply-To: <20231119070102.3053-2-xry111@xry111.site>


在 2023/11/19 下午3:01, Xi Ruoyao 写道:
> The vec_perm expander was wrongly defined.  GCC internal says:
>
> Operand 3 is the “selector”.  It is an integral mode vector of the same
> width and number of elements as mode M.
>
> With this mistake, the generic code manages to work around and it ends
> up creating some very nasty code for a simple __builtin_shuffle (a, b,
> c) where a and b are V4SF, c is V4SI:
>
>      la.local    $r12,.LANCHOR0
>      la.local    $r13,.LANCHOR1
>      vld $vr1,$r12,48
>      vslli.w $vr1,$vr1,2
>      vld $vr2,$r12,16
>      vld $vr0,$r13,0
>      vld $vr3,$r13,16
>      vshuf.b $vr0,$vr1,$vr1,$vr0
>      vld $vr1,$r12,32
>      vadd.b  $vr0,$vr0,$vr3
>      vandi.b $vr0,$vr0,31
>      vshuf.b $vr0,$vr1,$vr2,$vr0
>      vst $vr0,$r12,0
>      jr  $r1
>
> This is obviously stupid.  Fix the expander definition and adjust
> loongarch_expand_vec_perm to handle it correctly.
>
> gcc/ChangeLog:
>
> 	* config/loongarch/lsx.md (vec_perm<mode:LSX>): Make the
> 	selector VIMODE.
> 	* config/loongarch/loongarch.cc (loongarch_expand_vec_perm):
> 	Use the mode of the selector (instead of the shuffled vector)
> 	for truncating it.  Operate on subregs in the selector mode if
> 	the shuffled vector has a different mode (i. e. it's a
> 	floating-point vector).
>
> gcc/testsuite/ChangeLog:
>
> 	* gcc.target/loongarch/vect-shuf-fp.c: New test.
> ---
>
> Bootstrapped & regtested on loongarch64-linux-gnu.  Ok for trunk?
LGTM. Thanks!
>
>   gcc/config/loongarch/loongarch.cc              | 18 ++++++++++--------
>   gcc/config/loongarch/lsx.md                    |  2 +-
>   .../gcc.target/loongarch/vect-shuf-fp.c        | 16 ++++++++++++++++
>   3 files changed, 27 insertions(+), 9 deletions(-)
>   create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-shuf-fp.c
>
> diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
> index ce601a331f7..33357c670e1 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -8607,8 +8607,9 @@ void
>   loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
>   {
>     machine_mode vmode = GET_MODE (target);
> +  machine_mode vimode = GET_MODE (sel);
>     auto nelt = GET_MODE_NUNITS (vmode);
> -  auto round_reg = gen_reg_rtx (vmode);
> +  auto round_reg = gen_reg_rtx (vimode);
>     rtx round_data[MAX_VECT_LEN];
>   
>     for (int i = 0; i < nelt; i += 1)
> @@ -8616,9 +8617,16 @@ loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
>         round_data[i] = GEN_INT (0x1f);
>       }
>   
> -  rtx round_data_rtx = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, round_data));
> +  rtx round_data_rtx = gen_rtx_CONST_VECTOR (vimode, gen_rtvec_v (nelt, round_data));
>     emit_move_insn (round_reg, round_data_rtx);
>   
> +  if (vmode != vimode)
> +    {
> +      target = lowpart_subreg (vimode, target, vmode);
> +      op0 = lowpart_subreg (vimode, op0, vmode);
> +      op1 = lowpart_subreg (vimode, op1, vmode);
> +    }
> +
>     switch (vmode)
>       {
>       case E_V16QImode:
> @@ -8626,17 +8634,11 @@ loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
>         emit_insn (gen_lsx_vshuf_b (target, op1, op0, sel));
>         break;
>       case E_V2DFmode:
> -      emit_insn (gen_andv2di3 (sel, sel, round_reg));
> -      emit_insn (gen_lsx_vshuf_d_f (target, sel, op1, op0));
> -      break;
>       case E_V2DImode:
>         emit_insn (gen_andv2di3 (sel, sel, round_reg));
>         emit_insn (gen_lsx_vshuf_d (target, sel, op1, op0));
>         break;
>       case E_V4SFmode:
> -      emit_insn (gen_andv4si3 (sel, sel, round_reg));
> -      emit_insn (gen_lsx_vshuf_w_f (target, sel, op1, op0));
> -      break;
>       case E_V4SImode:
>         emit_insn (gen_andv4si3 (sel, sel, round_reg));
>         emit_insn (gen_lsx_vshuf_w (target, sel, op1, op0));
> diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
> index 8ea41c85b01..5e8d8d74b43 100644
> --- a/gcc/config/loongarch/lsx.md
> +++ b/gcc/config/loongarch/lsx.md
> @@ -837,7 +837,7 @@ (define_expand "vec_perm<mode>"
>    [(match_operand:LSX 0 "register_operand")
>     (match_operand:LSX 1 "register_operand")
>     (match_operand:LSX 2 "register_operand")
> -  (match_operand:LSX 3 "register_operand")]
> +  (match_operand:<VIMODE> 3 "register_operand")]
>     "ISA_HAS_LSX"
>   {
>     loongarch_expand_vec_perm (operands[0], operands[1],
> diff --git a/gcc/testsuite/gcc.target/loongarch/vect-shuf-fp.c b/gcc/testsuite/gcc.target/loongarch/vect-shuf-fp.c
> new file mode 100644
> index 00000000000..7acc2113afe
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/vect-shuf-fp.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mlasx -O3" } */
> +/* { dg-final { scan-assembler "vshuf\.w" } } */
> +
> +#define V __attribute__ ((vector_size (16)))
> +
> +int a V;
> +float b V;
> +float c V;
> +float d V;
> +
> +void
> +test (void)
> +{
> +  d = __builtin_shuffle (b, c, a);
> +}


      reply	other threads:[~2023-11-22  6:41 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-11-19  7:01 Xi Ruoyao
2023-11-22  6:41 ` chenglulu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ad0c527f-fa39-3ec7-0e84-507cd967b7a6@loongson.cn \
    --to=chenglulu@loongson.cn \
    --cc=gcc-patches@gcc.gnu.org \
    --cc=i@xen0n.name \
    --cc=xry111@xry111.site \
    --cc=xuchenghua@loongson.cn \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).