public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
To: gcc-patches@gcc.gnu.org
Cc: Juzhe-Zhong <juzhe.zhong@rivai.ai>
Subject: [Committed] RISC-V: Add simplification of dummy len and dummy mask COND_LEN_xxx pattern
Date: Tue,  2 Jan 2024 15:26:55 +0800	[thread overview]
Message-ID: <20240102072655.1533350-1-juzhe.zhong@rivai.ai> (raw)

In https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d1eacedc6d9ba9f5522f2c8d49ccfdf7939ad72d
I optimize COND_LEN_xxx pattern with dummy len and dummy mask with too simply solution which
causes redundant vsetvli in the following case:

	vsetvli	a5,a2,e8,m1,ta,ma
	vle32.v	v8,0(a0)
	vsetivli	zero,16,e32,m4,tu,mu   ----> We should apply VLMAX instead of a CONST_INT AVL
	slli	a4,a5,2
	vand.vv	v0,v8,v16
	vand.vv	v4,v8,v12
	vmseq.vi	v0,v0,0
	sub	a2,a2,a5
	vneg.v	v4,v8,v0.t
	vsetvli	zero,a5,e32,m4,ta,ma

The root cause above is the following codes:

is_vlmax_len_p (...)
   return poly_int_rtx_p (len, &value)
        && known_eq (value, GET_MODE_NUNITS (mode))
        && !satisfies_constraint_K (len);            ---> incorrect check.

Actually, we should not elide the VLMAX situation that has AVL in range of [0,31].

After removing the the check above, we will have this following issue:

        vsetivli        zero,4,e32,m1,ta,ma
        vlseg4e32.v     v4,(a5)
        vlseg4e32.v     v12,(a3)
        vsetvli a5,zero,e32,m1,tu,ma             ---> This is redundant since VLMAX AVL = 4 when it is fixed-vlmax
        vfadd.vf        v3,v13,fa0
        vfadd.vf        v1,v12,fa1
        vfmul.vv        v17,v3,v5
        vfmul.vv        v16,v1,v5

Since all the following operations (vfadd.vf ... etc) are COND_LEN_xxx with dummy len and dummy mask,
we add the simplification operations dummy len and dummy mask into VLMAX TA and MA policy.

So, after this patch. Both cases are optimal codegen now:

case 1:
	vsetvli	a5,a2,e32,m1,ta,mu
	vle32.v	v2,0(a0)
	slli	a4,a5,2
	vand.vv	v1,v2,v3
	vand.vv	v0,v2,v4
	sub	a2,a2,a5
	vmseq.vi	v0,v0,0
	vneg.v	v1,v2,v0.t
	vse32.v	v1,0(a1)

case 2:
	vsetivli zero,4,e32,m1,tu,ma
	addi a4,a5,400
	vlseg4e32.v v12,(a3)
	vfadd.vf v3,v13,fa0
	vfadd.vf v1,v12,fa1
	vlseg4e32.v v4,(a4)
	vfadd.vf v2,v14,fa1
	vfmul.vv v17,v3,v5
	vfmul.vv v16,v1,v5

This patch is just additional fix of previous approved patch.
Tested on both RV32 and RV64 newlib no regression. Committed.

gcc/ChangeLog:

	* config/riscv/riscv-v.cc (is_vlmax_len_p): Remove satisfies_constraint_K.
	(expand_cond_len_op): Add simplification of dummy len and dummy mask.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/base/vf_avl-3.c: New test.

---
 gcc/config/riscv/riscv-v.cc                        | 11 ++++++++---
 gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-3.c | 11 +++++++++++
 2 files changed, 19 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-3.c

diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index b4c7e0f0126..3c83be35715 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -74,8 +74,7 @@ is_vlmax_len_p (machine_mode mode, rtx len)
 {
   poly_int64 value;
   return poly_int_rtx_p (len, &value)
-	 && known_eq (value, GET_MODE_NUNITS (mode))
-	 && !satisfies_constraint_K (len);
+	 && known_eq (value, GET_MODE_NUNITS (mode));
 }
 
 /* Helper functions for insn_flags && insn_types */
@@ -3855,7 +3854,13 @@ expand_cond_len_op (unsigned icode, insn_flags op_type, rtx *ops, rtx len)
   bool is_vlmax_len = is_vlmax_len_p (mode, len);
 
   unsigned insn_flags = HAS_DEST_P | HAS_MASK_P | HAS_MERGE_P | op_type;
-  if (is_dummy_mask)
+  /* FIXME: We don't support simplification of COND_LEN_NEG (..., dummy len,
+     dummy mask) into NEG_EXPR in GIMPLE FOLD yet.  So, we do such
+     simplification in RISC-V backend and may do that in middle-end in the
+     future.  */
+  if (is_dummy_mask && is_vlmax_len)
+    insn_flags |= TDEFAULT_POLICY_P | MDEFAULT_POLICY_P;
+  else if (is_dummy_mask)
     insn_flags |= TU_POLICY_P | MDEFAULT_POLICY_P;
   else if (is_vlmax_len)
     insn_flags |= TDEFAULT_POLICY_P | MU_POLICY_P;
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-3.c
new file mode 100644
index 00000000000..116b5b538cc
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-3.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=fixed-vlmax" } */
+
+void foo (int *src, int *dst, int size) {
+ int i;
+ for (i = 0; i < size; i++)
+  *dst++ = *src & 0x80 ? (*src++ & 0x7f) : -*src++;
+}
+
+/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*[a-x0-9]+,\s*e32,\s*m1,\s*t[au],\s*mu} 1 } } */
+/* { dg-final { scan-assembler-times {vsetvli} 1 } } */
-- 
2.36.3


                 reply	other threads:[~2024-01-02  7:27 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240102072655.1533350-1-juzhe.zhong@rivai.ai \
    --to=juzhe.zhong@rivai.ai \
    --cc=gcc-patches@gcc.gnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).