From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpbgeu2.qq.com (smtpbgeu2.qq.com [18.194.254.142]) by sourceware.org (Postfix) with ESMTPS id 18F4D3858D1E for ; Tue, 2 Jan 2024 07:27:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 18F4D3858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 18F4D3858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=18.194.254.142 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704180433; cv=none; b=FUHPKK+SdKCgrcE6gMHZ0///InAjyj6JyUBPN3gqpXxVB0QXOV1RbuY+my/6vfN3NieVHekdTKKOHGdVA5v1nBGSTvlqHoFdE6d/+YMufW9Z5Cc9Hb8ly3HhmckcZIx5rBKqD+AsGDK93wgtSv6/Vbpjipeb0IRMHUp1JDXujvw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704180433; c=relaxed/simple; bh=5Uk9+ACWXqYsVmHkrioWSYPBKHTiDsaJJua3htd26/c=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=avGt/RgM+tUSqXbrW6P493cfa7cEIrngvsIjZb4pIdiG1sM9489lL0IlNqORq+zaaUF6OdyPtbpVaTmxwy9o1aac5h+jgkdT8mz2Tbrdm6qZxqRQHngys11/tWMlVRf0a2aflyrDzo5kQH8PAVETOwB4FQf6RAU4UqWkiG/B7tY= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp66t1704180418t7p1n3bc Received: from rios-cad121.hadoop.rioslab.org ( [58.60.1.9]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 02 Jan 2024 15:26:56 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: zT6n3Y95oi3Lz+XgiYvVhHEUZ76swX30tGLXuAeiIH9gy6Q2C0r9dXIJLmpAf /cFjlhWbTkCBsB632l6Iug7XIYyKcjBDugRfjFq3Uc6fT6vkcXvsILuidjt2uVTFnun9Lsh jEm0E+86OwcptrltZHLVxoMxHevLn+XvhEizv/HXUjowqV8sBEz3d1NIrX4naItkCwv1+8p pU3GgnC5mu/Z2JeS+wAnS6qT9pAtq+KIc880QGTWn6A/4YyHm9Q6UefW9pIGQA8/RNaPcue bNkNvi0fr6XzvEPtIyIaUFDj5XHDpKQ9G7tDalfOFdQ4OM+mzVv1PkgHga783fvjajfBKlo 7xM+NAdt83ztm/tIo6Tuln8Rw0jfzL/4ywI53sklwl7vqp1J09LH+PZy0s+DiILD0nmyj6Y AKDGrm8izf8= X-QQ-GoodBg: 2 X-BIZMAIL-ID: 5029297655013391411 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: Juzhe-Zhong Subject: [Committed] RISC-V: Add simplification of dummy len and dummy mask COND_LEN_xxx pattern Date: Tue, 2 Jan 2024 15:26:55 +0800 Message-Id: <20240102072655.1533350-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_SHORT,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE,T_SPF_HELO_TEMPERROR autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: In https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d1eacedc6d9ba9f5522f2c8d49ccfdf7939ad72d I optimize COND_LEN_xxx pattern with dummy len and dummy mask with too simply solution which causes redundant vsetvli in the following case: vsetvli a5,a2,e8,m1,ta,ma vle32.v v8,0(a0) vsetivli zero,16,e32,m4,tu,mu ----> We should apply VLMAX instead of a CONST_INT AVL slli a4,a5,2 vand.vv v0,v8,v16 vand.vv v4,v8,v12 vmseq.vi v0,v0,0 sub a2,a2,a5 vneg.v v4,v8,v0.t vsetvli zero,a5,e32,m4,ta,ma The root cause above is the following codes: is_vlmax_len_p (...) return poly_int_rtx_p (len, &value) && known_eq (value, GET_MODE_NUNITS (mode)) && !satisfies_constraint_K (len); ---> incorrect check. Actually, we should not elide the VLMAX situation that has AVL in range of [0,31]. After removing the the check above, we will have this following issue: vsetivli zero,4,e32,m1,ta,ma vlseg4e32.v v4,(a5) vlseg4e32.v v12,(a3) vsetvli a5,zero,e32,m1,tu,ma ---> This is redundant since VLMAX AVL = 4 when it is fixed-vlmax vfadd.vf v3,v13,fa0 vfadd.vf v1,v12,fa1 vfmul.vv v17,v3,v5 vfmul.vv v16,v1,v5 Since all the following operations (vfadd.vf ... etc) are COND_LEN_xxx with dummy len and dummy mask, we add the simplification operations dummy len and dummy mask into VLMAX TA and MA policy. So, after this patch. Both cases are optimal codegen now: case 1: vsetvli a5,a2,e32,m1,ta,mu vle32.v v2,0(a0) slli a4,a5,2 vand.vv v1,v2,v3 vand.vv v0,v2,v4 sub a2,a2,a5 vmseq.vi v0,v0,0 vneg.v v1,v2,v0.t vse32.v v1,0(a1) case 2: vsetivli zero,4,e32,m1,tu,ma addi a4,a5,400 vlseg4e32.v v12,(a3) vfadd.vf v3,v13,fa0 vfadd.vf v1,v12,fa1 vlseg4e32.v v4,(a4) vfadd.vf v2,v14,fa1 vfmul.vv v17,v3,v5 vfmul.vv v16,v1,v5 This patch is just additional fix of previous approved patch. Tested on both RV32 and RV64 newlib no regression. Committed. gcc/ChangeLog: * config/riscv/riscv-v.cc (is_vlmax_len_p): Remove satisfies_constraint_K. (expand_cond_len_op): Add simplification of dummy len and dummy mask. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/vf_avl-3.c: New test. --- gcc/config/riscv/riscv-v.cc | 11 ++++++++--- gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-3.c | 11 +++++++++++ 2 files changed, 19 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-3.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index b4c7e0f0126..3c83be35715 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -74,8 +74,7 @@ is_vlmax_len_p (machine_mode mode, rtx len) { poly_int64 value; return poly_int_rtx_p (len, &value) - && known_eq (value, GET_MODE_NUNITS (mode)) - && !satisfies_constraint_K (len); + && known_eq (value, GET_MODE_NUNITS (mode)); } /* Helper functions for insn_flags && insn_types */ @@ -3855,7 +3854,13 @@ expand_cond_len_op (unsigned icode, insn_flags op_type, rtx *ops, rtx len) bool is_vlmax_len = is_vlmax_len_p (mode, len); unsigned insn_flags = HAS_DEST_P | HAS_MASK_P | HAS_MERGE_P | op_type; - if (is_dummy_mask) + /* FIXME: We don't support simplification of COND_LEN_NEG (..., dummy len, + dummy mask) into NEG_EXPR in GIMPLE FOLD yet. So, we do such + simplification in RISC-V backend and may do that in middle-end in the + future. */ + if (is_dummy_mask && is_vlmax_len) + insn_flags |= TDEFAULT_POLICY_P | MDEFAULT_POLICY_P; + else if (is_dummy_mask) insn_flags |= TU_POLICY_P | MDEFAULT_POLICY_P; else if (is_vlmax_len) insn_flags |= TDEFAULT_POLICY_P | MU_POLICY_P; diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-3.c b/gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-3.c new file mode 100644 index 00000000000..116b5b538cc --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/vf_avl-3.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=rv64gcv -mabi=lp64d --param riscv-autovec-preference=fixed-vlmax" } */ + +void foo (int *src, int *dst, int size) { + int i; + for (i = 0; i < size; i++) + *dst++ = *src & 0x80 ? (*src++ & 0x7f) : -*src++; +} + +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*[a-x0-9]+,\s*e32,\s*m1,\s*t[au],\s*mu} 1 } } */ +/* { dg-final { scan-assembler-times {vsetvli} 1 } } */ -- 2.36.3