From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpbg150.qq.com (smtpbg150.qq.com [18.132.163.193]) by sourceware.org (Postfix) with ESMTPS id 2CE503858D20 for ; Tue, 29 Aug 2023 10:07:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2CE503858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai X-QQ-mid: bizesmtp78t1693303660t82ojhma Received: from rios-cad122.hadoop.rioslab.org ( [58.60.1.26]) by bizesmtp.qq.com (ESMTP) with id ; Tue, 29 Aug 2023 18:07:39 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: mRz6/7wsmIiKJcGwxZCE50ud1vN1qBtiv56Rc+lvGDJBy4uuuZWrzIrKFGpsj xoaKdziuh2Xo1xlhBD/g9Iw1srOxu43XANx29ufTkxI3FQJAqbfUhCYf6viaerHy2U4VSaO CAjDYuaFzYAp3fZJVtnnMce4vNt6m41wsf9mxef2UMG8O0KGN/OGkv5OWh9EDdGfnq/6GgZ FuyvPok/GZ6PGhmYaZ/2OSHpy5/sD766ZvDw/vuENQAGl5ZQjm9hwgPyCnqyPUePQVKFV1O /WuSwSt8gCoeq5oOQQ4XI8+8Hk54iHlTMS+P9TZWQ45hgt6B4FyMa0oXa0PHD56yaJl/ANJ FFSjDSCEbk8eHYjzRIHH/uAYetdJervHsLw2kIartxAjvhOIEiIZUV/RuBjaURxvZb/dQI6 dySNsA19db8OXUy4kI1w3A== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 6102343631398061659 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH] RISC-V: Enable movmisalign for VLS modes Date: Tue, 29 Aug 2023 18:07:38 +0800 Message-Id: <20230829100738.2479550-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.3 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-9.8 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_ASCII_DIVIDERS,KAM_DMARC_STATUS,KAM_SHORT,RCVD_IN_BARRACUDACENTRAL,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS,TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Prevous patch (which removed VLA modes movmisalign pattern) to fix run-time bug. Such patch disable vectorization for misalign data movement. After I check LLVM codes, LLVM supports misalign for VLS modes. Before this patch: sll a5,a4,0x1 add a5,a5,a1 lhu a3,64(a5) lbu a5,66(a5) addw a4,a4,1 srl a3,a3,0x8 sll a5,a5,0x8 or a5,a5,a3 sh a5,0(a2) add a2,a2,2 bne a4,a0,101f8 After this patch: foo: lui a0,%hi(.LANCHOR0) addi a0,a0,%lo(.LANCHOR0) addi sp,sp,-16 addi a1,a0,1 li a2,64 sd ra,8(sp) vsetvli zero,a2,e8,m4,ta,ma addi a0,a0,128 vle8.v v4,0(a1) vse8.v v4,0(a0) call memcmp bne a0,zero,.L6 ld ra,8(sp) addi sp,sp,16 jr ra .L6: call abort Note this patch has passed all testcases in "vect" which are related to alignment. gcc/ChangeLog: * config/riscv/autovec-vls.md (movmisalign): New pattern. * config/riscv/riscv.cc (riscv_support_vector_misalignment): Support VLS misalign. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vls/misalign-1.c: New test. --- gcc/config/riscv/autovec-vls.md | 19 +++++++++++++ gcc/config/riscv/riscv.cc | 16 +++++++---- .../riscv/rvv/autovec/vls/misalign-1.c | 27 +++++++++++++++++++ 3 files changed, 57 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/misalign-1.c diff --git a/gcc/config/riscv/autovec-vls.md b/gcc/config/riscv/autovec-vls.md index 1b1d940d779..0c988d75d67 100644 --- a/gcc/config/riscv/autovec-vls.md +++ b/gcc/config/riscv/autovec-vls.md @@ -140,6 +140,25 @@ [(set_attr "type" "vmov") (set_attr "mode" "")]) +(define_expand "movmisalign" + [(set (match_operand:VLS 0 "nonimmediate_operand") + (match_operand:VLS 1 "general_operand"))] + "TARGET_VECTOR" + { + /* To support misalign data movement, we should use + minimum element alignment load/store. */ + unsigned int size = GET_MODE_SIZE (GET_MODE_INNER (mode)); + poly_int64 nunits = GET_MODE_NUNITS (mode) * size; + machine_mode mode = riscv_vector::get_vector_mode (QImode, nunits).require (); + operands[0] = gen_lowpart (mode, operands[0]); + operands[1] = gen_lowpart (mode, operands[1]); + if (MEM_P (operands[0]) && !register_operand (operands[1], mode)) + operands[1] = force_reg (mode, operands[1]); + riscv_vector::emit_vlmax_insn (code_for_pred_mov (mode), riscv_vector::RVV_UNOP, operands); + DONE; + } +) + ;; ----------------------------------------------------------------- ;; ---- Duplicate Operations ;; ----------------------------------------------------------------- diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 1d6e278ea90..907b6584275 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -8063,12 +8063,18 @@ riscv_support_vector_misalignment (machine_mode mode, int misalignment, bool is_packed ATTRIBUTE_UNUSED) { - /* TODO: For RVV scalable vector auto-vectorization, we should allow - movmisalign pattern to handle misalign data movement to unblock - possible auto-vectorization. + /* Only enable misalign data movements for VLS modes. */ + if (TARGET_VECTOR_VLS && STRICT_ALIGNMENT) + { + /* Return if movmisalign pattern is not supported for this mode. */ + if (optab_handler (movmisalign_optab, mode) == CODE_FOR_nothing) + return false; - RVV VLS auto-vectorization or SIMD auto-vectorization can be supported here - in the future. */ + /* Misalignment factor is unknown at compile time. */ + if (misalignment == -1) + return false; + } + /* Disable movmisalign for VLA auto-vectorization. */ return default_builtin_support_vector_misalignment (mode, type, misalignment, is_packed); } diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/misalign-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/misalign-1.c new file mode 100644 index 00000000000..b602ffd69bb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vls/misalign-1.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -fno-schedule-insns -fno-schedule-insns2 --param=riscv-autovec-lmul=m4 -fno-tree-loop-distribute-patterns" } */ + +#include + +typedef union U { unsigned short s; unsigned char c; } __attribute__((packed)) U; +struct S { char e __attribute__((aligned (64))); U s[32]; }; +struct S t = {0, {{1}, {2}, {3}, {4}, {5}, {6}, {7}, {8}, + {9}, {10}, {11}, {12}, {13}, {14}, {15}, {16}, + {17}, {18}, {19}, {20}, {21}, {22}, {23}, {24}, + {25}, {26}, {27}, {28}, {29}, {30}, {31}, {32}}}; +unsigned short d[32] = { 1 }; + +__attribute__((noinline, noclone)) void +foo () +{ + int i; + for (i = 0; i < 32; i++) + d[i] = t.s[i].s; + if (__builtin_memcmp (d, t.s, sizeof d)) + abort (); +} + +/* { dg-final { scan-assembler-times {vle8\.v} 1 } } */ +/* { dg-final { scan-assembler-times {vle8\.v} 1 } } */ +/* { dg-final { scan-assembler-not {vle16\.v} } } */ +/* { dg-final { scan-assembler-not {vle16\.v} } } */ -- 2.36.3