From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtpbg151.qq.com (smtpbg151.qq.com [18.169.211.239]) by sourceware.org (Postfix) with ESMTPS id 3796C3858C62 for ; Thu, 1 Feb 2024 09:03:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3796C3858C62 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=rivai.ai Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=rivai.ai ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3796C3858C62 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=18.169.211.239 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706778199; cv=none; b=XKa0L3FSaxV08keMus/08zBzWWfcR3W8Cbwtqm24zFpv95KxiGeVyYxD2jX3fx71zwmTdPPCjqOIaQnvWvBDl5bOhbOXXbrrb1ieqGYmLYpjU/CTU9RMCnxQ8OSUX0BxxJZvo9m4lt7G+Hs4sA5FC9ztNzszM8Dp36lNN6yY+m8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706778199; c=relaxed/simple; bh=QnY++qXMYXG5HGZpiYYOPPSvd/AveNcgER/3XsB8h6I=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=Ajk21XIUXaMOJiPddrFsNwLxicSuao5o1oPoROU4yu8QXI3hUuM4WVrOxU/QyiB+Mf0vmiQ4dBb0Ml4A6sYZj7myK98g0PCoLj43G5xTl121yAJVeXdPCZenD+bO3buTndbnDoewksbLitViE3xNl1OeyEyh6Xt96pBRnyODunY= ARC-Authentication-Results: i=1; server2.sourceware.org X-QQ-mid: bizesmtp81t1706778174tok2jyoh X-QQ-Originating-IP: Q2ZWhdyzgKxSfs0b97kro6uxhHOywGfSQ5JChl/IlcE= Received: from server1.localdomain ( [58.60.1.8]) by bizesmtp.qq.com (ESMTP) with id ; Thu, 01 Feb 2024 17:02:52 +0800 (CST) X-QQ-SSF: 01400000000000G0V000000A0000000 X-QQ-FEAT: me2mPWGGTr5Ja1ORSKmMbk1GxneBNWqafFBaNbMqSZUIGAtgD+jMQbhNxbQJs kD4+VeAGFJYB6xm2O84r0VkB0DxB7CHJcvvfaa2xNYOjRs6MyvLpY0yD47oLxw8ZunUtThN MDe4237ENMPaA9KJ1Ea+QfXODeZFzXBpjD1rlab1EgNZ1G2NxMvTMxlqpP51kHjQQonhR0K WLtczqGJNN9imp8CSAIgEfs2I9ja7NUNvpKy0OMyakGAs0KxODwUqIUAwPheQmqg+5aojN+ HpHFIL+epywKcmrzLMrxQjRmI2FRfx0aEi0fyQiFvTqYmT/VcFuVZJULC7+XTdkPxMDsFlo +yamQznwJx3qfjmeE9AiIcOLgPB7+bK/5jO/reD6CA1veYnq0+19B2CUsM7U+tI6/Yjfw6o TRlVgrMFbwh2l34hwho0mA== X-QQ-GoodBg: 2 X-BIZMAIL-ID: 18332080481056116688 From: Juzhe-Zhong To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong Subject: [PATCH v2] RISC-V: Suppress the vsetvl fusion for conflict successors Date: Thu, 1 Feb 2024 17:02:52 +0800 Message-Id: <20240201090252.24414-1-juzhe.zhong@rivai.ai> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00,GIT_PATCH_0,KAM_DMARC_STATUS,KAM_SHORT,LIKELY_SPAM_BODY,RCVD_IN_BARRACUDACENTRAL,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SCC_5_SHORT_WORD_LINES,SPF_HELO_PASS,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Update in v2: Add dump information. This patch fixes the following ineffective vsetvl insertion: #include "riscv_vector.h" void f (int32_t * restrict in, int32_t * restrict out, size_t n, size_t cond, size_t cond2) { for (size_t i = 0; i < n; i++) { if (i == cond) { vint8mf8_t v = *(vint8mf8_t*)(in + i + 100); *(vint8mf8_t*)(out + i + 100) = v; } else if (i == cond2) { vfloat32mf2_t v = *(vfloat32mf2_t*)(in + i + 200); *(vfloat32mf2_t*)(out + i + 200) = v; } else if (i == (cond2 - 1)) { vuint16mf2_t v = *(vuint16mf2_t*)(in + i + 300); *(vuint16mf2_t*)(out + i + 300) = v; } else { vint8mf4_t v = *(vint8mf4_t*)(in + i + 400); *(vint8mf4_t*)(out + i + 400) = v; } } } Before this patch: f: .LFB0: .cfi_startproc beq a2,zero,.L12 addi a7,a0,400 addi a6,a1,400 addi a0,a0,1600 addi a1,a1,1600 li a5,0 addi t6,a4,-1 vsetvli t3,zero,e8,mf8,ta,ma ---> ineffective uplift .L7: beq a3,a5,.L15 beq a4,a5,.L16 beq t6,a5,.L17 vsetvli t1,zero,e8,mf4,ta,ma vle8.v v1,0(a0) vse8.v v1,0(a1) vsetvli t3,zero,e8,mf8,ta,ma .L4: addi a5,a5,1 addi a7,a7,4 addi a6,a6,4 addi a0,a0,4 addi a1,a1,4 bne a2,a5,.L7 .L12: ret .L15: vle8.v v1,0(a7) vse8.v v1,0(a6) j .L4 .L17: vsetvli t1,zero,e8,mf4,ta,ma addi t5,a0,-400 addi t4,a1,-400 vle16.v v1,0(t5) vse16.v v1,0(t4) vsetvli t3,zero,e8,mf8,ta,ma j .L4 .L16: addi t5,a0,-800 addi t4,a1,-800 vle32.v v1,0(t5) vse32.v v1,0(t4) j .L4 It's obvious that we are hoisting the e8mf8 vsetvl to the top. It's ineffective since e8mf8 comes from low probability block which is if (i == cond). For this case, we disable such fusion. After this patch: f: beq a2,zero,.L12 addi a7,a0,400 addi a6,a1,400 addi a0,a0,1600 addi a1,a1,1600 li a5,0 addi t6,a4,-1 .L7: beq a3,a5,.L15 beq a4,a5,.L16 beq t6,a5,.L17 vsetvli t1,zero,e8,mf4,ta,ma vle8.v v1,0(a0) vse8.v v1,0(a1) .L4: addi a5,a5,1 addi a7,a7,4 addi a6,a6,4 addi a0,a0,4 addi a1,a1,4 bne a2,a5,.L7 .L12: ret .L15: vsetvli t3,zero,e8,mf8,ta,ma vle8.v v1,0(a7) vse8.v v1,0(a6) j .L4 .L17: addi t5,a0,-400 addi t4,a1,-400 vsetvli t1,zero,e8,mf4,ta,ma vle16.v v1,0(t5) vse16.v v1,0(t4) j .L4 .L16: addi t5,a0,-800 addi t4,a1,-800 vsetvli t3,zero,e32,mf2,ta,ma vle32.v v1,0(t5) vse32.v v1,0(t4) j .L4 Tested on both RV32/RV64 no regression. Ok for trunk ? PR target/113696 gcc/ChangeLog: * config/riscv/riscv-vsetvl.cc (pre_vsetvl::earliest_fuse_vsetvl_info): Suppress vsetvl fusion. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/vsetvl/pr113696.c: New test. --- gcc/config/riscv/riscv-vsetvl.cc | 25 ++++++++++++++++++ .../gcc.target/riscv/rvv/vsetvl/pr113696.c | 26 +++++++++++++++++++ 2 files changed, 51 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr113696.c diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc index cec862329c5..28b7534d970 100644 --- a/gcc/config/riscv/riscv-vsetvl.cc +++ b/gcc/config/riscv/riscv-vsetvl.cc @@ -2959,6 +2959,31 @@ pre_vsetvl::earliest_fuse_vsetvl_info (int iter) src_block_info.set_empty_info (); src_block_info.probability = profile_probability::uninitialized (); + /* See PR113696, we should reset immediate dominator to + empty since we may uplift ineffective vsetvl which + locate at low probability block. */ + basic_block dom + = get_immediate_dominator (CDI_DOMINATORS, eg->src); + auto &dom_block_info = get_block_info (dom); + if (dom_block_info.has_info () + && !m_dem.compatible_p ( + dom_block_info.get_exit_info (), curr_info)) + { + dom_block_info.set_empty_info (); + dom_block_info.probability + = profile_probability::uninitialized (); + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, + " Reset dominator bb %u:", + dom->index); + prev_info.dump (dump_file, " "); + fprintf (dump_file, + " due to (same probability or no " + "compatible reaching):"); + curr_info.dump (dump_file, " "); + } + } changed = true; } /* Choose the one with higher probability. */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr113696.c b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr113696.c new file mode 100644 index 00000000000..5d7c5f52ead --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr113696.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "--param=riscv-autovec-preference=scalable -march=rv64gcv -mabi=lp64d -O3" } */ + +#include "riscv_vector.h" + +void f (int32_t * restrict in, int32_t * restrict out, size_t n, size_t cond, size_t cond2) +{ + for (size_t i = 0; i < n; i++) + { + if (i == cond) { + vint8mf8_t v = *(vint8mf8_t*)(in + i + 100); + *(vint8mf8_t*)(out + i + 100) = v; + } else if (i == cond2) { + vfloat32mf2_t v = *(vfloat32mf2_t*)(in + i + 200); + *(vfloat32mf2_t*)(out + i + 200) = v; + } else if (i == (cond2 - 1)) { + vuint16mf2_t v = *(vuint16mf2_t*)(in + i + 300); + *(vuint16mf2_t*)(out + i + 300) = v; + } else { + vint8mf4_t v = *(vint8mf4_t*)(in + i + 400); + *(vint8mf4_t*)(out + i + 400) = v; + } + } +} + +/* { dg-final { scan-assembler-times {vsetvli} 4 { target { no-opts "-O0" no-opts "-Os" no-opts "-Oz" no-opts "-funroll-loops" no-opts "-g" } } } } */ -- 2.36.1