From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua1-x930.google.com (mail-ua1-x930.google.com [IPv6:2607:f8b0:4864:20::930]) by sourceware.org (Postfix) with ESMTPS id 3AC113858D20 for ; Fri, 5 May 2023 06:21:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3AC113858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ua1-x930.google.com with SMTP id a1e0cc1a2514c-77d0322e1c1so791149241.3 for ; Thu, 04 May 2023 23:21:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683267695; x=1685859695; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tEFLZm6NENtN903nVPY+e3WXc0OYLTbUhaS82UBx0X4=; b=qIjMhLtnS0CyoxUz3ptgitZoHugak3tEf6AaakduiLd3u66jDfeSYVuDQayPlBtyI3 UqIZ+DiFyd+97jFoTQ4yffFaMlHqRZOxVZr0pkdHWSM6cLW3GRkYTiVxhl5zrKcU/KU8 77Beib4YAMJL++Atnes28h1JHNk744Tq6JYByd8+ShlnhmockrIaZ8Lg5CALb+LPllvM cEgXaDHuTHci7kq3Hu1TpKCqmC3Dzr/T2M9B791kWJVnPb4VjRAldrW25RdYQrkKYyeu jI5MEOST2Oy8bCWGau3/yA9HRmYcReOKE3qQY2IYPZIfJkbNeQINtA0rS2TVNtYeK835 bQOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683267695; x=1685859695; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tEFLZm6NENtN903nVPY+e3WXc0OYLTbUhaS82UBx0X4=; b=J5wyvhPJLoant7MiU7q0xqYlBLdRlO0pxumMKJ1LOaa5sWaRQviE6g9liUVX+kBuMa 34HzSzDf0EOuzIBXi1EJxzrlmfHQRGCdKIGKma+CV0pU/MtU0A9UNLVu1D33eRHzvEqX ZfZO0kPcXEJXOvxNk6kyNDpD8GuF8xMdciZ34T2QWuqWxKZ/YZGs60IMbqTZqw0fs/t0 8Mjp09lKTG/s/6batFzXEpSqactg6hb6wi6lqhQpDp7uWjI3bIO9DpEdBKBn41hQ2a7W Rk8z55eQ4p6Jp3A2rNSdB0TMRwFtO2EeXkaDZ34judA8W/co7GCTvzaTRYJmynXB/UCm z+KA== X-Gm-Message-State: AC+VfDyE6tsvYoEi+uKxfVf6k4dLRO+dGE7WWZJZeqPWNgwGhFNiUdM3 HZ5+QxJcv8I6mMz1ubJ9tzY1JG6n7EB2Jk0561k= X-Google-Smtp-Source: ACHHUZ6hvUhk/B4bXusETT6vzVMUbGWVVGDGzL/F2L16k7TfiMfTQWA/MxpKQ8ab+Afyuf7HU0Zq2Mp+XU+q7eeR7qw= X-Received: by 2002:a05:6102:11ee:b0:42f:e8bb:5882 with SMTP id e14-20020a05610211ee00b0042fe8bb5882mr136100vsg.19.1683267695201; Thu, 04 May 2023 23:21:35 -0700 (PDT) MIME-Version: 1.0 References: <20230505052120.1074528-1-juzhe.zhong@rivai.ai> In-Reply-To: <20230505052120.1074528-1-juzhe.zhong@rivai.ai> From: Kito Cheng Date: Fri, 5 May 2023 14:21:24 +0800 Message-ID: Subject: Re: [PATCH] RISC-V: Fix PR109615 To: juzhe.zhong@rivai.ai Cc: gcc-patches@gcc.gnu.org, palmer@dabbelt.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-6.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_NUMSUBJECT,KAM_SHORT,LIKELY_SPAM_BODY,RCVD_IN_DNSWL_NONE,SCC_5_SHORT_WORD_LINES,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Could you add more description rather than post code gen results in the comment ? On Fri, May 5, 2023 at 1:21=E2=80=AFPM wrote: > > From: Juzhe-Zhong > > Before this patch: > > ... > .L2: > addi a4,a1,100 > add t1,a0,a2 > mv t0,a0 > beq a2,zero,.L1 > vsetvli zero,a3,e8,mf8,tu,mu > .L4: > addi a6,t0,100 > addi a7,a4,-100 > vle8.v v1,0(t0) > addi t0,t0,1 > vse8.v v1,0(a7) > vlm.v v0,0(a6) > vle8.v v1,0(a6),v0.t > vse8.v v1,0(a4) > addi a4,a4,1 > bne t0,t1,.L4 > addi a0,a0,300 > addi a1,a1,300 > add a2,a0,a2 > vsetvli zero,a3,e8,mf8,ta,ma > .L5: > vle8.v v2,0(a0) > addi a0,a0,1 > vse8.v v2,0(a1) > addi a1,a1,1 > bne a2,a0,.L5 > .L1: > ret > > After this patch: > > ... > .L2: > addi a4,a1,100 > add t1,a0,a2 > mv t0,a0 > beq a2,zero,.L1 > vsetvli zero,a3,e8,mf8,tu,mu > .L4: > addi a6,t0,100 > addi a7,a4,-100 > vle8.v v1,0(t0) > addi t0,t0,1 > vse8.v v1,0(a7) > vlm.v v0,0(a6) > vle8.v v1,0(a6),v0.t > vse8.v v1,0(a4) > addi a4,a4,1 > bne t0,t1,.L4 > addi a0,a0,300 > addi a1,a1,300 > add a2,a0,a2 > .L5: > vle8.v v2,0(a0) > addi a0,a0,1 > vse8.v v2,0(a1) > addi a1,a1,1 > bne a2,a0,.L5 > .L1: > ret > > PR target/109615 > > gcc/ChangeLog: > > * config/riscv/riscv-vsetvl.cc (avl_info::multiple_source_equal_p= ): Add denegrate PHI optmization. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/vsetvl/avl_single-74.c: Adapt testcase. > * gcc.target/riscv/rvv/vsetvl/vsetvl-11.c: Ditto. > * gcc.target/riscv/rvv/vsetvl/pr109615.c: New test. > > --- > gcc/config/riscv/riscv-vsetvl.cc | 81 +++++-------------- > .../riscv/rvv/vsetvl/avl_single-74.c | 4 +- > .../gcc.target/riscv/rvv/vsetvl/pr109615.c | 33 ++++++++ > .../gcc.target/riscv/rvv/vsetvl/vsetvl-11.c | 2 +- > 4 files changed, 54 insertions(+), 66 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109615.c > > diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vs= etvl.cc > index 609f86d8704..39b4d21210b 100644 > --- a/gcc/config/riscv/riscv-vsetvl.cc > +++ b/gcc/config/riscv/riscv-vsetvl.cc > @@ -1676,72 +1676,27 @@ avl_info::single_source_equal_p (const avl_info &= other) const > bool > avl_info::multiple_source_equal_p (const avl_info &other) const > { > - /* TODO: We don't do too much optimization here since it's > - too complicated in case of analyzing the PHI node. > + /* When the def info is same in RTL_SSA namespace, it's safe > + to consider they are avl compatible. */ > + if (m_source =3D=3D other.get_source ()) > + return true; > > - For example: > - void f (void * restrict in, void * restrict out, int n, int m, in= t cond) > - { > - size_t vl; > - switch (cond) > - { > - case 1: > - vl =3D 100; > - break; > - case 2: > - vl =3D *(size_t*)(in + 100); > - break; > - case 3: > - { > - size_t new_vl =3D *(size_t*)(in + 500); > - size_t new_vl2 =3D *(size_t*)(in + 600); > - vl =3D new_vl + new_vl2 + 777; > - break; > - } > - default: > - vl =3D 4000; > - break; > - } > - for (size_t i =3D 0; i < n; i++) > - { > - vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i, vl); > - __riscv_vse8_v_i8mf8 (out + i, v, vl); > + /* We only consider handle PHI node. */ > + if (!m_source->insn ()->is_phi () || !other.get_source ()->insn ()->is= _phi ()) > + return false; > > - vint8mf8_t v2 =3D __riscv_vle8_v_i8mf8_tu (v, in + i + 100,= vl); > - __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); > - } > + phi_info *phi1 =3D as_a (m_source); > + phi_info *phi2 =3D as_a (other.get_source ()); > > - size_t vl2; > - switch (cond) > - { > - case 1: > - vl2 =3D 100; > - break; > - case 2: > - vl2 =3D *(size_t*)(in + 100); > - break; > - case 3: > - { > - size_t new_vl =3D *(size_t*)(in + 500); > - size_t new_vl2 =3D *(size_t*)(in + 600); > - vl2 =3D new_vl + new_vl2 + 777; > - break; > - } > - default: > - vl2 =3D 4000; > - break; > - } > - for (size_t i =3D 0; i < m; i++) > - { > - vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i + 300, vl2); > - __riscv_vse8_v_i8mf8 (out + i + 300, v, vl2); > - vint8mf8_t v2 =3D __riscv_vle8_v_i8mf8_tu (v, in + i + 200,= vl2); > - __riscv_vse8_v_i8mf8 (out + i + 200, v2, vl2); > - } > - } > - Such case may not be necessary to optimize since the codes of defin= ing > - vl and vl2 are redundant. */ > - return m_source =3D=3D other.get_source (); > + if (phi1->is_degenerate () && phi2->is_degenerate ()) > + { > + /* Case 1: If both PHI nodes have the same single input in use lis= t. > + We consider they are AVL compatible. */ > + if (phi1->input_value (0) =3D=3D phi2->input_value (0)) > + return true; > + } > + /* TODO: We can support more optimization cases in the future. */ > + return false; > } > > avl_info & > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-74.c b/= gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-74.c > index ff540ec792d..cc4f88be888 100644 > --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-74.c > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/avl_single-74.c > @@ -23,5 +23,5 @@ void f (int8_t * restrict in, int8_t * restrict out, in= t n, int cond, size_t vl, > } > } > > -/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,= \s*mf8,\s*t[au],\s*m[au]} 2 { target { no-opts "-O0" no-opts "-O1" no-opts = "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ > -/* { dg-final { scan-assembler-times {vsetvli} 2 { target { no-opts "-O0= " no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-= loops" } } } } */ > +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,= \s*mf8,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-O1" no-opts "-O= s" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ > +/* { dg-final { scan-assembler-times {vsetvli} 1 { target { no-opts "-O0= " no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-= loops" } } } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109615.c b/gcc/t= estsuite/gcc.target/riscv/rvv/vsetvl/pr109615.c > new file mode 100644 > index 00000000000..90b0bb79937 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109615.c > @@ -0,0 +1,33 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=3Drv32gcv -mabi=3Dilp32 -fno-tree-vectorize -fno= -schedule-insns -fno-schedule-insns2" } */ > + > +#include "riscv_vector.h" > + > +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int c= ond) > +{ > + size_t vl =3D 101; > + if (cond) > + vl =3D m * 2; > + else > + vl =3D m * 2 * vl; > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i, vl); > + __riscv_vse8_v_i8mf8 (out + i, v, vl); > + > + vbool64_t mask =3D __riscv_vlm_v_b64 (in + i + 100, vl); > + > + vint8mf8_t v2 =3D __riscv_vle8_v_i8mf8_tumu (mask, v, in + i + 100= , vl); > + __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); > + } > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i + 300, vl); > + __riscv_vse8_v_i8mf8 (out + i + 300, v, vl); > + } > +} > + > +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,= \s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-O1" no-opts "-Os" = no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ > +/* { dg-final { scan-assembler-times {vsetvli} 1 { target { no-opts "-O0= " no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-= loops" } } } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-11.c b/gcc/= testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-11.c > index fa825f031f9..3ef0fdcb66d 100644 > --- a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-11.c > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/vsetvl-11.c > @@ -18,4 +18,4 @@ void foo(int32_t *in1, int32_t *in2, int32_t *in3, int3= 2_t *out, size_t n, int c > } > } > > -/* { dg-final { scan-assembler-times {vsetvli} 3 { target { no-opts "-O0= " no-opts "-g" no-opts "-funroll-loops" } } } } */ > +/* { dg-final { scan-assembler-times {vsetvli} 2 { target { no-opts "-O0= " no-opts "-g" no-opts "-funroll-loops" } } } } */ > -- > 2.36.3 >