From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk1-xa31.google.com (mail-vk1-xa31.google.com [IPv6:2607:f8b0:4864:20::a31]) by sourceware.org (Postfix) with ESMTPS id 4FBF03858D3C for ; Mon, 24 Apr 2023 06:25:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4FBF03858D3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-vk1-xa31.google.com with SMTP id 71dfb90a1353d-44087536177so2842091e0c.2 for ; Sun, 23 Apr 2023 23:25:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1682317539; x=1684909539; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Q+GUmR2UMEnmGy9YquSfldySTLyZyr3U4kAxECcm4TM=; b=Vli/+xXYM8PSzWLU5N5Sw5vbsh+vHuVVuc4psJ7S8i/FPwEev02zJi4NzoB9aOSe81 ylREmJluj37g6aU3YiTXhZTTOyHv63Tq+0ocj/x3TpOyJmUzvako+hNcFIWmQ+VQoKHK e9oPKMMz7fW5w/QMxE13d6arH+osMNhKfmG+DRV9UGZtCjlJe7K+t/D5TeVn02qknND8 r81/xc9WsKEH6kw3JXcG6e8CGlD7TO46/46gyPEgQ0lq0VwCXsZBafYTjjIXX2bDdakw OTh3HkS76NUlTWg5FW2nkT/KBh5HCFlGH7iasr2SGKOxisGu+PJmHR/mxT9ODKrv3t+b r22A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1682317539; x=1684909539; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Q+GUmR2UMEnmGy9YquSfldySTLyZyr3U4kAxECcm4TM=; b=bMGVrUEutdWI8jU/qGr7c0cnQHbjR9Pqw5rDGOcIMYjqoONoO1lb9nv9+gvuw66AVK t+hJ5VXe71fYCqgwDeNJvV7GXdvHDh3mrMHWl+FgknOATgig7lSL4sqfcGrkcvbJPpHn 80y2Vb9pzsYV1ieRuKNsXubv1VxGQqoZti912KtFYpiC8mDvBsXwZZpQoXCH1Fr9KMEM uCtvfebo7G4CSB/gG0qXxccq+2J7p2o1OLfZJiNbtg3hysQqy1HuZigF5W4GRzXq6Yos szcGyKTGUbuPmZxwy1oP8jcHdX7QpAHynojc0JqEQUrUI3mySkuwIIMYgHP8gvz0gf+r rgqQ== X-Gm-Message-State: AAQBX9f2hYBaLm6uN1G/+ntHEI5hDifw7OP5VrHP4RJFVes9f22HFvdj ib7Q24Pqp5SygkyYe9EU0vuE9GrB/Gyc70rlgiM= X-Google-Smtp-Source: AKy350Zi0ClP17wfH+5I1J9mFb9zxVVQz49mphm0TJ0Vom29cjf4MIe/dW1Ms5YewNgnpYSqRx5tvdiYciY5wMmhziI= X-Received: by 2002:a1f:d904:0:b0:43f:c805:e0a7 with SMTP id q4-20020a1fd904000000b0043fc805e0a7mr3541160vkg.9.1682317539353; Sun, 23 Apr 2023 23:25:39 -0700 (PDT) MIME-Version: 1.0 References: <20230423111752.101308-1-juzhe.zhong@rivai.ai> In-Reply-To: <20230423111752.101308-1-juzhe.zhong@rivai.ai> From: Kito Cheng Date: Mon, 24 Apr 2023 14:25:28 +0800 Message-ID: Subject: Re: [PATCH V2] RISC-V: Optimize fault only first load To: juzhe.zhong@rivai.ai Cc: gcc-patches@gcc.gnu.org, palmer@dabbelt.com, jeffreyalaw@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Committed, thanks :) On Sun, Apr 23, 2023 at 7:18=E2=80=AFPM wrote: > > From: Juzhe-Zhong > > V2 patch for: https://patchwork.sourceware.org/project/gcc/patch/20230330= 012804.110539-1-juzhe.zhong@rivai.ai/ > which has been reviewed. > > This patch address Jeff's comment, refine ChangeLog to give more > clear information. > > gcc/ChangeLog: > > * config/riscv/vector-iterators.md: New unspec to refine fault fi= rst load pattern. > * config/riscv/vector.md: Refine fault first load pattern to eras= e avl from instructions > with the fault first load property. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/vsetvl/ffload-1.c: New test. > * gcc.target/riscv/rvv/vsetvl/ffload-2.c: New test. > * gcc.target/riscv/rvv/vsetvl/ffload-3.c: New test. > * gcc.target/riscv/rvv/vsetvl/ffload-5.c: New test. > * gcc.target/riscv/rvv/vsetvl/ffload-6.c: New test. > * gcc.target/riscv/rvv/vsetvl/ffload-7.c: New test. > > --- > gcc/config/riscv/vector-iterators.md | 1 + > gcc/config/riscv/vector.md | 10 +++++- > .../gcc.target/riscv/rvv/vsetvl/ffload-1.c | 21 ++++++++++++ > .../gcc.target/riscv/rvv/vsetvl/ffload-2.c | 28 ++++++++++++++++ > .../gcc.target/riscv/rvv/vsetvl/ffload-3.c | 28 ++++++++++++++++ > .../gcc.target/riscv/rvv/vsetvl/ffload-5.c | 29 +++++++++++++++++ > .../gcc.target/riscv/rvv/vsetvl/ffload-6.c | 29 +++++++++++++++++ > .../gcc.target/riscv/rvv/vsetvl/ffload-7.c | 32 +++++++++++++++++++ > 8 files changed, 177 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c > > diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vect= or-iterators.md > index 3c6575208be..a8e856161d3 100644 > --- a/gcc/config/riscv/vector-iterators.md > +++ b/gcc/config/riscv/vector-iterators.md > @@ -80,6 +80,7 @@ > UNSPEC_VRGATHEREI16 > UNSPEC_VCOMPRESS > UNSPEC_VLEFF > + UNSPEC_MODIFY_VL > ]) > > (define_mode_iterator V [ > diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md > index 0fda11ed67d..959afac2283 100644 > --- a/gcc/config/riscv/vector.md > +++ b/gcc/config/riscv/vector.md > @@ -7414,7 +7414,15 @@ > (unspec:V > [(match_operand:V 3 "memory_operand" " m, m, = m, m")] UNSPEC_VLEFF) > (match_operand:V 2 "vector_merge_operand" " vu, 0, = vu, 0"))) > - (set (reg:SI VL_REGNUM) (unspec:SI [(match_dup 0)] UNSPEC_VLEFF))] > + (set (reg:SI VL_REGNUM) > + (unspec:SI > + [(if_then_else:V > + (unspec: > + [(match_dup 1) (match_dup 4) (match_dup 5) > + (match_dup 6) (match_dup 7) > + (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDIC= ATE) > + (unspec:V [(match_dup 3)] UNSPEC_VLEFF) > + (match_dup 2))] UNSPEC_MODIFY_VL))] > "TARGET_VECTOR" > "vleff.v\t%0,%3%p1" > [(set_attr "type" "vldff") > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c b/gcc/t= estsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c > new file mode 100644 > index 00000000000..b2b7eafa945 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-1.c > @@ -0,0 +1,21 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=3Drv32gcv -mabi=3Dilp32 -fno-tree-vectorize -fno= -schedule-insns -fno-schedule-insns2" } */ > + > +#include "riscv_vector.h" > + > +void f (int8_t * restrict in, int8_t * restrict out, int n, int cond,siz= e_t *new_vl,size_t *new_vl2) > +{ > + size_t vl =3D 101; > + > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in, vl); > + __riscv_vse8_v_i8mf8 (out, v, vl); > + vbool64_t mask =3D __riscv_vlm_v_b64 (in + 100, vl); > + vint8mf8_t v2 =3D __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + 100, new_= vl, vl); > + __riscv_vse8_v_i8mf8 (out + 100, v2, *new_vl); > + v2 =3D __riscv_vle8ff_v_i8mf8_tumu (mask, v2, in + 200, new_vl2, vl); > + __riscv_vse8_v_i8mf8 (out + 200, v2, *new_vl2); > +} > + > +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,= \s*mf8,\s*tu,\s*mu} 2 { target { no-opts "-O0" no-opts "-g" no-opts "-funro= ll-loops" } } } } */ > +/* { dg-final { scan-assembler-times {csrr} 2 { target { no-opts "-O0" n= o-opts "-g" no-opts "-funroll-loops" } } } } */ > +/* { dg-final { scan-assembler-not {vmv} { target { no-opts "-O0" no-opt= s "-g" no-opts "-funroll-loops" } } } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c b/gcc/t= estsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c > new file mode 100644 > index 00000000000..c0e21d461e7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-2.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=3Drv32gcv -mabi=3Dilp32 -fno-tree-vectorize -fno= -schedule-insns -fno-schedule-insns2" } */ > + > +#include "riscv_vector.h" > + > +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int c= ond) > +{ > + size_t vl =3D 101; > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i, vl); > + __riscv_vse8_v_i8mf8 (out + i, v, vl); > + > + vbool64_t mask =3D __riscv_vlm_v_b64 (in + i + 100, vl); > + > + vint8mf8_t v2 =3D __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 1= 00, &vl, vl); > + __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); > + } > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i + 300, vl); > + __riscv_vse8_v_i8mf8 (out + i + 300, v, vl); > + } > +} > + > +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,= \s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funro= ll-loops" } } } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c b/gcc/t= estsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c > new file mode 100644 > index 00000000000..9e90b189bd6 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-3.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=3Drv32gcv -mabi=3Dilp32 -fno-tree-vectorize -fno= -schedule-insns -fno-schedule-insns2" } */ > + > +#include "riscv_vector.h" > + > +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int c= ond) > +{ > + size_t vl =3D 101; > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i, vl); > + __riscv_vse8_v_i8mf8 (out + i, v, vl); > + > + vbool64_t mask =3D __riscv_vlm_v_b64 (in + i + 100, vl); > + > + vint8mf8_t v2 =3D __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 1= 00, &vl, vl); > + __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); > + } > + > + for (size_t i =3D 0; i < m; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i + 300, vl); > + __riscv_vse8_v_i8mf8 (out + i + 300, v, vl); > + } > +} > + > +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,= \s*mf8,\s*t[au],\s*m[au]} 2 { target { no-opts "-O0" no-opts "-g" no-opts "= -funroll-loops" } } } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c b/gcc/t= estsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c > new file mode 100644 > index 00000000000..895180cc54e > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-5.c > @@ -0,0 +1,29 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=3Drv32gcv -mabi=3Dilp32 -fno-tree-vectorize -fno= -schedule-insns -fno-schedule-insns2" } */ > + > +#include "riscv_vector.h" > + > +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int c= ond) > +{ > + size_t vl =3D 101; > + size_t new_vl; > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i, vl); > + __riscv_vse8_v_i8mf8 (out + i, v, vl); > + > + vbool64_t mask =3D __riscv_vlm_v_b64 (in + i + 100, vl); > + > + vint8mf8_t v2 =3D __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 1= 00, &new_vl, vl); > + __riscv_vse8_v_i8mf8 (out + i + 100, v2, new_vl); > + } > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i + 300, new_vl); > + __riscv_vse8_v_i8mf8 (out + i + 300, v, new_vl); > + } > +} > + > +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,= \s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funro= ll-loops" } } } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c b/gcc/t= estsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c > new file mode 100644 > index 00000000000..1b32f4ab24b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-6.c > @@ -0,0 +1,29 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=3Drv32gcv -mabi=3Dilp32 -fno-tree-vectorize -fno= -schedule-insns -fno-schedule-insns2" } */ > + > +#include "riscv_vector.h" > + > +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int c= ond) > +{ > + size_t vl =3D 101; > + size_t new_vl; > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i, vl); > + __riscv_vse8_v_i8mf8 (out + i, v, vl); > + > + vbool64_t mask =3D __riscv_vlm_v_b64 (in + i + 100, vl); > + > + vint8mf8_t v2 =3D __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 1= 00, &new_vl, vl); > + __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); > + } > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i + 300, new_vl); > + __riscv_vse8_v_i8mf8 (out + i + 300, v, new_vl); > + } > +} > + > +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,= \s*mf8,\s*t[au],\s*m[au]} 3 { target { no-opts "-O0" no-opts "-g" no-opts "= -funroll-loops" } } } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c b/gcc/t= estsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c > new file mode 100644 > index 00000000000..1c08b75873d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/ffload-7.c > @@ -0,0 +1,32 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=3Drv32gcv -mabi=3Dilp32 -fno-tree-vectorize -fno= -schedule-insns -fno-schedule-insns2" } */ > + > +#include "riscv_vector.h" > + > +void f (int8_t * restrict in, int8_t * restrict out, int n, int m, int c= ond) > +{ > + size_t vl =3D 101; > + if (cond) > + vl =3D m * 2; > + else > + vl =3D m * 2 * vl; > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i, vl); > + __riscv_vse8_v_i8mf8 (out + i, v, vl); > + > + vbool64_t mask =3D __riscv_vlm_v_b64 (in + i + 100, vl); > + > + vint8mf8_t v2 =3D __riscv_vle8ff_v_i8mf8_tumu (mask, v, in + i + 1= 00, &vl, vl); > + __riscv_vse8_v_i8mf8 (out + i + 100, v2, vl); > + } > + > + for (size_t i =3D 0; i < n; i++) > + { > + vint8mf8_t v =3D __riscv_vle8_v_i8mf8 (in + i + 300, vl); > + __riscv_vse8_v_i8mf8 (out + i + 300, v, vl); > + } > +} > + > +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e8,= \s*mf8,\s*tu,\s*mu} 1 { target { no-opts "-O0" no-opts "-g" no-opts "-funro= ll-loops" } } } } */ > -- > 2.36.1 >