From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ua1-x934.google.com (mail-ua1-x934.google.com [IPv6:2607:f8b0:4864:20::934]) by sourceware.org (Postfix) with ESMTPS id 378EB3858D35 for ; Sat, 6 May 2023 02:29:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 378EB3858D35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-ua1-x934.google.com with SMTP id a1e0cc1a2514c-77d46c7dd10so17176740241.0 for ; Fri, 05 May 2023 19:29:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1683340153; x=1685932153; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=W9T9jhl6rKBRwNwnnH/57yc0fyyEC+pCe1NqPNW6r9g=; b=qmyFCkra9U2irfPNqrCdadnwbWL8TZfFsOpX3v4yUZb8kP1qKSwb0eRwOpQRro6SGj 0lbAQsgyuaAtgCulgWMzosJ3iisqxfomsjkb+BBWYAZDQFDsvJpz3Ukz3xZArw014om3 h8bnmz3VZVcIHgGYa3Al9Tf7vXVxMjTrxbcKKlXK/1uluq+MzwQL6M9BL1k2VaW3nSVx bGon19PZAmjVj9Agz5XqAQQW08Mgb8ZfRKy/gQ8Z3/vgQC8IzlOKYZiR8OJrYceCt6v/ nAX2lwoz+931zfhlyhipYa+GThHD0sbpYXUfO6PCn5fXlMO9xg1Hm2Xjb/YTQEbgUR0m y9+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683340153; x=1685932153; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=W9T9jhl6rKBRwNwnnH/57yc0fyyEC+pCe1NqPNW6r9g=; b=aK7XaeXjNpT6Vy8IOKuDTDkYHSX3So+Kae9/SrGrGD1RI1S0p8dw/5CKd50L+sHtDm ccO+LuB+/pqqbyTV1Ioz93Bg2ALLJ1s4wPNVjCQUhnCbEph2ASM2UipVfKZnOxfKrjGx LKJhdtuYDY2iAxcf8qnEFzriC3FF/jtiw0EDWdXQCWWHgs4P/I8tEuVip7y/A/yaMnxY bkpzHiVq5HfqI9U/E67uxmREjNg4r9TuHIwcss95QPTRzjS0osY2YtEmGSh5LXy/ATkX kpPNS8HvcjSyC0wE5IPpU7gXE83ZmdieLyxgJMKtLs/o29UUqdthcdcbLuwez99jut9P nlBA== X-Gm-Message-State: AC+VfDwDGktwVOlAYEESmkG36QGuRqlhAtat5hsUqQHHN5u6dysS3h8i OqLCxToTKKGlHOCaeAOxtLWeuhP05laTgtj+3B1PvIMa X-Google-Smtp-Source: ACHHUZ64cqXCqKLXAd+PcuwtXWb+W/tf7p0rfohr+3SpPSlxhC1woKxzDpIpWvzZ0y8Elm/yC/bE6n/mQp5Jcxt2YeI= X-Received: by 2002:a05:6122:dd:b0:440:4c82:6508 with SMTP id h29-20020a05612200dd00b004404c826508mr1122466vkc.3.1683340153355; Fri, 05 May 2023 19:29:13 -0700 (PDT) MIME-Version: 1.0 References: <20230505141239.1323841-1-juzhe.zhong@rivai.ai> In-Reply-To: <20230505141239.1323841-1-juzhe.zhong@rivai.ai> From: Kito Cheng Date: Sat, 6 May 2023 10:29:02 +0800 Message-ID: Subject: Re: [PATCH V2] RISC-V: Fix incorrect demand info merge in local vsetvli optimization [PR109748] To: juzhe.zhong@rivai.ai Cc: gcc-patches@gcc.gnu.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.4 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,LIKELY_SPAM_BODY,RCVD_IN_DNSWL_NONE,SCC_5_SHORT_WORD_LINES,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: Thanks, committed to trunk! On Fri, May 5, 2023 at 10:13=E2=80=AFPM wrote: > > From: Juzhe-Zhong > > This patch is fixing my recent optimization patch: > https://github.com/gcc-mirror/gcc/commit/d51f2456ee51bd59a79b4725ca0e488c= 25260bbf > > In that patch, the new_info =3D parse_insn (i) is not correct. > Since consider the following case: > > vsetvli a5,a4, e8,m1 > .. > vsetvli zero,a5, e32, m4 > vle8.v > vmacc.vv > ... > > Since we have backward demand fusion in Phase 1, so the real demand of "v= le8.v" is e32, m4. > However, if we use parse_insn (vle8.v) =3D e8, m1 which is not correct. > > So this patch we change new_info =3D new_info.parse_insn (i) > into: > > vector_insn_info new_info =3D m_vector_manager->vector_insn_infos[i->uid = ()]; > > So that, we can correctly optimize codes into: > > vsetvli a5,a4, e32, m4 > .. > .. (vsetvli zero,a5, e32, m4 is removed) > vle8.v > vmacc.vv > > Since m_vector_manager->vector_insn_infos is the member variable of pass_= vsetvl class. > We remove static void function "local_eliminate_vsetvl_insn", and make it= as the member function > of pass_vsetvl class. > > PR target/109748 > > gcc/ChangeLog: > > * config/riscv/riscv-vsetvl.cc (local_eliminate_vsetvl_insn): Rem= ove it. > (pass_vsetvl::local_eliminate_vsetvl_insn): New function. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/vsetvl/pr109748.c: New test. > > --- > gcc/config/riscv/riscv-vsetvl.cc | 102 ++++++++++-------- > .../gcc.target/riscv/rvv/vsetvl/pr109748.c | 36 +++++++ > 2 files changed, 93 insertions(+), 45 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109748.c > > diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vs= etvl.cc > index 39b4d21210b..e1efd7b1c40 100644 > --- a/gcc/config/riscv/riscv-vsetvl.cc > +++ b/gcc/config/riscv/riscv-vsetvl.cc > @@ -1056,51 +1056,6 @@ change_vsetvl_insn (const insn_info *insn, const v= ector_insn_info &info) > change_insn (rinsn, new_pat); > } > > -static void > -local_eliminate_vsetvl_insn (const vector_insn_info &dem) > -{ > - const insn_info *insn =3D dem.get_insn (); > - if (!insn || insn->is_artificial ()) > - return; > - rtx_insn *rinsn =3D insn->rtl (); > - const bb_info *bb =3D insn->bb (); > - if (vsetvl_insn_p (rinsn)) > - { > - rtx vl =3D get_vl (rinsn); > - for (insn_info *i =3D insn->next_nondebug_insn (); > - real_insn_and_same_bb_p (i, bb); i =3D i->next_nondebug_insn (= )) > - { > - if (i->is_call () || i->is_asm () > - || find_access (i->defs (), VL_REGNUM) > - || find_access (i->defs (), VTYPE_REGNUM)) > - return; > - > - if (has_vtype_op (i->rtl ())) > - { > - if (!vsetvl_discard_result_insn_p (PREV_INSN (i->rtl ()))) > - return; > - rtx avl =3D get_avl (i->rtl ()); > - if (avl !=3D vl) > - return; > - set_info *def =3D find_access (i->uses (), REGNO (avl))->de= f (); > - if (def->insn () !=3D insn) > - return; > - > - vector_insn_info new_info; > - new_info.parse_insn (i); > - if (!new_info.skip_avl_compatible_p (dem)) > - return; > - > - new_info.set_avl_info (dem.get_avl_info ()); > - new_info =3D dem.merge (new_info, LOCAL_MERGE); > - change_vsetvl_insn (insn, new_info); > - eliminate_insn (PREV_INSN (i->rtl ())); > - return; > - } > - } > - } > -} > - > static bool > source_equal_p (insn_info *insn1, insn_info *insn2) > { > @@ -2672,6 +2627,7 @@ private: > void pre_vsetvl (void); > > /* Phase 5. */ > + void local_eliminate_vsetvl_insn (const vector_insn_info &) const; > void cleanup_insns (void) const; > > /* Phase 6. */ > @@ -3993,6 +3949,62 @@ pass_vsetvl::pre_vsetvl (void) > commit_edge_insertions (); > } > > +/* Local user vsetvl optimizaiton: > + > + Case 1: > + vsetvl a5,a4,e8,mf8 > + ... > + vsetvl zero,a5,e8,mf8 --> Eliminate directly. > + > + Case 2: > + vsetvl a5,a4,e8,mf8 --> vsetvl a5,a4,e32,mf2 > + ... > + vsetvl zero,a5,e32,mf2 --> Eliminate directly. */ > +void > +pass_vsetvl::local_eliminate_vsetvl_insn (const vector_insn_info &dem) c= onst > +{ > + const insn_info *insn =3D dem.get_insn (); > + if (!insn || insn->is_artificial ()) > + return; > + rtx_insn *rinsn =3D insn->rtl (); > + const bb_info *bb =3D insn->bb (); > + if (vsetvl_insn_p (rinsn)) > + { > + rtx vl =3D get_vl (rinsn); > + for (insn_info *i =3D insn->next_nondebug_insn (); > + real_insn_and_same_bb_p (i, bb); i =3D i->next_nondebug_insn (= )) > + { > + if (i->is_call () || i->is_asm () > + || find_access (i->defs (), VL_REGNUM) > + || find_access (i->defs (), VTYPE_REGNUM)) > + return; > + > + if (has_vtype_op (i->rtl ())) > + { > + if (!vsetvl_discard_result_insn_p (PREV_INSN (i->rtl ()))) > + return; > + rtx avl =3D get_avl (i->rtl ()); > + if (avl !=3D vl) > + return; > + set_info *def =3D find_access (i->uses (), REGNO (avl))->de= f (); > + if (def->insn () !=3D insn) > + return; > + > + vector_insn_info new_info > + =3D m_vector_manager->vector_insn_infos[i->uid ()]; > + if (!new_info.skip_avl_compatible_p (dem)) > + return; > + > + new_info.set_avl_info (dem.get_avl_info ()); > + new_info =3D dem.merge (new_info, LOCAL_MERGE); > + change_vsetvl_insn (insn, new_info); > + eliminate_insn (PREV_INSN (i->rtl ())); > + return; > + } > + } > + } > +} > + > /* Before VSETVL PASS, RVV instructions pattern is depending on AVL oper= and > implicitly. Since we will emit VSETVL instruction and make RVV instru= ctions > depending on VL/VTYPE global status registers, we remove the such AVL= operand > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109748.c b/gcc/t= estsuite/gcc.target/riscv/rvv/vsetvl/pr109748.c > new file mode 100644 > index 00000000000..81c42c5a82a > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/vsetvl/pr109748.c > @@ -0,0 +1,36 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=3Drv32gcv -mabi=3Dilp32 -fno-tree-vectorize -fno= -schedule-insns -fno-schedule-insns2" } */ > + > +#include "riscv_vector.h" > + > +int byte_mac_vec(unsigned char *a, unsigned char *b, int len) { > + size_t vlmax =3D __riscv_vsetvlmax_e8m1(); > + vint32m4_t vec_s =3D __riscv_vmv_v_x_i32m4(0, vlmax); > + vint32m1_t vec_zero =3D __riscv_vmv_v_x_i32m1(0, vlmax); > + int k =3D len; > + > + for (size_t vl; k > 0; k -=3D vl, a +=3D vl, b +=3D vl) { > + vl =3D __riscv_vsetvl_e8m1(k); > + > + vuint8m1_t a8s =3D __riscv_vle8_v_u8m1(a, vl); > + vuint8m1_t b8s =3D __riscv_vle8_v_u8m1(b, vl); > + vuint32m4_t a8s_extended =3D __riscv_vzext_vf4_u32m4(a8s, vl); > + vuint32m4_t b8s_extended =3D __riscv_vzext_vf4_u32m4(a8s, vl); > + > + vint32m4_t a8s_as_i32 =3D __riscv_vreinterpret_v_u32m4_i32m4(a8s_e= xtended); > + vint32m4_t b8s_as_i32 =3D __riscv_vreinterpret_v_u32m4_i32m4(b8s_e= xtended); > + > + vec_s =3D __riscv_vmacc_vv_i32m4_tu(vec_s, a8s_as_i32, b8s_as_i32,= vl); > + } > + > + vint32m1_t vec_sum =3D __riscv_vredsum_vs_i32m4_i32m1(vec_s, vec_zero,= __riscv_vsetvl_e32m4(len)); > + int sum =3D __riscv_vmv_x_s_i32m1_i32(vec_sum); > + > + return sum; > +} > + > +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e32= ,\s*m1,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-O1" no-opts = "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ > +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*zero,\s*e32= ,\s*m4,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-O1" no-opts = "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ > +/* { dg-final { scan-assembler-times {vsetvli\s+zero,\s*[a-x0-9]+,\s*e32= ,\s*m4,\s*t[au],\s*m[au]} 1 { target { no-opts "-O0" no-opts "-O1" no-opts = "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ > +/* { dg-final { scan-assembler-times {vsetvli\s+[a-x0-9]+,\s*[a-x0-9]+,\= s*e32,\s*m4,\s*tu,\s*m[au]} 1 { target { no-opts "-O0" no-opts "-O1" no-opt= s "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-loops" } } } } */ > +/* { dg-final { scan-assembler-times {vsetvli} 4 { target { no-opts "-O0= " no-opts "-O1" no-opts "-Os" no-opts "-Oz" no-opts "-g" no-opts "-funroll-= loops" } } } } */ > -- > 2.36.3 >