From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf1-x12a.google.com (mail-lf1-x12a.google.com [IPv6:2a00:1450:4864:20::12a]) by sourceware.org (Postfix) with ESMTPS id EA2B63858D33 for ; Thu, 9 Nov 2023 12:19:48 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EA2B63858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EA2B63858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699532390; cv=none; b=eFUkU2FM3nqhTOdwGlCPy+znZnlMiDahOWYlKpXJCX9xR3fSRPW1q8/27e/VSMZLvx8SKQs8Ngo+CLQXqDMjcHNVq4Umz01/JY5luBylzhtCW2LJqdNZUYPP+ZX9tM+YDTM5aTAcxyc+A8n54/qK8r2aMYUsHr5bofwbMvhtJnw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699532390; c=relaxed/simple; bh=OrzBfbrfXYsUof1SlEZauH+XNxjFlKARWxo4C72vlFQ=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=YIBdHFQrxmdudAhATWQLxaaajh8rYetnR+yMLUYf4VP1cdgsPqD2c4P5MzVqZKaUVTBhdupB7jsxF2jBAdVT124hUCwMLr3PLMW1+7UuO2gsm8qt+7EklYGYP5AqjgbfuWnDwbI/0oSse6Bv29m8cVhzderw06dmK+dgF3Q1EUA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12a.google.com with SMTP id 2adb3069b0e04-50931355d48so946224e87.3 for ; Thu, 09 Nov 2023 04:19:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699532386; x=1700137186; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=3/q0N2oXpgujvRcnyrkUSEEnN/T1Yo6TYj0ToDRpCR8=; b=lEo/hRIUaKGPuJBfQ0dEpiUeyF4W9mFatJb+K0wSoM9shVz8ftW8qEi5o2MbcWUbU+ HrdJ5esydjy0huuICSV7C0qtRZLeGSy5nGFAf6UyAbSnfSMWgC1l7PO1y+WYQM8fcOZn QoDkXjoBsJY6pcN6t8i/pdt+/IuAR1PKPdY0hrdlKWGGhQon1gC9ufhrlvxhoKVI8fdx QdY7poA/9BgKamjb4rrqVOjVEQw0QgdE0qqkaLyc7wGnlCi2pwTlPnq99OM3LdiDtiYk OO8upi3C34edKwuuumHNhy/Tn2MpKL0LYeYgZN4Zckg1zjzIJJ9UPZGBCKvTo59HMK48 +a2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699532386; x=1700137186; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3/q0N2oXpgujvRcnyrkUSEEnN/T1Yo6TYj0ToDRpCR8=; b=f9/IqFsDD96Lz/GquscavxTGgA3av0IUiRJPAxHbrtQWEPQ/xyRyI5slHRaYdX+a1h iUg8V2yPEYW1E9D7McQwvv1+KG9Q8eyGx2pX6FW/Sp+Fm088nzX2bWHkQbfeH+kFfE0i 3I2LuZtDenVwFgpJeMOwwHTAiTAzsQnTZwAfy9NZ2GR4S4Pr09puFEVYhJScQkJU5Zmt J9WMefpVeqgIErc/Oaz/NJD9HPGfY89JuXMH4TfDduzG0G219RpxtLQWDXJUp9ZENVVY /BQTDOchPHfdE+d/4pUkPL7NtVbu/2+k0Ke2bmlrhE/SajFfFKgGUPrK2Q9Ov0llek41 9oFA== X-Gm-Message-State: AOJu0YwSWOepDaNUu8jd5OHmKUUk5aPuDWYbY++Q9fLSPgL/Azu0vHu+ TH+wdJ4wu9qigTBCv0xW1N0HU/lJHtZ2ymJekt0= X-Google-Smtp-Source: AGHT+IFIQctDKxsywg3UsHz1qsqRaoa4T8cfXAFVXAXFxIf1KRQQ7GjvrXqjxOPdB3+KoQAy54e8PPm9GRwlSBfF8GM= X-Received: by 2002:a05:6512:312c:b0:509:860e:debf with SMTP id p12-20020a056512312c00b00509860edebfmr1156982lfd.27.1699532386109; Thu, 09 Nov 2023 04:19:46 -0800 (PST) MIME-Version: 1.0 References: <20231108105317.1786716-1-juzhe.zhong@rivai.ai> In-Reply-To: <20231108105317.1786716-1-juzhe.zhong@rivai.ai> From: Richard Biener Date: Thu, 9 Nov 2023 13:16:20 +0100 Message-ID: Subject: Re: [PATCH] Middle-end: Fix bug of induction variable vectorization for RVV To: Juzhe-Zhong Cc: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com, rguenther@suse.de, kito.cheng@gmail.com, kito.cheng@sifive.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,GIT_PATCH_0,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: On Wed, Nov 8, 2023 at 11:53=E2=80=AFAM Juzhe-Zhong = wrote: > > PR: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=3D112438 > > SELECT_VL result is not necessary always VF in non-final iteration. > > Current GIMPLE IR is wrong: > > # vect_vec_iv_.21_25 =3D PHI <_24(4), { 0, 1, 2, ... }(3)> > ... > _24 =3D vect_vec_iv_.21_25 + { POLY_INT_CST [4, 4], ... }; > > After this patch which is correct for SELECT_VL: > > # vect_vec_iv_.8_22 =3D PHI <_21(4), { 0, 1, 2, ... }(3)> > ... > _35 =3D .SELECT_VL (ivtmp_33, POLY_INT_CST [4, 4]); > _21 =3D vect_vec_iv_.8_22 + { POLY_INT_CST [4, 4], ... }; > > kito, could you give more explanation ? > > PR middle/112438 > > gcc/ChangeLog: > > * tree-vect-loop.cc (vectorizable_induction): Fix bug. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/autovec/pr112438.c: New test. > > --- > .../gcc.target/riscv/rvv/autovec/pr112438.c | 35 +++++++++++++++++ > gcc/tree-vect-loop.cc | 39 +++++++++++++++---- > 2 files changed, 67 insertions(+), 7 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c > > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c b/gcc/= testsuite/gcc.target/riscv/rvv/autovec/pr112438.c > new file mode 100644 > index 00000000000..b326d56a52c > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112438.c > @@ -0,0 +1,35 @@ > +/* { dg-do compile } */ > +/* { dg-options "-march=3Drv64gcv -mabi=3Dlp64d -O3 -fno-vect-cost-model= -ffast-math -fdump-tree-optimized-details" } */ > + > +void > +foo (int n, int *__restrict in, int *__restrict out) > +{ > + for (int i =3D 0; i < n; i +=3D 1) > + { > + out[i] =3D in[i] + i; > + } > +} > + > +void > +foo2 (int n, float * __restrict in, > +float * __restrict out) > +{ > + for (int i =3D 0; i < n; i +=3D 1) > + { > + out[i] =3D in[i] + i; > + } > +} > + > +void > +foo3 (int n, float * __restrict in, > +float * __restrict out, float x) > +{ > + for (int i =3D 0; i < n; i +=3D 1) > + { > + out[i] =3D in[i] + i* i; > + } > +} > + > +/* We don't want to see vect_vec_iv_.21_25 + { POLY_INT_CST [4, 4], ... = }. */ > +/* { dg-final { scan-tree-dump-not "\\+ \{ POLY_INT_CST" "optimized" } }= */ > + > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index a544bc9b059..3e103946168 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -10309,10 +10309,30 @@ vectorizable_induction (loop_vec_info loop_vinf= o, > new_name =3D step_expr; > else > { > + gimple_seq seq =3D NULL; > + if (LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)) > + { > + /* When we're using loop_len produced by SELEC_VL, the non-fina= l > + iterations are not always processing VF elements. So vector= ize > + induction variable instead of > + > + _21 =3D vect_vec_iv_.6_22 + { VF, ... }; > + > + We should generate: > + > + _35 =3D .SELECT_VL (ivtmp_33, VF); > + vect_cst__22 =3D [vec_duplicate_expr] _35; > + _21 =3D vect_vec_iv_.6_22 + vect_cst__22; */ > + vec_loop_lens *lens =3D &LOOP_VINFO_LENS (loop_vinfo); > + tree len > + =3D vect_get_loop_len (loop_vinfo, NULL, lens, 1, vectype, 0,= 0); > + expr =3D force_gimple_operand (fold_convert (TREE_TYPE (step_ex= pr), > + unshare_expr (len)), > + &seq, true, NULL_TREE); > + } I think it would be better to split out building a tree from VF from both arms and avoid using 'vf' when LOOP_VINFO_USING_SELECT_VL_P. Btw, you are not patching the SLP path here which I believe has the same problem but is currently exempt from non-constant VF at least. Richard. > /* iv_loop is the loop to be vectorized. Generate: > vec_step =3D [VF*S, VF*S, VF*S, VF*S] */ > - gimple_seq seq =3D NULL; > - if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (step_expr))) > + else if (SCALAR_FLOAT_TYPE_P (TREE_TYPE (step_expr))) > { > expr =3D build_int_cst (integer_type_node, vf); > expr =3D gimple_build (&seq, FLOAT_EXPR, TREE_TYPE (step_expr),= expr); > @@ -10323,8 +10343,13 @@ vectorizable_induction (loop_vec_info loop_vinfo= , > expr, step_expr); > if (seq) > { > - new_bb =3D gsi_insert_seq_on_edge_immediate (pe, seq); > - gcc_assert (!new_bb); > + if (LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo)) > + gsi_insert_seq_before (&si, seq, GSI_SAME_STMT); > + else > + { > + new_bb =3D gsi_insert_seq_on_edge_immediate (pe, seq); > + gcc_assert (!new_bb); > + } > } > } > > @@ -10332,9 +10357,9 @@ vectorizable_induction (loop_vec_info loop_vinfo, > gcc_assert (CONSTANT_CLASS_P (new_name) > || TREE_CODE (new_name) =3D=3D SSA_NAME); > new_vec =3D build_vector_from_val (step_vectype, t); > - vec_step =3D vect_init_vector (loop_vinfo, stmt_info, > - new_vec, step_vectype, NULL); > - > + vec_step > + =3D vect_init_vector (loop_vinfo, stmt_info, new_vec, step_vectype, > + LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo) ? &si := NULL); > > /* Create the following def-use cycle: > loop prolog: > -- > 2.36.3 >