From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oo1-xc30.google.com (mail-oo1-xc30.google.com [IPv6:2607:f8b0:4864:20::c30]) by sourceware.org (Postfix) with ESMTPS id 08EA03858CDB for ; Thu, 20 Jul 2023 08:59:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 08EA03858CDB Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=sifive.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=sifive.com Received: by mail-oo1-xc30.google.com with SMTP id 006d021491bc7-5633b7e5f90so397739eaf.1 for ; Thu, 20 Jul 2023 01:59:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1689843564; x=1690448364; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Svd6JTa6NtwWxyg/35zt3zleg5FwRIbDLqV7SwRQ/8w=; b=TzHzLHxYKwja8Ydt0ep2lA5Y0cpMGEu3wuiKaApzHAha+j+hpcAsch2Mb+f/fTK+CQ 3Jw2gHEhnohHZ6/8Fx8n/hDcz6b5/5NYhahYOccTX7OtomCVpT3/JTk4u5a0Iduq/K3v N9CjoLaemwFg5JrfPstoAdETftLTD+Dtw9b+26Dh4RQ3fngrVPdsYYMt/bjrE4DsRd9w KJC+T1peaQxOvHZTJuZYFx+8A+9LIgpsU35BieHE4DZnyLS+Aj4MWsdLu2Ugr1Mln/7t U7edzw9i0nXGm8vvAcPmO4XYJBmLbz7Ae/OjtNfl/derynEdaFUopXmwCy9UhdjLvvaQ E3vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689843564; x=1690448364; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Svd6JTa6NtwWxyg/35zt3zleg5FwRIbDLqV7SwRQ/8w=; b=LRSXP8GYOLvGZmGl10N5qjBn+QXITCxtojtBKr5rjo6AFixQxStA+mf3veCZkdWMQ8 JV1dqUjg+6brWC9dVZujc8h72+XTzqMYbQJn1B63IZcwPApsf1Ea8M8kxldk/CK0FOmh 6Awau6uohWF74LOblUtCIpqaZK6kkR7urLnSdlECbzo013wcyExfaPWITvUUQzKhWkaU o2HeMGOwlVOz1I5qx+vXtbVbyb8bSlkrxD5TzcCyF4dQipcorcsEBa9HpnYAVJJV49iZ C4x4gw8xCd/tQa5QYlo/JPOcvGX5ANPnDOKG9en82Wml3bfUpPdMzLtzQVq118dlGd8h Tzfg== X-Gm-Message-State: ABy/qLY6u8jr3plmf+7oMaMGhVP0LGu1ux5p6mNCmFho98Ghd7MGJRKZ iLR73tgyGJnpcRC9Kb5a/pTt3ZIb2iPUDMjNNR4E7w== X-Google-Smtp-Source: APBJJlG3qgwnjKm1ev/9+lK9UmoO+NIhR7ge0/PH17viI3RRU9L5JvDTspBYpQhovhxiowrIzkI4B4cKFfoqE4bsqHs= X-Received: by 2002:a05:6808:349:b0:396:169f:3660 with SMTP id j9-20020a056808034900b00396169f3660mr920447oie.58.1689843564504; Thu, 20 Jul 2023 01:59:24 -0700 (PDT) MIME-Version: 1.0 References: <20230720085103.159227-1-juzhe.zhong@rivai.ai> In-Reply-To: <20230720085103.159227-1-juzhe.zhong@rivai.ai> From: Kito Cheng Date: Thu, 20 Jul 2023 16:59:13 +0800 Message-ID: Subject: Re: [PATCH V2] RISC-V: Support in-order floating-point reduction To: Juzhe-Zhong Cc: gcc-patches@gcc.gnu.org, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_ASCII_DIVIDERS,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: LGTM, but I would like make sure Robin is OK too On Thu, Jul 20, 2023 at 4:51=E2=80=AFPM Juzhe-Zhong = wrote: > > This patch is depending on: > https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624995.html > > Consider this following case: > float foo (float *__restrict a, int n) > { > float result =3D 1.0; > for (int i =3D 0; i < n; i++) > result +=3D a[i]; > return result; > } > > Compile with **NO** -ffast-math: > > Before this patch: > :4:21: missed: couldn't vectorize loop > :1:7: missed: not vectorized: relevant phi not supported: result_= 14 =3D PHI > > After this patch: > foo: > lui a5,%hi(.LC0) > flw fa0,%lo(.LC0)(a5) > ble a1,zero,.L4 > .L3: > vsetvli a5,a1,e32,m1,ta,ma > vle32.v v1,0(a0) > slli a4,a5,2 > sub a1,a1,a5 > vfmv.s.f v2,fa0 > add a0,a0,a4 > vfredosum.vs v1,v1,v2 ----------> FOLD_LEFT_PLUS > vfmv.f.s fa0,v1 > bne a1,zero,.L3 > ret > .L4: > ret > > gcc/ChangeLog: > > * config/riscv/autovec.md (fold_left_plus_): New pattern. > (mask_len_fold_left_plus_): Ditto. > * config/riscv/riscv-protos.h (enum insn_type): New enum. > (enum reduction_type): Ditto. > (expand_reduction): Add in-order reduction. > * config/riscv/riscv-v.cc (emit_nonvlmax_fp_reduction_insn): New = function. > (expand_reduction): Add in-order reduction. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1.c: New te= st. > * gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2.c: New te= st. > > --- > gcc/config/riscv/autovec.md | 39 ++++++++++++++ > gcc/config/riscv/riscv-protos.h | 13 ++++- > gcc/config/riscv/riscv-v.cc | 53 +++++++++++++++---- > .../riscv/rvv/autovec/reduc/reduc_strict-1.c | 28 ++++++++++ > .../riscv/rvv/autovec/reduc/reduc_strict-2.c | 26 +++++++++ > .../riscv/rvv/autovec/reduc/reduc_strict-3.c | 18 +++++++ > .../riscv/rvv/autovec/reduc/reduc_strict-4.c | 24 +++++++++ > .../riscv/rvv/autovec/reduc/reduc_strict-5.c | 28 ++++++++++ > .../riscv/rvv/autovec/reduc/reduc_strict-6.c | 18 +++++++ > .../riscv/rvv/autovec/reduc/reduc_strict-7.c | 21 ++++++++ > .../rvv/autovec/reduc/reduc_strict_run-1.c | 29 ++++++++++ > .../rvv/autovec/reduc/reduc_strict_run-2.c | 31 +++++++++++ > 12 files changed, 317 insertions(+), 11 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_strict-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_strict-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_strict-3.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_strict-4.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_strict-5.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_strict-6.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_strict-7.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_strict_run-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_strict_run-2.c > > diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md > index 00947207f3f..667a877d009 100644 > --- a/gcc/config/riscv/autovec.md > +++ b/gcc/config/riscv/autovec.md > @@ -1687,3 +1687,42 @@ > riscv_vector::expand_reduction (SMIN, operands, f); > DONE; > }) > + > +;; ---------------------------------------------------------------------= ---- > +;; ---- [FP] Left-to-right reductions > +;; ---------------------------------------------------------------------= ---- > +;; Includes: > +;; - vfredosum.vs > +;; ---------------------------------------------------------------------= ---- > + > +;; Unpredicated in-order FP reductions. > +(define_expand "fold_left_plus_" > + [(match_operand: 0 "register_operand") > + (match_operand: 1 "register_operand") > + (match_operand:VF 2 "register_operand")] > + "TARGET_VECTOR" > +{ > + riscv_vector::expand_reduction (PLUS, operands, > + operands[1], > + riscv_vector::reduction_type::FOLD_LEFT= ); > + DONE; > +}) > + > +;; Predicated in-order FP reductions. > +(define_expand "mask_len_fold_left_plus_" > + [(match_operand: 0 "register_operand") > + (match_operand: 1 "register_operand") > + (match_operand:VF 2 "register_operand") > + (match_operand: 3 "vector_mask_operand") > + (match_operand 4 "autovec_length_operand") > + (match_operand 5 "const_0_operand")] > + "TARGET_VECTOR" > +{ > + if (rtx_equal_p (operands[4], const0_rtx)) > + emit_move_insn (operands[0], operands[1]); > + else > + riscv_vector::expand_reduction (PLUS, operands, > + operands[1], > + riscv_vector::reduction_type::MASK_LE= N_FOLD_LEFT); > + DONE; > +}) > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-pro= tos.h > index 16fb8dabca0..c9520f689e2 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -199,6 +199,7 @@ enum insn_type > RVV_GATHER_M_OP =3D 5, > RVV_SCATTER_M_OP =3D 4, > RVV_REDUCTION_OP =3D 3, > + RVV_REDUCTION_TU_OP =3D RVV_REDUCTION_OP + 2, > }; > enum vlmul_type > { > @@ -247,7 +248,7 @@ void emit_vlmax_merge_insn (unsigned, int, rtx *); > void emit_vlmax_cmp_insn (unsigned, rtx *); > void emit_vlmax_cmp_mu_insn (unsigned, rtx *); > void emit_vlmax_masked_mu_insn (unsigned, int, rtx *); > -void emit_scalar_move_insn (unsigned, rtx *); > +void emit_scalar_move_insn (unsigned, rtx *, rtx =3D 0); > void emit_nonvlmax_integer_move_insn (unsigned, rtx *, rtx); > enum vlmul_type get_vlmul (machine_mode); > unsigned int get_ratio (machine_mode); > @@ -270,6 +271,13 @@ enum mask_policy > MASK_AGNOSTIC =3D 1, > MASK_ANY =3D 2, > }; > + > +enum class reduction_type > +{ > + UNORDERED, > + FOLD_LEFT, > + MASK_LEN_FOLD_LEFT, > +}; > enum tail_policy get_prefer_tail_policy (); > enum mask_policy get_prefer_mask_policy (); > rtx get_avl_type_rtx (enum avl_type); > @@ -282,7 +290,8 @@ bool has_vi_variant_p (rtx_code, rtx); > void expand_vec_cmp (rtx, rtx_code, rtx, rtx); > bool expand_vec_cmp_float (rtx, rtx_code, rtx, rtx, bool); > void expand_cond_len_binop (rtx_code, rtx *); > -void expand_reduction (rtx_code, rtx *, rtx); > +void expand_reduction (rtx_code, rtx *, rtx, > + reduction_type =3D reduction_type::UNORDERED); > #endif > bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, > bool, void (*)(rtx *, rtx)); > diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc > index 53088edf909..e338be151d3 100644 > --- a/gcc/config/riscv/riscv-v.cc > +++ b/gcc/config/riscv/riscv-v.cc > @@ -1023,11 +1023,11 @@ emit_nonvlmax_fp_tu_insn (unsigned icode, int op_= num, rtx *ops, rtx avl) > /* Emit vmv.s.x instruction. */ > > void > -emit_scalar_move_insn (unsigned icode, rtx *ops) > +emit_scalar_move_insn (unsigned icode, rtx *ops, rtx len) > { > machine_mode dest_mode =3D GET_MODE (ops[0]); > machine_mode mask_mode =3D get_mask_mode (dest_mode).require (); > - insn_expander e (riscv_vector::RVV_SCALAR_MOV_O= P, > + insn_expander e (RVV_SCALAR_MOV_OP, > /* HAS_DEST_P */ true, > /* FULLY_UNMASKED_P */ false, > /* USE_REAL_MERGE_P */ true, > @@ -1038,7 +1038,7 @@ emit_scalar_move_insn (unsigned icode, rtx *ops) > > e.set_policy (TAIL_ANY); > e.set_policy (MASK_ANY); > - e.set_vl (CONST1_RTX (Pmode)); > + e.set_vl (len ? len : CONST1_RTX (Pmode)); > e.emit_insn ((enum insn_code) icode, ops); > } > > @@ -1196,6 +1196,26 @@ emit_vlmax_fp_reduction_insn (unsigned icode, int = op_num, rtx *ops) > e.emit_insn ((enum insn_code) icode, ops); > } > > +/* Emit reduction instruction. */ > +static void > +emit_nonvlmax_fp_reduction_insn (unsigned icode, int op_num, rtx *ops, r= tx vl) > +{ > + machine_mode dest_mode =3D GET_MODE (ops[0]); > + machine_mode mask_mode =3D get_mask_mode (GET_MODE (ops[1])).require (= ); > + insn_expander e (op_num, > + /* HAS_DEST_P */ true, > + /* FULLY_UNMASKED_P */ false, > + /* USE_REAL_MERGE_P */ true, > + /* HAS_AVL_P */ true, > + /* VLMAX_P */ false, dest_mode, > + mask_mode); > + > + e.set_policy (TAIL_ANY); > + e.set_rounding_mode (FRM_DYN); > + e.set_vl (vl); > + e.emit_insn ((enum insn_code) icode, ops); > +} > + > /* Emit merge instruction. */ > > static machine_mode > @@ -3343,9 +3363,10 @@ expand_cond_len_ternop (unsigned icode, rtx *ops) > > /* Expand reduction operations. */ > void > -expand_reduction (rtx_code code, rtx *ops, rtx init) > +expand_reduction (rtx_code code, rtx *ops, rtx init, reduction_type type= ) > { > - machine_mode vmode =3D GET_MODE (ops[1]); > + rtx vector =3D type =3D=3D reduction_type::UNORDERED ? ops[1] : ops[2]= ; > + machine_mode vmode =3D GET_MODE (vector); > machine_mode m1_mode =3D get_m1_mode (vmode).require (); > machine_mode m1_mmode =3D get_mask_mode (m1_mode).require (); > > @@ -3353,16 +3374,30 @@ expand_reduction (rtx_code code, rtx *ops, rtx in= it) > rtx m1_mask =3D gen_scalar_move_mask (m1_mmode); > rtx m1_undef =3D RVV_VUNDEF (m1_mode); > rtx scalar_move_ops[] =3D {m1_tmp, m1_mask, m1_undef, init}; > - emit_scalar_move_insn (code_for_pred_broadcast (m1_mode), scalar_move_= ops); > + rtx len =3D type =3D=3D reduction_type::MASK_LEN_FOLD_LEFT ? ops[4] : = NULL_RTX; > + emit_scalar_move_insn (code_for_pred_broadcast (m1_mode), scalar_move_= ops, > + len); > > rtx m1_tmp2 =3D gen_reg_rtx (m1_mode); > - rtx reduc_ops[] =3D {m1_tmp2, ops[1], m1_tmp}; > + rtx reduc_ops[] =3D {m1_tmp2, vector, m1_tmp}; > > if (FLOAT_MODE_P (vmode) && code =3D=3D PLUS) > { > insn_code icode > - =3D code_for_pred_reduc_plus (UNSPEC_UNORDERED, vmode, m1_mode); > - emit_vlmax_fp_reduction_insn (icode, RVV_REDUCTION_OP, reduc_ops); > + =3D code_for_pred_reduc_plus (type =3D=3D reduction_type::UNORDER= ED > + ? UNSPEC_UNORDERED > + : UNSPEC_ORDERED, > + vmode, m1_mode); > + if (type =3D=3D reduction_type::MASK_LEN_FOLD_LEFT) > + { > + rtx mask =3D ops[3]; > + rtx mask_len_reduc_ops[] > + =3D {m1_tmp2, mask, RVV_VUNDEF (m1_mode), vector, m1_tmp}; > + emit_nonvlmax_fp_reduction_insn (icode, RVV_REDUCTION_TU_OP, > + mask_len_reduc_ops, len); > + } > + else > + emit_vlmax_fp_reduction_insn (icode, RVV_REDUCTION_OP, reduc_ops)= ; > } > else > { > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_stric= t-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c > new file mode 100644 > index 00000000000..c293e9ae746 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-1.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv_zvfh -mabi=3Dilp32d --param= =3Driscv-autovec-preference=3Dscalable -fno-vect-cost-model" } */ > + > +#include > + > +#define NUM_ELEMS(TYPE) ((int)(5 * (256 / sizeof (TYPE)) + 3)) > + > +#define DEF_REDUC_PLUS(TYPE) \ > + TYPE __attribute__ ((noinline, noclone)) \ > + reduc_plus_##TYPE (TYPE *a, TYPE *b) \ > + { \ > + TYPE r =3D 0, q =3D 3; \ > + for (int i =3D 0; i < NUM_ELEMS (TYPE); i++) \ > + { \ > + r +=3D a[i]; \ > + q -=3D b[i]; \ > + } \ > + return r * q; \ > + } > + > +#define TEST_ALL(T) \ > + T (_Float16) \ > + T (float) \ > + T (double) > + > +TEST_ALL (DEF_REDUC_PLUS) > + > +/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+= ,\s*v[0-9]+} 6 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_stric= t-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c > new file mode 100644 > index 00000000000..2e1e7ab674d > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-2.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv_zvfh -mabi=3Dilp32d --param= =3Driscv-autovec-preference=3Dscalable -fno-vect-cost-model" } */ > + > +#define NUM_ELEMS(TYPE) ((int) (5 * (256 / sizeof (TYPE)) + 3)) > + > +#define DEF_REDUC_PLUS(TYPE) \ > +void __attribute__ ((noinline, noclone)) \ > +reduc_plus_##TYPE (TYPE (*restrict a)[NUM_ELEMS (TYPE)], \ > + TYPE *restrict r, int n) \ > +{ \ > + for (int i =3D 0; i < n; i++) \ > + { \ > + r[i] =3D 0; = \ > + for (int j =3D 0; j < NUM_ELEMS (TYPE); j++) \ > + r[i] +=3D a[i][j]; \ > + } \ > +} > + > +#define TEST_ALL(T) \ > + T (_Float16) \ > + T (float) \ > + T (double) > + > +TEST_ALL (DEF_REDUC_PLUS) > + > +/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+= ,\s*v[0-9]+} 3 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_stric= t-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c > new file mode 100644 > index 00000000000..f559d40e60f > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-3.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv_zvfh -mabi=3Dilp32d --param= =3Driscv-autovec-preference=3Dscalable -fno-vect-cost-model" } */ > + > +double mat[100][2]; > + > +double > +slp_reduc_plus (int n) > +{ > + double tmp =3D 0.0; > + for (int i =3D 0; i < n; i++) > + { > + tmp =3D tmp + mat[i][0]; > + tmp =3D tmp + mat[i][1]; > + } > + return tmp; > +} > + > +/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+= ,\s*v[0-9]+} 1 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_stric= t-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c > new file mode 100644 > index 00000000000..428d371d9cf > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-4.c > @@ -0,0 +1,24 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv_zvfh -mabi=3Dilp32d --param= =3Driscv-autovec-preference=3Dscalable -fno-vect-cost-model" } */ > + > +double mat[100][8]; > + > +double > +slp_reduc_plus (int n) > +{ > + double tmp =3D 0.0; > + for (int i =3D 0; i < n; i++) > + { > + tmp =3D tmp + mat[i][0]; > + tmp =3D tmp + mat[i][1]; > + tmp =3D tmp + mat[i][2]; > + tmp =3D tmp + mat[i][3]; > + tmp =3D tmp + mat[i][4]; > + tmp =3D tmp + mat[i][5]; > + tmp =3D tmp + mat[i][6]; > + tmp =3D tmp + mat[i][7]; > + } > + return tmp; > +} > + > +/* { dg-final { scan-assembler {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[= 0-9]+} } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_stric= t-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c > new file mode 100644 > index 00000000000..24add2291f1 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-5.c > @@ -0,0 +1,28 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv_zvfh -mabi=3Dilp32d --param= =3Driscv-autovec-preference=3Dscalable -fno-vect-cost-model" } */ > + > +double mat[100][12]; > + > +double > +slp_reduc_plus (int n) > +{ > + double tmp =3D 0.0; > + for (int i =3D 0; i < n; i++) > + { > + tmp =3D tmp + mat[i][0]; > + tmp =3D tmp + mat[i][1]; > + tmp =3D tmp + mat[i][2]; > + tmp =3D tmp + mat[i][3]; > + tmp =3D tmp + mat[i][4]; > + tmp =3D tmp + mat[i][5]; > + tmp =3D tmp + mat[i][6]; > + tmp =3D tmp + mat[i][7]; > + tmp =3D tmp + mat[i][8]; > + tmp =3D tmp + mat[i][9]; > + tmp =3D tmp + mat[i][10]; > + tmp =3D tmp + mat[i][11]; > + } > + return tmp; > +} > + > +/* { dg-final { scan-assembler {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+,\s*v[= 0-9]+} } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_stric= t-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c > new file mode 100644 > index 00000000000..c1567b067ba > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-6.c > @@ -0,0 +1,18 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv_zvfh -mabi=3Dilp32d --param= =3Driscv-autovec-preference=3Dscalable -fno-vect-cost-model -fdump-tree-vec= t-details" } */ > + > +float > +double_reduc (float (*i)[16]) > +{ > + float l =3D 0; > + > +#pragma GCC unroll 0 > + for (int a =3D 0; a < 8; a++) > + for (int b =3D 0; b < 100; b++) > + l +=3D i[b][a]; > + return l; > +} > + > +/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+= ,\s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-tree-dump "Detected double reduction" "vect" } } */ > +/* { dg-final { scan-tree-dump-not "OUTER LOOP VECTORIZED" "vect" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_stric= t-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c > new file mode 100644 > index 00000000000..f742a824bb2 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict-7.c > @@ -0,0 +1,21 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv_zvfh -mabi=3Dilp32d --param= =3Driscv-autovec-preference=3Dscalable -fno-vect-cost-model -fdump-tree-vec= t-details" } */ > + > +float > +double_reduc (float *i, float *j) > +{ > + float k =3D 0, l =3D 0; > + > + for (int a =3D 0; a < 8; a++) > + for (int b =3D 0; b < 100; b++) > + { > + k +=3D i[b]; > + l +=3D j[b]; > + } > + return l * k; > +} > + > +/* { dg-final { scan-assembler-times {vle32\.v} 2 } } */ > +/* { dg-final { scan-assembler-times {vfredosum\.vs\s+v[0-9]+,\s*v[0-9]+= ,\s*v[0-9]+} 2 } } */ > +/* { dg-final { scan-tree-dump "Detected double reduction" "vect" } } */ > +/* { dg-final { scan-tree-dump-not "OUTER LOOP VECTORIZED" "vect" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_stric= t_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_r= un-1.c > new file mode 100644 > index 00000000000..516be97e9eb > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-1= .c > @@ -0,0 +1,29 @@ > +/* { dg-do run { target { riscv_vector } } } */ > +/* { dg-additional-options "--param=3Driscv-autovec-preference=3Dscalabl= e -fno-vect-cost-model" } */ > + > +#include "reduc_strict-1.c" > + > +#define TEST_REDUC_PLUS(TYPE) \ > + { \ > + TYPE a[NUM_ELEMS (TYPE)]; \ > + TYPE b[NUM_ELEMS (TYPE)]; \ > + TYPE r =3D 0, q =3D 3; \ > + for (int i =3D 0; i < NUM_ELEMS (TYPE); i++) \ > + { \ > + a[i] =3D (i * 0.1) * (i & 1 ? 1 : -1); \ > + b[i] =3D (i * 0.3) * (i & 1 ? 1 : -1); \ > + r +=3D a[i]; \ > + q -=3D b[i]; \ > + asm volatile ("" ::: "memory"); \ > + } \ > + TYPE res =3D reduc_plus_##TYPE (a, b); \ > + if (res !=3D r * q) \ > + __builtin_abort (); \ > + } > + > +int __attribute__ ((optimize (1))) > +main () > +{ > + TEST_ALL (TEST_REDUC_PLUS); > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_stric= t_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_r= un-2.c > new file mode 100644 > index 00000000000..0a4238d96f3 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_strict_run-2= .c > @@ -0,0 +1,31 @@ > +/* { dg-do run { target { riscv_vector } } } */ > +/* { dg-additional-options "--param=3Driscv-autovec-preference=3Dscalabl= e -fno-vect-cost-model" } */ > + > +#include "reduc_strict-2.c" > + > +#define NROWS 5 > + > +#define TEST_REDUC_PLUS(TYPE) \ > + { \ > + TYPE a[NROWS][NUM_ELEMS (TYPE)]; \ > + TYPE r[NROWS]; \ > + TYPE expected[NROWS] =3D {}; \ > + for (int i =3D 0; i < NROWS; ++i) \ > + for (int j =3D 0; j < NUM_ELEMS (TYPE); ++j) \ > + { \ > + a[i][j] =3D (i * 0.1 + j * 0.6) * (j & 1 ? 1 : -1); \ > + expected[i] +=3D a[i][j]; \ > + asm volatile ("" ::: "memory"); \ > + } \ > + reduc_plus_##TYPE (a, r, NROWS); \ > + for (int i =3D 0; i < NROWS; ++i) \ > + if (r[i] !=3D expected[i]) \ > + __builtin_abort (); \ > + } > + > +int __attribute__ ((optimize (1))) > +main () > +{ > + TEST_ALL (TEST_REDUC_PLUS); > + return 0; > +} > -- > 2.36.1 >