From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-ed1-x534.google.com (mail-ed1-x534.google.com [IPv6:2a00:1450:4864:20::534]) by sourceware.org (Postfix) with ESMTPS id 4AC8D3858D28 for ; Mon, 17 Jul 2023 09:33:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4AC8D3858D28 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=sifive.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=sifive.com Received: by mail-ed1-x534.google.com with SMTP id 4fb4d7f45d1cf-51e5d9e20ecso6032407a12.1 for ; Mon, 17 Jul 2023 02:33:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1689586413; x=1692178413; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=UV3dpnsI0xZmKMNPEiEHkw1uG/ufoiqwYPvgOV4piic=; b=T9TbTQ8wvmeWXbkHsvcAN6r+YYWqPrJ+BxnBWvTe4YQa388WYSxTbCB/xduk5oaMxy 22K3Xzax0WMeBx9s1RMnsZmFhA7LvzrVo990l6atOJYLYPeSQHhFQsfdZlDEkbq3XcuX zbZqtktXEzlkE2ZQPrTsNfLj2XmiIeA3t2X6rp+XK7YhMRo9QvVYGjbThhD0PzjX6aNI ZR85se9d74ili7wC764GcmcRZHaKoewA875LFGqJgb9Cs4ZvWMw68jipK+F9TCRtTXmg IeN3Z/MboRlsVMuXpP5h3LIoOsJ59rtFBbTuHCco+XZMV6HwJR/1vnyyunSqMVSd5O5G f1ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689586413; x=1692178413; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UV3dpnsI0xZmKMNPEiEHkw1uG/ufoiqwYPvgOV4piic=; b=RSuB9myIQHZF8AIZ84v2LIpvvaKD+d/beUB+7NsIkSQ3RC2SY81cIE0IOpiFzyx0cv oQVitMtsHEG13wHNw0PCTNTyiR4Zq4QCuJIwev45mnuSTjRe4HxlaJZAxKmCLY2kyxqx /MsGdCI/ueZPBzXahgsOg3U7VhHwRy3wiw+Bm1RvLdlRkVoTP9jbDhrVx80iAfHRYQYj SfIHKS7C/XEqeaI5qZRqNx6J19Z/RQmC/RoKt3enV6ukD9xdKiM5xO97tLPI6WvQse81 VgUWRlI3pta81GOFaFBVZit7ilp2b5o6WoBKJ+Zc1rwz8ttsPnNiS/CLIhf5H6ixGDOQ scHQ== X-Gm-Message-State: ABy/qLYvzCWUta8vLuIPR66qLVNN3djBGppFCUg3Qo2BAiQ+SKV04vGH 1JNNKc6Y+PM7rFpQruAiwN5vrV5+tM38Ct4FWQmhAg== X-Google-Smtp-Source: APBJJlH7yUS6+AzP50WmoZS//29C5IcvF5Ue5PQNnUAavJNLx/FQIUyfj5u16HrY0UuNWC+IVYKcKZUIP6a/t2OnYjs= X-Received: by 2002:aa7:da99:0:b0:521:8d64:df1c with SMTP id q25-20020aa7da99000000b005218d64df1cmr2737348eds.0.1689586412909; Mon, 17 Jul 2023 02:33:32 -0700 (PDT) MIME-Version: 1.0 References: <20230717081946.187709-1-juzhe.zhong@rivai.ai> In-Reply-To: <20230717081946.187709-1-juzhe.zhong@rivai.ai> From: Kito Cheng Date: Mon, 17 Jul 2023 17:33:20 +0800 Message-ID: Subject: Re: [PATCH V2] RISC-V: Support non-SLP unordered reduction To: Juzhe-Zhong Cc: gcc-patches@gcc.gnu.org, kito.cheng@gmail.com, palmer@dabbelt.com, palmer@rivosinc.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00,DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,GIT_PATCH_0,KAM_ASCII_DIVIDERS,KAM_SHORT,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS,TXREP,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org List-Id: LGTM, thanks :) On Mon, Jul 17, 2023 at 4:20=E2=80=AFPM Juzhe-Zhong = wrote: > > This patch add reduc_*_scal to support reduction auto-vectorization. > > Use COND_LEN_* + reduc_*_scal to support unordered non-SLP auto-vectoriza= tion. > > Consider this following case: > int __attribute__((noipa)) > and_loop (int32_t * __restrict x, > int32_t n, int res) > { > for (int i =3D 0; i < n; ++i) > res &=3D x[i]; > return res; > } > > ASM: > and_loop: > ble a1,zero,.L4 > vsetvli a3,zero,e32,m1,ta,ma > vmv.v.i v1,-1 > .L3: > vsetvli a5,a1,e32,m1,tu,ma ------------> MUST BE "TU". > slli a4,a5,2 > sub a1,a1,a5 > vle32.v v2,0(a0) > add a0,a0,a4 > vand.vv v1,v2,v1 > bne a1,zero,.L3 > vsetivli zero,1,e32,m1,ta,ma > vmv.v.i v2,-1 > vsetvli a3,zero,e32,m1,ta,ma > vredand.vs v1,v1,v2 > vmv.x.s a5,v1 > and a0,a2,a5 > ret > .L4: > mv a0,a2 > ret > > Fix bug of VSETVL PASS which is caused by reduction testcase. > > SLP reduction and floating-point in-order reduction are not supported yet= . > > gcc/ChangeLog: > > * config/riscv/autovec.md (reduc_plus_scal_): New pattern. > (reduc_smax_scal_): Ditto. > (reduc_umax_scal_): Ditto. > (reduc_smin_scal_): Ditto. > (reduc_umin_scal_): Ditto. > (reduc_and_scal_): Ditto. > (reduc_ior_scal_): Ditto. > (reduc_xor_scal_): Ditto. > * config/riscv/riscv-protos.h (enum insn_type): Add reduction. > (expand_reduction): New function. > * config/riscv/riscv-v.cc (emit_vlmax_reduction_insn): Ditto. > (emit_vlmax_fp_reduction_insn): Ditto. > (get_m1_mode): Ditto. > (expand_cond_len_binop): Fix name. > (expand_reduction): New function > * config/riscv/riscv-vsetvl.cc (gen_vsetvl_pat): Fix VSETVL BUG. > (validate_change_or_fail): New function. > (change_insn): Fix VSETVL BUG. > (change_vsetvl_insn): Ditto. > (pass_vsetvl::backward_demand_fusion): Ditto. > (pass_vsetvl::df_post_optimization): Ditto. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/rvv/rvv.exp: Add reduction tests. > * gcc.target/riscv/rvv/autovec/reduc/reduc-1.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc-2.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc-3.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc-4.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_run-3.c: New test. > * gcc.target/riscv/rvv/autovec/reduc/reduc_run-4.c: New test. > > --- > gcc/config/riscv/autovec.md | 138 ++++++++++++++++++ > gcc/config/riscv/riscv-protos.h | 2 + > gcc/config/riscv/riscv-v.cc | 84 ++++++++++- > gcc/config/riscv/riscv-vsetvl.cc | 57 ++++++-- > .../riscv/rvv/autovec/reduc/reduc-1.c | 118 +++++++++++++++ > .../riscv/rvv/autovec/reduc/reduc-2.c | 129 ++++++++++++++++ > .../riscv/rvv/autovec/reduc/reduc-3.c | 65 +++++++++ > .../riscv/rvv/autovec/reduc/reduc-4.c | 59 ++++++++ > .../riscv/rvv/autovec/reduc/reduc_run-1.c | 56 +++++++ > .../riscv/rvv/autovec/reduc/reduc_run-2.c | 79 ++++++++++ > .../riscv/rvv/autovec/reduc/reduc_run-3.c | 49 +++++++ > .../riscv/rvv/autovec/reduc/reduc_run-4.c | 66 +++++++++ > gcc/testsuite/gcc.target/riscv/rvv/rvv.exp | 2 + > 13 files changed, 887 insertions(+), 17 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c-3.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c-4.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_run-1.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_run-2.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_run-3.c > create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/redu= c_run-4.c > > diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md > index 64a41bd7101..8cdec75bacf 100644 > --- a/gcc/config/riscv/autovec.md > +++ b/gcc/config/riscv/autovec.md > @@ -1554,3 +1554,141 @@ > riscv_vector::expand_cond_len_ternop (icode, operands); > DONE; > }) > + > +;; =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > +;; =3D=3D Reductions > +;; =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > + > +;; ---------------------------------------------------------------------= ---- > +;; ---- [INT] Tree reductions > +;; ---------------------------------------------------------------------= ---- > +;; Includes: > +;; - vredsum.vs > +;; - vredmaxu.vs > +;; - vredmax.vs > +;; - vredminu.vs > +;; - vredmin.vs > +;; - vredand.vs > +;; - vredor.vs > +;; - vredxor.vs > +;; ---------------------------------------------------------------------= ---- > + > +(define_expand "reduc_plus_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VI 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + riscv_vector::expand_reduction (PLUS, operands, CONST0_RTX (mode)= ); > + DONE; > +}) > + > +(define_expand "reduc_smax_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VI 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + int prec =3D GET_MODE_PRECISION (mode); > + rtx min =3D immed_wide_int_const (wi::min_value (prec, SIGNED), m= ode); > + riscv_vector::expand_reduction (SMAX, operands, min); > + DONE; > +}) > + > +(define_expand "reduc_umax_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VI 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + riscv_vector::expand_reduction (UMAX, operands, CONST0_RTX (mode)= ); > + DONE; > +}) > + > +(define_expand "reduc_smin_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VI 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + int prec =3D GET_MODE_PRECISION (mode); > + rtx max =3D immed_wide_int_const (wi::max_value (prec, SIGNED), m= ode); > + riscv_vector::expand_reduction (SMIN, operands, max); > + DONE; > +}) > + > +(define_expand "reduc_umin_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VI 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + int prec =3D GET_MODE_PRECISION (mode); > + rtx max =3D immed_wide_int_const (wi::max_value (prec, UNSIGNED), mode); > + riscv_vector::expand_reduction (UMIN, operands, max); > + DONE; > +}) > + > +(define_expand "reduc_and_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VI 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + riscv_vector::expand_reduction (AND, operands, CONSTM1_RTX (mode)= ); > + DONE; > +}) > + > +(define_expand "reduc_ior_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VI 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + riscv_vector::expand_reduction (IOR, operands, CONST0_RTX (mode))= ; > + DONE; > +}) > + > +(define_expand "reduc_xor_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VI 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + riscv_vector::expand_reduction (XOR, operands, CONST0_RTX (mode))= ; > + DONE; > +}) > + > +;; ---------------------------------------------------------------------= ---- > +;; ---- [FP] Tree reductions > +;; ---------------------------------------------------------------------= ---- > +;; Includes: > +;; - vfredusum.vs > +;; - vfredmax.vs > +;; - vfredmin.vs > +;; ---------------------------------------------------------------------= ---- > + > +(define_expand "reduc_plus_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VF 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + riscv_vector::expand_reduction (PLUS, operands, CONST0_RTX (mode)= ); > + DONE; > +}) > + > +(define_expand "reduc_smax_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VF 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + REAL_VALUE_TYPE rv; > + real_inf (&rv, true); > + rtx f =3D const_double_from_real_value (rv, mode); > + riscv_vector::expand_reduction (SMAX, operands, f); > + DONE; > +}) > + > +(define_expand "reduc_smin_scal_" > + [(match_operand: 0 "register_operand") > + (match_operand:VF 1 "register_operand")] > + "TARGET_VECTOR" > +{ > + REAL_VALUE_TYPE rv; > + real_inf (&rv, false); > + rtx f =3D const_double_from_real_value (rv, mode); > + riscv_vector::expand_reduction (SMIN, operands, f); > + DONE; > +}) > diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-pro= tos.h > index f91c2d51c3c..16fb8dabca0 100644 > --- a/gcc/config/riscv/riscv-protos.h > +++ b/gcc/config/riscv/riscv-protos.h > @@ -198,6 +198,7 @@ enum insn_type > RVV_COMPRESS_OP =3D 4, > RVV_GATHER_M_OP =3D 5, > RVV_SCATTER_M_OP =3D 4, > + RVV_REDUCTION_OP =3D 3, > }; > enum vlmul_type > { > @@ -281,6 +282,7 @@ bool has_vi_variant_p (rtx_code, rtx); > void expand_vec_cmp (rtx, rtx_code, rtx, rtx); > bool expand_vec_cmp_float (rtx, rtx_code, rtx, rtx, bool); > void expand_cond_len_binop (rtx_code, rtx *); > +void expand_reduction (rtx_code, rtx *, rtx); > #endif > bool sew64_scalar_helper (rtx *, rtx *, rtx, machine_mode, > bool, void (*)(rtx *, rtx)); > diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc > index c3fd4a1b03b..b4884a30872 100644 > --- a/gcc/config/riscv/riscv-v.cc > +++ b/gcc/config/riscv/riscv-v.cc > @@ -1159,6 +1159,43 @@ emit_vlmax_compress_insn (unsigned icode, rtx *ops= ) > e.emit_insn ((enum insn_code) icode, ops); > } > > +/* Emit reduction instruction. */ > +static void > +emit_vlmax_reduction_insn (unsigned icode, int op_num, rtx *ops) > +{ > + machine_mode dest_mode =3D GET_MODE (ops[0]); > + machine_mode mask_mode =3D get_mask_mode (GET_MODE (ops[1])).require (= ); > + insn_expander e (op_num, > + /* HAS_DEST_P */ true, > + /* FULLY_UNMASKED_P */ true, > + /* USE_REAL_MERGE_P */ false, > + /* HAS_AVL_P */ true, > + /* VLMAX_P */ true, dest_mode, > + mask_mode); > + > + e.set_policy (TAIL_ANY); > + e.emit_insn ((enum insn_code) icode, ops); > +} > + > +/* Emit reduction instruction. */ > +static void > +emit_vlmax_fp_reduction_insn (unsigned icode, int op_num, rtx *ops) > +{ > + machine_mode dest_mode =3D GET_MODE (ops[0]); > + machine_mode mask_mode =3D get_mask_mode (GET_MODE (ops[1])).require (= ); > + insn_expander e (op_num, > + /* HAS_DEST_P */ true, > + /* FULLY_UNMASKED_P */ true, > + /* USE_REAL_MERGE_P */ false, > + /* HAS_AVL_P */ true, > + /* VLMAX_P */ true, dest_mode, > + mask_mode); > + > + e.set_policy (TAIL_ANY); > + e.set_rounding_mode (FRM_DYN); > + e.emit_insn ((enum insn_code) icode, ops); > +} > + > /* Emit merge instruction. */ > > static machine_mode > @@ -1651,6 +1688,17 @@ get_mask_mode (machine_mode mode) > return get_vector_mode (BImode, GET_MODE_NUNITS (mode)); > } > > +/* Return the appropriate M1 mode for MODE. */ > + > +static opt_machine_mode > +get_m1_mode (machine_mode mode) > +{ > + scalar_mode smode =3D GET_MODE_INNER (mode); > + unsigned int bytes =3D GET_MODE_SIZE (smode); > + poly_uint64 m1_nunits =3D exact_div (BYTES_PER_RISCV_VECTOR, bytes); > + return get_vector_mode (smode, m1_nunits); > +} > + > /* Return the RVV vector mode that has NUNITS elements of mode INNER_MOD= E. > This function is not only used by builtins, but also will be used by > auto-vectorization in the future. */ > @@ -3121,9 +3169,9 @@ expand_cond_len_binop (rtx_code code, rtx *ops) > rtx ops[] =3D {dest, mask, merge, src1, src2}; > insn_code icode =3D code_for_pred (code, mode); > if (needs_fp_rounding (code, mode)) > - emit_nonvlmax_fp_tu_insn (icode, RVV_BINOP_MU, ops, len); > + emit_nonvlmax_fp_tu_insn (icode, RVV_BINOP_TU, ops, len); > else > - emit_nonvlmax_tu_insn (icode, RVV_BINOP_MU, ops, len); > + emit_nonvlmax_tu_insn (icode, RVV_BINOP_TU, ops, len); > } > else > /* FIXME: Enable this case when we support it in the middle-end. */ > @@ -3316,4 +3364,36 @@ expand_cond_len_ternop (unsigned icode, rtx *ops) > gcc_unreachable (); > } > > +/* Expand reduction operations. */ > +void > +expand_reduction (rtx_code code, rtx *ops, rtx init) > +{ > + machine_mode vmode =3D GET_MODE (ops[1]); > + machine_mode m1_mode =3D get_m1_mode (vmode).require (); > + machine_mode m1_mmode =3D get_mask_mode (m1_mode).require (); > + > + rtx m1_tmp =3D gen_reg_rtx (m1_mode); > + rtx m1_mask =3D gen_scalar_move_mask (m1_mmode); > + rtx m1_undef =3D RVV_VUNDEF (m1_mode); > + rtx scalar_move_ops[] =3D {m1_tmp, m1_mask, m1_undef, init}; > + emit_scalar_move_insn (code_for_pred_broadcast (m1_mode), scalar_move_= ops); > + > + rtx m1_tmp2 =3D gen_reg_rtx (m1_mode); > + rtx reduc_ops[] =3D {m1_tmp2, ops[1], m1_tmp}; > + > + if (FLOAT_MODE_P (vmode) && code =3D=3D PLUS) > + { > + insn_code icode > + =3D code_for_pred_reduc_plus (UNSPEC_UNORDERED, vmode, m1_mode); > + emit_vlmax_fp_reduction_insn (icode, RVV_REDUCTION_OP, reduc_ops); > + } > + else > + { > + insn_code icode =3D code_for_pred_reduc (code, vmode, m1_mode); > + emit_vlmax_reduction_insn (icode, RVV_REDUCTION_OP, reduc_ops); > + } > + > + emit_insn (gen_pred_extract_first (m1_mode, ops[0], m1_tmp2)); > +} > + > } // namespace riscv_vector > diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vs= etvl.cc > index 586dc8e5379..bb7ba129a5d 100644 > --- a/gcc/config/riscv/riscv-vsetvl.cc > +++ b/gcc/config/riscv/riscv-vsetvl.cc > @@ -646,7 +646,8 @@ gen_vsetvl_pat (enum vsetvl_type insn_type, const vl_= vtype_info &info, rtx vl) > } > > static rtx > -gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info &info) > +gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_info &info, > + rtx vl =3D NULL_RTX) > { > rtx new_pat; > vl_vtype_info new_info =3D info; > @@ -654,15 +655,17 @@ gen_vsetvl_pat (rtx_insn *rinsn, const vector_insn_= info &info) > && fault_first_load_p (info.get_insn ()->rtl ())) > new_info.set_avl_info ( > avl_info (get_avl (info.get_insn ()->rtl ()), nullptr)); > - if (vsetvl_insn_p (rinsn) || vlmax_avl_p (info.get_avl ())) > + if (vl) > + new_pat =3D gen_vsetvl_pat (VSETVL_NORMAL, new_info, vl); > + else > { > - rtx dest =3D get_vl (rinsn); > - new_pat =3D gen_vsetvl_pat (VSETVL_NORMAL, new_info, dest); > + if (vsetvl_insn_p (rinsn) || vlmax_avl_p (info.get_avl ())) > + new_pat =3D gen_vsetvl_pat (VSETVL_NORMAL, new_info, get_vl (rins= n)); > + else if (INSN_CODE (rinsn) =3D=3D CODE_FOR_vsetvl_vtype_change_onl= y) > + new_pat =3D gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, new_info, N= ULL_RTX); > + else > + new_pat =3D gen_vsetvl_pat (VSETVL_DISCARD_RESULT, new_info, NULL= _RTX); > } > - else if (INSN_CODE (rinsn) =3D=3D CODE_FOR_vsetvl_vtype_change_only) > - new_pat =3D gen_vsetvl_pat (VSETVL_VTYPE_CHANGE_ONLY, new_info, NULL= _RTX); > - else > - new_pat =3D gen_vsetvl_pat (VSETVL_DISCARD_RESULT, new_info, NULL_RT= X); > return new_pat; > } > > @@ -805,6 +808,14 @@ get_vl_vtype_info (const insn_info *insn) > return info; > } > > +/* Change insn and Assert the change always happens. */ > +static void > +validate_change_or_fail (rtx object, rtx *loc, rtx new_rtx, bool in_grou= p) > +{ > + bool change_p =3D validate_change (object, loc, new_rtx, in_group); > + gcc_assert (change_p); > +} > + > static void > change_insn (rtx_insn *rinsn, rtx new_pat) > { > @@ -818,7 +829,7 @@ change_insn (rtx_insn *rinsn, rtx new_pat) > print_rtl_single (dump_file, PATTERN (rinsn)); > } > > - validate_change (rinsn, &PATTERN (rinsn), new_pat, false); > + validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, false); > > if (dump_file) > { > @@ -874,7 +885,7 @@ change_insn (function_info *ssa, insn_change change, = insn_info *insn, > } > > insn_change_watermark watermark; > - validate_change (rinsn, &PATTERN (rinsn), new_pat, true); > + validate_change_or_fail (rinsn, &PATTERN (rinsn), new_pat, true); > > /* These routines report failures themselves. */ > if (!recog (attempt, change) || !change_is_worthwhile (change, false)) > @@ -931,7 +942,8 @@ change_insn (function_info *ssa, insn_change change, = insn_info *insn, > } > > static void > -change_vsetvl_insn (const insn_info *insn, const vector_insn_info &info) > +change_vsetvl_insn (const insn_info *insn, const vector_insn_info &info, > + rtx vl =3D NULL_RTX) > { > rtx_insn *rinsn; > if (vector_config_insn_p (insn->rtl ())) > @@ -945,7 +957,7 @@ change_vsetvl_insn (const insn_info *insn, const vect= or_insn_info &info) > rinsn =3D PREV_INSN (insn->rtl ()); > gcc_assert (vector_config_insn_p (rinsn)); > } > - rtx new_pat =3D gen_vsetvl_pat (rinsn, info); > + rtx new_pat =3D gen_vsetvl_pat (rinsn, info, vl); > change_insn (rinsn, new_pat); > } > > @@ -3377,7 +3389,20 @@ pass_vsetvl::backward_demand_fusion (void) > new_info)) > continue; > > - change_vsetvl_insn (new_info.get_insn (), new_info); > + rtx vl =3D NULL_RTX; > + /* Backward VLMAX VL: > + bb 3: > + vsetivli zero, 1 ... -> vsetvli t1, zero > + vmv.s.x > + bb 5: > + vsetvli t1, zero ... -> to be elided. > + vlse16.v > + > + We should forward "t1". */ > + if (!block_info.reaching_out.has_avl_reg () > + && vlmax_avl_p (new_info.get_avl ())) > + vl =3D get_vl (prop.get_insn ()->rtl ()); > + change_vsetvl_insn (new_info.get_insn (), new_info, vl); > if (block_info.local_dem =3D=3D block_info.reaching_out) > block_info.local_dem =3D new_info; > block_info.reaching_out =3D new_info; > @@ -4524,13 +4549,15 @@ pass_vsetvl::df_post_optimization (void) const > { > rtx new_pat =3D gen_vsetvl_pat (VSETVL_VTYPE_CHANGE= _ONLY, > info, NULL_RTX); > - validate_change (rinsn, &PATTERN (rinsn), new_pat, = false); > + validate_change_or_fail (rinsn, &PATTERN (rinsn), n= ew_pat, > + false); > } > else if (!vlmax_avl_p (info.get_avl ())) > { > rtx new_pat =3D gen_vsetvl_pat (VSETVL_DISCARD_RESU= LT, info, > NULL_RTX); > - validate_change (rinsn, &PATTERN (rinsn), new_pat, = false); > + validate_change_or_fail (rinsn, &PATTERN (rinsn), n= ew_pat, > + false); > } > } > } > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-1.c b= /gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-1.c > new file mode 100644 > index 00000000000..0d543af13ca > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-1.c > @@ -0,0 +1,118 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv_zvfh -mabi=3Dilp32d --param= =3Driscv-autovec-preference=3Dscalable -ffast-math -fno-vect-cost-model" } = */ > + > +#include > + > +#define DEF_REDUC_PLUS(TYPE) \ > +TYPE __attribute__ ((noinline, noclone)) \ > +reduc_plus_##TYPE (TYPE *a, int n) \ > +{ \ > + TYPE r =3D 0; \ > + for (int i =3D 0; i < n; ++i) \ > + r +=3D a[i]; \ > + return r; \ > +} > + > +#define TEST_PLUS(T) \ > + T (int8_t) \ > + T (int16_t) \ > + T (int32_t) \ > + T (int64_t) \ > + T (uint8_t) \ > + T (uint16_t) \ > + T (uint32_t) \ > + T (uint64_t) \ > + T (_Float16) \ > + T (float) \ > + T (double) > + > +TEST_PLUS (DEF_REDUC_PLUS) > + > +#define DEF_REDUC_MAXMIN(TYPE, NAME, CMP_OP) \ > +TYPE __attribute__ ((noinline, noclone)) \ > +reduc_##NAME##_##TYPE (TYPE *a, int n) \ > +{ \ > + TYPE r =3D 13; \ > + for (int i =3D 0; i < n; ++i) \ > + r =3D a[i] CMP_OP r ? a[i] : r; \ > + return r; \ > +} > + > +#define TEST_MAXMIN(T) \ > + T (int8_t, max, >) \ > + T (int16_t, max, >) \ > + T (int32_t, max, >) \ > + T (int64_t, max, >) \ > + T (uint8_t, max, >) \ > + T (uint16_t, max, >) \ > + T (uint32_t, max, >) \ > + T (uint64_t, max, >) \ > + T (_Float16, max, >) \ > + T (float, max, >) \ > + T (double, max, >) \ > + \ > + T (int8_t, min, <) \ > + T (int16_t, min, <) \ > + T (int32_t, min, <) \ > + T (int64_t, min, <) \ > + T (uint8_t, min, <) \ > + T (uint16_t, min, <) \ > + T (uint32_t, min, <) \ > + T (uint64_t, min, <) \ > + T (_Float16, min, <) \ > + T (float, min, <) \ > + T (double, min, <) > + > +TEST_MAXMIN (DEF_REDUC_MAXMIN) > + > +#define DEF_REDUC_BITWISE(TYPE, NAME, BIT_OP) \ > +TYPE __attribute__ ((noinline, noclone)) \ > +reduc_##NAME##_##TYPE (TYPE *a, int n) \ > +{ \ > + TYPE r =3D 13; \ > + for (int i =3D 0; i < n; ++i) \ > + r BIT_OP a[i]; \ > + return r; \ > +} > + > +#define TEST_BITWISE(T) \ > + T (int8_t, and, &=3D) \ > + T (int16_t, and, &=3D) \ > + T (int32_t, and, &=3D) \ > + T (int64_t, and, &=3D) \ > + T (uint8_t, and, &=3D) \ > + T (uint16_t, and, &=3D) \ > + T (uint32_t, and, &=3D) \ > + T (uint64_t, and, &=3D) \ > + \ > + T (int8_t, ior, |=3D) \ > + T (int16_t, ior, |=3D) \ > + T (int32_t, ior, |=3D) \ > + T (int64_t, ior, |=3D) \ > + T (uint8_t, ior, |=3D) \ > + T (uint16_t, ior, |=3D) \ > + T (uint32_t, ior, |=3D) \ > + T (uint64_t, ior, |=3D) \ > + \ > + T (int8_t, xor, ^=3D) \ > + T (int16_t, xor, ^=3D) \ > + T (int32_t, xor, ^=3D) \ > + T (int64_t, xor, ^=3D) \ > + T (uint8_t, xor, ^=3D) \ > + T (uint16_t, xor, ^=3D) \ > + T (uint32_t, xor, ^=3D) \ > + T (uint64_t, xor, ^=3D) > + > +TEST_BITWISE (DEF_REDUC_BITWISE) > + > +/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 8 } } */ > +/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 4 } } */ > +/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 4 } } */ > +/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 4 } } */ > +/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 4 } } */ > +/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 8 } } */ > +/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s= *v[0-9]+} 8 } } */ > +/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 8 } } */ > +/* { dg-final { scan-assembler-times {vfredusum\.vs\s+v[0-9]+,\s*v[0-9]+= ,\s*v[0-9]+} 3 } } */ > +/* { dg-final { scan-assembler-times {vfredmax\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 3 } } */ > +/* { dg-final { scan-assembler-times {vfredmin\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 3 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-2.c b= /gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-2.c > new file mode 100644 > index 00000000000..136a8a378bf > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-2.c > @@ -0,0 +1,129 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv_zvfh -mabi=3Dilp32d --param= =3Driscv-autovec-preference=3Dscalable -ffast-math -fno-vect-cost-model" } = */ > + > +#include > + > +#define NUM_ELEMS(TYPE) (1024 / sizeof (TYPE)) > + > +#define DEF_REDUC_PLUS(TYPE) \ > +void __attribute__ ((noinline, noclone)) \ > +reduc_plus_##TYPE (TYPE (*restrict a)[NUM_ELEMS (TYPE)], \ > + TYPE *restrict r, int n) \ > +{ \ > + for (int i =3D 0; i < n; i++) \ > + { \ > + r[i] =3D 0; = \ > + for (int j =3D 0; j < NUM_ELEMS (TYPE); j++) \ > + r[i] +=3D a[i][j]; \ > + } \ > +} > + > +#define TEST_PLUS(T) \ > + T (int8_t) \ > + T (int16_t) \ > + T (int32_t) \ > + T (int64_t) \ > + T (uint8_t) \ > + T (uint16_t) \ > + T (uint32_t) \ > + T (uint64_t) \ > + T (_Float16) \ > + T (float) \ > + T (double) > + > +TEST_PLUS (DEF_REDUC_PLUS) > + > +#define DEF_REDUC_MAXMIN(TYPE, NAME, CMP_OP) \ > +void __attribute__ ((noinline, noclone)) \ > +reduc_##NAME##_##TYPE (TYPE (*restrict a)[NUM_ELEMS (TYPE)], \ > + TYPE *restrict r, int n) \ > +{ \ > + for (int i =3D 0; i < n; i++) \ > + { \ > + r[i] =3D a[i][0]; \ > + for (int j =3D 0; j < NUM_ELEMS (TYPE); j++) \ > + r[i] =3D a[i][j] CMP_OP r[i] ? a[i][j] : r[i]; \ > + } \ > +} > + > +#define TEST_MAXMIN(T) \ > + T (int8_t, max, >) \ > + T (int16_t, max, >) \ > + T (int32_t, max, >) \ > + T (int64_t, max, >) \ > + T (uint8_t, max, >) \ > + T (uint16_t, max, >) \ > + T (uint32_t, max, >) \ > + T (uint64_t, max, >) \ > + T (_Float16, max, >) \ > + T (float, max, >) \ > + T (double, max, >) \ > + \ > + T (int8_t, min, <) \ > + T (int16_t, min, <) \ > + T (int32_t, min, <) \ > + T (int64_t, min, <) \ > + T (uint8_t, min, <) \ > + T (uint16_t, min, <) \ > + T (uint32_t, min, <) \ > + T (uint64_t, min, <) \ > + T (_Float16, min, <) \ > + T (float, min, <) \ > + T (double, min, <) > + > +TEST_MAXMIN (DEF_REDUC_MAXMIN) > + > +#define DEF_REDUC_BITWISE(TYPE,NAME,BIT_OP) \ > +void __attribute__ ((noinline, noclone)) \ > +reduc_##NAME##TYPE (TYPE (*restrict a)[NUM_ELEMS(TYPE)], \ > + TYPE *restrict r, int n) \ > +{ \ > + for (int i =3D 0; i < n; i++) \ > + { \ > + r[i] =3D a[i][0]; \ > + for (int j =3D 0; j < NUM_ELEMS(TYPE); j++) = \ > + r[i] BIT_OP a[i][j]; \ > + } \ > +} > + > +#define TEST_BITWISE(T) \ > + T (int8_t, and, &=3D) \ > + T (int16_t, and, &=3D) \ > + T (int32_t, and, &=3D) \ > + T (int64_t, and, &=3D) \ > + T (uint8_t, and, &=3D) \ > + T (uint16_t, and, &=3D) \ > + T (uint32_t, and, &=3D) \ > + T (uint64_t, and, &=3D) \ > + \ > + T (int8_t, ior, |=3D) \ > + T (int16_t, ior, |=3D) \ > + T (int32_t, ior, |=3D) \ > + T (int64_t, ior, |=3D) \ > + T (uint8_t, ior, |=3D) \ > + T (uint16_t, ior, |=3D) \ > + T (uint32_t, ior, |=3D) \ > + T (uint64_t, ior, |=3D) \ > + \ > + T (int8_t, xor, ^=3D) \ > + T (int16_t, xor, ^=3D) \ > + T (int32_t, xor, ^=3D) \ > + T (int64_t, xor, ^=3D) \ > + T (uint8_t, xor, ^=3D) \ > + T (uint16_t, xor, ^=3D) \ > + T (uint32_t, xor, ^=3D) \ > + T (uint64_t, xor, ^=3D) > + > +TEST_BITWISE (DEF_REDUC_BITWISE) > + > +/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 8 } } */ > +/* { dg-final { scan-assembler-times {vredmax\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 4 } } */ > +/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 4 } } */ > +/* { dg-final { scan-assembler-times {vredmin\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 4 } } */ > +/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 4 } } */ > +/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 8 } } */ > +/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s= *v[0-9]+} 8 } } */ > +/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 8 } } */ > +/* { dg-final { scan-assembler-times {vfredusum\.vs\s+v[0-9]+,\s*v[0-9]+= ,\s*v[0-9]+} 3 } } */ > +/* { dg-final { scan-assembler-times {vfredmax\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 3 } } */ > +/* { dg-final { scan-assembler-times {vfredmin\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 3 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-3.c b= /gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-3.c > new file mode 100644 > index 00000000000..c3638344f80 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-3.c > @@ -0,0 +1,65 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv -mabi=3Dilp32d --param=3Dri= scv-autovec-preference=3Dscalable -ffast-math -fno-vect-cost-model" } */ > + > +#include > + > +unsigned short __attribute__((noipa)) > +add_loop (unsigned short *x, int n) > +{ > + unsigned short res =3D 0; > + for (int i =3D 0; i < n; ++i) > + res +=3D x[i]; > + return res; > +} > + > +unsigned short __attribute__((noipa)) > +min_loop (unsigned short *x, int n) > +{ > + unsigned short res =3D ~0; > + for (int i =3D 0; i < n; ++i) > + res =3D res < x[i] ? res : x[i]; > + return res; > +} > + > +unsigned short __attribute__((noipa)) > +max_loop (unsigned short *x, int n) > +{ > + unsigned short res =3D 0; > + for (int i =3D 0; i < n; ++i) > + res =3D res > x[i] ? res : x[i]; > + return res; > +} > + > +unsigned short __attribute__((noipa)) > +and_loop (unsigned short *x, int n) > +{ > + unsigned short res =3D ~0; > + for (int i =3D 0; i < n; ++i) > + res &=3D x[i]; > + return res; > +} > + > +unsigned short __attribute__((noipa)) > +or_loop (unsigned short *x, int n) > +{ > + unsigned short res =3D 0; > + for (int i =3D 0; i < n; ++i) > + res |=3D x[i]; > + return res; > +} > + > +unsigned short __attribute__((noipa)) > +eor_loop (unsigned short *x, int n) > +{ > + unsigned short res =3D 0; > + for (int i =3D 0; i < n; ++i) > + res ^=3D x[i]; > + return res; > +} > + > +/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s= *v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 1 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-4.c b= /gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-4.c > new file mode 100644 > index 00000000000..f00a12826c6 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc-4.c > @@ -0,0 +1,59 @@ > +/* { dg-do compile } */ > +/* { dg-additional-options "-march=3Drv32gcv -mabi=3Dilp32d --param=3Dri= scv-autovec-preference=3Dscalable -ffast-math -fno-vect-cost-model" } */ > + > +#include > + > +unsigned short __attribute__((noipa)) > +add_loop (unsigned short *x, int n, unsigned short res) > +{ > + for (int i =3D 0; i < n; ++i) > + res +=3D x[i]; > + return res; > +} > + > +unsigned short __attribute__((noipa)) > +min_loop (unsigned short *x, int n, unsigned short res) > +{ > + for (int i =3D 0; i < n; ++i) > + res =3D res < x[i] ? res : x[i]; > + return res; > +} > + > +unsigned short __attribute__((noipa)) > +max_loop (unsigned short *x, int n, unsigned short res) > +{ > + for (int i =3D 0; i < n; ++i) > + res =3D res > x[i] ? res : x[i]; > + return res; > +} > + > +unsigned short __attribute__((noipa)) > +and_loop (unsigned short *x, int n, unsigned short res) > +{ > + for (int i =3D 0; i < n; ++i) > + res &=3D x[i]; > + return res; > +} > + > +unsigned short __attribute__((noipa)) > +or_loop (unsigned short *x, int n, unsigned short res) > +{ > + for (int i =3D 0; i < n; ++i) > + res |=3D x[i]; > + return res; > +} > + > +unsigned short __attribute__((noipa)) > +eor_loop (unsigned short *x, int n, unsigned short res) > +{ > + for (int i =3D 0; i < n; ++i) > + res ^=3D x[i]; > + return res; > +} > + > +/* { dg-final { scan-assembler-times {vredsum\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vredmaxu\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vredminu\.vs\s+v[0-9]+,\s*v[0-9]+,= \s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vredand\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vredor\.vs\s+v[0-9]+,\s*v[0-9]+,\s= *v[0-9]+} 1 } } */ > +/* { dg-final { scan-assembler-times {vredxor\.vs\s+v[0-9]+,\s*v[0-9]+,\= s*v[0-9]+} 1 } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-1= .c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c > new file mode 100644 > index 00000000000..b500f857598 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-1.c > @@ -0,0 +1,56 @@ > +/* { dg-do run { target { riscv_vector } } } */ > +/* { dg-additional-options "--param=3Driscv-autovec-preference=3Dscalabl= e -ffast-math -fno-vect-cost-model" } */ > + > +#include "reduc-1.c" > + > +#define NUM_ELEMS(TYPE) (73 + sizeof (TYPE)) > + > +#define INIT_VECTOR(TYPE) \ > + TYPE a[NUM_ELEMS (TYPE) + 1]; \ > + for (int i =3D 0; i < NUM_ELEMS (TYPE) + 1; i++) \ > + { \ > + a[i] =3D ((i * 2) * (i & 1 ? 1 : -1) | 3); \ > + asm volatile ("" ::: "memory"); \ > + } > + > +#define TEST_REDUC_PLUS(TYPE) \ > + { \ > + INIT_VECTOR (TYPE); \ > + TYPE r1 =3D reduc_plus_##TYPE (a, NUM_ELEMS (TYPE)); \ > + volatile TYPE r2 =3D 0; \ > + for (int i =3D 0; i < NUM_ELEMS (TYPE); ++i) \ > + r2 +=3D a[i]; \ > + if (r1 !=3D r2) \ > + __builtin_abort (); \ > + } > + > +#define TEST_REDUC_MAXMIN(TYPE, NAME, CMP_OP) \ > + { \ > + INIT_VECTOR (TYPE); \ > + TYPE r1 =3D reduc_##NAME##_##TYPE (a, NUM_ELEMS (TYPE)); \ > + volatile TYPE r2 =3D 13; \ > + for (int i =3D 0; i < NUM_ELEMS (TYPE); ++i) \ > + r2 =3D a[i] CMP_OP r2 ? a[i] : r2; \ > + if (r1 !=3D r2) \ > + __builtin_abort (); \ > + } > + > +#define TEST_REDUC_BITWISE(TYPE, NAME, BIT_OP) \ > + { \ > + INIT_VECTOR (TYPE); \ > + TYPE r1 =3D reduc_##NAME##_##TYPE (a, NUM_ELEMS (TYPE)); \ > + volatile TYPE r2 =3D 13; \ > + for (int i =3D 0; i < NUM_ELEMS (TYPE); ++i) \ > + r2 BIT_OP a[i]; \ > + if (r1 !=3D r2) \ > + __builtin_abort (); \ > + } > + > +int main () > +{ > + TEST_PLUS (TEST_REDUC_PLUS) > + TEST_MAXMIN (TEST_REDUC_MAXMIN) > + TEST_BITWISE (TEST_REDUC_BITWISE) > + > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-2= .c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c > new file mode 100644 > index 00000000000..3c2f62557b1 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-2.c > @@ -0,0 +1,79 @@ > +/* { dg-do run { target { riscv_vector } } } */ > +/* { dg-additional-options "--param=3Driscv-autovec-preference=3Dscalabl= e" } */ > + > +#include "reduc-2.c" > + > +#define NROWS 53 > + > +/* -ffast-math fuzz for PLUS. */ > +#define CMP__Float16(X, Y) ((X) >=3D (Y) * 0.875 && (X) <=3D (Y) * 1.125= ) > +#define CMP_float(X, Y) ((X) =3D=3D (Y)) > +#define CMP_double(X, Y) ((X) =3D=3D (Y)) > +#define CMP_int8_t(X, Y) ((X) =3D=3D (Y)) > +#define CMP_int16_t(X, Y) ((X) =3D=3D (Y)) > +#define CMP_int32_t(X, Y) ((X) =3D=3D (Y)) > +#define CMP_int64_t(X, Y) ((X) =3D=3D (Y)) > +#define CMP_uint8_t(X, Y) ((X) =3D=3D (Y)) > +#define CMP_uint16_t(X, Y) ((X) =3D=3D (Y)) > +#define CMP_uint32_t(X, Y) ((X) =3D=3D (Y)) > +#define CMP_uint64_t(X, Y) ((X) =3D=3D (Y)) > + > +#define INIT_MATRIX(TYPE) \ > + TYPE mat[NROWS][NUM_ELEMS (TYPE)]; \ > + TYPE r[NROWS]; \ > + for (int i =3D 0; i < NROWS; i++) \ > + for (int j =3D 0; j < NUM_ELEMS (TYPE); j++) \ > + { \ > + mat[i][j] =3D i + (j * 2) * (j & 1 ? 1 : -1); \ > + asm volatile ("" ::: "memory"); \ > + } > + > +#define TEST_REDUC_PLUS(TYPE) \ > + { \ > + INIT_MATRIX (TYPE); \ > + reduc_plus_##TYPE (mat, r, NROWS); \ > + for (int i =3D 0; i < NROWS; i++) \ > + { \ > + volatile TYPE r2 =3D 0; \ > + for (int j =3D 0; j < NUM_ELEMS (TYPE); ++j) \ > + r2 +=3D mat[i][j]; \ > + if (!CMP_##TYPE (r[i], r2)) \ > + __builtin_abort (); \ > + } \ > + } > + > +#define TEST_REDUC_MAXMIN(TYPE, NAME, CMP_OP) \ > + { \ > + INIT_MATRIX (TYPE); \ > + reduc_##NAME##_##TYPE (mat, r, NROWS); \ > + for (int i =3D 0; i < NROWS; i++) \ > + { \ > + volatile TYPE r2 =3D mat[i][0]; \ > + for (int j =3D 0; j < NUM_ELEMS (TYPE); ++j) \ > + r2 =3D mat[i][j] CMP_OP r2 ? mat[i][j] : r2; \ > + if (r[i] !=3D r2) \ > + __builtin_abort (); \ > + } \ > + } > + > +#define TEST_REDUC_BITWISE(TYPE, NAME, BIT_OP) \ > + { \ > + INIT_MATRIX (TYPE); \ > + reduc_##NAME##_##TYPE (mat, r, NROWS); \ > + for (int i =3D 0; i < NROWS; i++) \ > + { \ > + volatile TYPE r2 =3D mat[i][0]; \ > + for (int j =3D 0; j < NUM_ELEMS (TYPE); ++j) \ > + r2 BIT_OP mat[i][j]; \ > + if (r[i] !=3D r2) \ > + __builtin_abort (); \ > + } \ > + } > + > +int main () > +{ > + TEST_PLUS (TEST_REDUC_PLUS) > + TEST_MAXMIN (TEST_REDUC_MAXMIN) > + > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-3= .c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-3.c > new file mode 100644 > index 00000000000..d1b22c0d69a > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-3.c > @@ -0,0 +1,49 @@ > +/* { dg-do run { target { riscv_vector } } } */ > +/* { dg-additional-options "--param=3Driscv-autovec-preference=3Dscalabl= e -ffast-math -fno-vect-cost-model" } */ > + > +#include "reduc-3.c" > + > +#define N 0x1100 > + > +int > +main (void) > +{ > + unsigned short x[N]; > + for (int i =3D 0; i < N; ++i) > + x[i] =3D (i + 1) * (i + 2); > + > + if (add_loop (x, 0) !=3D 0 > + || add_loop (x, 11) !=3D 572 > + || add_loop (x, 0x100) !=3D 22016 > + || add_loop (x, 0xfff) !=3D 20480 > + || max_loop (x, 0) !=3D 0 > + || max_loop (x, 11) !=3D 132 > + || max_loop (x, 0x100) !=3D 65280 > + || max_loop (x, 0xfff) !=3D 65504 > + || or_loop (x, 0) !=3D 0 > + || or_loop (x, 11) !=3D 0xfe > + || or_loop (x, 0x80) !=3D 0x7ffe > + || or_loop (x, 0xb4) !=3D 0x7ffe > + || or_loop (x, 0xb5) !=3D 0xfffe > + || eor_loop (x, 0) !=3D 0 > + || eor_loop (x, 11) !=3D 0xe8 > + || eor_loop (x, 0x100) !=3D 0xcf00 > + || eor_loop (x, 0xfff) !=3D 0xa000) > + __builtin_abort (); > + > + for (int i =3D 0; i < N; ++i) > + x[i] =3D ~x[i]; > + > + if (min_loop (x, 0) !=3D 65535 > + || min_loop (x, 11) !=3D 65403 > + || min_loop (x, 0x100) !=3D 255 > + || min_loop (x, 0xfff) !=3D 31 > + || and_loop (x, 0) !=3D 0xffff > + || and_loop (x, 11) !=3D 0xff01 > + || and_loop (x, 0x80) !=3D 0x8001 > + || and_loop (x, 0xb4) !=3D 0x8001 > + || and_loop (x, 0xb5) !=3D 1) > + __builtin_abort (); > + > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-4= .c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-4.c > new file mode 100644 > index 00000000000..c17e125a763 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/reduc/reduc_run-4.c > @@ -0,0 +1,66 @@ > +/* { dg-do run { target { riscv_vector } } } */ > +/* { dg-additional-options "--param=3Driscv-autovec-preference=3Dscalabl= e -ffast-math -fno-vect-cost-model" } */ > + > +#include "reduc-4.c" > + > +#define N 0x1100 > + > +int > +main (void) > +{ > + unsigned short x[N]; > + for (int i =3D 0; i < N; ++i) > + x[i] =3D (i + 1) * (i + 2); > + > + if (add_loop (x, 0, 10) !=3D 10 > + || add_loop (x, 11, 42) !=3D 614 > + || add_loop (x, 0x100, 84) !=3D 22100 > + || add_loop (x, 0xfff, 20) !=3D 20500 > + || max_loop (x, 0, 10) !=3D 10 > + || max_loop (x, 11, 131) !=3D 132 > + || max_loop (x, 11, 133) !=3D 133 > + || max_loop (x, 0x100, 65279) !=3D 65280 > + || max_loop (x, 0x100, 65281) !=3D 65281 > + || max_loop (x, 0xfff, 65503) !=3D 65504 > + || max_loop (x, 0xfff, 65505) !=3D 65505 > + || or_loop (x, 0, 0x71) !=3D 0x71 > + || or_loop (x, 11, 0) !=3D 0xfe > + || or_loop (x, 11, 0xb3c) !=3D 0xbfe > + || or_loop (x, 0x80, 0) !=3D 0x7ffe > + || or_loop (x, 0x80, 1) !=3D 0x7fff > + || or_loop (x, 0xb4, 0) !=3D 0x7ffe > + || or_loop (x, 0xb4, 1) !=3D 0x7fff > + || or_loop (x, 0xb5, 0) !=3D 0xfffe > + || or_loop (x, 0xb5, 1) !=3D 0xffff > + || eor_loop (x, 0, 0x3e) !=3D 0x3e > + || eor_loop (x, 11, 0) !=3D 0xe8 > + || eor_loop (x, 11, 0x1ff) !=3D 0x117 > + || eor_loop (x, 0x100, 0) !=3D 0xcf00 > + || eor_loop (x, 0x100, 0xeee) !=3D 0xc1ee > + || eor_loop (x, 0xfff, 0) !=3D 0xa000 > + || eor_loop (x, 0xfff, 0x8888) !=3D 0x2888) > + __builtin_abort (); > + > + for (int i =3D 0; i < N; ++i) > + x[i] =3D ~x[i]; > + > + if (min_loop (x, 0, 10000) !=3D 10000 > + || min_loop (x, 11, 65404) !=3D 65403 > + || min_loop (x, 11, 65402) !=3D 65402 > + || min_loop (x, 0x100, 256) !=3D 255 > + || min_loop (x, 0x100, 254) !=3D 254 > + || min_loop (x, 0xfff, 32) !=3D 31 > + || min_loop (x, 0xfff, 30) !=3D 30 > + || and_loop (x, 0, 0x1234) !=3D 0x1234 > + || and_loop (x, 11, 0xffff) !=3D 0xff01 > + || and_loop (x, 11, 0xcdef) !=3D 0xcd01 > + || and_loop (x, 0x80, 0xffff) !=3D 0x8001 > + || and_loop (x, 0x80, 0xfffe) !=3D 0x8000 > + || and_loop (x, 0xb4, 0xffff) !=3D 0x8001 > + || and_loop (x, 0xb4, 0xfffe) !=3D 0x8000 > + || and_loop (x, 0xb5, 0xffff) !=3D 1 > + || and_loop (x, 0xb5, 0xfffe) !=3D 0) > + __builtin_abort (); > + > + return 0; > +} > diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp b/gcc/testsuite/g= cc.target/riscv/rvv/rvv.exp > index 19589fa9638..532c17c4065 100644 > --- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp > +++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp > @@ -71,6 +71,8 @@ foreach op $AUTOVEC_TEST_OPTS { > "" "$op" > dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/ternop/*.\= [cS\]]] \ > "" "$op" > + dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/reduc/*.\[= cS\]]] \ > + "" "$op" > } > > # widening operation only test on LMUL < 8 > -- > 2.36.1 >