From: Kito Cheng <kito.cheng@gmail.com>
To: juzhe.zhong@rivai.ai
Cc: gcc-patches@gcc.gnu.org, palmer@dabbelt.com, jeffreyalaw@gmail.com
Subject: Re: [PATCH V2] RISC-V: Optimize comparison patterns for register allocation
Date: Wed, 26 Apr 2023 12:22:50 +0800 [thread overview]
Message-ID: <CA+yXCZDrvUaxp2h8vh0aQ3wT2Cj-4trsKymVo=kNL0QPD_UiSA@mail.gmail.com> (raw)
In-Reply-To: <20230424035341.96537-1-juzhe.zhong@rivai.ai>
Committed.
On Mon, Apr 24, 2023 at 11:54 AM <juzhe.zhong@rivai.ai> wrote:
>
> From: Juzhe-Zhong <juzhe.zhong@rivai.ai>
>
> Current RA constraint for RVV comparison instructions totall does not allow
> registers between dest and source operand have any overlaps.
>
> For example:
> vmseq.vv vd, vs2, vs1
> If LMUL = 8, vs2 = v8, vs1 = v16:
>
> In current GCC RA constraint, GCC does not allow vd to be any regno in v8 ~ v23.
> However, it is too conservative and not true according to RVV ISA.
>
> Since the dest EEW of comparison is always EEW = 1, so it always follows the overlap
> rules of Dest EEW < Source EEW. So in this case, we should allow GCC RA have the chance
> to allocate v8 or v16 for vd, so that we can have better vector registers usage in RA.
>
> gcc/ChangeLog:
>
> * config/riscv/vector.md (*pred_cmp<mode>_merge_tie_mask): New pattern.
> (*pred_ltge<mode>_merge_tie_mask): Ditto.
> (*pred_cmp<mode>_scalar_merge_tie_mask): Ditto.
> (*pred_eqne<mode>_scalar_merge_tie_mask): Ditto.
> (*pred_cmp<mode>_extended_scalar_merge_tie_mask): Ditto.
> (*pred_eqne<mode>_extended_scalar_merge_tie_mask): Ditto.
> (*pred_cmp<mode>_narrow_merge_tie_mask): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/rvv/base/binop_vv_constraint-4.c: Adapt testcase.
> * gcc.target/riscv/rvv/base/narrow_constraint-17.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-18.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-19.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-20.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-21.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-22.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-23.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-24.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-25.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-26.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-27.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-28.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-29.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-30.c: New test.
> * gcc.target/riscv/rvv/base/narrow_constraint-31.c: New test.
>
> ---
> gcc/config/riscv/vector.md | 439 ++++++++++++++----
> .../riscv/rvv/base/binop_vv_constraint-4.c | 2 +-
> .../riscv/rvv/base/narrow_constraint-17.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-18.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-19.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-20.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-21.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-22.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-23.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-24.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-25.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-26.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-27.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-28.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-29.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-30.c | 231 +++++++++
> .../riscv/rvv/base/narrow_constraint-31.c | 231 +++++++++
> 17 files changed, 3817 insertions(+), 89 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c
>
> diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
> index 959afac2283..cbfc8913aec 100644
> --- a/gcc/config/riscv/vector.md
> +++ b/gcc/config/riscv/vector.md
> @@ -3647,6 +3647,29 @@
> "TARGET_VECTOR"
> {})
>
> +(define_insn "*pred_cmp<mode>_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "comparison_except_ltge_operator"
> + [(match_operand:VI 3 "register_operand" " vr")
> + (match_operand:VI 4 "vector_arith_operand" "vrvi")])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vms%B2.v%o4\t%0,%3,%v4,v0.t"
> + [(set_attr "type" "vicmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> ;; We don't use early-clobber for LMUL <= 1 to get better codegen.
> (define_insn "*pred_cmp<mode>"
> [(set (match_operand:<VM> 0 "register_operand" "=vr, vr, vr, vr")
> @@ -3669,19 +3692,19 @@
>
> ;; We use early-clobber for source LMUL > dest LMUL.
> (define_insn "*pred_cmp<mode>_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr, &vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, vr, vr, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK, rK, rK")
> - (match_operand 7 "const_int_operand" " i, i, i, i")
> - (match_operand 8 "const_int_operand" " i, i, i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "comparison_except_ltge_operator"
> - [(match_operand:VI 4 "register_operand" " vr, vr, vr, vr")
> - (match_operand:VI 5 "vector_arith_operand" " vr, vr, vi, vi")])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0, vu, 0")))]
> + [(match_operand:VI 4 "register_operand" " vr, 0, vr, 0, 0, vr, 0, vr, vr")
> + (match_operand:VI 5 "vector_arith_operand" " vrvi, vrvi, 0, 0, vrvi, 0, 0, vrvi, vrvi")])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, vu, vu, 0, 0, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vms%B3.v%o5\t%0,%4,%v5%p1"
> [(set_attr "type" "vicmp")
> @@ -3704,6 +3727,29 @@
> "TARGET_VECTOR"
> {})
>
> +(define_insn "*pred_ltge<mode>_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "ltge_operator"
> + [(match_operand:VI 3 "register_operand" " vr")
> + (match_operand:VI 4 "vector_neg_arith_operand" "vrvj")])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vms%B2.v%o4\t%0,%3,%v4,v0.t"
> + [(set_attr "type" "vicmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> ;; We don't use early-clobber for LMUL <= 1 to get better codegen.
> (define_insn "*pred_ltge<mode>"
> [(set (match_operand:<VM> 0 "register_operand" "=vr, vr, vr, vr")
> @@ -3726,19 +3772,19 @@
>
> ;; We use early-clobber for source LMUL > dest LMUL.
> (define_insn "*pred_ltge<mode>_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr, &vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, vr, vr, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK, rK, rK")
> - (match_operand 7 "const_int_operand" " i, i, i, i")
> - (match_operand 8 "const_int_operand" " i, i, i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "ltge_operator"
> - [(match_operand:VI 4 "register_operand" " vr, vr, vr, vr")
> - (match_operand:VI 5 "vector_neg_arith_operand" " vr, vr, vj, vj")])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0, vu, 0")))]
> + [(match_operand:VI 4 "register_operand" " vr, 0, vr, 0, 0, vr, 0, vr, vr")
> + (match_operand:VI 5 "vector_neg_arith_operand" " vrvj, vrvj, 0, 0, vrvj, 0, 0, vrvj, vrvj")])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, vu, vu, 0, 0, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vms%B3.v%o5\t%0,%4,%v5%p1"
> [(set_attr "type" "vicmp")
> @@ -3762,6 +3808,30 @@
> "TARGET_VECTOR"
> {})
>
> +(define_insn "*pred_cmp<mode>_scalar_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "comparison_except_eqge_operator"
> + [(match_operand:VI_QHS 3 "register_operand" " vr")
> + (vec_duplicate:VI_QHS
> + (match_operand:<VEL> 4 "register_operand" " r"))])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vms%B2.vx\t%0,%3,%4,v0.t"
> + [(set_attr "type" "vicmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> ;; We don't use early-clobber for LMUL <= 1 to get better codegen.
> (define_insn "*pred_cmp<mode>_scalar"
> [(set (match_operand:<VM> 0 "register_operand" "=vr, vr")
> @@ -3785,20 +3855,20 @@
>
> ;; We use early-clobber for source LMUL > dest LMUL.
> (define_insn "*pred_cmp<mode>_scalar_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK")
> - (match_operand 7 "const_int_operand" " i, i")
> - (match_operand 8 "const_int_operand" " i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "comparison_except_eqge_operator"
> - [(match_operand:VI_QHS 4 "register_operand" " vr, vr")
> + [(match_operand:VI_QHS 4 "register_operand" " vr, 0, 0, vr, vr")
> (vec_duplicate:VI_QHS
> - (match_operand:<VEL> 5 "register_operand" " r, r"))])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0")))]
> + (match_operand:<VEL> 5 "register_operand" " r, r, r, r, r"))])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vms%B3.vx\t%0,%4,%5%p1"
> [(set_attr "type" "vicmp")
> @@ -3822,6 +3892,30 @@
> "TARGET_VECTOR"
> {})
>
> +(define_insn "*pred_eqne<mode>_scalar_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "equality_operator"
> + [(vec_duplicate:VI_QHS
> + (match_operand:<VEL> 4 "register_operand" " r"))
> + (match_operand:VI_QHS 3 "register_operand" " vr")])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vms%B2.vx\t%0,%3,%4,v0.t"
> + [(set_attr "type" "vicmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> ;; We don't use early-clobber for LMUL <= 1 to get better codegen.
> (define_insn "*pred_eqne<mode>_scalar"
> [(set (match_operand:<VM> 0 "register_operand" "=vr, vr")
> @@ -3845,20 +3939,20 @@
>
> ;; We use early-clobber for source LMUL > dest LMUL.
> (define_insn "*pred_eqne<mode>_scalar_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK")
> - (match_operand 7 "const_int_operand" " i, i")
> - (match_operand 8 "const_int_operand" " i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "equality_operator"
> [(vec_duplicate:VI_QHS
> - (match_operand:<VEL> 5 "register_operand" " r, r"))
> - (match_operand:VI_QHS 4 "register_operand" " vr, vr")])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0")))]
> + (match_operand:<VEL> 5 "register_operand" " r, r, r, r, r"))
> + (match_operand:VI_QHS 4 "register_operand" " vr, 0, 0, vr, vr")])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vms%B3.vx\t%0,%4,%5%p1"
> [(set_attr "type" "vicmp")
> @@ -3939,6 +4033,54 @@
> DONE;
> })
>
> +(define_insn "*pred_cmp<mode>_scalar_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "comparison_except_eqge_operator"
> + [(match_operand:VI_D 3 "register_operand" " vr")
> + (vec_duplicate:VI_D
> + (match_operand:<VEL> 4 "register_operand" " r"))])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vms%B2.vx\t%0,%3,%4,v0.t"
> + [(set_attr "type" "vicmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> +(define_insn "*pred_eqne<mode>_scalar_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "equality_operator"
> + [(vec_duplicate:VI_D
> + (match_operand:<VEL> 4 "register_operand" " r"))
> + (match_operand:VI_D 3 "register_operand" " vr")])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vms%B2.vx\t%0,%3,%4,v0.t"
> + [(set_attr "type" "vicmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> ;; We don't use early-clobber for LMUL <= 1 to get better codegen.
> (define_insn "*pred_cmp<mode>_scalar"
> [(set (match_operand:<VM> 0 "register_operand" "=vr, vr")
> @@ -3962,20 +4104,20 @@
>
> ;; We use early-clobber for source LMUL > dest LMUL.
> (define_insn "*pred_cmp<mode>_scalar_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK")
> - (match_operand 7 "const_int_operand" " i, i")
> - (match_operand 8 "const_int_operand" " i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "comparison_except_eqge_operator"
> - [(match_operand:VI_D 4 "register_operand" " vr, vr")
> + [(match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr")
> (vec_duplicate:VI_D
> - (match_operand:<VEL> 5 "register_operand" " r, r"))])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0")))]
> + (match_operand:<VEL> 5 "register_operand" " r, r, r, r, r"))])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vms%B3.vx\t%0,%4,%5%p1"
> [(set_attr "type" "vicmp")
> @@ -4004,25 +4146,50 @@
>
> ;; We use early-clobber for source LMUL > dest LMUL.
> (define_insn "*pred_eqne<mode>_scalar_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK")
> - (match_operand 7 "const_int_operand" " i, i")
> - (match_operand 8 "const_int_operand" " i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "equality_operator"
> [(vec_duplicate:VI_D
> - (match_operand:<VEL> 5 "register_operand" " r, r"))
> - (match_operand:VI_D 4 "register_operand" " vr, vr")])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0")))]
> + (match_operand:<VEL> 5 "register_operand" " r, r, r, r, r"))
> + (match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr")])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vms%B3.vx\t%0,%4,%5%p1"
> [(set_attr "type" "vicmp")
> (set_attr "mode" "<MODE>")])
>
> +(define_insn "*pred_cmp<mode>_extended_scalar_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "comparison_except_eqge_operator"
> + [(match_operand:VI_D 3 "register_operand" " vr")
> + (vec_duplicate:VI_D
> + (sign_extend:<VEL>
> + (match_operand:<VSUBEL> 4 "register_operand" " r")))])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vms%B2.vx\t%0,%3,%4,v0.t"
> + [(set_attr "type" "vicmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> ;; We don't use early-clobber for LMUL <= 1 to get better codegen.
> (define_insn "*pred_cmp<mode>_extended_scalar"
> [(set (match_operand:<VM> 0 "register_operand" "=vr, vr")
> @@ -4046,26 +4213,51 @@
> (set_attr "mode" "<MODE>")])
>
> (define_insn "*pred_cmp<mode>_extended_scalar_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK")
> - (match_operand 7 "const_int_operand" " i, i")
> - (match_operand 8 "const_int_operand" " i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "comparison_except_eqge_operator"
> - [(match_operand:VI_D 4 "register_operand" " vr, vr")
> + [(match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr")
> (vec_duplicate:VI_D
> (sign_extend:<VEL>
> - (match_operand:<VSUBEL> 5 "register_operand" " r, r")))])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0")))]
> + (match_operand:<VSUBEL> 5 "register_operand" " r, r, r, r, r")))])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vms%B3.vx\t%0,%4,%5%p1"
> [(set_attr "type" "vicmp")
> (set_attr "mode" "<MODE>")])
>
> +(define_insn "*pred_eqne<mode>_extended_scalar_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "equality_operator"
> + [(vec_duplicate:VI_D
> + (sign_extend:<VEL>
> + (match_operand:<VSUBEL> 4 "register_operand" " r")))
> + (match_operand:VI_D 3 "register_operand" " vr")])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vms%B2.vx\t%0,%3,%4,v0.t"
> + [(set_attr "type" "vicmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> ;; We don't use early-clobber for LMUL <= 1 to get better codegen.
> (define_insn "*pred_eqne<mode>_extended_scalar"
> [(set (match_operand:<VM> 0 "register_operand" "=vr, vr")
> @@ -4089,21 +4281,21 @@
> (set_attr "mode" "<MODE>")])
>
> (define_insn "*pred_eqne<mode>_extended_scalar_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK")
> - (match_operand 7 "const_int_operand" " i, i")
> - (match_operand 8 "const_int_operand" " i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "equality_operator"
> [(vec_duplicate:VI_D
> (sign_extend:<VEL>
> - (match_operand:<VSUBEL> 5 "register_operand" " r, r")))
> - (match_operand:VI_D 4 "register_operand" " vr, vr")])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0")))]
> + (match_operand:<VSUBEL> 5 "register_operand" " r, r, r, r, r")))
> + (match_operand:VI_D 4 "register_operand" " vr, 0, 0, vr, vr")])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vms%B3.vx\t%0,%4,%5%p1"
> [(set_attr "type" "vicmp")
> @@ -6346,21 +6538,44 @@
> [(set_attr "type" "vfcmp")
> (set_attr "mode" "<MODE>")])
>
> +(define_insn "*pred_cmp<mode>_narrow_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "signed_order_operator"
> + [(match_operand:VF 3 "register_operand" " vr")
> + (match_operand:VF 4 "register_operand" " vr")])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vmf%B2.vv\t%0,%3,%4,v0.t"
> + [(set_attr "type" "vfcmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> ;; We use early-clobber for source LMUL > dest LMUL.
> (define_insn "*pred_cmp<mode>_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, vr, vr, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK")
> - (match_operand 7 "const_int_operand" " i, i")
> - (match_operand 8 "const_int_operand" " i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "signed_order_operator"
> - [(match_operand:VF 4 "register_operand" " vr, vr")
> - (match_operand:VF 5 "register_operand" " vr, vr")])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0")))]
> + [(match_operand:VF 4 "register_operand" " vr, 0, vr, 0, 0, vr, 0, vr, vr")
> + (match_operand:VF 5 "register_operand" " vr, vr, 0, 0, vr, 0, 0, vr, vr")])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, vu, vu, 0, 0, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vmf%B3.vv\t%0,%4,%5%p1"
> [(set_attr "type" "vfcmp")
> @@ -6384,6 +6599,30 @@
> "TARGET_VECTOR"
> {})
>
> +(define_insn "*pred_cmp<mode>_scalar_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "signed_order_operator"
> + [(match_operand:VF 3 "register_operand" " vr")
> + (vec_duplicate:VF
> + (match_operand:<VEL> 4 "register_operand" " f"))])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vmf%B2.vf\t%0,%3,%4,v0.t"
> + [(set_attr "type" "vfcmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> ;; We don't use early-clobber for LMUL <= 1 to get better codegen.
> (define_insn "*pred_cmp<mode>_scalar"
> [(set (match_operand:<VM> 0 "register_operand" "=vr, vr")
> @@ -6407,20 +6646,20 @@
>
> ;; We use early-clobber for source LMUL > dest LMUL.
> (define_insn "*pred_cmp<mode>_scalar_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK")
> - (match_operand 7 "const_int_operand" " i, i")
> - (match_operand 8 "const_int_operand" " i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "signed_order_operator"
> - [(match_operand:VF 4 "register_operand" " vr, vr")
> + [(match_operand:VF 4 "register_operand" " vr, 0, 0, vr, vr")
> (vec_duplicate:VF
> - (match_operand:<VEL> 5 "register_operand" " f, f"))])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0")))]
> + (match_operand:<VEL> 5 "register_operand" " f, f, f, f, f"))])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vmf%B3.vf\t%0,%4,%5%p1"
> [(set_attr "type" "vfcmp")
> @@ -6444,6 +6683,30 @@
> "TARGET_VECTOR"
> {})
>
> +(define_insn "*pred_eqne<mode>_scalar_merge_tie_mask"
> + [(set (match_operand:<VM> 0 "register_operand" "=vm")
> + (if_then_else:<VM>
> + (unspec:<VM>
> + [(match_operand:<VM> 1 "register_operand" " 0")
> + (match_operand 5 "vector_length_operand" " rK")
> + (match_operand 6 "const_int_operand" " i")
> + (match_operand 7 "const_int_operand" " i")
> + (reg:SI VL_REGNUM)
> + (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> + (match_operator:<VM> 2 "equality_operator"
> + [(vec_duplicate:VF
> + (match_operand:<VEL> 4 "register_operand" " f"))
> + (match_operand:VF 3 "register_operand" " vr")])
> + (match_dup 1)))]
> + "TARGET_VECTOR"
> + "vmf%B2.vf\t%0,%3,%4,v0.t"
> + [(set_attr "type" "vfcmp")
> + (set_attr "mode" "<MODE>")
> + (set_attr "merge_op_idx" "1")
> + (set_attr "vl_op_idx" "5")
> + (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])"))
> + (set (attr "avl_type") (symbol_ref "INTVAL (operands[7])"))])
> +
> ;; We don't use early-clobber for LMUL <= 1 to get better codegen.
> (define_insn "*pred_eqne<mode>_scalar"
> [(set (match_operand:<VM> 0 "register_operand" "=vr, vr")
> @@ -6467,20 +6730,20 @@
>
> ;; We use early-clobber for source LMUL > dest LMUL.
> (define_insn "*pred_eqne<mode>_scalar_narrow"
> - [(set (match_operand:<VM> 0 "register_operand" "=&vr, &vr")
> + [(set (match_operand:<VM> 0 "register_operand" "=vm, vr, vr, &vr, &vr")
> (if_then_else:<VM>
> (unspec:<VM>
> - [(match_operand:<VM> 1 "vector_mask_operand" "vmWc1,vmWc1")
> - (match_operand 6 "vector_length_operand" " rK, rK")
> - (match_operand 7 "const_int_operand" " i, i")
> - (match_operand 8 "const_int_operand" " i, i")
> + [(match_operand:<VM> 1 "vector_mask_operand" " 0,vmWc1,vmWc1,vmWc1,vmWc1")
> + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK")
> + (match_operand 7 "const_int_operand" " i, i, i, i, i")
> + (match_operand 8 "const_int_operand" " i, i, i, i, i")
> (reg:SI VL_REGNUM)
> (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
> (match_operator:<VM> 3 "equality_operator"
> [(vec_duplicate:VF
> - (match_operand:<VEL> 5 "register_operand" " f, f"))
> - (match_operand:VF 4 "register_operand" " vr, vr")])
> - (match_operand:<VM> 2 "vector_merge_operand" " vu, 0")))]
> + (match_operand:<VEL> 5 "register_operand" " f, f, f, f, f"))
> + (match_operand:VF 4 "register_operand" " vr, 0, 0, vr, vr")])
> + (match_operand:<VM> 2 "vector_merge_operand" " vu, vu, 0, vu, 0")))]
> "TARGET_VECTOR && known_gt (GET_MODE_SIZE (<MODE>mode), BYTES_PER_RISCV_VECTOR)"
> "vmf%B3.vf\t%0,%4,%5%p1"
> [(set_attr "type" "vfcmp")
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c
> index 552c264d895..e16db932f15 100644
> --- a/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/binop_vv_constraint-4.c
> @@ -24,4 +24,4 @@ void f2 (void * in, void *out, int32_t x)
> __riscv_vsm_v_b32 (out, m4, 4);
> }
>
> -/* { dg-final { scan-assembler-times {vmv} 2 } } */
> +/* { dg-final { scan-assembler-not {vmv} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c
> new file mode 100644
> index 00000000000..97df21dd743
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-17.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_mu(m1,m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v1,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vv_u16m8_b2_mu(m1,m2,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v, 4);
> + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_mu (m3, m3, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v, 4);
> + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_m (m3, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v, 4);
> + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_m (m3, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> + vbool4_t m5 = __riscv_vmseq_vv_i32m8_b4_mu (m3, m4, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v2, 4);
> + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_mu (m3, m3, v2, v, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vv_i32m8_b4 (v, v2, 4);
> + vbool4_t m4 = __riscv_vmseq_vv_i32m8_b4_m (m3, v2, v, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n)
> +{
> + vbool4_t mask = *(vbool4_t*)base1;
> + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4);
> + mask = __riscv_vmseq_vv_i32m8_b4 (v, v2, 4);
> + for (int i = 0; i < n; i++){
> + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4);
> + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vv_i32m8_b4_m (mask, v3, v4,32);
> + mask = __riscv_vmseq_vv_i32m8_b4_mu (mask, mask, v4, v4, 32);
> + }
> + __riscv_vsm_v_b4 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n)
> +{
> + vbool32_t mask = *(vbool32_t*)base1;
> + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4);
> + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4);
> + mask = __riscv_vmseq_vv_i32m1_b32 (v, v2, 4);
> + for (int i = 0; i < n; i++){
> + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4);
> + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vv_i32m1_b32_m (mask, v3, v4,32);
> + mask = __riscv_vmseq_vv_i32m1_b32_mu (mask, mask, v4, v4, 32);
> + }
> + __riscv_vsm_v_b32 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c
> new file mode 100644
> index 00000000000..56c95d9c884
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-18.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_mu(m1,m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v1,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vuint16m8_t v2 = __riscv_vle16_v_u16m8 (base2, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vv_u16m8_b2_mu(m1,m2,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v, 4);
> + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_mu (m3, m3, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v, 4);
> + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_m (m3, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v, 4);
> + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_m (m3, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> + vbool4_t m5 = __riscv_vmslt_vv_i32m8_b4_mu (m3, m4, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v2, 4);
> + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_mu (m3, m3, v2, v, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vv_i32m8_b4 (v, v2, 4);
> + vbool4_t m4 = __riscv_vmslt_vv_i32m8_b4_m (m3, v2, v, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n)
> +{
> + vbool4_t mask = *(vbool4_t*)base1;
> + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4);
> + mask = __riscv_vmslt_vv_i32m8_b4 (v, v2, 4);
> + for (int i = 0; i < n; i++){
> + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4);
> + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vv_i32m8_b4_m (mask, v3, v4,32);
> + mask = __riscv_vmslt_vv_i32m8_b4_mu (mask, mask, v4, v4, 32);
> + }
> + __riscv_vsm_v_b4 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n)
> +{
> + vbool32_t mask = *(vbool32_t*)base1;
> + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4);
> + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4);
> + mask = __riscv_vmslt_vv_i32m1_b32 (v, v2, 4);
> + for (int i = 0; i < n; i++){
> + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4);
> + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vv_i32m1_b32_m (mask, v3, v4,32);
> + mask = __riscv_vmslt_vv_i32m1_b32_mu (mask, mask, v4, v4, 32);
> + }
> + __riscv_vsm_v_b32 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c
> new file mode 100644
> index 00000000000..d50e497d6c9
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-19.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m2,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> + vbool4_t m5 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m4, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)base1;
> + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4);
> + mask = __riscv_vmseq_vx_i32m8_b4 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4);
> + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vx_i32m8_b4_m (mask, v3, x,32);
> + mask = __riscv_vmseq_vx_i32m8_b4_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b4 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool32_t mask = *(vbool32_t*)base1;
> + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4);
> + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4);
> + mask = __riscv_vmseq_vx_i32m1_b32 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4);
> + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vx_i32m1_b32_m (mask, v3, x,32);
> + mask = __riscv_vmseq_vx_i32m1_b32_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b32 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c
> new file mode 100644
> index 00000000000..4e77c51d058
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-20.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m2,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> + vbool4_t m5 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m4, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)base1;
> + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4);
> + mask = __riscv_vmslt_vx_i32m8_b4 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4);
> + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vx_i32m8_b4_m (mask, v3, x,32);
> + mask = __riscv_vmslt_vx_i32m8_b4_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b4 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool32_t mask = *(vbool32_t*)base1;
> + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4);
> + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4);
> + mask = __riscv_vmslt_vx_i32m1_b32 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4);
> + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vx_i32m1_b32_m (mask, v3, x,32);
> + mask = __riscv_vmslt_vx_i32m1_b32_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b32 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c
> new file mode 100644
> index 00000000000..4f7efd508b1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-21.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m1,v1, -16,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_m(m1,v1, -16,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmseq_vx_u16m8_b2_mu(m1,m2,v1, -16,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4);
> + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, -16, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4);
> + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, -16, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4);
> + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, -16, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> + vbool4_t m5 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m4, v2, -16, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4);
> + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_mu (m3, m3, v2, -16, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4);
> + vbool4_t m4 = __riscv_vmseq_vx_i32m8_b4_m (m3, v2, -16, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)base1;
> + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4);
> + mask = __riscv_vmseq_vx_i32m8_b4 (v, -16, 4);
> + for (int i = 0; i < n; i++){
> + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4);
> + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vx_i32m8_b4_m (mask, v3, -16,32);
> + mask = __riscv_vmseq_vx_i32m8_b4_mu (mask, mask, v4, -16, 32);
> + }
> + __riscv_vsm_v_b4 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool32_t mask = *(vbool32_t*)base1;
> + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4);
> + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4);
> + mask = __riscv_vmseq_vx_i32m1_b32 (v, -16, 4);
> + for (int i = 0; i < n; i++){
> + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4);
> + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vx_i32m1_b32_m (mask, v3, -16,32);
> + mask = __riscv_vmseq_vx_i32m1_b32_mu (mask, mask, v4, -16, 32);
> + }
> + __riscv_vsm_v_b32 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c
> new file mode 100644
> index 00000000000..92084be99b2
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-22.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m1,v1, -15,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_m(m1,v1, -15,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint16m8_t v1 = __riscv_vle16_v_u16m8 (base1, vl);
> + vbool2_t m1 = __riscv_vlm_v_b2 (base3, vl);
> + vbool2_t m2 = __riscv_vlm_v_b2 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool2_t v = __riscv_vmsltu_vx_u16m8_b2_mu(m1,m2,v1, -15,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b2 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4);
> + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, -15,4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4);
> + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, -15,4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4);
> + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, -15,4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> + vbool4_t m5 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m4, v2, -15,4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4);
> + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_mu (m3, m3, v2, -15,4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vint32m8_t v = __riscv_vle32_v_i32m8 (in, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmslt_vx_i32m8_b4 (v, -15,4);
> + vbool4_t m4 = __riscv_vmslt_vx_i32m8_b4_m (m3, v2, -15,4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)base1;
> + vint32m8_t v = __riscv_vle32_v_i32m8 (base1, 4);
> + vint32m8_t v2 = __riscv_vle32_v_i32m8_m (mask, base1, 4);
> + mask = __riscv_vmslt_vx_i32m8_b4 (v, -15,4);
> + for (int i = 0; i < n; i++){
> + vint32m8_t v3 = __riscv_vle32_v_i32m8 (base1 + i, 4);
> + vint32m8_t v4 = __riscv_vle32_v_i32m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vx_i32m8_b4_m (mask, v3, -15,32);
> + mask = __riscv_vmslt_vx_i32m8_b4_mu (mask, mask, v4, -15,32);
> + }
> + __riscv_vsm_v_b4 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool32_t mask = *(vbool32_t*)base1;
> + vint32m1_t v = __riscv_vle32_v_i32m1 (base1, 4);
> + vint32m1_t v2 = __riscv_vle32_v_i32m1_m (mask, base1, 4);
> + mask = __riscv_vmslt_vx_i32m1_b32 (v, -15,4);
> + for (int i = 0; i < n; i++){
> + vint32m1_t v3 = __riscv_vle32_v_i32m1 (base1 + i, 4);
> + vint32m1_t v4 = __riscv_vle32_v_i32m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vx_i32m1_b32_m (mask, v3, -15,32);
> + mask = __riscv_vmslt_vx_i32m1_b32_mu (mask, mask, v4, -15,32);
> + }
> + __riscv_vsm_v_b32 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c
> new file mode 100644
> index 00000000000..f9817caca1e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-23.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m2,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> + vbool8_t m5 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m4, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, x, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)base1;
> + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4);
> + mask = __riscv_vmseq_vx_i64m8_b8 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4);
> + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vx_i64m8_b8_m (mask, v3, x,32);
> + mask = __riscv_vmseq_vx_i64m8_b8_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b8 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool64_t mask = *(vbool64_t*)base1;
> + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4);
> + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4);
> + mask = __riscv_vmseq_vx_i64m1_b64 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4);
> + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vx_i64m1_b64_m (mask, v3, x,32);
> + mask = __riscv_vmseq_vx_i64m1_b64_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b64 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c
> new file mode 100644
> index 00000000000..62d1f6dddd5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-24.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, uint16_t x)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m2,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> + vbool8_t m5 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m4, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, x, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, x, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool8_t mask = *(vbool8_t*)base1;
> + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4);
> + mask = __riscv_vmslt_vx_i64m8_b8 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4);
> + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vx_i64m8_b8_m (mask, v3, x,32);
> + mask = __riscv_vmslt_vx_i64m8_b8_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b8 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n, int32_t x)
> +{
> + vbool64_t mask = *(vbool64_t*)base1;
> + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4);
> + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4);
> + mask = __riscv_vmslt_vx_i64m1_b64 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4);
> + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vx_i64m1_b64_m (mask, v3, x,32);
> + mask = __riscv_vmslt_vx_i64m1_b64_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b64 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c
> new file mode 100644
> index 00000000000..250c3fdb89a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-25.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m1,v1,-16,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,-16,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m2,v1,-16,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, -16, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, -16, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, -16, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> + vbool8_t m5 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m4, v2, -16, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, -16, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, -16, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n)
> +{
> + vbool8_t mask = *(vbool8_t*)base1;
> + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4);
> + mask = __riscv_vmseq_vx_i64m8_b8 (v, -16, 4);
> + for (int i = 0; i < n; i++){
> + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4);
> + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vx_i64m8_b8_m (mask, v3, -16,32);
> + mask = __riscv_vmseq_vx_i64m8_b8_mu (mask, mask, v4, -16, 32);
> + }
> + __riscv_vsm_v_b8 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n)
> +{
> + vbool64_t mask = *(vbool64_t*)base1;
> + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4);
> + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4);
> + mask = __riscv_vmseq_vx_i64m1_b64 (v, -16, 4);
> + for (int i = 0; i < n; i++){
> + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4);
> + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vx_i64m1_b64_m (mask, v3, -16,32);
> + mask = __riscv_vmseq_vx_i64m1_b64_mu (mask, mask, v4, -16, 32);
> + }
> + __riscv_vsm_v_b64 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c
> new file mode 100644
> index 00000000000..72e2d210c05
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-26.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m1,v1,-15,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,-15,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m2,v1,-15,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, -15, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, -15, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, -15, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> + vbool8_t m5 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m4, v2, -15, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, -15, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, -15, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n)
> +{
> + vbool8_t mask = *(vbool8_t*)base1;
> + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4);
> + mask = __riscv_vmslt_vx_i64m8_b8 (v, -15, 4);
> + for (int i = 0; i < n; i++){
> + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4);
> + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vx_i64m8_b8_m (mask, v3, -15,32);
> + mask = __riscv_vmslt_vx_i64m8_b8_mu (mask, mask, v4, -15, 32);
> + }
> + __riscv_vsm_v_b8 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n)
> +{
> + vbool64_t mask = *(vbool64_t*)base1;
> + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4);
> + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4);
> + mask = __riscv_vmslt_vx_i64m1_b64 (v, -15, 4);
> + for (int i = 0; i < n; i++){
> + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4);
> + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vx_i64m1_b64_m (mask, v3, -15,32);
> + mask = __riscv_vmslt_vx_i64m1_b64_mu (mask, mask, v4, -15, 32);
> + }
> + __riscv_vsm_v_b64 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c
> new file mode 100644
> index 00000000000..0842700475c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-27.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_m(m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmseq_vx_u64m8_b8_mu(m1,m2,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> + vbool8_t m5 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m4, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4);
> + vbool8_t m4 = __riscv_vmseq_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n)
> +{
> + vbool8_t mask = *(vbool8_t*)base1;
> + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4);
> + mask = __riscv_vmseq_vx_i64m8_b8 (v, 0xAAAA, 4);
> + for (int i = 0; i < n; i++){
> + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4);
> + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vx_i64m8_b8_m (mask, v3, 0xAAAA,32);
> + mask = __riscv_vmseq_vx_i64m8_b8_mu (mask, mask, v4, 0xAAAA, 32);
> + }
> + __riscv_vsm_v_b8 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n)
> +{
> + vbool64_t mask = *(vbool64_t*)base1;
> + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4);
> + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4);
> + mask = __riscv_vmseq_vx_i64m1_b64 (v, 0xAAAA, 4);
> + for (int i = 0; i < n; i++){
> + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4);
> + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmseq_vx_i64m1_b64_m (mask, v3, 0xAAAA,32);
> + mask = __riscv_vmseq_vx_i64m1_b64_mu (mask, mask, v4, 0xAAAA, 32);
> + }
> + __riscv_vsm_v_b64 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c
> new file mode 100644
> index 00000000000..9c1eddfac7e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-28.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_m(m1,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl)
> +{
> + vuint64m8_t v1 = __riscv_vle64_v_u64m8 (base1, vl);
> + vbool8_t m1 = __riscv_vlm_v_b8 (base3, vl);
> + vbool8_t m2 = __riscv_vlm_v_b8 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool8_t v = __riscv_vmsltu_vx_u64m8_b8_mu(m1,m2,v1,0xAAAA,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b8 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> + vbool8_t m5 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m4, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_mu (m3, m3, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out)
> +{
> + vbool8_t mask = *(vbool8_t*)in;
> + asm volatile ("":::"memory");
> + vint64m8_t v = __riscv_vle64_v_i64m8 (in, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, in, 4);
> + vbool8_t m3 = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4);
> + vbool8_t m4 = __riscv_vmslt_vx_i64m8_b8_m (m3, v2, 0xAAAA, 4);
> + __riscv_vsm_v_b8 (out, m3, 4);
> + __riscv_vsm_v_b8 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n)
> +{
> + vbool8_t mask = *(vbool8_t*)base1;
> + vint64m8_t v = __riscv_vle64_v_i64m8 (base1, 4);
> + vint64m8_t v2 = __riscv_vle64_v_i64m8_m (mask, base1, 4);
> + mask = __riscv_vmslt_vx_i64m8_b8 (v, 0xAAAA, 4);
> + for (int i = 0; i < n; i++){
> + vint64m8_t v3 = __riscv_vle64_v_i64m8 (base1 + i, 4);
> + vint64m8_t v4 = __riscv_vle64_v_i64m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vx_i64m8_b8_m (mask, v3, 0xAAAA,32);
> + mask = __riscv_vmslt_vx_i64m8_b8_mu (mask, mask, v4, 0xAAAA, 32);
> + }
> + __riscv_vsm_v_b8 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n)
> +{
> + vbool64_t mask = *(vbool64_t*)base1;
> + vint64m1_t v = __riscv_vle64_v_i64m1 (base1, 4);
> + vint64m1_t v2 = __riscv_vle64_v_i64m1_m (mask, base1, 4);
> + mask = __riscv_vmslt_vx_i64m1_b64 (v, 0xAAAA, 4);
> + for (int i = 0; i < n; i++){
> + vint64m1_t v3 = __riscv_vle64_v_i64m1 (base1 + i, 4);
> + vint64m1_t v4 = __riscv_vle64_v_i64m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmslt_vx_i64m1_b64_m (mask, v3, 0xAAAA,32);
> + mask = __riscv_vmslt_vx_i64m1_b64_mu (mask, mask, v4, 0xAAAA, 32);
> + }
> + __riscv_vsm_v_b64 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c
> new file mode 100644
> index 00000000000..6988c24bd92
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-29.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_mu(m1,m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v1,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_m(m1,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, size_t shift)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8 (base2, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + vbool4_t m2 = __riscv_vlm_v_b4 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vv_f32m8_b4_mu(m1,m2,v1,v2,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v, 4);
> + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_mu (m3, m3, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v, 4);
> + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_m (m3, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v, 4);
> + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_m (m3, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> + vbool4_t m5 = __riscv_vmfeq_vv_f32m8_b4_mu (m3, m4, v2, v2, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v2, 4);
> + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_mu (m3, m3, v2, v, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, int32_t x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmfeq_vv_f32m8_b4 (v, v2, 4);
> + vbool4_t m4 = __riscv_vmfeq_vv_f32m8_b4_m (m3, v2, v, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n)
> +{
> + vbool4_t mask = *(vbool4_t*)base1;
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (base1, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, base1, 4);
> + mask = __riscv_vmfeq_vv_f32m8_b4 (v, v2, 4);
> + for (int i = 0; i < n; i++){
> + vfloat32m8_t v3 = __riscv_vle32_v_f32m8 (base1 + i, 4);
> + vfloat32m8_t v4 = __riscv_vle32_v_f32m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmfeq_vv_f32m8_b4_m (mask, v3, v4,32);
> + mask = __riscv_vmfeq_vv_f32m8_b4_mu (mask, mask, v4, v4, 32);
> + }
> + __riscv_vsm_v_b4 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n)
> +{
> + vbool32_t mask = *(vbool32_t*)base1;
> + vfloat32m1_t v = __riscv_vle32_v_f32m1 (base1, 4);
> + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, base1, 4);
> + mask = __riscv_vmfeq_vv_f32m1_b32 (v, v2, 4);
> + for (int i = 0; i < n; i++){
> + vfloat32m1_t v3 = __riscv_vle32_v_f32m1 (base1 + i, 4);
> + vfloat32m1_t v4 = __riscv_vle32_v_f32m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmfeq_vv_f32m1_b32_m (mask, v3, v4,32);
> + mask = __riscv_vmfeq_vv_f32m1_b32_mu (mask, mask, v4, v4, 32);
> + }
> + __riscv_vsm_v_b32 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c
> new file mode 100644
> index 00000000000..fe181de4d56
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-30.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_mu(m1,m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + vbool4_t m2 = __riscv_vlm_v_b4 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmfeq_vf_f32m8_b4_mu(m1,m2,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> + vbool4_t m5 = __riscv_vmfeq_vf_f32m8_b4_mu (m3, m4, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmfeq_vf_f32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)base1;
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (base1, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, base1, 4);
> + mask = __riscv_vmfeq_vf_f32m8_b4 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vfloat32m8_t v3 = __riscv_vle32_v_f32m8 (base1 + i, 4);
> + vfloat32m8_t v4 = __riscv_vle32_v_f32m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmfeq_vf_f32m8_b4_m (mask, v3, x,32);
> + mask = __riscv_vmfeq_vf_f32m8_b4_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b4 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n, float x)
> +{
> + vbool32_t mask = *(vbool32_t*)base1;
> + vfloat32m1_t v = __riscv_vle32_v_f32m1 (base1, 4);
> + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, base1, 4);
> + mask = __riscv_vmfeq_vf_f32m1_b32 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vfloat32m1_t v3 = __riscv_vle32_v_f32m1 (base1 + i, 4);
> + vfloat32m1_t v4 = __riscv_vle32_v_f32m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmfeq_vf_f32m1_b32_m (mask, v3, x,32);
> + mask = __riscv_vmfeq_vf_f32m1_b32_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b32 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c
> new file mode 100644
> index 00000000000..ae5b4ed6913
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/rvv/base/narrow_constraint-31.c
> @@ -0,0 +1,231 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv32gcv -mabi=ilp32d -O3" } */
> +
> +#include "riscv_vector.h"
> +
> +void f0 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f1 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_mu(m1,m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f2 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f3 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f4 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f5 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v27", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_m(m1,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f6 (void *base1,void *base2,void *base3,void *base4,void *out,size_t vl, float x)
> +{
> + vfloat32m8_t v1 = __riscv_vle32_v_f32m8 (base1, vl);
> + vbool4_t m1 = __riscv_vlm_v_b4 (base3, vl);
> + vbool4_t m2 = __riscv_vlm_v_b4 (base4, vl);
> + asm volatile("#" ::
> + : "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23","v24","v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + vbool4_t v = __riscv_vmflt_vf_f32m8_b4_mu(m1,m2,v1,x,vl);
> + asm volatile("#" ::
> + : "v0", "v1", "v2", "v3", "v4", "v5", "v6", "v7", "v8", "v9",
> + "v10", "v11", "v12", "v13", "v14", "v15", "v16", "v17",
> + "v18", "v19", "v20", "v21", "v22", "v23", "v24", "v25",
> + "v26", "v28", "v29", "v30", "v31");
> +
> + __riscv_vsm_v_b4 (out,v,vl);
> +}
> +
> +void f7 (void * in, void *out, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f8 (void * in, void *out, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f9 (void * in, void *out, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> + vbool4_t m5 = __riscv_vmflt_vf_f32m8_b4_mu (m3, m4, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m5, 4);
> +}
> +
> +void f10 (void * in, void *out, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_mu (m3, m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f11 (void * in, void *out, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)in;
> + asm volatile ("":::"memory");
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (in, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, in, 4);
> + vbool4_t m3 = __riscv_vmflt_vf_f32m8_b4 (v, x, 4);
> + vbool4_t m4 = __riscv_vmflt_vf_f32m8_b4_m (m3, v2, x, 4);
> + __riscv_vsm_v_b4 (out, m3, 4);
> + __riscv_vsm_v_b4 (out, m4, 4);
> +}
> +
> +void f12 (void* base1,void* base2,void* out,int n, float x)
> +{
> + vbool4_t mask = *(vbool4_t*)base1;
> + vfloat32m8_t v = __riscv_vle32_v_f32m8 (base1, 4);
> + vfloat32m8_t v2 = __riscv_vle32_v_f32m8_m (mask, base1, 4);
> + mask = __riscv_vmflt_vf_f32m8_b4 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vfloat32m8_t v3 = __riscv_vle32_v_f32m8 (base1 + i, 4);
> + vfloat32m8_t v4 = __riscv_vle32_v_f32m8_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmflt_vf_f32m8_b4_m (mask, v3, x,32);
> + mask = __riscv_vmflt_vf_f32m8_b4_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b4 (out, mask, 32);
> +}
> +
> +void f13 (void* base1,void* base2,void* out,int n, float x)
> +{
> + vbool32_t mask = *(vbool32_t*)base1;
> + vfloat32m1_t v = __riscv_vle32_v_f32m1 (base1, 4);
> + vfloat32m1_t v2 = __riscv_vle32_v_f32m1_m (mask, base1, 4);
> + mask = __riscv_vmflt_vf_f32m1_b32 (v, x, 4);
> + for (int i = 0; i < n; i++){
> + vfloat32m1_t v3 = __riscv_vle32_v_f32m1 (base1 + i, 4);
> + vfloat32m1_t v4 = __riscv_vle32_v_f32m1_m (mask, base1 + i * 2, 4);
> + mask = __riscv_vmflt_vf_f32m1_b32_m (mask, v3, x,32);
> + mask = __riscv_vmflt_vf_f32m1_b32_mu (mask, mask, v4, x, 32);
> + }
> + __riscv_vsm_v_b32 (out, mask, 32);
> +}
> +
> +/* { dg-final { scan-assembler-not {vmv} } } */
> +/* { dg-final { scan-assembler-not {csrr} } } */
> --
> 2.36.1
>
prev parent reply other threads:[~2023-04-26 4:23 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-24 3:53 juzhe.zhong
2023-04-26 4:22 ` Kito Cheng [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CA+yXCZDrvUaxp2h8vh0aQ3wT2Cj-4trsKymVo=kNL0QPD_UiSA@mail.gmail.com' \
--to=kito.cheng@gmail.com \
--cc=gcc-patches@gcc.gnu.org \
--cc=jeffreyalaw@gmail.com \
--cc=juzhe.zhong@rivai.ai \
--cc=palmer@dabbelt.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).