public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH V4] RISC-V: Support gather_load/scatter RVV auto-vectorization
@ 2023-07-07 10:48 Juzhe-Zhong
  2023-07-07 12:34 ` Robin Dapp
  0 siblings, 1 reply; 3+ messages in thread
From: Juzhe-Zhong @ 2023-07-07 10:48 UTC (permalink / raw)
  To: gcc-patches
  Cc: kito.cheng, kito.cheng, palmer, palmer, jeffreyalaw, rdapp.gcc,
	Juzhe-Zhong

This patch fully support gather_load/scatter_store:
1. Support single-rgroup on both RV32/RV64.
2. Support indexed element width can be same as or smaller than Pmode.
3. Support VLA SLP with gather/scatter.
4. Fully tested all gather/scatter with LMUL = M1/M2/M4/M8 both VLA and VLS.
5. Fix bug of handling (subreg:SI (const_poly_int:DI))
6. Fix bug on vec_perm which is used by gather/scatter SLP.

All kinds of GATHER/SCATTER are normalized into LEN_MASK_*.
We fully supported these 4 kinds of gather/scatter:
1. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and dummy mask (Full vector).
2. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and real mask.
2. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and dummy mask.
2. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and real mask.

We use vluxei/vsuxei (un-ordered indexed loads/stores of RVV to code generate gather/scatter).

Also, we support strided loads/stores with vlse.v/vsse.v. Consider this following case:
#define TEST_LOOP(DATA_TYPE, BITS)                                             \
  void __attribute__ ((noinline, noclone))                                     \
  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
			  INDEX##BITS stride, INDEX##BITS n)                   \
  {                                                                            \
    for (INDEX##BITS i = 0; i < n; ++i)                                        \
      dest[i] += src[i * stride];                                              \
  }

Codegen:
f_int8_t_8:
	ble	a3,zero,.L10
	li	a5,1
	mv	a4,a0
	bne	a2,a5,.L4
	li	a2,1
.L6:
	vsetvli	a5,a3,e8,m2,ta,ma
	vle8.v	v2,0(a0)
	vlse8.v	v4,0(a1),a2
	vsetvli	a6,zero,e8,m2,ta,ma
	sub	a3,a3,a5
	vadd.vv	v2,v2,v4
	vsetvli	zero,a5,e8,m2,ta,ma
	vse8.v	v2,0(a4)
	add	a0,a0,a5
	add	a1,a1,a5
	add	a4,a4,a5
	bne	a3,zero,.L6
.L10:
	ret

We use vlse.v instead of vluxei.

This patch has been tested on both RV32 and RV64.

gcc/ChangeLog:

        * config/riscv/autovec.md (len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>): New pattern.
        (len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto.
        (len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
        (len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
        (len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
        (len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto.
        (len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>): Ditto.
        (len_mask_gather_load<mode><mode>): Ditto.
        (len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>): Ditto.
        (len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto.
        (len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
        (len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
        (len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
        (len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto.
        (len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>): Ditto.
        (len_mask_scatter_store<mode><mode>): Ditto.
        * config/riscv/predicates.md (const_1_operand): New predicate.
        (vector_gs_offset_operand): Ditto.
        (vector_gs_scale_operand_16): Ditto.
        (vector_gs_scale_operand_32): Ditto.
        (vector_gs_scale_operand_64): Ditto.
        (vector_gs_extension_operand): Ditto.
        (vector_gs_scale_operand_16_rv32): Ditto.
        (vector_gs_scale_operand_32_rv32): Ditto.
        * config/riscv/riscv-protos.h (enum insn_type): Add gather/scatter.
        (expand_gather_scatter): New function.
        * config/riscv/riscv-v.cc (gen_const_vector_dup): Add gather/scatter.
        (emit_vlmax_masked_store_insn): New function.
        (emit_nonvlmax_masked_store_insn): Ditto.
        (modulo_sel_indices): Ditto.
        (expand_vec_perm): Fix SLP for gather/scatter.
        (prepare_gather_scatter): New function.
        (strided_load_store_p): Ditto.
        (expand_gather_scatter): Ditto.
        * config/riscv/riscv.cc (riscv_legitimize_move): Fix bug of (subreg:SI (DI CONST_POLY_INT)).
        * config/riscv/vector-iterators.md: Add gather/scatter.
        * config/riscv/vector.md (vec_duplicate<mode>): Use "@" instead.
        (@vec_duplicate<mode>): Ditto.
        (@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>): Fix name.
        (@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/rvv.exp: Add gather/scatter tests.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c: New test.
        * gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c: New test.

---
 gcc/config/riscv/autovec.md                   | 256 ++++++++++++
 gcc/config/riscv/predicates.md                |  39 +-
 gcc/config/riscv/riscv-protos.h               |   3 +
 gcc/config/riscv/riscv-v.cc                   | 370 ++++++++++++++++--
 gcc/config/riscv/riscv.cc                     |  11 +-
 gcc/config/riscv/vector-iterators.md          | 118 +++++-
 gcc/config/riscv/vector.md                    |  30 +-
 .../autovec/gather-scatter/gather_load-1.c    |  38 ++
 .../autovec/gather-scatter/gather_load-10.c   |  35 ++
 .../autovec/gather-scatter/gather_load-11.c   |  32 ++
 .../autovec/gather-scatter/gather_load-12.c   | 112 ++++++
 .../autovec/gather-scatter/gather_load-2.c    |  38 ++
 .../autovec/gather-scatter/gather_load-3.c    |  35 ++
 .../autovec/gather-scatter/gather_load-4.c    |  35 ++
 .../autovec/gather-scatter/gather_load-5.c    |  35 ++
 .../autovec/gather-scatter/gather_load-6.c    |  35 ++
 .../autovec/gather-scatter/gather_load-7.c    |  35 ++
 .../autovec/gather-scatter/gather_load-8.c    |  35 ++
 .../autovec/gather-scatter/gather_load-9.c    |  35 ++
 .../gather-scatter/gather_load_run-1.c        |  41 ++
 .../gather-scatter/gather_load_run-10.c       |  41 ++
 .../gather-scatter/gather_load_run-11.c       |  39 ++
 .../gather-scatter/gather_load_run-12.c       | 124 ++++++
 .../gather-scatter/gather_load_run-2.c        |  41 ++
 .../gather-scatter/gather_load_run-3.c        |  41 ++
 .../gather-scatter/gather_load_run-4.c        |  41 ++
 .../gather-scatter/gather_load_run-5.c        |  41 ++
 .../gather-scatter/gather_load_run-6.c        |  41 ++
 .../gather-scatter/gather_load_run-7.c        |  41 ++
 .../gather-scatter/gather_load_run-8.c        |  41 ++
 .../gather-scatter/gather_load_run-9.c        |  41 ++
 .../gather-scatter/mask_gather_load-1.c       |  39 ++
 .../gather-scatter/mask_gather_load-10.c      |  36 ++
 .../gather-scatter/mask_gather_load-11.c      | 116 ++++++
 .../gather-scatter/mask_gather_load-2.c       |  39 ++
 .../gather-scatter/mask_gather_load-3.c       |  36 ++
 .../gather-scatter/mask_gather_load-4.c       |  36 ++
 .../gather-scatter/mask_gather_load-5.c       |  36 ++
 .../gather-scatter/mask_gather_load-6.c       |  36 ++
 .../gather-scatter/mask_gather_load-7.c       |  36 ++
 .../gather-scatter/mask_gather_load-8.c       |  36 ++
 .../gather-scatter/mask_gather_load-9.c       |  36 ++
 .../gather-scatter/mask_gather_load_run-1.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-10.c  |  48 +++
 .../gather-scatter/mask_gather_load_run-11.c  | 140 +++++++
 .../gather-scatter/mask_gather_load_run-2.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-3.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-4.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-5.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-6.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-7.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-8.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-9.c   |  48 +++
 .../gather-scatter/mask_scatter_store-1.c     |  39 ++
 .../gather-scatter/mask_scatter_store-10.c    |  36 ++
 .../gather-scatter/mask_scatter_store-2.c     |  39 ++
 .../gather-scatter/mask_scatter_store-3.c     |  36 ++
 .../gather-scatter/mask_scatter_store-4.c     |  36 ++
 .../gather-scatter/mask_scatter_store-5.c     |  36 ++
 .../gather-scatter/mask_scatter_store-6.c     |  36 ++
 .../gather-scatter/mask_scatter_store-7.c     |  36 ++
 .../gather-scatter/mask_scatter_store-8.c     |  36 ++
 .../gather-scatter/mask_scatter_store-9.c     |  36 ++
 .../gather-scatter/mask_scatter_store_run-1.c |  48 +++
 .../mask_scatter_store_run-10.c               |  48 +++
 .../gather-scatter/mask_scatter_store_run-2.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-3.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-4.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-5.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-6.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-7.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-8.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-9.c |  48 +++
 .../autovec/gather-scatter/scatter_store-1.c  |  38 ++
 .../autovec/gather-scatter/scatter_store-10.c |  35 ++
 .../autovec/gather-scatter/scatter_store-2.c  |  38 ++
 .../autovec/gather-scatter/scatter_store-3.c  |  35 ++
 .../autovec/gather-scatter/scatter_store-4.c  |  35 ++
 .../autovec/gather-scatter/scatter_store-5.c  |  35 ++
 .../autovec/gather-scatter/scatter_store-6.c  |  35 ++
 .../autovec/gather-scatter/scatter_store-7.c  |  35 ++
 .../autovec/gather-scatter/scatter_store-8.c  |  35 ++
 .../autovec/gather-scatter/scatter_store-9.c  |  35 ++
 .../gather-scatter/scatter_store_run-1.c      |  40 ++
 .../gather-scatter/scatter_store_run-10.c     |  40 ++
 .../gather-scatter/scatter_store_run-2.c      |  40 ++
 .../gather-scatter/scatter_store_run-3.c      |  40 ++
 .../gather-scatter/scatter_store_run-4.c      |  40 ++
 .../gather-scatter/scatter_store_run-5.c      |  40 ++
 .../gather-scatter/scatter_store_run-6.c      |  40 ++
 .../gather-scatter/scatter_store_run-7.c      |  40 ++
 .../gather-scatter/scatter_store_run-8.c      |  40 ++
 .../gather-scatter/scatter_store_run-9.c      |  40 ++
 .../autovec/gather-scatter/strided_load-1.c   |  46 +++
 .../autovec/gather-scatter/strided_load-2.c   |  46 +++
 .../gather-scatter/strided_load_run-1.c       |  84 ++++
 .../gather-scatter/strided_load_run-2.c       |  84 ++++
 .../autovec/gather-scatter/strided_store-1.c  |  46 +++
 .../autovec/gather-scatter/strided_store-2.c  |  46 +++
 .../gather-scatter/strided_store_run-1.c      |  82 ++++
 .../gather-scatter/strided_store_run-2.c      |  82 ++++
 gcc/testsuite/gcc.target/riscv/rvv/rvv.exp    |  23 ++
 102 files changed, 5082 insertions(+), 61 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index 9e61b2e41d8..78b9b5a2edb 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -57,6 +57,262 @@
   }
 )
 
+;; =========================================================================
+;; == Gather Load
+;; =========================================================================
+
+(define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
+  [(match_operand:VNX1_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX1_QHSDI 2 "vector_gs_offset_operand")
+   (match_operand 3 "<VNX1_QHSD:gs_extension>")
+   (match_operand 4 "<VNX1_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX1_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>"
+  [(match_operand:VNX2_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX2_QHSDI 2 "vector_gs_offset_operand")
+   (match_operand 3 "<VNX2_QHSD:gs_extension>")
+   (match_operand 4 "<VNX2_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX2_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>"
+  [(match_operand:VNX4_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX4_QHSDI 2 "vector_gs_offset_operand")
+   (match_operand 3 "<VNX4_QHSD:gs_extension>")
+   (match_operand 4 "<VNX4_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX4_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>"
+  [(match_operand:VNX8_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX8_QHSDI 2 "vector_gs_offset_operand")
+   (match_operand 3 "<VNX8_QHSD:gs_extension>")
+   (match_operand 4 "<VNX8_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX8_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
+  [(match_operand:VNX16_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX16_QHSDI 2 "vector_gs_offset_operand")
+   (match_operand 3 "<VNX16_QHSD:gs_extension>")
+   (match_operand 4 "<VNX16_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX16_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>"
+  [(match_operand:VNX32_QHS 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX32_QHSI 2 "vector_gs_offset_operand")
+   (match_operand 3 "<VNX32_QHS:gs_extension>")
+   (match_operand 4 "<VNX32_QHS:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX32_QHS:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>"
+  [(match_operand:VNX64_QH 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX64_QHI 2 "vector_gs_offset_operand")
+   (match_operand 3 "<VNX64_QH:gs_extension>")
+   (match_operand 4 "<VNX64_QH:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX64_QH:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+;; When SEW = 8 and LMUL = 8, we can't find any index mode with
+;; larger SEW. Since RVV indexed load/store support zero extend
+;; implicitly and not support scaling, we should only allow
+;; operands[3] and operands[4] to be const_1_operand.
+(define_expand "len_mask_gather_load<mode><mode>"
+  [(match_operand:VNX128_Q 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX128_Q 2 "vector_gs_offset_operand")
+   (match_operand 3 "const_1_operand")
+   (match_operand 4 "const_1_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+;; =========================================================================
+;; == Scatter Store
+;; =========================================================================
+
+(define_expand "len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX1_QHSDI 1 "vector_gs_offset_operand")
+   (match_operand 2 "<VNX1_QHSD:gs_extension>")
+   (match_operand 3 "<VNX1_QHSD:gs_scale>")
+   (match_operand:VNX1_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX1_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX2_QHSDI 1 "vector_gs_offset_operand")
+   (match_operand 2 "<VNX2_QHSD:gs_extension>")
+   (match_operand 3 "<VNX2_QHSD:gs_scale>")
+   (match_operand:VNX2_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX2_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX4_QHSDI 1 "vector_gs_offset_operand")
+   (match_operand 2 "<VNX4_QHSD:gs_extension>")
+   (match_operand 3 "<VNX4_QHSD:gs_scale>")
+   (match_operand:VNX4_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX4_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX8_QHSDI 1 "vector_gs_offset_operand")
+   (match_operand 2 "<VNX8_QHSD:gs_extension>")
+   (match_operand 3 "<VNX8_QHSD:gs_scale>")
+   (match_operand:VNX8_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX8_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX16_QHSDI 1 "vector_gs_offset_operand")
+   (match_operand 2 "<VNX16_QHSD:gs_extension>")
+   (match_operand 3 "<VNX16_QHSD:gs_scale>")
+   (match_operand:VNX16_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX16_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX32_QHSI 1 "vector_gs_offset_operand")
+   (match_operand 2 "<VNX32_QHS:gs_extension>")
+   (match_operand 3 "<VNX32_QHS:gs_scale>")
+   (match_operand:VNX32_QHS 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX32_QHS:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX64_QHI 1 "vector_gs_offset_operand")
+   (match_operand 2 "<VNX64_QH:gs_extension>")
+   (match_operand 3 "<VNX64_QH:gs_scale>")
+   (match_operand:VNX64_QH 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX64_QH:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+;; When SEW = 8 and LMUL = 8, we can't find any index mode with
+;; larger SEW. Since RVV indexed load/store support zero extend
+;; implicitly and not support scaling, we should only allow
+;; operands[3] and operands[4] to be const_1_operand.
+(define_expand "len_mask_scatter_store<mode><mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX128_Q 1 "vector_gs_offset_operand")
+   (match_operand 2 "const_1_operand")
+   (match_operand 3 "const_1_operand")
+   (match_operand:VNX128_Q 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
 ;; =========================================================================
 ;; == Vector creation
 ;; =========================================================================
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index eb975eaf994..5a65334e943 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -61,6 +61,10 @@
   (and (match_code "const_int,const_wide_int,const_vector")
        (match_test "op == CONST0_RTX (GET_MODE (op))")))
 
+(define_predicate "const_1_operand"
+  (and (match_code "const_int,const_wide_int,const_vector")
+       (match_test "op == CONST1_RTX (GET_MODE (op))")))
+
 (define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "const_0_operand")
        (match_operand 0 "register_operand")))
@@ -341,6 +345,39 @@
   (ior (match_operand 0 "register_operand")
        (match_code "const_vector")))
 
+(define_predicate "vector_gs_offset_operand"
+  (ior (match_operand 0 "register_operand")
+       (and (match_code "const_vector")
+            (match_test "CONST_VECTOR_NPATTERNS (op) == 1
+	                 && !CONST_VECTOR_DUPLICATE_P (op)"))))
+
+(define_predicate "vector_gs_scale_operand_16"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
+
+(define_predicate "vector_gs_scale_operand_32"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
+
+(define_predicate "vector_gs_scale_operand_64"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode == DImode)")))
+
+(define_predicate "vector_gs_extension_operand"
+  (ior (match_operand 0 "const_1_operand")
+       (and (match_operand 0 "const_0_operand")
+            (match_test "Pmode == SImode"))))
+
+(define_predicate "vector_gs_scale_operand_16_rv32"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1
+		    || (INTVAL (op) == 2 && Pmode == SImode)")))
+
+(define_predicate "vector_gs_scale_operand_32_rv32"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1
+		    || (INTVAL (op) == 4 && Pmode == SImode)")))
+
 (define_predicate "ltge_operator"
   (match_code "lt,ltu,ge,geu"))
 
@@ -376,7 +413,7 @@
 		|| rtx_equal_p (op, CONST0_RTX (GET_MODE (op))))
 		&& maybe_gt (GET_MODE_BITSIZE (GET_MODE (op)), GET_MODE_BITSIZE (Pmode)))")
     (ior (match_test "rtx_equal_p (op, CONST0_RTX (GET_MODE (op)))")
-         (ior (match_operand 0 "const_int_operand")
+         (ior (match_code "const_int,const_poly_int")
               (ior (match_operand 0 "register_operand")
                    (match_test "satisfies_constraint_Wdm (op)"))))))
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 5766e3597e8..fd6caccc183 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -148,6 +148,8 @@ enum insn_type
   RVV_WIDEN_TERNOP = 4,
   RVV_SCALAR_MOV_OP = 4, /* +1 for VUNDEF according to vector.md.  */
   RVV_SLIDE_OP = 4,      /* Dest, VUNDEF, source and offset.  */
+  RVV_GATHER_M_OP = 5,
+  RVV_SCATTER_M_OP = 4,
 };
 enum vlmul_type
 {
@@ -255,6 +257,7 @@ void expand_vec_init (rtx, rtx);
 void expand_vec_perm (rtx, rtx, rtx, rtx);
 void expand_select_vl (rtx *);
 void expand_load_store (rtx *, bool);
+void expand_gather_scatter (rtx *, bool);
 
 /* Rounding mode bitfield for fixed point VXRM.  */
 enum fixed_point_rounding_mode
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 8d5bed7ebe4..6f69f079b95 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -49,6 +49,7 @@
 #include "tm-constrs.h"
 #include "rtx-vector-builder.h"
 #include "targhooks.h"
+#include "gimple.h"
 
 using namespace riscv_vector;
 
@@ -556,15 +557,22 @@ const_vec_all_in_range_p (rtx vec, poly_int64 minval, poly_int64 maxval)
   return true;
 }
 
-/* Return a const_int vector of VAL.
-
-   This function also exists in aarch64, we may unify it in middle-end in the
-   future.  */
+/* Return a const vector of VAL. The VAL can be either const_int or
+   const_poly_int.  */
 
 static rtx
 gen_const_vector_dup (machine_mode mode, poly_int64 val)
 {
-  rtx c = gen_int_mode (val, GET_MODE_INNER (mode));
+  scalar_mode smode = GET_MODE_INNER (mode);
+  rtx c = gen_int_mode (val, smode);
+  if (!val.is_constant () && GET_MODE_SIZE (smode) > GET_MODE_SIZE (Pmode))
+    {
+      /* When VAL is const_poly_int value, we need to explicitly broadcast
+	 it into a vector using RVV broadcast instruction.  */
+      rtx dup = gen_reg_rtx (mode);
+      emit_insn (gen_vec_duplicate (mode, dup, c));
+      return dup;
+    }
   return gen_const_vec_duplicate (mode, c);
 }
 
@@ -901,6 +909,39 @@ emit_nonvlmax_masked_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
   e.emit_insn ((enum insn_code) icode, ops);
 }
 
+/* This function emits a VLMAX masked store instruction.  */
+static void
+emit_vlmax_masked_store_insn (unsigned icode, int op_num, rtx *ops)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+					  /*HAS_DEST_P*/ false,
+					  /*FULLY_UNMASKED_P*/ false,
+					  /*USE_REAL_MERGE_P*/ true,
+					  /*HAS_AVL_P*/ true,
+					  /*VLMAX_P*/ true, dest_mode,
+					  mask_mode);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
+/* This function emits a non-VLMAX masked store instruction.  */
+static void
+emit_nonvlmax_masked_store_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+					  /*HAS_DEST_P*/ false,
+					  /*FULLY_UNMASKED_P*/ false,
+					  /*USE_REAL_MERGE_P*/ true,
+					  /*HAS_AVL_P*/ true,
+					  /*VLMAX_P*/ false, dest_mode,
+					  mask_mode);
+  e.set_vl (avl);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
 /* This function emits a masked instruction.  */
 void
 emit_vlmax_masked_mu_insn (unsigned icode, int op_num, rtx *ops)
@@ -1137,7 +1178,6 @@ static void
 expand_const_vector (rtx target, rtx src)
 {
   machine_mode mode = GET_MODE (target);
-  scalar_mode elt_mode = GET_MODE_INNER (mode);
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
     {
       rtx elt;
@@ -1162,7 +1202,6 @@ expand_const_vector (rtx target, rtx src)
 	}
       else
 	{
-	  elt = force_reg (elt_mode, elt);
 	  rtx ops[] = {tmp, elt};
 	  emit_vlmax_insn (code_for_pred_broadcast (mode), RVV_UNOP, ops);
 	}
@@ -2431,6 +2470,25 @@ expand_vec_cmp_float (rtx target, rtx_code code, rtx op0, rtx op1,
   return false;
 }
 
+/* Modulo all SEL indices to ensure they are all in range if [0, MAX_SEL].  */
+static rtx
+modulo_sel_indices (rtx sel, poly_uint64 max_sel)
+{
+  rtx sel_mod;
+  machine_mode sel_mode = GET_MODE (sel);
+  poly_uint64 nunits = GET_MODE_NUNITS (sel_mode);
+  /* If SEL is variable-length CONST_VECTOR, we don't need to modulo it.  */
+  if (!nunits.is_constant () && CONST_VECTOR_P (sel))
+    sel_mod = sel;
+  else
+    {
+      rtx mod = gen_const_vector_dup (sel_mode, max_sel);
+      sel_mod
+	= expand_simple_binop (sel_mode, AND, sel, mod, NULL, 0, OPTAB_DIRECT);
+    }
+  return sel_mod;
+}
+
 /* Implement vec_perm<mode>.  */
 
 void
@@ -2444,41 +2502,43 @@ expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
      index is in range of [0, nunits - 1]. A single vrgather instructions is
      enough. Since we will use vrgatherei16.vv for variable-length vector,
      it is never out of range and we don't need to modulo the index.  */
-  if (!nunits.is_constant () || const_vec_all_in_range_p (sel, 0, nunits - 1))
+  if (nunits.is_constant () && const_vec_all_in_range_p (sel, 0, nunits - 1))
     {
       emit_vlmax_gather_insn (target, op0, sel);
       return;
     }
 
+  /* Check if all the indices are same.  */
+  rtx elt;
+  if (const_vec_duplicate_p (sel, &elt))
+    {
+      poly_uint64 value = rtx_to_poly_int64 (elt);
+      rtx op = op0;
+      if (maybe_gt (value, nunits - 1))
+	{
+	  sel = gen_const_vector_dup (sel_mode, value - nunits);
+	  op = op1;
+	}
+      emit_vlmax_gather_insn (target, op, sel);
+    }
+
+  /* Note: vec_perm indices are supposed to wrap when they go beyond the
+     size of the two value vectors, i.e. the upper bits of the indices
+     are effectively ignored.  RVV vrgather instead produces 0 for any
+     out-of-range indices, so we need to modulo all the vec_perm indices
+     to ensure they are all in range of [0, nunits - 1] when op0 == op1
+     or all in range of [0, 2 * nunits - 1] when op0 != op1.  */
+  rtx sel_mod
+    = modulo_sel_indices (sel,
+			  rtx_equal_p (op0, op1) ? nunits - 1 : 2 * nunits - 1);
   /* Check if the two values vectors are the same.  */
-  if (rtx_equal_p (op0, op1) || const_vec_duplicate_p (sel))
-    {
-      /* Note: vec_perm indices are supposed to wrap when they go beyond the
-	 size of the two value vectors, i.e. the upper bits of the indices
-	 are effectively ignored.  RVV vrgather instead produces 0 for any
-	 out-of-range indices, so we need to modulo all the vec_perm indices
-	 to ensure they are all in range of [0, nunits - 1].  */
-      rtx max_sel = gen_const_vector_dup (sel_mode, nunits - 1);
-      rtx sel_mod = expand_simple_binop (sel_mode, AND, sel, max_sel, NULL, 0,
-					 OPTAB_DIRECT);
-      emit_vlmax_gather_insn (target, op1, sel_mod);
+  if (rtx_equal_p (op0, op1))
+    {
+      emit_vlmax_gather_insn (target, op0, sel_mod);
       return;
     }
 
-  rtx sel_mod = sel;
   rtx max_sel = gen_const_vector_dup (sel_mode, 2 * nunits - 1);
-  /* We don't need to modulo indices for VLA vector.
-     Since we should gurantee they aren't out of range before.  */
-  if (nunits.is_constant ())
-    {
-      /* Note: vec_perm indices are supposed to wrap when they go beyond the
-	 size of the two value vectors, i.e. the upper bits of the indices
-	 are effectively ignored.  RVV vrgather instead produces 0 for any
-	 out-of-range indices, so we need to modulo all the vec_perm indices
-	 to ensure they are all in range of [0, 2 * nunits - 1].  */
-      sel_mod = expand_simple_binop (sel_mode, AND, sel, max_sel, NULL, 0,
-				     OPTAB_DIRECT);
-    }
 
   /* This following sequence is handling the case that:
      __builtin_shufflevector (vec1, vec2, index...), the index can be any
@@ -2812,4 +2872,250 @@ expand_load_store (rtx *ops, bool is_load)
     }
 }
 
+/* Prepare insn_code for gather_load/scatter_store according to
+   the vector mode and index mode.  */
+static insn_code
+prepare_gather_scatter (machine_mode vec_mode, machine_mode idx_mode,
+			bool is_load)
+{
+  if (!is_load)
+    return code_for_pred_indexed_store (UNSPEC_UNORDERED, vec_mode, idx_mode);
+  else
+    {
+      unsigned src_eew_bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (idx_mode));
+      unsigned dst_eew_bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (vec_mode));
+      if (dst_eew_bitsize == src_eew_bitsize)
+	return code_for_pred_indexed_load_same_eew (UNSPEC_UNORDERED, vec_mode);
+      else if (dst_eew_bitsize > src_eew_bitsize)
+	{
+	  unsigned factor = dst_eew_bitsize / src_eew_bitsize;
+	  switch (factor)
+	    {
+	    case 2:
+	      return code_for_pred_indexed_load_x2_greater_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    case 4:
+	      return code_for_pred_indexed_load_x4_greater_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    case 8:
+	      return code_for_pred_indexed_load_x8_greater_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+      else
+	{
+	  unsigned factor = src_eew_bitsize / dst_eew_bitsize;
+	  switch (factor)
+	    {
+	    case 2:
+	      return code_for_pred_indexed_load_x2_smaller_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    case 4:
+	      return code_for_pred_indexed_load_x4_smaller_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    case 8:
+	      return code_for_pred_indexed_load_x8_smaller_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+    }
+}
+
+/* Return true if it is the strided load/store.  */
+static bool
+strided_load_store_p (rtx vec_offset, rtx *base, rtx *step)
+{
+  if (const_vec_series_p (vec_offset, base, step))
+    return true;
+
+  /* For strided load/store, vectorizer always generates
+     VEC_SERIES_EXPR for vec_offset.  */
+  tree expr = REG_EXPR (vec_offset);
+  if (!expr || TREE_CODE (expr) != SSA_NAME)
+    return false;
+
+  /* Check if it is GIMPLE like: _88 = VEC_SERIES_EXPR <0, _87>;  */
+  gimple *def_stmt = SSA_NAME_DEF_STMT (expr);
+  if (!def_stmt || !is_gimple_assign (def_stmt)
+      || gimple_assign_rhs_code (def_stmt) != VEC_SERIES_EXPR)
+    return false;
+
+  tree baset = gimple_assign_rhs1 (def_stmt);
+  tree stept = gimple_assign_rhs2 (def_stmt);
+  *base = expand_normal (baset);
+  *step = expand_normal (stept);
+
+  if (!rtx_equal_p (*base, const0_rtx))
+    return false;
+  return true;
+}
+
+/* Expand LEN_MASK_{GATHER_LOAD,SCATTER_STORE}.  */
+void
+expand_gather_scatter (rtx *ops, bool is_load)
+{
+  rtx ptr, vec_offset, vec_reg, len, mask;
+  bool zero_extend_p;
+  int scale_log2;
+  if (is_load)
+    {
+      vec_reg = ops[0];
+      ptr = ops[1];
+      vec_offset = ops[2];
+      zero_extend_p = INTVAL (ops[3]);
+      scale_log2 = exact_log2 (INTVAL (ops[4]));
+      len = ops[5];
+      mask = ops[7];
+    }
+  else
+    {
+      vec_reg = ops[4];
+      ptr = ops[0];
+      vec_offset = ops[1];
+      zero_extend_p = INTVAL (ops[2]);
+      scale_log2 = exact_log2 (INTVAL (ops[3]));
+      len = ops[5];
+      mask = ops[7];
+    }
+
+  machine_mode vec_mode = GET_MODE (vec_reg);
+  machine_mode idx_mode = GET_MODE (vec_offset);
+  scalar_mode inner_vec_mode = GET_MODE_INNER (vec_mode);
+  scalar_mode inner_idx_mode = GET_MODE_INNER (idx_mode);
+  unsigned inner_vsize = GET_MODE_BITSIZE (inner_vec_mode);
+  unsigned inner_offsize = GET_MODE_BITSIZE (inner_idx_mode);
+  poly_int64 nunits = GET_MODE_NUNITS (vec_mode);
+  poly_int64 value;
+  bool is_dummy_len = poly_int_rtx_p (len, &value) && known_eq (value, nunits);
+
+  /* We use vlse.v/vsse.v instead of indexed load/store by default
+     if it is strided load/store.
+
+     FIXME: vlse.v/vsse.v may not always be better than vluxei.v/vsuxei.v.
+     We may need COST MODE to adjust it.  */
+  rtx base, step;
+  if (strided_load_store_p (vec_offset, &base, &step))
+    {
+      if (GET_MODE (step) != Pmode)
+	{
+	  if (CONSTANT_P (step))
+	    step = force_reg (Pmode, step);
+	  else
+	    {
+	      rtx extend_step = gen_reg_rtx (Pmode);
+	      emit_insn (gen_extend_insn (extend_step, step, Pmode,
+					  GET_MODE (step),
+					  zero_extend_p ? true : false));
+	      step = extend_step;
+	    }
+	}
+      if (scale_log2 != 0)
+	{
+	  rtx scale_step = gen_reg_rtx (Pmode);
+	  rtx tmp = expand_simple_binop (Pmode, ASHIFT, step,
+					 gen_int_mode (scale_log2, Pmode),
+					 NULL_RTX, false, OPTAB_DIRECT);
+	  emit_move_insn (scale_step, tmp);
+	  step = scale_step;
+	}
+
+      rtx mem = validize_mem (gen_rtx_MEM (vec_mode, ptr));
+      /* Emit vlse.v if it's load. Otherwise, emit vsse.v.  */
+      if (is_load)
+	{
+	  insn_code icode = code_for_pred_strided_load (vec_mode);
+	  rtx load_ops[] = {vec_reg, mask, RVV_VUNDEF (vec_mode), mem, step};
+	  if (is_dummy_len)
+	    emit_vlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops);
+	  else
+	    emit_nonvlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops, len);
+	}
+      else
+	{
+	  if (is_dummy_len)
+	    {
+	      rtx dummy_len = gen_reg_rtx (Pmode);
+	      emit_vlmax_vsetvl (vec_mode, dummy_len);
+	      emit_insn (gen_pred_strided_store (vec_mode, mem, mask, step,
+						 vec_reg, dummy_len,
+						 get_avl_type_rtx (VLMAX)));
+	    }
+	  else
+	    emit_insn (gen_pred_strided_store (vec_mode, mem, mask, step,
+					       vec_reg, len,
+					       get_avl_type_rtx (NONVLMAX)));
+	}
+      return;
+    }
+
+  if (inner_offsize < inner_vsize)
+    {
+      /* 7.2. Vector Load/Store Addressing Modes.
+	 If the vector offset elements are narrower than XLEN, they are
+	 zero-extended to XLEN before adding to the ptr effective address. If
+	 the vector offset elements are wider than XLEN, the least-significant
+	 XLEN bits are used in the address calculation. An implementation must
+	 raise an illegal instruction exception if the EEW is not supported for
+	 offset elements.  */
+      if (!zero_extend_p || (zero_extend_p && scale_log2 != 0))
+	{
+	  if (zero_extend_p)
+	    inner_idx_mode
+	      = int_mode_for_size (inner_offsize * 2, 0).require ();
+	  else
+	    inner_idx_mode = int_mode_for_size (BITS_PER_WORD, 0).require ();
+	  machine_mode new_idx_mode
+	    = get_vector_mode (inner_idx_mode, nunits).require ();
+	  rtx tmp = gen_reg_rtx (new_idx_mode);
+	  emit_insn (gen_extend_insn (tmp, vec_offset, new_idx_mode, idx_mode,
+				      zero_extend_p ? true : false));
+	  vec_offset = tmp;
+	  idx_mode = new_idx_mode;
+	}
+    }
+
+  if (scale_log2 != 0)
+    {
+      rtx tmp = expand_binop (idx_mode, ashl_optab, vec_offset,
+			      gen_int_mode (scale_log2, Pmode), NULL_RTX, 0,
+			      OPTAB_DIRECT);
+      vec_offset = tmp;
+    }
+
+  insn_code icode = prepare_gather_scatter (vec_mode, idx_mode, is_load);
+  if (is_dummy_len)
+    {
+      if (is_load)
+	{
+	  rtx load_ops[]
+	    = {vec_reg, mask, RVV_VUNDEF (vec_mode), ptr, vec_offset};
+	  emit_vlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops);
+	}
+      else
+	{
+	  rtx store_ops[] = {mask, ptr, vec_offset, vec_reg};
+	  emit_vlmax_masked_store_insn (icode, RVV_SCATTER_M_OP, store_ops);
+	}
+    }
+  else
+    {
+      if (is_load)
+	{
+	  rtx load_ops[]
+	    = {vec_reg, mask, RVV_VUNDEF (vec_mode), ptr, vec_offset};
+	  emit_nonvlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops, len);
+	}
+      else
+	{
+	  rtx store_ops[] = {mask, ptr, vec_offset, vec_reg};
+	  emit_nonvlmax_masked_store_insn (icode, RVV_SCATTER_M_OP, store_ops,
+					   len);
+	}
+    }
+}
+
 } // namespace riscv_vector
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 38d8eb2fcf5..8970f6da6ad 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2060,7 +2060,14 @@ riscv_legitimize_poly_move (machine_mode mode, rtx dest, rtx tmp, rtx src)
      (m, n) = base * magn + constant.
      This calculation doesn't need div operation.  */
 
-  emit_move_insn (tmp, gen_int_mode (BYTES_PER_RISCV_VECTOR, mode));
+  if (mode <= Pmode)
+    emit_move_insn (tmp, gen_int_mode (BYTES_PER_RISCV_VECTOR, mode));
+  else
+    {
+      emit_move_insn (gen_highpart (Pmode, tmp), CONST0_RTX (Pmode));
+      emit_move_insn (gen_lowpart (Pmode, tmp),
+		      gen_int_mode (BYTES_PER_RISCV_VECTOR, Pmode));
+    }
 
   if (BYTES_PER_RISCV_VECTOR.is_constant ())
     {
@@ -2167,7 +2174,7 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src)
 	  return false;
 	}
 
-      if (satisfies_constraint_vp (src))
+      if (satisfies_constraint_vp (src) && GET_MODE (src) == Pmode)
 	return false;
 
       if (GET_MODE_SIZE (mode).to_constant () < GET_MODE_SIZE (Pmode))
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 8afd3dcaddd..ec49544bcf6 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -115,6 +115,9 @@
 
 (define_mode_iterator VEEWEXT2 [
   (VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx64HI "TARGET_MIN_VLEN >= 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16") (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
   (VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI "TARGET_VECTOR_ELEN_64")
   (VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
@@ -161,6 +164,8 @@
 (define_mode_iterator VEEWTRUNC2 [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI (VNx64QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN >= 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16") (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
   (VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN >= 128")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
@@ -172,6 +177,8 @@
 (define_mode_iterator VEEWTRUNC4 [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI VNx4QI VNx8QI VNx16QI (VNx32QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI (VNx16HI "TARGET_MIN_VLEN >= 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
 ])
 
 (define_mode_iterator VEEWTRUNC8 [
@@ -362,46 +369,67 @@
 ])
 
 (define_mode_iterator VNX1_QHSD [
-  (VNx1QI "TARGET_MIN_VLEN < 128") (VNx1HI "TARGET_MIN_VLEN < 128") (VNx1SI "TARGET_MIN_VLEN < 128")
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  (VNx1SI "TARGET_MIN_VLEN < 128")
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
   (VNx1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN < 128")
 ])
 
 (define_mode_iterator VNX2_QHSD [
-  VNx2QI VNx2HI VNx2SI
+  VNx2QI
+  VNx2HI
+  VNx2SI
   (VNx2DI "TARGET_VECTOR_ELEN_64")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx2DF "TARGET_VECTOR_ELEN_FP_64")
 ])
 
 (define_mode_iterator VNX4_QHSD [
-  VNx4QI VNx4HI VNx4SI
+  VNx4QI
+  VNx4HI
+  VNx4SI
   (VNx4DI "TARGET_VECTOR_ELEN_64")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx4DF "TARGET_VECTOR_ELEN_FP_64")
 ])
 
 (define_mode_iterator VNX8_QHSD [
-  VNx8QI VNx8HI VNx8SI
+  VNx8QI
+  VNx8HI
+  VNx8SI
   (VNx8DI "TARGET_VECTOR_ELEN_64")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx8SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx8DF "TARGET_VECTOR_ELEN_FP_64")
 ])
 
-(define_mode_iterator VNX16_QHS [
-  VNx16QI VNx16HI (VNx16SI "TARGET_MIN_VLEN > 32")
+(define_mode_iterator VNX16_QHSD [
+  VNx16QI
+  VNx16HI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
-  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128") (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
+  (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
 ])
 
 (define_mode_iterator VNX32_QHS [
-  VNx32QI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128") (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
+  VNx32QI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
+  (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
 ])
 
 (define_mode_iterator VNX64_QH [
   (VNx64QI "TARGET_MIN_VLEN > 32")
   (VNx64HI "TARGET_MIN_VLEN >= 128")
+  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
 ])
 
 (define_mode_iterator VNX128_Q [
@@ -409,35 +437,49 @@
 ])
 
 (define_mode_iterator VNX1_QHSDI [
-  (VNx1QI "TARGET_MIN_VLEN < 128") (VNx1HI "TARGET_MIN_VLEN < 128") (VNx1SI "TARGET_MIN_VLEN < 128")
-  (VNx1DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  (VNx1SI "TARGET_MIN_VLEN < 128")
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128 && TARGET_64BIT")
 ])
 
 (define_mode_iterator VNX2_QHSDI [
-  VNx2QI VNx2HI VNx2SI
-  (VNx2DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
+  VNx2QI
+  VNx2HI
+  VNx2SI
+  (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
 ])
 
 (define_mode_iterator VNX4_QHSDI [
-  VNx4QI VNx4HI VNx4SI
-  (VNx4DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
+  VNx4QI
+  VNx4HI
+  VNx4SI
+  (VNx4DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
 ])
 
 (define_mode_iterator VNX8_QHSDI [
-  VNx8QI VNx8HI VNx8SI
-  (VNx8DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
+  VNx8QI
+  VNx8HI
+  VNx8SI
+  (VNx8DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
 ])
 
 (define_mode_iterator VNX16_QHSDI [
-  VNx16QI VNx16HI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx16DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  VNx16QI
+  VNx16HI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128 && TARGET_64BIT")
 ])
 
 (define_mode_iterator VNX32_QHSI [
-  VNx32QI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
+  VNx32QI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
 ])
 
 (define_mode_iterator VNX64_QHI [
-  VNx64QI (VNx64HI "TARGET_MIN_VLEN >= 128")
+  (VNx64QI "TARGET_MIN_VLEN > 32")
+  (VNx64HI "TARGET_MIN_VLEN >= 128")
 ])
 
 (define_mode_iterator V_WHOLE [
@@ -1393,6 +1435,8 @@
 (define_mode_attr VINDEX_DOUBLE_TRUNC [
   (VNx1HI "VNx1QI") (VNx2HI "VNx2QI")  (VNx4HI "VNx4QI")  (VNx8HI "VNx8QI")
   (VNx16HI "VNx16QI") (VNx32HI "VNx32QI") (VNx64HI "VNx64QI")
+  (VNx1HF "VNx1QI") (VNx2HF "VNx2QI")  (VNx4HF "VNx4QI")  (VNx8HF "VNx8QI")
+  (VNx16HF "VNx16QI") (VNx32HF "VNx32QI") (VNx64HF "VNx64QI")
   (VNx1SI "VNx1HI") (VNx2SI "VNx2HI") (VNx4SI "VNx4HI") (VNx8SI "VNx8HI")
   (VNx16SI "VNx16HI") (VNx32SI "VNx32HI")
   (VNx1SF "VNx1HI") (VNx2SF "VNx2HI") (VNx4SF "VNx4HI") (VNx8SF "VNx8HI")
@@ -1420,6 +1464,7 @@
 (define_mode_attr VINDEX_DOUBLE_EXT [
   (VNx1QI "VNx1HI") (VNx2QI "VNx2HI") (VNx4QI "VNx4HI") (VNx8QI "VNx8HI") (VNx16QI "VNx16HI") (VNx32QI "VNx32HI") (VNx64QI "VNx64HI")
   (VNx1HI "VNx1SI") (VNx2HI "VNx2SI") (VNx4HI "VNx4SI") (VNx8HI "VNx8SI") (VNx16HI "VNx16SI") (VNx32HI "VNx32SI")
+  (VNx1HF "VNx1SI") (VNx2HF "VNx2SI") (VNx4HF "VNx4SI") (VNx8HF "VNx8SI") (VNx16HF "VNx16SI") (VNx32HF "VNx32SI")
   (VNx1SI "VNx1DI") (VNx2SI "VNx2DI") (VNx4SI "VNx4DI") (VNx8SI "VNx8DI") (VNx16SI "VNx16DI")
   (VNx1SF "VNx1DI") (VNx2SF "VNx2DI") (VNx4SF "VNx4DI") (VNx8SF "VNx8DI") (VNx16SF "VNx16DI")
 ])
@@ -1427,6 +1472,7 @@
 (define_mode_attr VINDEX_QUAD_EXT [
   (VNx1QI "VNx1SI") (VNx2QI "VNx2SI") (VNx4QI "VNx4SI") (VNx8QI "VNx8SI") (VNx16QI "VNx16SI") (VNx32QI "VNx32SI")
   (VNx1HI "VNx1DI") (VNx2HI "VNx2DI") (VNx4HI "VNx4DI") (VNx8HI "VNx8DI") (VNx16HI "VNx16DI")
+  (VNx1HF "VNx1DI") (VNx2HF "VNx2DI") (VNx4HF "VNx4DI") (VNx8HF "VNx8DI") (VNx16HF "VNx16DI")
 ])
 
 (define_mode_attr VINDEX_OCT_EXT [
@@ -1471,6 +1517,40 @@
   (VNx4DI "VNx8BI") (VNx8DI "VNx16BI") (VNx16DI "VNx32BI")
 ])
 
+(define_mode_attr gs_extension [
+  (VNx1QI "immediate_operand") (VNx2QI "immediate_operand") (VNx4QI "immediate_operand") (VNx8QI "immediate_operand") (VNx16QI "immediate_operand")
+  (VNx32QI "vector_gs_extension_operand") (VNx64QI "const_1_operand")
+  (VNx1HI "immediate_operand") (VNx2HI "immediate_operand") (VNx4HI "immediate_operand") (VNx8HI "immediate_operand") (VNx16HI "immediate_operand")
+  (VNx32HI "vector_gs_extension_operand") (VNx64HI "const_1_operand")
+  (VNx1SI "immediate_operand") (VNx2SI "immediate_operand") (VNx4SI "immediate_operand") (VNx8SI "immediate_operand") (VNx16SI "immediate_operand")
+  (VNx32SI "vector_gs_extension_operand")
+  (VNx1DI "immediate_operand") (VNx2DI "immediate_operand") (VNx4DI "immediate_operand") (VNx8DI "immediate_operand") (VNx16DI "immediate_operand")
+
+  (VNx1HF "immediate_operand") (VNx2HF "immediate_operand") (VNx4HF "immediate_operand") (VNx8HF "immediate_operand") (VNx16HF "immediate_operand")
+  (VNx32HF "vector_gs_extension_operand") (VNx64HF "const_1_operand")
+  (VNx1SF "immediate_operand") (VNx2SF "immediate_operand") (VNx4SF "immediate_operand") (VNx8SF "immediate_operand") (VNx16SF "immediate_operand")
+  (VNx32SF "vector_gs_extension_operand")
+  (VNx1DF "immediate_operand") (VNx2DF "immediate_operand") (VNx4DF "immediate_operand") (VNx8DF "immediate_operand") (VNx16DF "immediate_operand")
+])
+
+(define_mode_attr gs_scale [
+  (VNx1QI "const_1_operand") (VNx2QI "const_1_operand") (VNx4QI "const_1_operand") (VNx8QI "const_1_operand")
+  (VNx16QI "const_1_operand") (VNx32QI "const_1_operand") (VNx64QI "const_1_operand")
+  (VNx1HI "vector_gs_scale_operand_16") (VNx2HI "vector_gs_scale_operand_16") (VNx4HI "vector_gs_scale_operand_16") (VNx8HI "vector_gs_scale_operand_16")
+  (VNx16HI "vector_gs_scale_operand_16") (VNx32HI "vector_gs_scale_operand_16_rv32") (VNx64HI "const_1_operand")
+  (VNx1SI "vector_gs_scale_operand_32") (VNx2SI "vector_gs_scale_operand_32") (VNx4SI "vector_gs_scale_operand_32") (VNx8SI "vector_gs_scale_operand_32")
+  (VNx16SI "vector_gs_scale_operand_32") (VNx32SI "vector_gs_scale_operand_32_rv32")
+  (VNx1DI "vector_gs_scale_operand_64") (VNx2DI "vector_gs_scale_operand_64") (VNx4DI "vector_gs_scale_operand_64") (VNx8DI "vector_gs_scale_operand_64")
+  (VNx16DI "vector_gs_scale_operand_64")
+
+  (VNx1HF "vector_gs_scale_operand_16") (VNx2HF "vector_gs_scale_operand_16") (VNx4HF "vector_gs_scale_operand_16") (VNx8HF "vector_gs_scale_operand_16")
+  (VNx16HF "vector_gs_scale_operand_16") (VNx32HF "vector_gs_scale_operand_16_rv32") (VNx64HF "const_1_operand")
+  (VNx1SF "vector_gs_scale_operand_32") (VNx2SF "vector_gs_scale_operand_32") (VNx4SF "vector_gs_scale_operand_32") (VNx8SF "vector_gs_scale_operand_32")
+  (VNx16SF "vector_gs_scale_operand_32") (VNx32SF "vector_gs_scale_operand_32_rv32")
+  (VNx1DF "vector_gs_scale_operand_64") (VNx2DF "vector_gs_scale_operand_64") (VNx4DF "vector_gs_scale_operand_64") (VNx8DF "vector_gs_scale_operand_64")
+  (VNx16DF "vector_gs_scale_operand_64")
+])
+
 (define_int_iterator WREDUC [UNSPEC_WREDUC_SUM UNSPEC_WREDUC_USUM])
 
 (define_int_iterator ORDER [UNSPEC_ORDERED UNSPEC_UNORDERED])
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 5b7a17b9d34..19740c89132 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -818,7 +818,7 @@
 ;; This pattern only handles duplicates of non-constant inputs.
 ;; Constant vectors go through the movm pattern instead.
 ;; So "direct_broadcast_operand" can only be mem or reg, no CONSTANT.
-(define_expand "vec_duplicate<mode>"
+(define_expand "@vec_duplicate<mode>"
   [(set (match_operand:V 0 "register_operand")
 	(vec_duplicate:V
 	  (match_operand:<VEL> 1 "direct_broadcast_operand")))]
@@ -1357,8 +1357,16 @@
 	}
     }
   else if (GET_MODE_BITSIZE (<VEL>mode) > GET_MODE_BITSIZE (Pmode)
-           && immediate_operand (operands[3], Pmode))
-    operands[3] = gen_rtx_SIGN_EXTEND (<VEL>mode, force_reg (Pmode, operands[3]));
+           && (immediate_operand (operands[3], Pmode)
+	       || (CONST_POLY_INT_P (operands[3])
+	           && known_ge (rtx_to_poly_int64 (operands[3]), 0U)
+		   && known_le (rtx_to_poly_int64 (operands[3]), GET_MODE_SIZE (<MODE>mode)))))
+    {
+      rtx tmp = gen_reg_rtx (Pmode);
+      poly_int64 value = rtx_to_poly_int64 (operands[3]);
+      emit_move_insn (tmp, gen_int_mode (value, Pmode));
+      operands[3] = gen_rtx_SIGN_EXTEND (<VEL>mode, tmp);
+    }
   else
     operands[3] = force_reg (<VEL>mode, operands[3]);
 })
@@ -1387,7 +1395,8 @@
    vlse<sew>.v\t%0,%3,zero
    vmv.s.x\t%0,%3
    vmv.s.x\t%0,%3"
-  "register_operand (operands[3], <VEL>mode)
+  "(register_operand (operands[3], <VEL>mode)
+  || CONST_POLY_INT_P (operands[3]))
   && GET_MODE_BITSIZE (<VEL>mode) > GET_MODE_BITSIZE (Pmode)"
   [(set (match_dup 0)
 	(if_then_else:VI (unspec:<VM> [(match_dup 1) (match_dup 4)
@@ -1397,6 +1406,12 @@
 	  (match_dup 2)))]
   {
     gcc_assert (can_create_pseudo_p ());
+    if (CONST_POLY_INT_P (operands[3]))
+      {
+        rtx tmp = gen_reg_rtx (<VEL>mode);
+	emit_move_insn (tmp, operands[3]);
+	operands[3] = tmp;
+      }
     rtx m = assign_stack_local (<VEL>mode, GET_MODE_SIZE (<VEL>mode),
 				GET_MODE_ALIGNMENT (<VEL>mode));
     m = validize_mem (m);
@@ -1483,6 +1498,7 @@
 	     (match_operand 5 "vector_length_operand"    "   rK,    rK,    rK")
 	     (match_operand 6 "const_int_operand"        "    i,     i,     i")
 	     (match_operand 7 "const_int_operand"        "    i,     i,     i")
+	     (match_operand 8 "const_int_operand"        "    i,     i,     i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	  (unspec:V
@@ -1738,7 +1754,7 @@
   [(set_attr "type" "vst<order>x")
    (set_attr "mode" "<VNX8_QHSD:MODE>")])
 
-(define_insn "@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>"
+(define_insn "@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
   [(set (mem:BLK (scratch))
 	(unspec:BLK
 	  [(unspec:<VM>
@@ -1749,11 +1765,11 @@
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	   (match_operand 1 "pmode_reg_or_0_operand"      "  rJ")
 	   (match_operand:VNX16_QHSDI 2 "register_operand" "  vr")
-	   (match_operand:VNX16_QHS 3 "register_operand"  "  vr")] ORDER))]
+	   (match_operand:VNX16_QHSD 3 "register_operand"  "  vr")] ORDER))]
   "TARGET_VECTOR"
   "vs<order>xei<VNX16_QHSDI:sew>.v\t%3,(%z1),%2%p0"
   [(set_attr "type" "vst<order>x")
-   (set_attr "mode" "<VNX16_QHS:MODE>")])
+   (set_attr "mode" "<VNX16_QHSD:MODE>")])
 
 (define_insn "@pred_indexed_<order>store<VNX32_QHS:mode><VNX32_QHSI:mode>"
   [(set (mem:BLK (scratch))
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c
new file mode 100644
index 00000000000..dffe13f6a8a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c
new file mode 100644
index 00000000000..a622e516f06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c
new file mode 100644
index 00000000000..4692380233d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_LOOP(DATA_TYPE)                                                   \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict *src)           \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += *src[i];                                                      \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t)                                                                   \
+  T (uint8_t)                                                                  \
+  T (int16_t)                                                                  \
+  T (uint16_t)                                                                 \
+  T (_Float16)                                                                 \
+  T (int32_t)                                                                  \
+  T (uint32_t)                                                                 \
+  T (float)                                                                    \
+  T (int64_t)                                                                  \
+  T (uint64_t)                                                                 \
+  T (double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c
new file mode 100644
index 00000000000..71a3dd466fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c
@@ -0,0 +1,112 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_LOOP(DATA_TYPE, INDEX_TYPE)                                       \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x,  \
+				INDEX_TYPE *restrict index)                    \
+  {                                                                            \
+    for (int i = 0; i < 100; ++i)                                              \
+      {                                                                        \
+	y[i * 2] = x[index[i * 2]] + 1;                                        \
+	y[i * 2 + 1] = x[index[i * 2 + 1]] + 2;                                \
+      }                                                                        \
+  }
+
+TEST_LOOP (int8_t, int8_t)
+TEST_LOOP (uint8_t, int8_t)
+TEST_LOOP (int16_t, int8_t)
+TEST_LOOP (uint16_t, int8_t)
+TEST_LOOP (int32_t, int8_t)
+TEST_LOOP (uint32_t, int8_t)
+TEST_LOOP (int64_t, int8_t)
+TEST_LOOP (uint64_t, int8_t)
+TEST_LOOP (_Float16, int8_t)
+TEST_LOOP (float, int8_t)
+TEST_LOOP (double, int8_t)
+TEST_LOOP (int8_t, int16_t)
+TEST_LOOP (uint8_t, int16_t)
+TEST_LOOP (int16_t, int16_t)
+TEST_LOOP (uint16_t, int16_t)
+TEST_LOOP (int32_t, int16_t)
+TEST_LOOP (uint32_t, int16_t)
+TEST_LOOP (int64_t, int16_t)
+TEST_LOOP (uint64_t, int16_t)
+TEST_LOOP (_Float16, int16_t)
+TEST_LOOP (float, int16_t)
+TEST_LOOP (double, int16_t)
+TEST_LOOP (int8_t, int32_t)
+TEST_LOOP (uint8_t, int32_t)
+TEST_LOOP (int16_t, int32_t)
+TEST_LOOP (uint16_t, int32_t)
+TEST_LOOP (int32_t, int32_t)
+TEST_LOOP (uint32_t, int32_t)
+TEST_LOOP (int64_t, int32_t)
+TEST_LOOP (uint64_t, int32_t)
+TEST_LOOP (_Float16, int32_t)
+TEST_LOOP (float, int32_t)
+TEST_LOOP (double, int32_t)
+TEST_LOOP (int8_t, int64_t)
+TEST_LOOP (uint8_t, int64_t)
+TEST_LOOP (int16_t, int64_t)
+TEST_LOOP (uint16_t, int64_t)
+TEST_LOOP (int32_t, int64_t)
+TEST_LOOP (uint32_t, int64_t)
+TEST_LOOP (int64_t, int64_t)
+TEST_LOOP (uint64_t, int64_t)
+TEST_LOOP (_Float16, int64_t)
+TEST_LOOP (float, int64_t)
+TEST_LOOP (double, int64_t)
+TEST_LOOP (int8_t, uint8_t)
+TEST_LOOP (uint8_t, uint8_t)
+TEST_LOOP (int16_t, uint8_t)
+TEST_LOOP (uint16_t, uint8_t)
+TEST_LOOP (int32_t, uint8_t)
+TEST_LOOP (uint32_t, uint8_t)
+TEST_LOOP (int64_t, uint8_t)
+TEST_LOOP (uint64_t, uint8_t)
+TEST_LOOP (_Float16, uint8_t)
+TEST_LOOP (float, uint8_t)
+TEST_LOOP (double, uint8_t)
+TEST_LOOP (int8_t, uint16_t)
+TEST_LOOP (uint8_t, uint16_t)
+TEST_LOOP (int16_t, uint16_t)
+TEST_LOOP (uint16_t, uint16_t)
+TEST_LOOP (int32_t, uint16_t)
+TEST_LOOP (uint32_t, uint16_t)
+TEST_LOOP (int64_t, uint16_t)
+TEST_LOOP (uint64_t, uint16_t)
+TEST_LOOP (_Float16, uint16_t)
+TEST_LOOP (float, uint16_t)
+TEST_LOOP (double, uint16_t)
+TEST_LOOP (int8_t, uint32_t)
+TEST_LOOP (uint8_t, uint32_t)
+TEST_LOOP (int16_t, uint32_t)
+TEST_LOOP (uint16_t, uint32_t)
+TEST_LOOP (int32_t, uint32_t)
+TEST_LOOP (uint32_t, uint32_t)
+TEST_LOOP (int64_t, uint32_t)
+TEST_LOOP (uint64_t, uint32_t)
+TEST_LOOP (_Float16, uint32_t)
+TEST_LOOP (float, uint32_t)
+TEST_LOOP (double, uint32_t)
+TEST_LOOP (int8_t, uint64_t)
+TEST_LOOP (uint8_t, uint64_t)
+TEST_LOOP (int16_t, uint64_t)
+TEST_LOOP (uint16_t, uint64_t)
+TEST_LOOP (int32_t, uint64_t)
+TEST_LOOP (uint32_t, uint64_t)
+TEST_LOOP (int64_t, uint64_t)
+TEST_LOOP (uint64_t, uint64_t)
+TEST_LOOP (_Float16, uint64_t)
+TEST_LOOP (float, uint64_t)
+TEST_LOOP (double, uint64_t)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 88 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-assembler-not "vluxei64\.v" } } */
+/* { dg-final { scan-assembler-not "vsuxei64\.v" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c
new file mode 100644
index 00000000000..785550c4b2d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c
new file mode 100644
index 00000000000..22aeb889221
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c
new file mode 100644
index 00000000000..d74a83415d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c
new file mode 100644
index 00000000000..2b6c0a87c18
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c
new file mode 100644
index 00000000000..407cc8a5a73
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c
new file mode 100644
index 00000000000..81b31ef26aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c
new file mode 100644
index 00000000000..0bfdb8f0acf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c
new file mode 100644
index 00000000000..46f791105ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c
new file mode 100644
index 00000000000..0d3c5b71e5b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c
new file mode 100644
index 00000000000..145df1e7797
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c
new file mode 100644
index 00000000000..d36b6f025f5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c
@@ -0,0 +1,39 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-11.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE *src_##DATA_TYPE[128];                                             \
+  DATA_TYPE src2_##DATA_TYPE[128];                                             \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = src2_##DATA_TYPE + i;                               \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE);                           \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i] + src_##DATA_TYPE[i][0]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
new file mode 100644
index 00000000000..b4e2ead8ca9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
@@ -0,0 +1,124 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-12.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, INDEX_TYPE)                                        \
+  DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                        \
+  DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                         \
+  INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                      \
+  for (int i = 0; i < 202; i++)                                                \
+    {                                                                          \
+      src_##DATA_TYPE##_##INDEX_TYPE[i]                                        \
+	= (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1));         \
+      index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55);                    \
+    }                                                                          \
+  f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE,               \
+				src_##DATA_TYPE##_##INDEX_TYPE,                \
+				index_##DATA_TYPE##_##INDEX_TYPE);             \
+  for (int i = 0; i < 100; i++)                                                \
+    {                                                                          \
+      assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]                           \
+	      == (src_##DATA_TYPE##_##INDEX_TYPE                               \
+		    [index_##DATA_TYPE##_##INDEX_TYPE[i * 2]]                  \
+		  + 1));                                                       \
+      assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]                       \
+	      == (src_##DATA_TYPE##_##INDEX_TYPE                               \
+		    [index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]]              \
+		  + 2));                                                       \
+    }
+
+  RUN_LOOP (int8_t, int8_t)
+  RUN_LOOP (uint8_t, int8_t)
+  RUN_LOOP (int16_t, int8_t)
+  RUN_LOOP (uint16_t, int8_t)
+  RUN_LOOP (int32_t, int8_t)
+  RUN_LOOP (uint32_t, int8_t)
+  RUN_LOOP (int64_t, int8_t)
+  RUN_LOOP (uint64_t, int8_t)
+  RUN_LOOP (_Float16, int8_t)
+  RUN_LOOP (float, int8_t)
+  RUN_LOOP (double, int8_t)
+  RUN_LOOP (int8_t, int16_t)
+  RUN_LOOP (uint8_t, int16_t)
+  RUN_LOOP (int16_t, int16_t)
+  RUN_LOOP (uint16_t, int16_t)
+  RUN_LOOP (int32_t, int16_t)
+  RUN_LOOP (uint32_t, int16_t)
+  RUN_LOOP (int64_t, int16_t)
+  RUN_LOOP (uint64_t, int16_t)
+  RUN_LOOP (_Float16, int16_t)
+  RUN_LOOP (float, int16_t)
+  RUN_LOOP (double, int16_t)
+  RUN_LOOP (int8_t, int32_t)
+  RUN_LOOP (uint8_t, int32_t)
+  RUN_LOOP (int16_t, int32_t)
+  RUN_LOOP (uint16_t, int32_t)
+  RUN_LOOP (int32_t, int32_t)
+  RUN_LOOP (uint32_t, int32_t)
+  RUN_LOOP (int64_t, int32_t)
+  RUN_LOOP (uint64_t, int32_t)
+  RUN_LOOP (_Float16, int32_t)
+  RUN_LOOP (float, int32_t)
+  RUN_LOOP (double, int32_t)
+  RUN_LOOP (int8_t, int64_t)
+  RUN_LOOP (uint8_t, int64_t)
+  RUN_LOOP (int16_t, int64_t)
+  RUN_LOOP (uint16_t, int64_t)
+  RUN_LOOP (int32_t, int64_t)
+  RUN_LOOP (uint32_t, int64_t)
+  RUN_LOOP (int64_t, int64_t)
+  RUN_LOOP (uint64_t, int64_t)
+  RUN_LOOP (_Float16, int64_t)
+  RUN_LOOP (float, int64_t)
+  RUN_LOOP (double, int64_t)
+  RUN_LOOP (int8_t, uint8_t)
+  RUN_LOOP (uint8_t, uint8_t)
+  RUN_LOOP (int16_t, uint8_t)
+  RUN_LOOP (uint16_t, uint8_t)
+  RUN_LOOP (int32_t, uint8_t)
+  RUN_LOOP (uint32_t, uint8_t)
+  RUN_LOOP (int64_t, uint8_t)
+  RUN_LOOP (uint64_t, uint8_t)
+  RUN_LOOP (_Float16, uint8_t)
+  RUN_LOOP (float, uint8_t)
+  RUN_LOOP (double, uint8_t)
+  RUN_LOOP (int8_t, uint16_t)
+  RUN_LOOP (uint8_t, uint16_t)
+  RUN_LOOP (int16_t, uint16_t)
+  RUN_LOOP (uint16_t, uint16_t)
+  RUN_LOOP (int32_t, uint16_t)
+  RUN_LOOP (uint32_t, uint16_t)
+  RUN_LOOP (int64_t, uint16_t)
+  RUN_LOOP (uint64_t, uint16_t)
+  RUN_LOOP (_Float16, uint16_t)
+  RUN_LOOP (float, uint16_t)
+  RUN_LOOP (double, uint16_t)
+  RUN_LOOP (int8_t, uint32_t)
+  RUN_LOOP (uint8_t, uint32_t)
+  RUN_LOOP (int16_t, uint32_t)
+  RUN_LOOP (uint16_t, uint32_t)
+  RUN_LOOP (int32_t, uint32_t)
+  RUN_LOOP (uint32_t, uint32_t)
+  RUN_LOOP (int64_t, uint32_t)
+  RUN_LOOP (uint64_t, uint32_t)
+  RUN_LOOP (_Float16, uint32_t)
+  RUN_LOOP (float, uint32_t)
+  RUN_LOOP (double, uint32_t)
+  RUN_LOOP (int8_t, uint64_t)
+  RUN_LOOP (uint8_t, uint64_t)
+  RUN_LOOP (int16_t, uint64_t)
+  RUN_LOOP (uint16_t, uint64_t)
+  RUN_LOOP (int32_t, uint64_t)
+  RUN_LOOP (uint32_t, uint64_t)
+  RUN_LOOP (int64_t, uint64_t)
+  RUN_LOOP (uint64_t, uint64_t)
+  RUN_LOOP (_Float16, uint64_t)
+  RUN_LOOP (float, uint64_t)
+  RUN_LOOP (double, uint64_t)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c
new file mode 100644
index 00000000000..76c6df32e6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c
new file mode 100644
index 00000000000..0fd64260082
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c
new file mode 100644
index 00000000000..069d232b912
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c
new file mode 100644
index 00000000000..499e555c1d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c
new file mode 100644
index 00000000000..ec6587aa4e5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c
new file mode 100644
index 00000000000..c16287955a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c
new file mode 100644
index 00000000000..e1744f60dbb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c
new file mode 100644
index 00000000000..3ad6d33087f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c
new file mode 100644
index 00000000000..a5de0deccbe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c
new file mode 100644
index 00000000000..74a0d05b37d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                               \
+  T (uint8_t, 64)                                                              \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
new file mode 100644
index 00000000000..98c5b4678b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
@@ -0,0 +1,116 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_LOOP(DATA_TYPE, INDEX_TYPE)                                       \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x,  \
+				INDEX_TYPE *restrict index,                    \
+				INDEX_TYPE *restrict cond)                     \
+  {                                                                            \
+    for (int i = 0; i < 100; ++i)                                              \
+      {                                                                        \
+	if (cond[i * 2])                                                       \
+	  y[i * 2] = x[index[i * 2]] + 1;                                      \
+	if (cond[i * 2 + 1])                                                   \
+	  y[i * 2 + 1] = x[index[i * 2 + 1]] + 2;                              \
+      }                                                                        \
+  }
+
+TEST_LOOP (int8_t, int8_t)
+TEST_LOOP (uint8_t, int8_t)
+TEST_LOOP (int16_t, int8_t)
+TEST_LOOP (uint16_t, int8_t)
+TEST_LOOP (int32_t, int8_t)
+TEST_LOOP (uint32_t, int8_t)
+TEST_LOOP (int64_t, int8_t)
+TEST_LOOP (uint64_t, int8_t)
+TEST_LOOP (_Float16, int8_t)
+TEST_LOOP (float, int8_t)
+TEST_LOOP (double, int8_t)
+TEST_LOOP (int8_t, int16_t)
+TEST_LOOP (uint8_t, int16_t)
+TEST_LOOP (int16_t, int16_t)
+TEST_LOOP (uint16_t, int16_t)
+TEST_LOOP (int32_t, int16_t)
+TEST_LOOP (uint32_t, int16_t)
+TEST_LOOP (int64_t, int16_t)
+TEST_LOOP (uint64_t, int16_t)
+TEST_LOOP (_Float16, int16_t)
+TEST_LOOP (float, int16_t)
+TEST_LOOP (double, int16_t)
+TEST_LOOP (int8_t, int32_t)
+TEST_LOOP (uint8_t, int32_t)
+TEST_LOOP (int16_t, int32_t)
+TEST_LOOP (uint16_t, int32_t)
+TEST_LOOP (int32_t, int32_t)
+TEST_LOOP (uint32_t, int32_t)
+TEST_LOOP (int64_t, int32_t)
+TEST_LOOP (uint64_t, int32_t)
+TEST_LOOP (_Float16, int32_t)
+TEST_LOOP (float, int32_t)
+TEST_LOOP (double, int32_t)
+TEST_LOOP (int8_t, int64_t)
+TEST_LOOP (uint8_t, int64_t)
+TEST_LOOP (int16_t, int64_t)
+TEST_LOOP (uint16_t, int64_t)
+TEST_LOOP (int32_t, int64_t)
+TEST_LOOP (uint32_t, int64_t)
+TEST_LOOP (int64_t, int64_t)
+TEST_LOOP (uint64_t, int64_t)
+TEST_LOOP (_Float16, int64_t)
+TEST_LOOP (float, int64_t)
+TEST_LOOP (double, int64_t)
+TEST_LOOP (int8_t, uint8_t)
+TEST_LOOP (uint8_t, uint8_t)
+TEST_LOOP (int16_t, uint8_t)
+TEST_LOOP (uint16_t, uint8_t)
+TEST_LOOP (int32_t, uint8_t)
+TEST_LOOP (uint32_t, uint8_t)
+TEST_LOOP (int64_t, uint8_t)
+TEST_LOOP (uint64_t, uint8_t)
+TEST_LOOP (_Float16, uint8_t)
+TEST_LOOP (float, uint8_t)
+TEST_LOOP (double, uint8_t)
+TEST_LOOP (int8_t, uint16_t)
+TEST_LOOP (uint8_t, uint16_t)
+TEST_LOOP (int16_t, uint16_t)
+TEST_LOOP (uint16_t, uint16_t)
+TEST_LOOP (int32_t, uint16_t)
+TEST_LOOP (uint32_t, uint16_t)
+TEST_LOOP (int64_t, uint16_t)
+TEST_LOOP (uint64_t, uint16_t)
+TEST_LOOP (_Float16, uint16_t)
+TEST_LOOP (float, uint16_t)
+TEST_LOOP (double, uint16_t)
+TEST_LOOP (int8_t, uint32_t)
+TEST_LOOP (uint8_t, uint32_t)
+TEST_LOOP (int16_t, uint32_t)
+TEST_LOOP (uint16_t, uint32_t)
+TEST_LOOP (int32_t, uint32_t)
+TEST_LOOP (uint32_t, uint32_t)
+TEST_LOOP (int64_t, uint32_t)
+TEST_LOOP (uint64_t, uint32_t)
+TEST_LOOP (_Float16, uint32_t)
+TEST_LOOP (float, uint32_t)
+TEST_LOOP (double, uint32_t)
+TEST_LOOP (int8_t, uint64_t)
+TEST_LOOP (uint8_t, uint64_t)
+TEST_LOOP (int16_t, uint64_t)
+TEST_LOOP (uint16_t, uint64_t)
+TEST_LOOP (int32_t, uint64_t)
+TEST_LOOP (uint32_t, uint64_t)
+TEST_LOOP (int64_t, uint64_t)
+TEST_LOOP (uint64_t, uint64_t)
+TEST_LOOP (_Float16, uint64_t)
+TEST_LOOP (float, uint64_t)
+TEST_LOOP (double, uint64_t)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 88 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-assembler-not "vluxei64\.v" } } */
+/* { dg-final { scan-assembler-not "vsuxei64\.v" } } */
+/* { dg-final { scan-assembler-not {vlse64\.v\s+v[0-9]+,\s*0\([a-x0-9]+\),\s*zero} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c
new file mode 100644
index 00000000000..03f84ce962c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c
new file mode 100644
index 00000000000..8578001ef41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c
new file mode 100644
index 00000000000..b273caa0bfe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c
new file mode 100644
index 00000000000..5055d886d62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                               \
+  T (uint8_t, 16)                                                              \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c
new file mode 100644
index 00000000000..2a4ae58588f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                               \
+  T (uint8_t, 16)                                                              \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c
new file mode 100644
index 00000000000..31d9414c549
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                               \
+  T (uint8_t, 32)                                                              \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c
new file mode 100644
index 00000000000..73ed23042fb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                               \
+  T (uint8_t, 32)                                                              \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c
new file mode 100644
index 00000000000..2f64e805759
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                               \
+  T (uint8_t, 64)                                                              \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c
new file mode 100644
index 00000000000..41f60bd88b9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c
new file mode 100644
index 00000000000..9840434fa41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
new file mode 100644
index 00000000000..105c706dbf9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
@@ -0,0 +1,140 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "mask_gather_load-11.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, INDEX_TYPE)                                        \
+  DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                        \
+  DATA_TYPE dest2_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                       \
+  DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                         \
+  INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                      \
+  INDEX_TYPE cond_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                       \
+  for (int i = 0; i < 202; i++)                                                \
+    {                                                                          \
+      src_##DATA_TYPE##_##INDEX_TYPE[i]                                        \
+	= (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1));         \
+      dest_##DATA_TYPE##_##INDEX_TYPE[i]                                       \
+	= (DATA_TYPE) ((i * 7 + 666) & (sizeof (DATA_TYPE) * 5 - 1));          \
+      dest2_##DATA_TYPE##_##INDEX_TYPE[i]                                      \
+	= (DATA_TYPE) ((i * 7 + 666) & (sizeof (DATA_TYPE) * 5 - 1));          \
+      index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55);                    \
+      cond_##DATA_TYPE##_##INDEX_TYPE[i] = (INDEX_TYPE) ((i & 0x3) == 3);      \
+    }                                                                          \
+  f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE,               \
+				src_##DATA_TYPE##_##INDEX_TYPE,                \
+				index_##DATA_TYPE##_##INDEX_TYPE,              \
+				cond_##DATA_TYPE##_##INDEX_TYPE);              \
+  for (int i = 0; i < 100; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##INDEX_TYPE[i * 2])                              \
+	assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]                         \
+		== (src_##DATA_TYPE##_##INDEX_TYPE                             \
+		      [index_##DATA_TYPE##_##INDEX_TYPE[i * 2]]                \
+		    + 1));                                                     \
+      else                                                                     \
+	assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]                         \
+		== dest2_##DATA_TYPE##_##INDEX_TYPE[i * 2]);                   \
+      if (cond_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1])                          \
+	assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]                     \
+		== (src_##DATA_TYPE##_##INDEX_TYPE                             \
+		      [index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]]            \
+		    + 2));                                                     \
+      else                                                                     \
+	assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]                     \
+		== dest2_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]);               \
+    }
+
+  RUN_LOOP (int8_t, int8_t)
+  RUN_LOOP (uint8_t, int8_t)
+  RUN_LOOP (int16_t, int8_t)
+  RUN_LOOP (uint16_t, int8_t)
+  RUN_LOOP (int32_t, int8_t)
+  RUN_LOOP (uint32_t, int8_t)
+  RUN_LOOP (int64_t, int8_t)
+  RUN_LOOP (uint64_t, int8_t)
+  RUN_LOOP (_Float16, int8_t)
+  RUN_LOOP (float, int8_t)
+  RUN_LOOP (double, int8_t)
+  RUN_LOOP (int8_t, int16_t)
+  RUN_LOOP (uint8_t, int16_t)
+  RUN_LOOP (int16_t, int16_t)
+  RUN_LOOP (uint16_t, int16_t)
+  RUN_LOOP (int32_t, int16_t)
+  RUN_LOOP (uint32_t, int16_t)
+  RUN_LOOP (int64_t, int16_t)
+  RUN_LOOP (uint64_t, int16_t)
+  RUN_LOOP (_Float16, int16_t)
+  RUN_LOOP (float, int16_t)
+  RUN_LOOP (double, int16_t)
+  RUN_LOOP (int8_t, int32_t)
+  RUN_LOOP (uint8_t, int32_t)
+  RUN_LOOP (int16_t, int32_t)
+  RUN_LOOP (uint16_t, int32_t)
+  RUN_LOOP (int32_t, int32_t)
+  RUN_LOOP (uint32_t, int32_t)
+  RUN_LOOP (int64_t, int32_t)
+  RUN_LOOP (uint64_t, int32_t)
+  RUN_LOOP (_Float16, int32_t)
+  RUN_LOOP (float, int32_t)
+  RUN_LOOP (double, int32_t)
+  RUN_LOOP (int8_t, int64_t)
+  RUN_LOOP (uint8_t, int64_t)
+  RUN_LOOP (int16_t, int64_t)
+  RUN_LOOP (uint16_t, int64_t)
+  RUN_LOOP (int32_t, int64_t)
+  RUN_LOOP (uint32_t, int64_t)
+  RUN_LOOP (int64_t, int64_t)
+  RUN_LOOP (uint64_t, int64_t)
+  RUN_LOOP (_Float16, int64_t)
+  RUN_LOOP (float, int64_t)
+  RUN_LOOP (double, int64_t)
+  RUN_LOOP (int8_t, uint8_t)
+  RUN_LOOP (uint8_t, uint8_t)
+  RUN_LOOP (int16_t, uint8_t)
+  RUN_LOOP (uint16_t, uint8_t)
+  RUN_LOOP (int32_t, uint8_t)
+  RUN_LOOP (uint32_t, uint8_t)
+  RUN_LOOP (int64_t, uint8_t)
+  RUN_LOOP (uint64_t, uint8_t)
+  RUN_LOOP (_Float16, uint8_t)
+  RUN_LOOP (float, uint8_t)
+  RUN_LOOP (double, uint8_t)
+  RUN_LOOP (int8_t, uint16_t)
+  RUN_LOOP (uint8_t, uint16_t)
+  RUN_LOOP (int16_t, uint16_t)
+  RUN_LOOP (uint16_t, uint16_t)
+  RUN_LOOP (int32_t, uint16_t)
+  RUN_LOOP (uint32_t, uint16_t)
+  RUN_LOOP (int64_t, uint16_t)
+  RUN_LOOP (uint64_t, uint16_t)
+  RUN_LOOP (_Float16, uint16_t)
+  RUN_LOOP (float, uint16_t)
+  RUN_LOOP (double, uint16_t)
+  RUN_LOOP (int8_t, uint32_t)
+  RUN_LOOP (uint8_t, uint32_t)
+  RUN_LOOP (int16_t, uint32_t)
+  RUN_LOOP (uint16_t, uint32_t)
+  RUN_LOOP (int32_t, uint32_t)
+  RUN_LOOP (uint32_t, uint32_t)
+  RUN_LOOP (int64_t, uint32_t)
+  RUN_LOOP (uint64_t, uint32_t)
+  RUN_LOOP (_Float16, uint32_t)
+  RUN_LOOP (float, uint32_t)
+  RUN_LOOP (double, uint32_t)
+  RUN_LOOP (int8_t, uint64_t)
+  RUN_LOOP (uint8_t, uint64_t)
+  RUN_LOOP (int16_t, uint64_t)
+  RUN_LOOP (uint16_t, uint64_t)
+  RUN_LOOP (int32_t, uint64_t)
+  RUN_LOOP (uint32_t, uint64_t)
+  RUN_LOOP (int64_t, uint64_t)
+  RUN_LOOP (uint64_t, uint64_t)
+  RUN_LOOP (_Float16, uint64_t)
+  RUN_LOOP (float, uint64_t)
+  RUN_LOOP (double, uint64_t)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c
new file mode 100644
index 00000000000..33ddb5d9909
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c
new file mode 100644
index 00000000000..9f06fbe4ecf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c
new file mode 100644
index 00000000000..ae578f0c7b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c
new file mode 100644
index 00000000000..741abd166e1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c
new file mode 100644
index 00000000000..a14a5c4ced1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c
new file mode 100644
index 00000000000..0ccc7dce166
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c
new file mode 100644
index 00000000000..a34688ff339
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "mask_gather_load-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c
new file mode 100644
index 00000000000..1cfdede2060
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c
new file mode 100644
index 00000000000..623de41267b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c
new file mode 100644
index 00000000000..55112b067fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c
new file mode 100644
index 00000000000..32a572d0064
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c
new file mode 100644
index 00000000000..fbaaa9d8a8e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                              \
+  T (uint16_t, 8)                                                             \
+  T (_Float16, 8)                                                             \
+  T (int32_t, 8)                                                              \
+  T (uint32_t, 8)                                                             \
+  T (float, 8)                                                                \
+  T (int64_t, 8)                                                              \
+  T (uint64_t, 8)                                                             \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c
new file mode 100644
index 00000000000..9b08661f8e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                              \
+  T (uint16_t, 8)                                                             \
+  T (_Float16, 8)                                                             \
+  T (int32_t, 8)                                                              \
+  T (uint32_t, 8)                                                             \
+  T (float, 8)                                                                \
+  T (int64_t, 8)                                                              \
+  T (uint64_t, 8)                                                             \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c
new file mode 100644
index 00000000000..dd26635f2cb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c
new file mode 100644
index 00000000000..fa0206a0ec2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c
new file mode 100644
index 00000000000..325e86c26a8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c
new file mode 100644
index 00000000000..b4b84e9cdda
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c
new file mode 100644
index 00000000000..77a9af953e0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c
new file mode 100644
index 00000000000..e0d52bf6291
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c
new file mode 100644
index 00000000000..c1af0d30e62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c
new file mode 100644
index 00000000000..6b1b02eae35
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c
new file mode 100644
index 00000000000..cef0bdec1d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c
new file mode 100644
index 00000000000..88a74d5a632
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c
new file mode 100644
index 00000000000..06804ab7111
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c
new file mode 100644
index 00000000000..c6c9a676ed6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c
new file mode 100644
index 00000000000..8246e964aad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c
new file mode 100644
index 00000000000..8ee35d2e505
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c
new file mode 100644
index 00000000000..c27a673e2b0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c
new file mode 100644
index 00000000000..6a390261cfb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c
new file mode 100644
index 00000000000..feb58d7d458
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c
new file mode 100644
index 00000000000..e4c587fd7bb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c
new file mode 100644
index 00000000000..33ad256d3db
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c
new file mode 100644
index 00000000000..48d305623e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c
new file mode 100644
index 00000000000..83ddc44bf9c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c
new file mode 100644
index 00000000000..11eb68bdb13
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c
new file mode 100644
index 00000000000..2e323477258
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c
new file mode 100644
index 00000000000..e6732fe3790
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c
new file mode 100644
index 00000000000..766a52b4622
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c
new file mode 100644
index 00000000000..cafa64f3527
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c
new file mode 100644
index 00000000000..79f6885831f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c
new file mode 100644
index 00000000000..376db088153
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "scatter_store-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c
new file mode 100644
index 00000000000..103b8649d38
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c
new file mode 100644
index 00000000000..f5f89c0fb4f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c
new file mode 100644
index 00000000000..049251ec888
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c
new file mode 100644
index 00000000000..59c8e701dbe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c
new file mode 100644
index 00000000000..a24401181e3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c
new file mode 100644
index 00000000000..080c9b83363
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c
new file mode 100644
index 00000000000..cc9f20f0daa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "scatter_store-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c
new file mode 100644
index 00000000000..c7b990668c7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+			  INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < n; ++i)                                        \
+      dest[i] += src[i * stride];                                              \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_GATHER_LOAD" 66 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */
+/* { dg-final { scan-assembler-not "vluxei" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c
new file mode 100644
index 00000000000..37dd7291f9e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+			  INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < (BITS + 13); ++i)                              \
+      dest[i] += src[i * (BITS - 3)];                                          \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_GATHER_LOAD" 46 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */
+/* { dg-final { scan-assembler-not "vluxei" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
new file mode 100644
index 00000000000..4b03c25a907
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
@@ -0,0 +1,84 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_load-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+	= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+			  stride_##DATA_TYPE##_##BITS,                         \
+			  n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (                                                                 \
+	dest_##DATA_TYPE##_##BITS[i]                                           \
+	== (dest2_##DATA_TYPE##_##BITS[i]                                      \
+	    + src_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]));     \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
new file mode 100644
index 00000000000..8499e4cef24
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
@@ -0,0 +1,84 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_load-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+	= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+			  stride_##DATA_TYPE##_##BITS,                         \
+			  n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (                                                                 \
+	dest_##DATA_TYPE##_##BITS[i]                                           \
+	== (dest2_##DATA_TYPE##_##BITS[i]                                      \
+	    + src_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]));     \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c
new file mode 100644
index 00000000000..df0560c5a31
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+			  INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < n; ++i)                                        \
+      dest[i * stride] = src[i] + BITS;                                        \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_SCATTER_STORE" 66 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */
+/* { dg-final { scan-assembler-not "vsuxei" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c
new file mode 100644
index 00000000000..1419cbc91b1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c
@@ -0,0 +1,46 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+			  INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < n; ++i)                                        \
+      dest[i * (BITS - 3)] = src[i] + BITS;                                    \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_SCATTER_STORE" 44 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */
+/* { dg-final { scan-assembler-not "vsuxei" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
new file mode 100644
index 00000000000..e9dca4672c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
@@ -0,0 +1,82 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_store-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+	= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+			  stride_##DATA_TYPE##_##BITS,                         \
+			  n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (dest_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]       \
+	      == (src_##DATA_TYPE##_##BITS[i] + BITS));                        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c
new file mode 100644
index 00000000000..509def789e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c
@@ -0,0 +1,82 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_store-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+	= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+			  stride_##DATA_TYPE##_##BITS,                         \
+			  n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (dest_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]       \
+	      == (src_##DATA_TYPE##_##BITS[i] + BITS));                        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
index 5e69235a268..19589fa9638 100644
--- a/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
+++ b/gcc/testsuite/gcc.target/riscv/rvv/rvv.exp
@@ -90,5 +90,28 @@ foreach op $AUTOVEC_TEST_OPTS {
 dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/vls-vlmax/*.\[cS\]]] \
 	"-std=c99 -O3 -ftree-vectorize --param riscv-autovec-preference=fixed-vlmax" $CFLAGS
 
+# gather-scatter tests
+set AUTOVEC_TEST_OPTS [list \
+  {-ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m1 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m2 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m4 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O3 --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m8 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O2 --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m1 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O2 --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m2 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O2 --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m4 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O2 --param riscv-autovec-preference=fixed-vlmax --param riscv-autovec-lmul=m8 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O3 --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m1 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O3 --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m2 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O3 --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m4 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O3 --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m8 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O2 --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m1 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O2 --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m2 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O2 --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m4 -fno-vect-cost-model -ffast-math} \
+  {-ftree-vectorize -O2 --param riscv-autovec-preference=scalable --param riscv-autovec-lmul=m8 -fno-vect-cost-model -ffast-math} ]
+foreach op $AUTOVEC_TEST_OPTS {
+  dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/autovec/gather-scatter/*.\[cS\]]] \
+    "" "$op"
+}
+
 # All done.
 dg-finish
-- 
2.36.1


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH V4] RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-07 10:48 [PATCH V4] RISC-V: Support gather_load/scatter RVV auto-vectorization Juzhe-Zhong
@ 2023-07-07 12:34 ` Robin Dapp
  2023-07-07 14:34   ` 钟居哲
  0 siblings, 1 reply; 3+ messages in thread
From: Robin Dapp @ 2023-07-07 12:34 UTC (permalink / raw)
  To: Juzhe-Zhong, gcc-patches
  Cc: rdapp.gcc, kito.cheng, kito.cheng, palmer, palmer, jeffreyalaw

Hi Juzhe,

thanks, the somewhat unified modulo is IMHO a more readable.
Could probably still be improved but OK with me for now.

> +	  if (is_dummy_len)
> +	    {
> +	      rtx dummy_len = gen_reg_rtx (Pmode);

Can we call this is_vlmax_len/is_vlmax and vlmax_len or so?

> +  if (inner_offsize < inner_vsize)
> +    {
> +      /* 7.2. Vector Load/Store Addressing Modes.
> +	 If the vector offset elements are narrower than XLEN, they are
> +	 zero-extended to XLEN before adding to the ptr effective address. If
> +	 the vector offset elements are wider than XLEN, the least-significant
> +	 XLEN bits are used in the address calculation. An implementation must
> +	 raise an illegal instruction exception if the EEW is not supported for
> +	 offset elements.  */
> +      if (!zero_extend_p || (zero_extend_p && scale_log2 != 0))

Hehe I really thought we have a widening shift, well ;)

I see, the zero extension refers to this part of the GCC docs
 "multiply the extended offset by operand 4;" 

and not to the RVV spec.  You could clarify this then, saying
that the RVV spec only refers to the scale_log == 0 case.

The rest LGTM now, no separate revision needed for those nits.

Regards
 Robin

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Re: [PATCH V4] RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-07 12:34 ` Robin Dapp
@ 2023-07-07 14:34   ` 钟居哲
  0 siblings, 0 replies; 3+ messages in thread
From: 钟居哲 @ 2023-07-07 14:34 UTC (permalink / raw)
  To: rdapp.gcc, gcc-patches
  Cc: rdapp.gcc, kito.cheng, kito.cheng, palmer, palmer, Jeff Law

[-- Attachment #1: Type: text/plain, Size: 1588 bytes --]

Thanks. I still sent V5 with fixing "dummy" into "vlmax"
and add more comments.

Thanks.



juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-07-07 20:34
To: Juzhe-Zhong; gcc-patches
CC: rdapp.gcc; kito.cheng; kito.cheng; palmer; palmer; jeffreyalaw
Subject: Re: [PATCH V4] RISC-V: Support gather_load/scatter RVV auto-vectorization
Hi Juzhe,
 
thanks, the somewhat unified modulo is IMHO a more readable.
Could probably still be improved but OK with me for now.
 
> +   if (is_dummy_len)
> +     {
> +       rtx dummy_len = gen_reg_rtx (Pmode);
 
Can we call this is_vlmax_len/is_vlmax and vlmax_len or so?
 
> +  if (inner_offsize < inner_vsize)
> +    {
> +      /* 7.2. Vector Load/Store Addressing Modes.
> + If the vector offset elements are narrower than XLEN, they are
> + zero-extended to XLEN before adding to the ptr effective address. If
> + the vector offset elements are wider than XLEN, the least-significant
> + XLEN bits are used in the address calculation. An implementation must
> + raise an illegal instruction exception if the EEW is not supported for
> + offset elements.  */
> +      if (!zero_extend_p || (zero_extend_p && scale_log2 != 0))
 
Hehe I really thought we have a widening shift, well ;)
 
I see, the zero extension refers to this part of the GCC docs
"multiply the extended offset by operand 4;" 
 
and not to the RVV spec.  You could clarify this then, saying
that the RVV spec only refers to the scale_log == 0 case.
 
The rest LGTM now, no separate revision needed for those nits.
 
Regards
Robin
 

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-07-07 14:34 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-07 10:48 [PATCH V4] RISC-V: Support gather_load/scatter RVV auto-vectorization Juzhe-Zhong
2023-07-07 12:34 ` Robin Dapp
2023-07-07 14:34   ` 钟居哲

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).