public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
@ 2023-07-12  9:38 juzhe.zhong
       [not found] ` <66564378DDCC57ED+202307121748494317959@rivai.ai>
  0 siblings, 1 reply; 13+ messages in thread
From: juzhe.zhong @ 2023-07-12  9:38 UTC (permalink / raw)
  To: gcc-patches; +Cc: kito.cheng, kito.cheng, jeffreyalaw, rdapp.gcc, Ju-Zhe Zhong

From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>

This patch fully support gather_load/scatter_store:
1. Support single-rgroup on both RV32/RV64.
2. Support indexed element width can be same as or smaller than Pmode.
3. Support VLA SLP with gather/scatter.
4. Fully tested all gather/scatter with LMUL = M1/M2/M4/M8 both VLA and VLS.
5. Fix bug of handling (subreg:SI (const_poly_int:DI))
6. Fix bug on vec_perm which is used by gather/scatter SLP.

All kinds of GATHER/SCATTER are normalized into LEN_MASK_*.
We fully supported these 4 kinds of gather/scatter:
1. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and dummy mask (Full vector).
2. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and real mask.
3. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and dummy mask.
4. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and real mask.

Base on the disscussions with Richards, we don't lower vlse/vsse in RISC-V backend for strided load/store.
Instead, we leave it to the middle-end to handle that.

Regression is pass ok for trunk ?

gcc/ChangeLog:

	* config/riscv/autovec.md
	(len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>): New pattern.
	(len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto.
	(len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
	(len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
	(len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
	(len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto.
	(len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>): Ditto.
	(len_mask_gather_load<mode><mode>): Ditto.
	(len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>): Ditto.
	(len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto.
	(len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
	(len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
	(len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
	(len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto.
	(len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>): Ditto.
	(len_mask_scatter_store<mode><mode>): Ditto.
	* config/riscv/predicates.md (const_1_operand): New predicate.
	(vector_gs_scale_operand_16): Ditto.
	(vector_gs_scale_operand_32): Ditto.
	(vector_gs_scale_operand_64): Ditto.
	(vector_gs_extension_operand): Ditto.
	(vector_gs_scale_operand_16_rv32): Ditto.
	(vector_gs_scale_operand_32_rv32): Ditto.
	* config/riscv/riscv-protos.h (enum insn_type): Add gather/scatter.
	(expand_gather_scatter): New function.
	* config/riscv/riscv-v.cc (gen_const_vector_dup): Add gather/scatter.
	(emit_vlmax_masked_store_insn): New function.
	(emit_nonvlmax_masked_store_insn): Ditto.
	(modulo_sel_indices): Ditto.
	(expand_vec_perm): Fix SLP for gather/scatter.
	(prepare_gather_scatter): New function.
	(expand_gather_scatter): Ditto.
	* config/riscv/riscv.cc (riscv_legitimize_move): Fix bug of
	(subreg:SI (DI CONST_POLY_INT)).
	* config/riscv/vector-iterators.md: Add gather/scatter.
	* config/riscv/vector.md (vec_duplicate<mode>): Use "@" instead.
	(@vec_duplicate<mode>): Ditto.
	(@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>): Fix name.
	(@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/rvv/rvv.exp: Add gather/scatter tests.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c: New test.
	* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c: New test.

---
 gcc/config/riscv/autovec.md                   | 256 ++++++++++++++++
 gcc/config/riscv/predicates.md                |  33 +-
 gcc/config/riscv/riscv-protos.h               |   3 +
 gcc/config/riscv/riscv-v.cc                   | 283 ++++++++++++++++--
 gcc/config/riscv/riscv.cc                     |  11 +-
 gcc/config/riscv/vector-iterators.md          | 118 ++++++--
 gcc/config/riscv/vector.md                    |  30 +-
 .../autovec/gather-scatter/gather_load-1.c    |  38 +++
 .../autovec/gather-scatter/gather_load-10.c   |  35 +++
 .../autovec/gather-scatter/gather_load-11.c   |  32 ++
 .../autovec/gather-scatter/gather_load-12.c   | 112 +++++++
 .../autovec/gather-scatter/gather_load-2.c    |  38 +++
 .../autovec/gather-scatter/gather_load-3.c    |  35 +++
 .../autovec/gather-scatter/gather_load-4.c    |  35 +++
 .../autovec/gather-scatter/gather_load-5.c    |  35 +++
 .../autovec/gather-scatter/gather_load-6.c    |  35 +++
 .../autovec/gather-scatter/gather_load-7.c    |  35 +++
 .../autovec/gather-scatter/gather_load-8.c    |  35 +++
 .../autovec/gather-scatter/gather_load-9.c    |  35 +++
 .../gather-scatter/gather_load_run-1.c        |  41 +++
 .../gather-scatter/gather_load_run-10.c       |  41 +++
 .../gather-scatter/gather_load_run-11.c       |  39 +++
 .../gather-scatter/gather_load_run-12.c       | 124 ++++++++
 .../gather-scatter/gather_load_run-2.c        |  41 +++
 .../gather-scatter/gather_load_run-3.c        |  41 +++
 .../gather-scatter/gather_load_run-4.c        |  41 +++
 .../gather-scatter/gather_load_run-5.c        |  41 +++
 .../gather-scatter/gather_load_run-6.c        |  41 +++
 .../gather-scatter/gather_load_run-7.c        |  41 +++
 .../gather-scatter/gather_load_run-8.c        |  41 +++
 .../gather-scatter/gather_load_run-9.c        |  41 +++
 .../gather-scatter/mask_gather_load-1.c       |  39 +++
 .../gather-scatter/mask_gather_load-10.c      |  36 +++
 .../gather-scatter/mask_gather_load-11.c      | 116 +++++++
 .../gather-scatter/mask_gather_load-2.c       |  39 +++
 .../gather-scatter/mask_gather_load-3.c       |  36 +++
 .../gather-scatter/mask_gather_load-4.c       |  36 +++
 .../gather-scatter/mask_gather_load-5.c       |  36 +++
 .../gather-scatter/mask_gather_load-6.c       |  36 +++
 .../gather-scatter/mask_gather_load-7.c       |  36 +++
 .../gather-scatter/mask_gather_load-8.c       |  36 +++
 .../gather-scatter/mask_gather_load-9.c       |  36 +++
 .../gather-scatter/mask_gather_load_run-1.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-10.c  |  48 +++
 .../gather-scatter/mask_gather_load_run-11.c  | 140 +++++++++
 .../gather-scatter/mask_gather_load_run-2.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-3.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-4.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-5.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-6.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-7.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-8.c   |  48 +++
 .../gather-scatter/mask_gather_load_run-9.c   |  48 +++
 .../gather-scatter/mask_scatter_store-1.c     |  39 +++
 .../gather-scatter/mask_scatter_store-10.c    |  36 +++
 .../gather-scatter/mask_scatter_store-2.c     |  39 +++
 .../gather-scatter/mask_scatter_store-3.c     |  36 +++
 .../gather-scatter/mask_scatter_store-4.c     |  36 +++
 .../gather-scatter/mask_scatter_store-5.c     |  36 +++
 .../gather-scatter/mask_scatter_store-6.c     |  36 +++
 .../gather-scatter/mask_scatter_store-7.c     |  36 +++
 .../gather-scatter/mask_scatter_store-8.c     |  36 +++
 .../gather-scatter/mask_scatter_store-9.c     |  36 +++
 .../gather-scatter/mask_scatter_store_run-1.c |  48 +++
 .../mask_scatter_store_run-10.c               |  48 +++
 .../gather-scatter/mask_scatter_store_run-2.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-3.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-4.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-5.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-6.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-7.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-8.c |  48 +++
 .../gather-scatter/mask_scatter_store_run-9.c |  48 +++
 .../autovec/gather-scatter/scatter_store-1.c  |  38 +++
 .../autovec/gather-scatter/scatter_store-10.c |  35 +++
 .../autovec/gather-scatter/scatter_store-2.c  |  38 +++
 .../autovec/gather-scatter/scatter_store-3.c  |  35 +++
 .../autovec/gather-scatter/scatter_store-4.c  |  35 +++
 .../autovec/gather-scatter/scatter_store-5.c  |  35 +++
 .../autovec/gather-scatter/scatter_store-6.c  |  35 +++
 .../autovec/gather-scatter/scatter_store-7.c  |  35 +++
 .../autovec/gather-scatter/scatter_store-8.c  |  35 +++
 .../autovec/gather-scatter/scatter_store-9.c  |  35 +++
 .../gather-scatter/scatter_store_run-1.c      |  40 +++
 .../gather-scatter/scatter_store_run-10.c     |  40 +++
 .../gather-scatter/scatter_store_run-2.c      |  40 +++
 .../gather-scatter/scatter_store_run-3.c      |  40 +++
 .../gather-scatter/scatter_store_run-4.c      |  40 +++
 .../gather-scatter/scatter_store_run-5.c      |  40 +++
 .../gather-scatter/scatter_store_run-6.c      |  40 +++
 .../gather-scatter/scatter_store_run-7.c      |  40 +++
 .../gather-scatter/scatter_store_run-8.c      |  40 +++
 .../gather-scatter/scatter_store_run-9.c      |  40 +++
 .../autovec/gather-scatter/strided_load-1.c   |  45 +++
 .../autovec/gather-scatter/strided_load-2.c   |  45 +++
 .../gather-scatter/strided_load_run-1.c       |  84 ++++++
 .../gather-scatter/strided_load_run-2.c       |  84 ++++++
 .../autovec/gather-scatter/strided_store-1.c  |  45 +++
 .../autovec/gather-scatter/strided_store-2.c  |  45 +++
 .../gather-scatter/strided_store_run-1.c      |  82 +++++
 .../gather-scatter/strided_store_run-2.c      |  82 +++++
 101 files changed, 4962 insertions(+), 61 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c

diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index d98a63c285e..600ddcdb152 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -57,6 +57,262 @@
   }
 )
 
+;; =========================================================================
+;; == Gather Load
+;; =========================================================================
+
+(define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
+  [(match_operand:VNX1_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX1_QHSDI 2 "register_operand")
+   (match_operand 3 "<VNX1_QHSD:gs_extension>")
+   (match_operand 4 "<VNX1_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX1_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>"
+  [(match_operand:VNX2_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX2_QHSDI 2 "register_operand")
+   (match_operand 3 "<VNX2_QHSD:gs_extension>")
+   (match_operand 4 "<VNX2_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX2_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>"
+  [(match_operand:VNX4_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX4_QHSDI 2 "register_operand")
+   (match_operand 3 "<VNX4_QHSD:gs_extension>")
+   (match_operand 4 "<VNX4_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX4_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>"
+  [(match_operand:VNX8_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX8_QHSDI 2 "register_operand")
+   (match_operand 3 "<VNX8_QHSD:gs_extension>")
+   (match_operand 4 "<VNX8_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX8_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
+  [(match_operand:VNX16_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX16_QHSDI 2 "register_operand")
+   (match_operand 3 "<VNX16_QHSD:gs_extension>")
+   (match_operand 4 "<VNX16_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX16_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>"
+  [(match_operand:VNX32_QHS 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX32_QHSI 2 "register_operand")
+   (match_operand 3 "<VNX32_QHS:gs_extension>")
+   (match_operand 4 "<VNX32_QHS:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX32_QHS:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>"
+  [(match_operand:VNX64_QH 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX64_QHI 2 "register_operand")
+   (match_operand 3 "<VNX64_QH:gs_extension>")
+   (match_operand 4 "<VNX64_QH:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX64_QH:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+;; When SEW = 8 and LMUL = 8, we can't find any index mode with
+;; larger SEW. Since RVV indexed load/store support zero extend
+;; implicitly and not support scaling, we should only allow
+;; operands[3] and operands[4] to be const_1_operand.
+(define_expand "len_mask_gather_load<mode><mode>"
+  [(match_operand:VNX128_Q 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX128_Q 2 "register_operand")
+   (match_operand 3 "const_1_operand")
+   (match_operand 4 "const_1_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+;; =========================================================================
+;; == Scatter Store
+;; =========================================================================
+
+(define_expand "len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX1_QHSDI 1 "register_operand")
+   (match_operand 2 "<VNX1_QHSD:gs_extension>")
+   (match_operand 3 "<VNX1_QHSD:gs_scale>")
+   (match_operand:VNX1_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX1_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX2_QHSDI 1 "register_operand")
+   (match_operand 2 "<VNX2_QHSD:gs_extension>")
+   (match_operand 3 "<VNX2_QHSD:gs_scale>")
+   (match_operand:VNX2_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX2_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX4_QHSDI 1 "register_operand")
+   (match_operand 2 "<VNX4_QHSD:gs_extension>")
+   (match_operand 3 "<VNX4_QHSD:gs_scale>")
+   (match_operand:VNX4_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX4_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX8_QHSDI 1 "register_operand")
+   (match_operand 2 "<VNX8_QHSD:gs_extension>")
+   (match_operand 3 "<VNX8_QHSD:gs_scale>")
+   (match_operand:VNX8_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX8_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX16_QHSDI 1 "register_operand")
+   (match_operand 2 "<VNX16_QHSD:gs_extension>")
+   (match_operand 3 "<VNX16_QHSD:gs_scale>")
+   (match_operand:VNX16_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX16_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX32_QHSI 1 "register_operand")
+   (match_operand 2 "<VNX32_QHS:gs_extension>")
+   (match_operand 3 "<VNX32_QHS:gs_scale>")
+   (match_operand:VNX32_QHS 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX32_QHS:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX64_QHI 1 "register_operand")
+   (match_operand 2 "<VNX64_QH:gs_extension>")
+   (match_operand 3 "<VNX64_QH:gs_scale>")
+   (match_operand:VNX64_QH 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX64_QH:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+;; When SEW = 8 and LMUL = 8, we can't find any index mode with
+;; larger SEW. Since RVV indexed load/store support zero extend
+;; implicitly and not support scaling, we should only allow
+;; operands[3] and operands[4] to be const_1_operand.
+(define_expand "len_mask_scatter_store<mode><mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX128_Q 1 "register_operand")
+   (match_operand 2 "const_1_operand")
+   (match_operand 3 "const_1_operand")
+   (match_operand:VNX128_Q 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
 ;; =========================================================================
 ;; == Vector creation
 ;; =========================================================================
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index eb975eaf994..5a22c77f0cd 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -61,6 +61,10 @@
   (and (match_code "const_int,const_wide_int,const_vector")
        (match_test "op == CONST0_RTX (GET_MODE (op))")))
 
+(define_predicate "const_1_operand"
+  (and (match_code "const_int,const_wide_int,const_vector")
+       (match_test "op == CONST1_RTX (GET_MODE (op))")))
+
 (define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "const_0_operand")
        (match_operand 0 "register_operand")))
@@ -341,6 +345,33 @@
   (ior (match_operand 0 "register_operand")
        (match_code "const_vector")))
 
+(define_predicate "vector_gs_scale_operand_16"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
+
+(define_predicate "vector_gs_scale_operand_32"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
+
+(define_predicate "vector_gs_scale_operand_64"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode == DImode)")))
+
+(define_predicate "vector_gs_extension_operand"
+  (ior (match_operand 0 "const_1_operand")
+       (and (match_operand 0 "const_0_operand")
+            (match_test "Pmode == SImode"))))
+
+(define_predicate "vector_gs_scale_operand_16_rv32"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1
+		    || (INTVAL (op) == 2 && Pmode == SImode)")))
+
+(define_predicate "vector_gs_scale_operand_32_rv32"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1
+		    || (INTVAL (op) == 4 && Pmode == SImode)")))
+
 (define_predicate "ltge_operator"
   (match_code "lt,ltu,ge,geu"))
 
@@ -376,7 +407,7 @@
 		|| rtx_equal_p (op, CONST0_RTX (GET_MODE (op))))
 		&& maybe_gt (GET_MODE_BITSIZE (GET_MODE (op)), GET_MODE_BITSIZE (Pmode)))")
     (ior (match_test "rtx_equal_p (op, CONST0_RTX (GET_MODE (op)))")
-         (ior (match_operand 0 "const_int_operand")
+         (ior (match_code "const_int,const_poly_int")
               (ior (match_operand 0 "register_operand")
                    (match_test "satisfies_constraint_Wdm (op)"))))))
 
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 6cd5c6639c9..4f690d41753 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -149,6 +149,8 @@ enum insn_type
   RVV_SCALAR_MOV_OP = 4, /* +1 for VUNDEF according to vector.md.  */
   RVV_SLIDE_OP = 4,      /* Dest, VUNDEF, source and offset.  */
   RVV_COMPRESS_OP = 4,
+  RVV_GATHER_M_OP = 5,
+  RVV_SCATTER_M_OP = 4,
 };
 enum vlmul_type
 {
@@ -256,6 +258,7 @@ void expand_vec_init (rtx, rtx);
 void expand_vec_perm (rtx, rtx, rtx, rtx);
 void expand_select_vl (rtx *);
 void expand_load_store (rtx *, bool);
+void expand_gather_scatter (rtx *, bool);
 
 /* Rounding mode bitfield for fixed point VXRM.  */
 enum fixed_point_rounding_mode
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 1a61f9a18cd..d4c90800400 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -49,6 +49,7 @@
 #include "tm-constrs.h"
 #include "rtx-vector-builder.h"
 #include "targhooks.h"
+#include "gimple.h"
 
 using namespace riscv_vector;
 
@@ -556,15 +557,22 @@ const_vec_all_in_range_p (rtx vec, poly_int64 minval, poly_int64 maxval)
   return true;
 }
 
-/* Return a const_int vector of VAL.
-
-   This function also exists in aarch64, we may unify it in middle-end in the
-   future.  */
+/* Return a const vector of VAL. The VAL can be either const_int or
+   const_poly_int.  */
 
 static rtx
 gen_const_vector_dup (machine_mode mode, poly_int64 val)
 {
-  rtx c = gen_int_mode (val, GET_MODE_INNER (mode));
+  scalar_mode smode = GET_MODE_INNER (mode);
+  rtx c = gen_int_mode (val, smode);
+  if (!val.is_constant () && GET_MODE_SIZE (smode) > GET_MODE_SIZE (Pmode))
+    {
+      /* When VAL is const_poly_int value, we need to explicitly broadcast
+	 it into a vector using RVV broadcast instruction.  */
+      rtx dup = gen_reg_rtx (mode);
+      emit_insn (gen_vec_duplicate (mode, dup, c));
+      return dup;
+    }
   return gen_const_vec_duplicate (mode, c);
 }
 
@@ -901,6 +909,39 @@ emit_nonvlmax_masked_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
   e.emit_insn ((enum insn_code) icode, ops);
 }
 
+/* This function emits a VLMAX masked store instruction.  */
+static void
+emit_vlmax_masked_store_insn (unsigned icode, int op_num, rtx *ops)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+					  /*HAS_DEST_P*/ false,
+					  /*FULLY_UNMASKED_P*/ false,
+					  /*USE_REAL_MERGE_P*/ true,
+					  /*HAS_AVL_P*/ true,
+					  /*VLMAX_P*/ true, dest_mode,
+					  mask_mode);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
+/* This function emits a non-VLMAX masked store instruction.  */
+static void
+emit_nonvlmax_masked_store_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+					  /*HAS_DEST_P*/ false,
+					  /*FULLY_UNMASKED_P*/ false,
+					  /*USE_REAL_MERGE_P*/ true,
+					  /*HAS_AVL_P*/ true,
+					  /*VLMAX_P*/ false, dest_mode,
+					  mask_mode);
+  e.set_vl (avl);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
 /* This function emits a masked instruction.  */
 void
 emit_vlmax_masked_mu_insn (unsigned icode, int op_num, rtx *ops)
@@ -1155,7 +1196,6 @@ static void
 expand_const_vector (rtx target, rtx src)
 {
   machine_mode mode = GET_MODE (target);
-  scalar_mode elt_mode = GET_MODE_INNER (mode);
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
     {
       rtx elt;
@@ -1180,7 +1220,6 @@ expand_const_vector (rtx target, rtx src)
 	}
       else
 	{
-	  elt = force_reg (elt_mode, elt);
 	  rtx ops[] = {tmp, elt};
 	  emit_vlmax_insn (code_for_pred_broadcast (mode), RVV_UNOP, ops);
 	}
@@ -2449,6 +2488,25 @@ expand_vec_cmp_float (rtx target, rtx_code code, rtx op0, rtx op1,
   return false;
 }
 
+/* Modulo all SEL indices to ensure they are all in range if [0, MAX_SEL].  */
+static rtx
+modulo_sel_indices (rtx sel, poly_uint64 max_sel)
+{
+  rtx sel_mod;
+  machine_mode sel_mode = GET_MODE (sel);
+  poly_uint64 nunits = GET_MODE_NUNITS (sel_mode);
+  /* If SEL is variable-length CONST_VECTOR, we don't need to modulo it.  */
+  if (!nunits.is_constant () && CONST_VECTOR_P (sel))
+    sel_mod = sel;
+  else
+    {
+      rtx mod = gen_const_vector_dup (sel_mode, max_sel);
+      sel_mod
+	= expand_simple_binop (sel_mode, AND, sel, mod, NULL, 0, OPTAB_DIRECT);
+    }
+  return sel_mod;
+}
+
 /* Implement vec_perm<mode>.  */
 
 void
@@ -2462,41 +2520,43 @@ expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
      index is in range of [0, nunits - 1]. A single vrgather instructions is
      enough. Since we will use vrgatherei16.vv for variable-length vector,
      it is never out of range and we don't need to modulo the index.  */
-  if (!nunits.is_constant () || const_vec_all_in_range_p (sel, 0, nunits - 1))
+  if (nunits.is_constant () && const_vec_all_in_range_p (sel, 0, nunits - 1))
     {
       emit_vlmax_gather_insn (target, op0, sel);
       return;
     }
 
+  /* Check if all the indices are same.  */
+  rtx elt;
+  if (const_vec_duplicate_p (sel, &elt))
+    {
+      poly_uint64 value = rtx_to_poly_int64 (elt);
+      rtx op = op0;
+      if (maybe_gt (value, nunits - 1))
+	{
+	  sel = gen_const_vector_dup (sel_mode, value - nunits);
+	  op = op1;
+	}
+      emit_vlmax_gather_insn (target, op, sel);
+    }
+
+  /* Note: vec_perm indices are supposed to wrap when they go beyond the
+     size of the two value vectors, i.e. the upper bits of the indices
+     are effectively ignored.  RVV vrgather instead produces 0 for any
+     out-of-range indices, so we need to modulo all the vec_perm indices
+     to ensure they are all in range of [0, nunits - 1] when op0 == op1
+     or all in range of [0, 2 * nunits - 1] when op0 != op1.  */
+  rtx sel_mod
+    = modulo_sel_indices (sel,
+			  rtx_equal_p (op0, op1) ? nunits - 1 : 2 * nunits - 1);
   /* Check if the two values vectors are the same.  */
-  if (rtx_equal_p (op0, op1) || const_vec_duplicate_p (sel))
-    {
-      /* Note: vec_perm indices are supposed to wrap when they go beyond the
-	 size of the two value vectors, i.e. the upper bits of the indices
-	 are effectively ignored.  RVV vrgather instead produces 0 for any
-	 out-of-range indices, so we need to modulo all the vec_perm indices
-	 to ensure they are all in range of [0, nunits - 1].  */
-      rtx max_sel = gen_const_vector_dup (sel_mode, nunits - 1);
-      rtx sel_mod = expand_simple_binop (sel_mode, AND, sel, max_sel, NULL, 0,
-					 OPTAB_DIRECT);
-      emit_vlmax_gather_insn (target, op1, sel_mod);
+  if (rtx_equal_p (op0, op1))
+    {
+      emit_vlmax_gather_insn (target, op0, sel_mod);
       return;
     }
 
-  rtx sel_mod = sel;
   rtx max_sel = gen_const_vector_dup (sel_mode, 2 * nunits - 1);
-  /* We don't need to modulo indices for VLA vector.
-     Since we should gurantee they aren't out of range before.  */
-  if (nunits.is_constant ())
-    {
-      /* Note: vec_perm indices are supposed to wrap when they go beyond the
-	 size of the two value vectors, i.e. the upper bits of the indices
-	 are effectively ignored.  RVV vrgather instead produces 0 for any
-	 out-of-range indices, so we need to modulo all the vec_perm indices
-	 to ensure they are all in range of [0, 2 * nunits - 1].  */
-      sel_mod = expand_simple_binop (sel_mode, AND, sel, max_sel, NULL, 0,
-				     OPTAB_DIRECT);
-    }
 
   /* This following sequence is handling the case that:
      __builtin_shufflevector (vec1, vec2, index...), the index can be any
@@ -2968,4 +3028,163 @@ expand_load_store (rtx *ops, bool is_load)
     }
 }
 
+/* Prepare insn_code for gather_load/scatter_store according to
+   the vector mode and index mode.  */
+static insn_code
+prepare_gather_scatter (machine_mode vec_mode, machine_mode idx_mode,
+			bool is_load)
+{
+  if (!is_load)
+    return code_for_pred_indexed_store (UNSPEC_UNORDERED, vec_mode, idx_mode);
+  else
+    {
+      unsigned src_eew_bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (idx_mode));
+      unsigned dst_eew_bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (vec_mode));
+      if (dst_eew_bitsize == src_eew_bitsize)
+	return code_for_pred_indexed_load_same_eew (UNSPEC_UNORDERED, vec_mode);
+      else if (dst_eew_bitsize > src_eew_bitsize)
+	{
+	  unsigned factor = dst_eew_bitsize / src_eew_bitsize;
+	  switch (factor)
+	    {
+	    case 2:
+	      return code_for_pred_indexed_load_x2_greater_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    case 4:
+	      return code_for_pred_indexed_load_x4_greater_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    case 8:
+	      return code_for_pred_indexed_load_x8_greater_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+      else
+	{
+	  unsigned factor = src_eew_bitsize / dst_eew_bitsize;
+	  switch (factor)
+	    {
+	    case 2:
+	      return code_for_pred_indexed_load_x2_smaller_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    case 4:
+	      return code_for_pred_indexed_load_x4_smaller_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    case 8:
+	      return code_for_pred_indexed_load_x8_smaller_eew (
+		UNSPEC_UNORDERED, vec_mode);
+	    default:
+	      gcc_unreachable ();
+	    }
+	}
+    }
+}
+
+/* Expand LEN_MASK_{GATHER_LOAD,SCATTER_STORE}.  */
+void
+expand_gather_scatter (rtx *ops, bool is_load)
+{
+  rtx ptr, vec_offset, vec_reg, len, mask;
+  bool zero_extend_p;
+  int scale_log2;
+  if (is_load)
+    {
+      vec_reg = ops[0];
+      ptr = ops[1];
+      vec_offset = ops[2];
+      zero_extend_p = INTVAL (ops[3]);
+      scale_log2 = exact_log2 (INTVAL (ops[4]));
+      len = ops[5];
+      mask = ops[7];
+    }
+  else
+    {
+      vec_reg = ops[4];
+      ptr = ops[0];
+      vec_offset = ops[1];
+      zero_extend_p = INTVAL (ops[2]);
+      scale_log2 = exact_log2 (INTVAL (ops[3]));
+      len = ops[5];
+      mask = ops[7];
+    }
+
+  machine_mode vec_mode = GET_MODE (vec_reg);
+  machine_mode idx_mode = GET_MODE (vec_offset);
+  scalar_mode inner_vec_mode = GET_MODE_INNER (vec_mode);
+  scalar_mode inner_idx_mode = GET_MODE_INNER (idx_mode);
+  unsigned inner_vsize = GET_MODE_BITSIZE (inner_vec_mode);
+  unsigned inner_offsize = GET_MODE_BITSIZE (inner_idx_mode);
+  poly_int64 nunits = GET_MODE_NUNITS (vec_mode);
+  poly_int64 value;
+  bool is_vlmax = poly_int_rtx_p (len, &value) && known_eq (value, nunits);
+
+  if (inner_offsize < inner_vsize)
+    {
+      /* 7.2. Vector Load/Store Addressing Modes.
+	 If the vector offset elements are narrower than XLEN, they are
+	 zero-extended to XLEN before adding to the ptr effective address. If
+	 the vector offset elements are wider than XLEN, the least-significant
+	 XLEN bits are used in the address calculation. An implementation must
+	 raise an illegal instruction exception if the EEW is not supported for
+	 offset elements.
+
+	 RVV spec only refers to the scale_log == 0 case.  */
+      if (!zero_extend_p || (zero_extend_p && scale_log2 != 0))
+	{
+	  if (zero_extend_p)
+	    inner_idx_mode
+	      = int_mode_for_size (inner_offsize * 2, 0).require ();
+	  else
+	    inner_idx_mode = int_mode_for_size (BITS_PER_WORD, 0).require ();
+	  machine_mode new_idx_mode
+	    = get_vector_mode (inner_idx_mode, nunits).require ();
+	  rtx tmp = gen_reg_rtx (new_idx_mode);
+	  emit_insn (gen_extend_insn (tmp, vec_offset, new_idx_mode, idx_mode,
+				      zero_extend_p ? true : false));
+	  vec_offset = tmp;
+	  idx_mode = new_idx_mode;
+	}
+    }
+
+  if (scale_log2 != 0)
+    {
+      rtx tmp = expand_binop (idx_mode, ashl_optab, vec_offset,
+			      gen_int_mode (scale_log2, Pmode), NULL_RTX, 0,
+			      OPTAB_DIRECT);
+      vec_offset = tmp;
+    }
+
+  insn_code icode = prepare_gather_scatter (vec_mode, idx_mode, is_load);
+  if (is_vlmax)
+    {
+      if (is_load)
+	{
+	  rtx load_ops[]
+	    = {vec_reg, mask, RVV_VUNDEF (vec_mode), ptr, vec_offset};
+	  emit_vlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops);
+	}
+      else
+	{
+	  rtx store_ops[] = {mask, ptr, vec_offset, vec_reg};
+	  emit_vlmax_masked_store_insn (icode, RVV_SCATTER_M_OP, store_ops);
+	}
+    }
+  else
+    {
+      if (is_load)
+	{
+	  rtx load_ops[]
+	    = {vec_reg, mask, RVV_VUNDEF (vec_mode), ptr, vec_offset};
+	  emit_nonvlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops, len);
+	}
+      else
+	{
+	  rtx store_ops[] = {mask, ptr, vec_offset, vec_reg};
+	  emit_nonvlmax_masked_store_insn (icode, RVV_SCATTER_M_OP, store_ops,
+					   len);
+	}
+    }
+}
+
 } // namespace riscv_vector
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 38d8eb2fcf5..c5dbb6c34ee 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2060,7 +2060,14 @@ riscv_legitimize_poly_move (machine_mode mode, rtx dest, rtx tmp, rtx src)
      (m, n) = base * magn + constant.
      This calculation doesn't need div operation.  */
 
-  emit_move_insn (tmp, gen_int_mode (BYTES_PER_RISCV_VECTOR, mode));
+  if (known_le (GET_MODE_SIZE (mode), GET_MODE_SIZE (Pmode)))
+    emit_move_insn (tmp, gen_int_mode (BYTES_PER_RISCV_VECTOR, mode));
+  else
+    {
+      emit_move_insn (gen_highpart (Pmode, tmp), CONST0_RTX (Pmode));
+      emit_move_insn (gen_lowpart (Pmode, tmp),
+		      gen_int_mode (BYTES_PER_RISCV_VECTOR, Pmode));
+    }
 
   if (BYTES_PER_RISCV_VECTOR.is_constant ())
     {
@@ -2167,7 +2174,7 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src)
 	  return false;
 	}
 
-      if (satisfies_constraint_vp (src))
+      if (satisfies_constraint_vp (src) && GET_MODE (src) == Pmode)
 	return false;
 
       if (GET_MODE_SIZE (mode).to_constant () < GET_MODE_SIZE (Pmode))
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 8afd3dcaddd..ec49544bcf6 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -115,6 +115,9 @@
 
 (define_mode_iterator VEEWEXT2 [
   (VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx64HI "TARGET_MIN_VLEN >= 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16") (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
   (VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI "TARGET_VECTOR_ELEN_64")
   (VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
@@ -161,6 +164,8 @@
 (define_mode_iterator VEEWTRUNC2 [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI (VNx64QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN >= 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16") (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
   (VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN >= 128")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
@@ -172,6 +177,8 @@
 (define_mode_iterator VEEWTRUNC4 [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI VNx4QI VNx8QI VNx16QI (VNx32QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI (VNx16HI "TARGET_MIN_VLEN >= 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
 ])
 
 (define_mode_iterator VEEWTRUNC8 [
@@ -362,46 +369,67 @@
 ])
 
 (define_mode_iterator VNX1_QHSD [
-  (VNx1QI "TARGET_MIN_VLEN < 128") (VNx1HI "TARGET_MIN_VLEN < 128") (VNx1SI "TARGET_MIN_VLEN < 128")
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  (VNx1SI "TARGET_MIN_VLEN < 128")
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
   (VNx1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN < 128")
 ])
 
 (define_mode_iterator VNX2_QHSD [
-  VNx2QI VNx2HI VNx2SI
+  VNx2QI
+  VNx2HI
+  VNx2SI
   (VNx2DI "TARGET_VECTOR_ELEN_64")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx2DF "TARGET_VECTOR_ELEN_FP_64")
 ])
 
 (define_mode_iterator VNX4_QHSD [
-  VNx4QI VNx4HI VNx4SI
+  VNx4QI
+  VNx4HI
+  VNx4SI
   (VNx4DI "TARGET_VECTOR_ELEN_64")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx4DF "TARGET_VECTOR_ELEN_FP_64")
 ])
 
 (define_mode_iterator VNX8_QHSD [
-  VNx8QI VNx8HI VNx8SI
+  VNx8QI
+  VNx8HI
+  VNx8SI
   (VNx8DI "TARGET_VECTOR_ELEN_64")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx8SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx8DF "TARGET_VECTOR_ELEN_FP_64")
 ])
 
-(define_mode_iterator VNX16_QHS [
-  VNx16QI VNx16HI (VNx16SI "TARGET_MIN_VLEN > 32")
+(define_mode_iterator VNX16_QHSD [
+  VNx16QI
+  VNx16HI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
-  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128") (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
+  (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
 ])
 
 (define_mode_iterator VNX32_QHS [
-  VNx32QI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128") (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
+  VNx32QI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
+  (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
 ])
 
 (define_mode_iterator VNX64_QH [
   (VNx64QI "TARGET_MIN_VLEN > 32")
   (VNx64HI "TARGET_MIN_VLEN >= 128")
+  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
 ])
 
 (define_mode_iterator VNX128_Q [
@@ -409,35 +437,49 @@
 ])
 
 (define_mode_iterator VNX1_QHSDI [
-  (VNx1QI "TARGET_MIN_VLEN < 128") (VNx1HI "TARGET_MIN_VLEN < 128") (VNx1SI "TARGET_MIN_VLEN < 128")
-  (VNx1DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  (VNx1SI "TARGET_MIN_VLEN < 128")
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128 && TARGET_64BIT")
 ])
 
 (define_mode_iterator VNX2_QHSDI [
-  VNx2QI VNx2HI VNx2SI
-  (VNx2DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
+  VNx2QI
+  VNx2HI
+  VNx2SI
+  (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
 ])
 
 (define_mode_iterator VNX4_QHSDI [
-  VNx4QI VNx4HI VNx4SI
-  (VNx4DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
+  VNx4QI
+  VNx4HI
+  VNx4SI
+  (VNx4DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
 ])
 
 (define_mode_iterator VNX8_QHSDI [
-  VNx8QI VNx8HI VNx8SI
-  (VNx8DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
+  VNx8QI
+  VNx8HI
+  VNx8SI
+  (VNx8DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
 ])
 
 (define_mode_iterator VNX16_QHSDI [
-  VNx16QI VNx16HI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx16DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  VNx16QI
+  VNx16HI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128 && TARGET_64BIT")
 ])
 
 (define_mode_iterator VNX32_QHSI [
-  VNx32QI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
+  VNx32QI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
 ])
 
 (define_mode_iterator VNX64_QHI [
-  VNx64QI (VNx64HI "TARGET_MIN_VLEN >= 128")
+  (VNx64QI "TARGET_MIN_VLEN > 32")
+  (VNx64HI "TARGET_MIN_VLEN >= 128")
 ])
 
 (define_mode_iterator V_WHOLE [
@@ -1393,6 +1435,8 @@
 (define_mode_attr VINDEX_DOUBLE_TRUNC [
   (VNx1HI "VNx1QI") (VNx2HI "VNx2QI")  (VNx4HI "VNx4QI")  (VNx8HI "VNx8QI")
   (VNx16HI "VNx16QI") (VNx32HI "VNx32QI") (VNx64HI "VNx64QI")
+  (VNx1HF "VNx1QI") (VNx2HF "VNx2QI")  (VNx4HF "VNx4QI")  (VNx8HF "VNx8QI")
+  (VNx16HF "VNx16QI") (VNx32HF "VNx32QI") (VNx64HF "VNx64QI")
   (VNx1SI "VNx1HI") (VNx2SI "VNx2HI") (VNx4SI "VNx4HI") (VNx8SI "VNx8HI")
   (VNx16SI "VNx16HI") (VNx32SI "VNx32HI")
   (VNx1SF "VNx1HI") (VNx2SF "VNx2HI") (VNx4SF "VNx4HI") (VNx8SF "VNx8HI")
@@ -1420,6 +1464,7 @@
 (define_mode_attr VINDEX_DOUBLE_EXT [
   (VNx1QI "VNx1HI") (VNx2QI "VNx2HI") (VNx4QI "VNx4HI") (VNx8QI "VNx8HI") (VNx16QI "VNx16HI") (VNx32QI "VNx32HI") (VNx64QI "VNx64HI")
   (VNx1HI "VNx1SI") (VNx2HI "VNx2SI") (VNx4HI "VNx4SI") (VNx8HI "VNx8SI") (VNx16HI "VNx16SI") (VNx32HI "VNx32SI")
+  (VNx1HF "VNx1SI") (VNx2HF "VNx2SI") (VNx4HF "VNx4SI") (VNx8HF "VNx8SI") (VNx16HF "VNx16SI") (VNx32HF "VNx32SI")
   (VNx1SI "VNx1DI") (VNx2SI "VNx2DI") (VNx4SI "VNx4DI") (VNx8SI "VNx8DI") (VNx16SI "VNx16DI")
   (VNx1SF "VNx1DI") (VNx2SF "VNx2DI") (VNx4SF "VNx4DI") (VNx8SF "VNx8DI") (VNx16SF "VNx16DI")
 ])
@@ -1427,6 +1472,7 @@
 (define_mode_attr VINDEX_QUAD_EXT [
   (VNx1QI "VNx1SI") (VNx2QI "VNx2SI") (VNx4QI "VNx4SI") (VNx8QI "VNx8SI") (VNx16QI "VNx16SI") (VNx32QI "VNx32SI")
   (VNx1HI "VNx1DI") (VNx2HI "VNx2DI") (VNx4HI "VNx4DI") (VNx8HI "VNx8DI") (VNx16HI "VNx16DI")
+  (VNx1HF "VNx1DI") (VNx2HF "VNx2DI") (VNx4HF "VNx4DI") (VNx8HF "VNx8DI") (VNx16HF "VNx16DI")
 ])
 
 (define_mode_attr VINDEX_OCT_EXT [
@@ -1471,6 +1517,40 @@
   (VNx4DI "VNx8BI") (VNx8DI "VNx16BI") (VNx16DI "VNx32BI")
 ])
 
+(define_mode_attr gs_extension [
+  (VNx1QI "immediate_operand") (VNx2QI "immediate_operand") (VNx4QI "immediate_operand") (VNx8QI "immediate_operand") (VNx16QI "immediate_operand")
+  (VNx32QI "vector_gs_extension_operand") (VNx64QI "const_1_operand")
+  (VNx1HI "immediate_operand") (VNx2HI "immediate_operand") (VNx4HI "immediate_operand") (VNx8HI "immediate_operand") (VNx16HI "immediate_operand")
+  (VNx32HI "vector_gs_extension_operand") (VNx64HI "const_1_operand")
+  (VNx1SI "immediate_operand") (VNx2SI "immediate_operand") (VNx4SI "immediate_operand") (VNx8SI "immediate_operand") (VNx16SI "immediate_operand")
+  (VNx32SI "vector_gs_extension_operand")
+  (VNx1DI "immediate_operand") (VNx2DI "immediate_operand") (VNx4DI "immediate_operand") (VNx8DI "immediate_operand") (VNx16DI "immediate_operand")
+
+  (VNx1HF "immediate_operand") (VNx2HF "immediate_operand") (VNx4HF "immediate_operand") (VNx8HF "immediate_operand") (VNx16HF "immediate_operand")
+  (VNx32HF "vector_gs_extension_operand") (VNx64HF "const_1_operand")
+  (VNx1SF "immediate_operand") (VNx2SF "immediate_operand") (VNx4SF "immediate_operand") (VNx8SF "immediate_operand") (VNx16SF "immediate_operand")
+  (VNx32SF "vector_gs_extension_operand")
+  (VNx1DF "immediate_operand") (VNx2DF "immediate_operand") (VNx4DF "immediate_operand") (VNx8DF "immediate_operand") (VNx16DF "immediate_operand")
+])
+
+(define_mode_attr gs_scale [
+  (VNx1QI "const_1_operand") (VNx2QI "const_1_operand") (VNx4QI "const_1_operand") (VNx8QI "const_1_operand")
+  (VNx16QI "const_1_operand") (VNx32QI "const_1_operand") (VNx64QI "const_1_operand")
+  (VNx1HI "vector_gs_scale_operand_16") (VNx2HI "vector_gs_scale_operand_16") (VNx4HI "vector_gs_scale_operand_16") (VNx8HI "vector_gs_scale_operand_16")
+  (VNx16HI "vector_gs_scale_operand_16") (VNx32HI "vector_gs_scale_operand_16_rv32") (VNx64HI "const_1_operand")
+  (VNx1SI "vector_gs_scale_operand_32") (VNx2SI "vector_gs_scale_operand_32") (VNx4SI "vector_gs_scale_operand_32") (VNx8SI "vector_gs_scale_operand_32")
+  (VNx16SI "vector_gs_scale_operand_32") (VNx32SI "vector_gs_scale_operand_32_rv32")
+  (VNx1DI "vector_gs_scale_operand_64") (VNx2DI "vector_gs_scale_operand_64") (VNx4DI "vector_gs_scale_operand_64") (VNx8DI "vector_gs_scale_operand_64")
+  (VNx16DI "vector_gs_scale_operand_64")
+
+  (VNx1HF "vector_gs_scale_operand_16") (VNx2HF "vector_gs_scale_operand_16") (VNx4HF "vector_gs_scale_operand_16") (VNx8HF "vector_gs_scale_operand_16")
+  (VNx16HF "vector_gs_scale_operand_16") (VNx32HF "vector_gs_scale_operand_16_rv32") (VNx64HF "const_1_operand")
+  (VNx1SF "vector_gs_scale_operand_32") (VNx2SF "vector_gs_scale_operand_32") (VNx4SF "vector_gs_scale_operand_32") (VNx8SF "vector_gs_scale_operand_32")
+  (VNx16SF "vector_gs_scale_operand_32") (VNx32SF "vector_gs_scale_operand_32_rv32")
+  (VNx1DF "vector_gs_scale_operand_64") (VNx2DF "vector_gs_scale_operand_64") (VNx4DF "vector_gs_scale_operand_64") (VNx8DF "vector_gs_scale_operand_64")
+  (VNx16DF "vector_gs_scale_operand_64")
+])
+
 (define_int_iterator WREDUC [UNSPEC_WREDUC_SUM UNSPEC_WREDUC_USUM])
 
 (define_int_iterator ORDER [UNSPEC_ORDERED UNSPEC_UNORDERED])
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 5b7a17b9d34..de944637570 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -818,7 +818,7 @@
 ;; This pattern only handles duplicates of non-constant inputs.
 ;; Constant vectors go through the movm pattern instead.
 ;; So "direct_broadcast_operand" can only be mem or reg, no CONSTANT.
-(define_expand "vec_duplicate<mode>"
+(define_expand "@vec_duplicate<mode>"
   [(set (match_operand:V 0 "register_operand")
 	(vec_duplicate:V
 	  (match_operand:<VEL> 1 "direct_broadcast_operand")))]
@@ -1357,8 +1357,16 @@
 	}
     }
   else if (GET_MODE_BITSIZE (<VEL>mode) > GET_MODE_BITSIZE (Pmode)
-           && immediate_operand (operands[3], Pmode))
-    operands[3] = gen_rtx_SIGN_EXTEND (<VEL>mode, force_reg (Pmode, operands[3]));
+           && (immediate_operand (operands[3], Pmode)
+	       || (CONST_POLY_INT_P (operands[3])
+	           && known_ge (rtx_to_poly_int64 (operands[3]), 0U)
+		   && known_le (rtx_to_poly_int64 (operands[3]), GET_MODE_SIZE (<MODE>mode)))))
+    {
+      rtx tmp = gen_reg_rtx (Pmode);
+      poly_int64 value = rtx_to_poly_int64 (operands[3]);
+      emit_move_insn (tmp, gen_int_mode (value, Pmode));
+      operands[3] = gen_rtx_SIGN_EXTEND (<VEL>mode, tmp);
+    }
   else
     operands[3] = force_reg (<VEL>mode, operands[3]);
 })
@@ -1387,7 +1395,8 @@
    vlse<sew>.v\t%0,%3,zero
    vmv.s.x\t%0,%3
    vmv.s.x\t%0,%3"
-  "register_operand (operands[3], <VEL>mode)
+  "(register_operand (operands[3], <VEL>mode)
+  || CONST_POLY_INT_P (operands[3]))
   && GET_MODE_BITSIZE (<VEL>mode) > GET_MODE_BITSIZE (Pmode)"
   [(set (match_dup 0)
 	(if_then_else:VI (unspec:<VM> [(match_dup 1) (match_dup 4)
@@ -1397,6 +1406,12 @@
 	  (match_dup 2)))]
   {
     gcc_assert (can_create_pseudo_p ());
+    if (CONST_POLY_INT_P (operands[3]))
+      {
+	rtx tmp = gen_reg_rtx (<VEL>mode);
+	emit_move_insn (tmp, operands[3]);
+	operands[3] = tmp;
+      }
     rtx m = assign_stack_local (<VEL>mode, GET_MODE_SIZE (<VEL>mode),
 				GET_MODE_ALIGNMENT (<VEL>mode));
     m = validize_mem (m);
@@ -1483,6 +1498,7 @@
 	     (match_operand 5 "vector_length_operand"    "   rK,    rK,    rK")
 	     (match_operand 6 "const_int_operand"        "    i,     i,     i")
 	     (match_operand 7 "const_int_operand"        "    i,     i,     i")
+	     (match_operand 8 "const_int_operand"        "    i,     i,     i")
 	     (reg:SI VL_REGNUM)
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	  (unspec:V
@@ -1738,7 +1754,7 @@
   [(set_attr "type" "vst<order>x")
    (set_attr "mode" "<VNX8_QHSD:MODE>")])
 
-(define_insn "@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>"
+(define_insn "@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
   [(set (mem:BLK (scratch))
 	(unspec:BLK
 	  [(unspec:<VM>
@@ -1749,11 +1765,11 @@
 	     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
 	   (match_operand 1 "pmode_reg_or_0_operand"      "  rJ")
 	   (match_operand:VNX16_QHSDI 2 "register_operand" "  vr")
-	   (match_operand:VNX16_QHS 3 "register_operand"  "  vr")] ORDER))]
+	   (match_operand:VNX16_QHSD 3 "register_operand"  "  vr")] ORDER))]
   "TARGET_VECTOR"
   "vs<order>xei<VNX16_QHSDI:sew>.v\t%3,(%z1),%2%p0"
   [(set_attr "type" "vst<order>x")
-   (set_attr "mode" "<VNX16_QHS:MODE>")])
+   (set_attr "mode" "<VNX16_QHSD:MODE>")])
 
 (define_insn "@pred_indexed_<order>store<VNX32_QHS:mode><VNX32_QHSI:mode>"
   [(set (mem:BLK (scratch))
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c
new file mode 100644
index 00000000000..dffe13f6a8a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c
new file mode 100644
index 00000000000..a622e516f06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c
new file mode 100644
index 00000000000..4692380233d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_LOOP(DATA_TYPE)                                                   \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict *src)           \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += *src[i];                                                      \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t)                                                                   \
+  T (uint8_t)                                                                  \
+  T (int16_t)                                                                  \
+  T (uint16_t)                                                                 \
+  T (_Float16)                                                                 \
+  T (int32_t)                                                                  \
+  T (uint32_t)                                                                 \
+  T (float)                                                                    \
+  T (int64_t)                                                                  \
+  T (uint64_t)                                                                 \
+  T (double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c
new file mode 100644
index 00000000000..71a3dd466fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c
@@ -0,0 +1,112 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_LOOP(DATA_TYPE, INDEX_TYPE)                                       \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x,  \
+				INDEX_TYPE *restrict index)                    \
+  {                                                                            \
+    for (int i = 0; i < 100; ++i)                                              \
+      {                                                                        \
+	y[i * 2] = x[index[i * 2]] + 1;                                        \
+	y[i * 2 + 1] = x[index[i * 2 + 1]] + 2;                                \
+      }                                                                        \
+  }
+
+TEST_LOOP (int8_t, int8_t)
+TEST_LOOP (uint8_t, int8_t)
+TEST_LOOP (int16_t, int8_t)
+TEST_LOOP (uint16_t, int8_t)
+TEST_LOOP (int32_t, int8_t)
+TEST_LOOP (uint32_t, int8_t)
+TEST_LOOP (int64_t, int8_t)
+TEST_LOOP (uint64_t, int8_t)
+TEST_LOOP (_Float16, int8_t)
+TEST_LOOP (float, int8_t)
+TEST_LOOP (double, int8_t)
+TEST_LOOP (int8_t, int16_t)
+TEST_LOOP (uint8_t, int16_t)
+TEST_LOOP (int16_t, int16_t)
+TEST_LOOP (uint16_t, int16_t)
+TEST_LOOP (int32_t, int16_t)
+TEST_LOOP (uint32_t, int16_t)
+TEST_LOOP (int64_t, int16_t)
+TEST_LOOP (uint64_t, int16_t)
+TEST_LOOP (_Float16, int16_t)
+TEST_LOOP (float, int16_t)
+TEST_LOOP (double, int16_t)
+TEST_LOOP (int8_t, int32_t)
+TEST_LOOP (uint8_t, int32_t)
+TEST_LOOP (int16_t, int32_t)
+TEST_LOOP (uint16_t, int32_t)
+TEST_LOOP (int32_t, int32_t)
+TEST_LOOP (uint32_t, int32_t)
+TEST_LOOP (int64_t, int32_t)
+TEST_LOOP (uint64_t, int32_t)
+TEST_LOOP (_Float16, int32_t)
+TEST_LOOP (float, int32_t)
+TEST_LOOP (double, int32_t)
+TEST_LOOP (int8_t, int64_t)
+TEST_LOOP (uint8_t, int64_t)
+TEST_LOOP (int16_t, int64_t)
+TEST_LOOP (uint16_t, int64_t)
+TEST_LOOP (int32_t, int64_t)
+TEST_LOOP (uint32_t, int64_t)
+TEST_LOOP (int64_t, int64_t)
+TEST_LOOP (uint64_t, int64_t)
+TEST_LOOP (_Float16, int64_t)
+TEST_LOOP (float, int64_t)
+TEST_LOOP (double, int64_t)
+TEST_LOOP (int8_t, uint8_t)
+TEST_LOOP (uint8_t, uint8_t)
+TEST_LOOP (int16_t, uint8_t)
+TEST_LOOP (uint16_t, uint8_t)
+TEST_LOOP (int32_t, uint8_t)
+TEST_LOOP (uint32_t, uint8_t)
+TEST_LOOP (int64_t, uint8_t)
+TEST_LOOP (uint64_t, uint8_t)
+TEST_LOOP (_Float16, uint8_t)
+TEST_LOOP (float, uint8_t)
+TEST_LOOP (double, uint8_t)
+TEST_LOOP (int8_t, uint16_t)
+TEST_LOOP (uint8_t, uint16_t)
+TEST_LOOP (int16_t, uint16_t)
+TEST_LOOP (uint16_t, uint16_t)
+TEST_LOOP (int32_t, uint16_t)
+TEST_LOOP (uint32_t, uint16_t)
+TEST_LOOP (int64_t, uint16_t)
+TEST_LOOP (uint64_t, uint16_t)
+TEST_LOOP (_Float16, uint16_t)
+TEST_LOOP (float, uint16_t)
+TEST_LOOP (double, uint16_t)
+TEST_LOOP (int8_t, uint32_t)
+TEST_LOOP (uint8_t, uint32_t)
+TEST_LOOP (int16_t, uint32_t)
+TEST_LOOP (uint16_t, uint32_t)
+TEST_LOOP (int32_t, uint32_t)
+TEST_LOOP (uint32_t, uint32_t)
+TEST_LOOP (int64_t, uint32_t)
+TEST_LOOP (uint64_t, uint32_t)
+TEST_LOOP (_Float16, uint32_t)
+TEST_LOOP (float, uint32_t)
+TEST_LOOP (double, uint32_t)
+TEST_LOOP (int8_t, uint64_t)
+TEST_LOOP (uint8_t, uint64_t)
+TEST_LOOP (int16_t, uint64_t)
+TEST_LOOP (uint16_t, uint64_t)
+TEST_LOOP (int32_t, uint64_t)
+TEST_LOOP (uint32_t, uint64_t)
+TEST_LOOP (int64_t, uint64_t)
+TEST_LOOP (uint64_t, uint64_t)
+TEST_LOOP (_Float16, uint64_t)
+TEST_LOOP (float, uint64_t)
+TEST_LOOP (double, uint64_t)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 88 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-assembler-not "vluxei64\.v" } } */
+/* { dg-final { scan-assembler-not "vsuxei64\.v" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c
new file mode 100644
index 00000000000..785550c4b2d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c
new file mode 100644
index 00000000000..22aeb889221
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c
new file mode 100644
index 00000000000..d74a83415d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c
new file mode 100644
index 00000000000..2b6c0a87c18
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c
new file mode 100644
index 00000000000..407cc8a5a73
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c
new file mode 100644
index 00000000000..81b31ef26aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c
new file mode 100644
index 00000000000..0bfdb8f0acf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c
new file mode 100644
index 00000000000..46f791105ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c
new file mode 100644
index 00000000000..0d3c5b71e5b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c
new file mode 100644
index 00000000000..145df1e7797
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c
new file mode 100644
index 00000000000..d36b6f025f5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c
@@ -0,0 +1,39 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-11.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE *src_##DATA_TYPE[128];                                             \
+  DATA_TYPE src2_##DATA_TYPE[128];                                             \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = src2_##DATA_TYPE + i;                               \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE);                           \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i] + src_##DATA_TYPE[i][0]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
new file mode 100644
index 00000000000..b4e2ead8ca9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
@@ -0,0 +1,124 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-12.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, INDEX_TYPE)                                        \
+  DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                        \
+  DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                         \
+  INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                      \
+  for (int i = 0; i < 202; i++)                                                \
+    {                                                                          \
+      src_##DATA_TYPE##_##INDEX_TYPE[i]                                        \
+	= (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1));         \
+      index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55);                    \
+    }                                                                          \
+  f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE,               \
+				src_##DATA_TYPE##_##INDEX_TYPE,                \
+				index_##DATA_TYPE##_##INDEX_TYPE);             \
+  for (int i = 0; i < 100; i++)                                                \
+    {                                                                          \
+      assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]                           \
+	      == (src_##DATA_TYPE##_##INDEX_TYPE                               \
+		    [index_##DATA_TYPE##_##INDEX_TYPE[i * 2]]                  \
+		  + 1));                                                       \
+      assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]                       \
+	      == (src_##DATA_TYPE##_##INDEX_TYPE                               \
+		    [index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]]              \
+		  + 2));                                                       \
+    }
+
+  RUN_LOOP (int8_t, int8_t)
+  RUN_LOOP (uint8_t, int8_t)
+  RUN_LOOP (int16_t, int8_t)
+  RUN_LOOP (uint16_t, int8_t)
+  RUN_LOOP (int32_t, int8_t)
+  RUN_LOOP (uint32_t, int8_t)
+  RUN_LOOP (int64_t, int8_t)
+  RUN_LOOP (uint64_t, int8_t)
+  RUN_LOOP (_Float16, int8_t)
+  RUN_LOOP (float, int8_t)
+  RUN_LOOP (double, int8_t)
+  RUN_LOOP (int8_t, int16_t)
+  RUN_LOOP (uint8_t, int16_t)
+  RUN_LOOP (int16_t, int16_t)
+  RUN_LOOP (uint16_t, int16_t)
+  RUN_LOOP (int32_t, int16_t)
+  RUN_LOOP (uint32_t, int16_t)
+  RUN_LOOP (int64_t, int16_t)
+  RUN_LOOP (uint64_t, int16_t)
+  RUN_LOOP (_Float16, int16_t)
+  RUN_LOOP (float, int16_t)
+  RUN_LOOP (double, int16_t)
+  RUN_LOOP (int8_t, int32_t)
+  RUN_LOOP (uint8_t, int32_t)
+  RUN_LOOP (int16_t, int32_t)
+  RUN_LOOP (uint16_t, int32_t)
+  RUN_LOOP (int32_t, int32_t)
+  RUN_LOOP (uint32_t, int32_t)
+  RUN_LOOP (int64_t, int32_t)
+  RUN_LOOP (uint64_t, int32_t)
+  RUN_LOOP (_Float16, int32_t)
+  RUN_LOOP (float, int32_t)
+  RUN_LOOP (double, int32_t)
+  RUN_LOOP (int8_t, int64_t)
+  RUN_LOOP (uint8_t, int64_t)
+  RUN_LOOP (int16_t, int64_t)
+  RUN_LOOP (uint16_t, int64_t)
+  RUN_LOOP (int32_t, int64_t)
+  RUN_LOOP (uint32_t, int64_t)
+  RUN_LOOP (int64_t, int64_t)
+  RUN_LOOP (uint64_t, int64_t)
+  RUN_LOOP (_Float16, int64_t)
+  RUN_LOOP (float, int64_t)
+  RUN_LOOP (double, int64_t)
+  RUN_LOOP (int8_t, uint8_t)
+  RUN_LOOP (uint8_t, uint8_t)
+  RUN_LOOP (int16_t, uint8_t)
+  RUN_LOOP (uint16_t, uint8_t)
+  RUN_LOOP (int32_t, uint8_t)
+  RUN_LOOP (uint32_t, uint8_t)
+  RUN_LOOP (int64_t, uint8_t)
+  RUN_LOOP (uint64_t, uint8_t)
+  RUN_LOOP (_Float16, uint8_t)
+  RUN_LOOP (float, uint8_t)
+  RUN_LOOP (double, uint8_t)
+  RUN_LOOP (int8_t, uint16_t)
+  RUN_LOOP (uint8_t, uint16_t)
+  RUN_LOOP (int16_t, uint16_t)
+  RUN_LOOP (uint16_t, uint16_t)
+  RUN_LOOP (int32_t, uint16_t)
+  RUN_LOOP (uint32_t, uint16_t)
+  RUN_LOOP (int64_t, uint16_t)
+  RUN_LOOP (uint64_t, uint16_t)
+  RUN_LOOP (_Float16, uint16_t)
+  RUN_LOOP (float, uint16_t)
+  RUN_LOOP (double, uint16_t)
+  RUN_LOOP (int8_t, uint32_t)
+  RUN_LOOP (uint8_t, uint32_t)
+  RUN_LOOP (int16_t, uint32_t)
+  RUN_LOOP (uint16_t, uint32_t)
+  RUN_LOOP (int32_t, uint32_t)
+  RUN_LOOP (uint32_t, uint32_t)
+  RUN_LOOP (int64_t, uint32_t)
+  RUN_LOOP (uint64_t, uint32_t)
+  RUN_LOOP (_Float16, uint32_t)
+  RUN_LOOP (float, uint32_t)
+  RUN_LOOP (double, uint32_t)
+  RUN_LOOP (int8_t, uint64_t)
+  RUN_LOOP (uint8_t, uint64_t)
+  RUN_LOOP (int16_t, uint64_t)
+  RUN_LOOP (uint16_t, uint64_t)
+  RUN_LOOP (int32_t, uint64_t)
+  RUN_LOOP (uint32_t, uint64_t)
+  RUN_LOOP (int64_t, uint64_t)
+  RUN_LOOP (uint64_t, uint64_t)
+  RUN_LOOP (_Float16, uint64_t)
+  RUN_LOOP (float, uint64_t)
+  RUN_LOOP (double, uint64_t)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c
new file mode 100644
index 00000000000..76c6df32e6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c
new file mode 100644
index 00000000000..0fd64260082
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c
new file mode 100644
index 00000000000..069d232b912
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c
new file mode 100644
index 00000000000..499e555c1d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c
new file mode 100644
index 00000000000..ec6587aa4e5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c
new file mode 100644
index 00000000000..c16287955a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c
new file mode 100644
index 00000000000..e1744f60dbb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c
new file mode 100644
index 00000000000..3ad6d33087f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+	    == (dest2_##DATA_TYPE[i]                                           \
+		+ src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c
new file mode 100644
index 00000000000..a5de0deccbe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c
new file mode 100644
index 00000000000..74a0d05b37d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                               \
+  T (uint8_t, 64)                                                              \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
new file mode 100644
index 00000000000..98c5b4678b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
@@ -0,0 +1,116 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_LOOP(DATA_TYPE, INDEX_TYPE)                                       \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x,  \
+				INDEX_TYPE *restrict index,                    \
+				INDEX_TYPE *restrict cond)                     \
+  {                                                                            \
+    for (int i = 0; i < 100; ++i)                                              \
+      {                                                                        \
+	if (cond[i * 2])                                                       \
+	  y[i * 2] = x[index[i * 2]] + 1;                                      \
+	if (cond[i * 2 + 1])                                                   \
+	  y[i * 2 + 1] = x[index[i * 2 + 1]] + 2;                              \
+      }                                                                        \
+  }
+
+TEST_LOOP (int8_t, int8_t)
+TEST_LOOP (uint8_t, int8_t)
+TEST_LOOP (int16_t, int8_t)
+TEST_LOOP (uint16_t, int8_t)
+TEST_LOOP (int32_t, int8_t)
+TEST_LOOP (uint32_t, int8_t)
+TEST_LOOP (int64_t, int8_t)
+TEST_LOOP (uint64_t, int8_t)
+TEST_LOOP (_Float16, int8_t)
+TEST_LOOP (float, int8_t)
+TEST_LOOP (double, int8_t)
+TEST_LOOP (int8_t, int16_t)
+TEST_LOOP (uint8_t, int16_t)
+TEST_LOOP (int16_t, int16_t)
+TEST_LOOP (uint16_t, int16_t)
+TEST_LOOP (int32_t, int16_t)
+TEST_LOOP (uint32_t, int16_t)
+TEST_LOOP (int64_t, int16_t)
+TEST_LOOP (uint64_t, int16_t)
+TEST_LOOP (_Float16, int16_t)
+TEST_LOOP (float, int16_t)
+TEST_LOOP (double, int16_t)
+TEST_LOOP (int8_t, int32_t)
+TEST_LOOP (uint8_t, int32_t)
+TEST_LOOP (int16_t, int32_t)
+TEST_LOOP (uint16_t, int32_t)
+TEST_LOOP (int32_t, int32_t)
+TEST_LOOP (uint32_t, int32_t)
+TEST_LOOP (int64_t, int32_t)
+TEST_LOOP (uint64_t, int32_t)
+TEST_LOOP (_Float16, int32_t)
+TEST_LOOP (float, int32_t)
+TEST_LOOP (double, int32_t)
+TEST_LOOP (int8_t, int64_t)
+TEST_LOOP (uint8_t, int64_t)
+TEST_LOOP (int16_t, int64_t)
+TEST_LOOP (uint16_t, int64_t)
+TEST_LOOP (int32_t, int64_t)
+TEST_LOOP (uint32_t, int64_t)
+TEST_LOOP (int64_t, int64_t)
+TEST_LOOP (uint64_t, int64_t)
+TEST_LOOP (_Float16, int64_t)
+TEST_LOOP (float, int64_t)
+TEST_LOOP (double, int64_t)
+TEST_LOOP (int8_t, uint8_t)
+TEST_LOOP (uint8_t, uint8_t)
+TEST_LOOP (int16_t, uint8_t)
+TEST_LOOP (uint16_t, uint8_t)
+TEST_LOOP (int32_t, uint8_t)
+TEST_LOOP (uint32_t, uint8_t)
+TEST_LOOP (int64_t, uint8_t)
+TEST_LOOP (uint64_t, uint8_t)
+TEST_LOOP (_Float16, uint8_t)
+TEST_LOOP (float, uint8_t)
+TEST_LOOP (double, uint8_t)
+TEST_LOOP (int8_t, uint16_t)
+TEST_LOOP (uint8_t, uint16_t)
+TEST_LOOP (int16_t, uint16_t)
+TEST_LOOP (uint16_t, uint16_t)
+TEST_LOOP (int32_t, uint16_t)
+TEST_LOOP (uint32_t, uint16_t)
+TEST_LOOP (int64_t, uint16_t)
+TEST_LOOP (uint64_t, uint16_t)
+TEST_LOOP (_Float16, uint16_t)
+TEST_LOOP (float, uint16_t)
+TEST_LOOP (double, uint16_t)
+TEST_LOOP (int8_t, uint32_t)
+TEST_LOOP (uint8_t, uint32_t)
+TEST_LOOP (int16_t, uint32_t)
+TEST_LOOP (uint16_t, uint32_t)
+TEST_LOOP (int32_t, uint32_t)
+TEST_LOOP (uint32_t, uint32_t)
+TEST_LOOP (int64_t, uint32_t)
+TEST_LOOP (uint64_t, uint32_t)
+TEST_LOOP (_Float16, uint32_t)
+TEST_LOOP (float, uint32_t)
+TEST_LOOP (double, uint32_t)
+TEST_LOOP (int8_t, uint64_t)
+TEST_LOOP (uint8_t, uint64_t)
+TEST_LOOP (int16_t, uint64_t)
+TEST_LOOP (uint16_t, uint64_t)
+TEST_LOOP (int32_t, uint64_t)
+TEST_LOOP (uint32_t, uint64_t)
+TEST_LOOP (int64_t, uint64_t)
+TEST_LOOP (uint64_t, uint64_t)
+TEST_LOOP (_Float16, uint64_t)
+TEST_LOOP (float, uint64_t)
+TEST_LOOP (double, uint64_t)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 88 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-assembler-not "vluxei64\.v" } } */
+/* { dg-final { scan-assembler-not "vsuxei64\.v" } } */
+/* { dg-final { scan-assembler-not {vlse64\.v\s+v[0-9]+,\s*0\([a-x0-9]+\),\s*zero} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c
new file mode 100644
index 00000000000..03f84ce962c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c
new file mode 100644
index 00000000000..8578001ef41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c
new file mode 100644
index 00000000000..b273caa0bfe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c
new file mode 100644
index 00000000000..5055d886d62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                               \
+  T (uint8_t, 16)                                                              \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c
new file mode 100644
index 00000000000..2a4ae58588f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                               \
+  T (uint8_t, 16)                                                              \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c
new file mode 100644
index 00000000000..31d9414c549
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                               \
+  T (uint8_t, 32)                                                              \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c
new file mode 100644
index 00000000000..73ed23042fb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                               \
+  T (uint8_t, 32)                                                              \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c
new file mode 100644
index 00000000000..2f64e805759
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                               \
+  T (uint8_t, 64)                                                              \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c
new file mode 100644
index 00000000000..41f60bd88b9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c
new file mode 100644
index 00000000000..9840434fa41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
new file mode 100644
index 00000000000..105c706dbf9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
@@ -0,0 +1,140 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "mask_gather_load-11.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, INDEX_TYPE)                                        \
+  DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                        \
+  DATA_TYPE dest2_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                       \
+  DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                         \
+  INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                      \
+  INDEX_TYPE cond_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                       \
+  for (int i = 0; i < 202; i++)                                                \
+    {                                                                          \
+      src_##DATA_TYPE##_##INDEX_TYPE[i]                                        \
+	= (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1));         \
+      dest_##DATA_TYPE##_##INDEX_TYPE[i]                                       \
+	= (DATA_TYPE) ((i * 7 + 666) & (sizeof (DATA_TYPE) * 5 - 1));          \
+      dest2_##DATA_TYPE##_##INDEX_TYPE[i]                                      \
+	= (DATA_TYPE) ((i * 7 + 666) & (sizeof (DATA_TYPE) * 5 - 1));          \
+      index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55);                    \
+      cond_##DATA_TYPE##_##INDEX_TYPE[i] = (INDEX_TYPE) ((i & 0x3) == 3);      \
+    }                                                                          \
+  f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE,               \
+				src_##DATA_TYPE##_##INDEX_TYPE,                \
+				index_##DATA_TYPE##_##INDEX_TYPE,              \
+				cond_##DATA_TYPE##_##INDEX_TYPE);              \
+  for (int i = 0; i < 100; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##INDEX_TYPE[i * 2])                              \
+	assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]                         \
+		== (src_##DATA_TYPE##_##INDEX_TYPE                             \
+		      [index_##DATA_TYPE##_##INDEX_TYPE[i * 2]]                \
+		    + 1));                                                     \
+      else                                                                     \
+	assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]                         \
+		== dest2_##DATA_TYPE##_##INDEX_TYPE[i * 2]);                   \
+      if (cond_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1])                          \
+	assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]                     \
+		== (src_##DATA_TYPE##_##INDEX_TYPE                             \
+		      [index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]]            \
+		    + 2));                                                     \
+      else                                                                     \
+	assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]                     \
+		== dest2_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]);               \
+    }
+
+  RUN_LOOP (int8_t, int8_t)
+  RUN_LOOP (uint8_t, int8_t)
+  RUN_LOOP (int16_t, int8_t)
+  RUN_LOOP (uint16_t, int8_t)
+  RUN_LOOP (int32_t, int8_t)
+  RUN_LOOP (uint32_t, int8_t)
+  RUN_LOOP (int64_t, int8_t)
+  RUN_LOOP (uint64_t, int8_t)
+  RUN_LOOP (_Float16, int8_t)
+  RUN_LOOP (float, int8_t)
+  RUN_LOOP (double, int8_t)
+  RUN_LOOP (int8_t, int16_t)
+  RUN_LOOP (uint8_t, int16_t)
+  RUN_LOOP (int16_t, int16_t)
+  RUN_LOOP (uint16_t, int16_t)
+  RUN_LOOP (int32_t, int16_t)
+  RUN_LOOP (uint32_t, int16_t)
+  RUN_LOOP (int64_t, int16_t)
+  RUN_LOOP (uint64_t, int16_t)
+  RUN_LOOP (_Float16, int16_t)
+  RUN_LOOP (float, int16_t)
+  RUN_LOOP (double, int16_t)
+  RUN_LOOP (int8_t, int32_t)
+  RUN_LOOP (uint8_t, int32_t)
+  RUN_LOOP (int16_t, int32_t)
+  RUN_LOOP (uint16_t, int32_t)
+  RUN_LOOP (int32_t, int32_t)
+  RUN_LOOP (uint32_t, int32_t)
+  RUN_LOOP (int64_t, int32_t)
+  RUN_LOOP (uint64_t, int32_t)
+  RUN_LOOP (_Float16, int32_t)
+  RUN_LOOP (float, int32_t)
+  RUN_LOOP (double, int32_t)
+  RUN_LOOP (int8_t, int64_t)
+  RUN_LOOP (uint8_t, int64_t)
+  RUN_LOOP (int16_t, int64_t)
+  RUN_LOOP (uint16_t, int64_t)
+  RUN_LOOP (int32_t, int64_t)
+  RUN_LOOP (uint32_t, int64_t)
+  RUN_LOOP (int64_t, int64_t)
+  RUN_LOOP (uint64_t, int64_t)
+  RUN_LOOP (_Float16, int64_t)
+  RUN_LOOP (float, int64_t)
+  RUN_LOOP (double, int64_t)
+  RUN_LOOP (int8_t, uint8_t)
+  RUN_LOOP (uint8_t, uint8_t)
+  RUN_LOOP (int16_t, uint8_t)
+  RUN_LOOP (uint16_t, uint8_t)
+  RUN_LOOP (int32_t, uint8_t)
+  RUN_LOOP (uint32_t, uint8_t)
+  RUN_LOOP (int64_t, uint8_t)
+  RUN_LOOP (uint64_t, uint8_t)
+  RUN_LOOP (_Float16, uint8_t)
+  RUN_LOOP (float, uint8_t)
+  RUN_LOOP (double, uint8_t)
+  RUN_LOOP (int8_t, uint16_t)
+  RUN_LOOP (uint8_t, uint16_t)
+  RUN_LOOP (int16_t, uint16_t)
+  RUN_LOOP (uint16_t, uint16_t)
+  RUN_LOOP (int32_t, uint16_t)
+  RUN_LOOP (uint32_t, uint16_t)
+  RUN_LOOP (int64_t, uint16_t)
+  RUN_LOOP (uint64_t, uint16_t)
+  RUN_LOOP (_Float16, uint16_t)
+  RUN_LOOP (float, uint16_t)
+  RUN_LOOP (double, uint16_t)
+  RUN_LOOP (int8_t, uint32_t)
+  RUN_LOOP (uint8_t, uint32_t)
+  RUN_LOOP (int16_t, uint32_t)
+  RUN_LOOP (uint16_t, uint32_t)
+  RUN_LOOP (int32_t, uint32_t)
+  RUN_LOOP (uint32_t, uint32_t)
+  RUN_LOOP (int64_t, uint32_t)
+  RUN_LOOP (uint64_t, uint32_t)
+  RUN_LOOP (_Float16, uint32_t)
+  RUN_LOOP (float, uint32_t)
+  RUN_LOOP (double, uint32_t)
+  RUN_LOOP (int8_t, uint64_t)
+  RUN_LOOP (uint8_t, uint64_t)
+  RUN_LOOP (int16_t, uint64_t)
+  RUN_LOOP (uint16_t, uint64_t)
+  RUN_LOOP (int32_t, uint64_t)
+  RUN_LOOP (uint32_t, uint64_t)
+  RUN_LOOP (int64_t, uint64_t)
+  RUN_LOOP (uint64_t, uint64_t)
+  RUN_LOOP (_Float16, uint64_t)
+  RUN_LOOP (float, uint64_t)
+  RUN_LOOP (double, uint64_t)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c
new file mode 100644
index 00000000000..33ddb5d9909
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c
new file mode 100644
index 00000000000..9f06fbe4ecf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c
new file mode 100644
index 00000000000..ae578f0c7b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c
new file mode 100644
index 00000000000..741abd166e1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c
new file mode 100644
index 00000000000..a14a5c4ced1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c
new file mode 100644
index 00000000000..5f6ec81c93b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "mask_gather_load-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c
new file mode 100644
index 00000000000..a34688ff339
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "mask_gather_load-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c
new file mode 100644
index 00000000000..1cfdede2060
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[i]                                            \
+		== (dest2_##DATA_TYPE[i]                                       \
+		    + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c
new file mode 100644
index 00000000000..623de41267b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c
new file mode 100644
index 00000000000..55112b067fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c
new file mode 100644
index 00000000000..32a572d0064
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c
new file mode 100644
index 00000000000..fbaaa9d8a8e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                              \
+  T (uint16_t, 8)                                                             \
+  T (_Float16, 8)                                                             \
+  T (int32_t, 8)                                                              \
+  T (uint32_t, 8)                                                             \
+  T (float, 8)                                                                \
+  T (int64_t, 8)                                                              \
+  T (uint64_t, 8)                                                             \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c
new file mode 100644
index 00000000000..9b08661f8e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                              \
+  T (uint16_t, 8)                                                             \
+  T (_Float16, 8)                                                             \
+  T (int32_t, 8)                                                              \
+  T (uint32_t, 8)                                                             \
+  T (float, 8)                                                                \
+  T (int64_t, 8)                                                              \
+  T (uint64_t, 8)                                                             \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c
new file mode 100644
index 00000000000..dd26635f2cb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c
new file mode 100644
index 00000000000..fa0206a0ec2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c
new file mode 100644
index 00000000000..325e86c26a8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c
new file mode 100644
index 00000000000..b4b84e9cdda
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c
new file mode 100644
index 00000000000..77a9af953e0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+	dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c
new file mode 100644
index 00000000000..e0d52bf6291
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c
new file mode 100644
index 00000000000..c1af0d30e62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c
new file mode 100644
index 00000000000..6b1b02eae35
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c
new file mode 100644
index 00000000000..cef0bdec1d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c
new file mode 100644
index 00000000000..88a74d5a632
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c
new file mode 100644
index 00000000000..06804ab7111
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c
new file mode 100644
index 00000000000..c6c9a676ed6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c
new file mode 100644
index 00000000000..06a37f92907
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "mask_scatter_store-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c
new file mode 100644
index 00000000000..8ee35d2e505
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c
new file mode 100644
index 00000000000..c27a673e2b0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+	assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+		== dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c
new file mode 100644
index 00000000000..6a390261cfb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c
new file mode 100644
index 00000000000..feb58d7d458
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c
new file mode 100644
index 00000000000..e4c587fd7bb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c
new file mode 100644
index 00000000000..33ad256d3db
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c
new file mode 100644
index 00000000000..48d305623e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c
new file mode 100644
index 00000000000..83ddc44bf9c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c
new file mode 100644
index 00000000000..11eb68bdb13
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c
new file mode 100644
index 00000000000..2e323477258
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c
new file mode 100644
index 00000000000..e6732fe3790
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c
new file mode 100644
index 00000000000..766a52b4622
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+		 INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c
new file mode 100644
index 00000000000..cafa64f3527
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c
new file mode 100644
index 00000000000..79f6885831f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c
new file mode 100644
index 00000000000..376db088153
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "scatter_store-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c
new file mode 100644
index 00000000000..103b8649d38
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c
new file mode 100644
index 00000000000..f5f89c0fb4f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c
new file mode 100644
index 00000000000..049251ec888
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c
new file mode 100644
index 00000000000..59c8e701dbe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c
new file mode 100644
index 00000000000..a24401181e3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c
new file mode 100644
index 00000000000..080c9b83363
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c
new file mode 100644
index 00000000000..cc9f20f0daa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "scatter_store-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+		 indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+	    == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c
new file mode 100644
index 00000000000..0c64f3e1a00
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+			  INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < n; ++i)                                        \
+      dest[i] += src[i * stride];                                              \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_GATHER_LOAD" 66 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c
new file mode 100644
index 00000000000..0f234209325
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+			  INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < (BITS + 13); ++i)                              \
+      dest[i] += src[i * (BITS - 3)];                                          \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_GATHER_LOAD" 46 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
new file mode 100644
index 00000000000..4b03c25a907
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
@@ -0,0 +1,84 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_load-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+	= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+			  stride_##DATA_TYPE##_##BITS,                         \
+			  n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (                                                                 \
+	dest_##DATA_TYPE##_##BITS[i]                                           \
+	== (dest2_##DATA_TYPE##_##BITS[i]                                      \
+	    + src_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]));     \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
new file mode 100644
index 00000000000..8499e4cef24
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
@@ -0,0 +1,84 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_load-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+	= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+			  stride_##DATA_TYPE##_##BITS,                         \
+			  n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (                                                                 \
+	dest_##DATA_TYPE##_##BITS[i]                                           \
+	== (dest2_##DATA_TYPE##_##BITS[i]                                      \
+	    + src_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]));     \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c
new file mode 100644
index 00000000000..2418a0d74eb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+			  INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < n; ++i)                                        \
+      dest[i * stride] = src[i] + BITS;                                        \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_SCATTER_STORE" 66 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c
new file mode 100644
index 00000000000..83e101d5688
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+			  INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < n; ++i)                                        \
+      dest[i * (BITS - 3)] = src[i] + BITS;                                    \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_SCATTER_STORE" 44 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
new file mode 100644
index 00000000000..e9dca4672c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
@@ -0,0 +1,82 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_store-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+	= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+			  stride_##DATA_TYPE##_##BITS,                         \
+			  n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (dest_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]       \
+	      == (src_##DATA_TYPE##_##BITS[i] + BITS));                        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c
new file mode 100644
index 00000000000..509def789e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c
@@ -0,0 +1,82 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_store-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+	= (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+	= (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+			  stride_##DATA_TYPE##_##BITS,                         \
+			  n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (dest_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]       \
+	      == (src_##DATA_TYPE##_##BITS[i] + BITS));                        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
-- 
2.36.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
       [not found] ` <66564378DDCC57ED+202307121748494317959@rivai.ai>
@ 2023-07-12 21:22   ` 钟居哲
  2023-07-12 21:48     ` Jeff Law
  0 siblings, 1 reply; 13+ messages in thread
From: 钟居哲 @ 2023-07-12 21:22 UTC (permalink / raw)
  To: gcc-patches; +Cc: kito.cheng, kito.cheng, Jeff Law, rdapp.gcc

[-- Attachment #1: Type: text/plain, Size: 315698 bytes --]

I have removed strided load/store, instead, I will support strided load/store in vectorizer later.

Ok for trunk?



juzhe.zhong@rivai.ai
 
发件人: juzhe.zhong@rivai.ai
发送时间: 2023-07-12 17:48
收件人: 钟居哲
抄送: kito.cheng; Kito.cheng; jeffreyalaw; Robin Dapp
主题: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
Just received email failed to send patches to you.
CC,...... again.



juzhe.zhong@rivai.ai
 
From: juzhe.zhong
Date: 2023-07-12 17:38
To: gcc-patches
CC: kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc; Ju-Zhe Zhong
Subject: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
From: Ju-Zhe Zhong <juzhe.zhong@rivai.ai>
 
This patch fully support gather_load/scatter_store:
1. Support single-rgroup on both RV32/RV64.
2. Support indexed element width can be same as or smaller than Pmode.
3. Support VLA SLP with gather/scatter.
4. Fully tested all gather/scatter with LMUL = M1/M2/M4/M8 both VLA and VLS.
5. Fix bug of handling (subreg:SI (const_poly_int:DI))
6. Fix bug on vec_perm which is used by gather/scatter SLP.
 
All kinds of GATHER/SCATTER are normalized into LEN_MASK_*.
We fully supported these 4 kinds of gather/scatter:
1. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and dummy mask (Full vector).
2. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with dummy length and real mask.
3. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and dummy mask.
4. LEN_MASK_GATHER_LOAD/LEN_MASK_SCATTER_STORE with real length and real mask.
 
Base on the disscussions with Richards, we don't lower vlse/vsse in RISC-V backend for strided load/store.
Instead, we leave it to the middle-end to handle that.
 
Regression is pass ok for trunk ?
 
gcc/ChangeLog:
 
* config/riscv/autovec.md
(len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>): New pattern.
(len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto.
(len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
(len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
(len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
(len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto.
(len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>): Ditto.
(len_mask_gather_load<mode><mode>): Ditto.
(len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>): Ditto.
(len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>): Ditto.
(len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>): Ditto.
(len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>): Ditto.
(len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
(len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>): Ditto.
(len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>): Ditto.
(len_mask_scatter_store<mode><mode>): Ditto.
* config/riscv/predicates.md (const_1_operand): New predicate.
(vector_gs_scale_operand_16): Ditto.
(vector_gs_scale_operand_32): Ditto.
(vector_gs_scale_operand_64): Ditto.
(vector_gs_extension_operand): Ditto.
(vector_gs_scale_operand_16_rv32): Ditto.
(vector_gs_scale_operand_32_rv32): Ditto.
* config/riscv/riscv-protos.h (enum insn_type): Add gather/scatter.
(expand_gather_scatter): New function.
* config/riscv/riscv-v.cc (gen_const_vector_dup): Add gather/scatter.
(emit_vlmax_masked_store_insn): New function.
(emit_nonvlmax_masked_store_insn): Ditto.
(modulo_sel_indices): Ditto.
(expand_vec_perm): Fix SLP for gather/scatter.
(prepare_gather_scatter): New function.
(expand_gather_scatter): Ditto.
* config/riscv/riscv.cc (riscv_legitimize_move): Fix bug of
(subreg:SI (DI CONST_POLY_INT)).
* config/riscv/vector-iterators.md: Add gather/scatter.
* config/riscv/vector.md (vec_duplicate<mode>): Use "@" instead.
(@vec_duplicate<mode>): Ditto.
(@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>): Fix name.
(@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>): Ditto.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/rvv.exp: Add gather/scatter tests.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c: New test.
* gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c: New test.
 
---
gcc/config/riscv/autovec.md                   | 256 ++++++++++++++++
gcc/config/riscv/predicates.md                |  33 +-
gcc/config/riscv/riscv-protos.h               |   3 +
gcc/config/riscv/riscv-v.cc                   | 283 ++++++++++++++++--
gcc/config/riscv/riscv.cc                     |  11 +-
gcc/config/riscv/vector-iterators.md          | 118 ++++++--
gcc/config/riscv/vector.md                    |  30 +-
.../autovec/gather-scatter/gather_load-1.c    |  38 +++
.../autovec/gather-scatter/gather_load-10.c   |  35 +++
.../autovec/gather-scatter/gather_load-11.c   |  32 ++
.../autovec/gather-scatter/gather_load-12.c   | 112 +++++++
.../autovec/gather-scatter/gather_load-2.c    |  38 +++
.../autovec/gather-scatter/gather_load-3.c    |  35 +++
.../autovec/gather-scatter/gather_load-4.c    |  35 +++
.../autovec/gather-scatter/gather_load-5.c    |  35 +++
.../autovec/gather-scatter/gather_load-6.c    |  35 +++
.../autovec/gather-scatter/gather_load-7.c    |  35 +++
.../autovec/gather-scatter/gather_load-8.c    |  35 +++
.../autovec/gather-scatter/gather_load-9.c    |  35 +++
.../gather-scatter/gather_load_run-1.c        |  41 +++
.../gather-scatter/gather_load_run-10.c       |  41 +++
.../gather-scatter/gather_load_run-11.c       |  39 +++
.../gather-scatter/gather_load_run-12.c       | 124 ++++++++
.../gather-scatter/gather_load_run-2.c        |  41 +++
.../gather-scatter/gather_load_run-3.c        |  41 +++
.../gather-scatter/gather_load_run-4.c        |  41 +++
.../gather-scatter/gather_load_run-5.c        |  41 +++
.../gather-scatter/gather_load_run-6.c        |  41 +++
.../gather-scatter/gather_load_run-7.c        |  41 +++
.../gather-scatter/gather_load_run-8.c        |  41 +++
.../gather-scatter/gather_load_run-9.c        |  41 +++
.../gather-scatter/mask_gather_load-1.c       |  39 +++
.../gather-scatter/mask_gather_load-10.c      |  36 +++
.../gather-scatter/mask_gather_load-11.c      | 116 +++++++
.../gather-scatter/mask_gather_load-2.c       |  39 +++
.../gather-scatter/mask_gather_load-3.c       |  36 +++
.../gather-scatter/mask_gather_load-4.c       |  36 +++
.../gather-scatter/mask_gather_load-5.c       |  36 +++
.../gather-scatter/mask_gather_load-6.c       |  36 +++
.../gather-scatter/mask_gather_load-7.c       |  36 +++
.../gather-scatter/mask_gather_load-8.c       |  36 +++
.../gather-scatter/mask_gather_load-9.c       |  36 +++
.../gather-scatter/mask_gather_load_run-1.c   |  48 +++
.../gather-scatter/mask_gather_load_run-10.c  |  48 +++
.../gather-scatter/mask_gather_load_run-11.c  | 140 +++++++++
.../gather-scatter/mask_gather_load_run-2.c   |  48 +++
.../gather-scatter/mask_gather_load_run-3.c   |  48 +++
.../gather-scatter/mask_gather_load_run-4.c   |  48 +++
.../gather-scatter/mask_gather_load_run-5.c   |  48 +++
.../gather-scatter/mask_gather_load_run-6.c   |  48 +++
.../gather-scatter/mask_gather_load_run-7.c   |  48 +++
.../gather-scatter/mask_gather_load_run-8.c   |  48 +++
.../gather-scatter/mask_gather_load_run-9.c   |  48 +++
.../gather-scatter/mask_scatter_store-1.c     |  39 +++
.../gather-scatter/mask_scatter_store-10.c    |  36 +++
.../gather-scatter/mask_scatter_store-2.c     |  39 +++
.../gather-scatter/mask_scatter_store-3.c     |  36 +++
.../gather-scatter/mask_scatter_store-4.c     |  36 +++
.../gather-scatter/mask_scatter_store-5.c     |  36 +++
.../gather-scatter/mask_scatter_store-6.c     |  36 +++
.../gather-scatter/mask_scatter_store-7.c     |  36 +++
.../gather-scatter/mask_scatter_store-8.c     |  36 +++
.../gather-scatter/mask_scatter_store-9.c     |  36 +++
.../gather-scatter/mask_scatter_store_run-1.c |  48 +++
.../mask_scatter_store_run-10.c               |  48 +++
.../gather-scatter/mask_scatter_store_run-2.c |  48 +++
.../gather-scatter/mask_scatter_store_run-3.c |  48 +++
.../gather-scatter/mask_scatter_store_run-4.c |  48 +++
.../gather-scatter/mask_scatter_store_run-5.c |  48 +++
.../gather-scatter/mask_scatter_store_run-6.c |  48 +++
.../gather-scatter/mask_scatter_store_run-7.c |  48 +++
.../gather-scatter/mask_scatter_store_run-8.c |  48 +++
.../gather-scatter/mask_scatter_store_run-9.c |  48 +++
.../autovec/gather-scatter/scatter_store-1.c  |  38 +++
.../autovec/gather-scatter/scatter_store-10.c |  35 +++
.../autovec/gather-scatter/scatter_store-2.c  |  38 +++
.../autovec/gather-scatter/scatter_store-3.c  |  35 +++
.../autovec/gather-scatter/scatter_store-4.c  |  35 +++
.../autovec/gather-scatter/scatter_store-5.c  |  35 +++
.../autovec/gather-scatter/scatter_store-6.c  |  35 +++
.../autovec/gather-scatter/scatter_store-7.c  |  35 +++
.../autovec/gather-scatter/scatter_store-8.c  |  35 +++
.../autovec/gather-scatter/scatter_store-9.c  |  35 +++
.../gather-scatter/scatter_store_run-1.c      |  40 +++
.../gather-scatter/scatter_store_run-10.c     |  40 +++
.../gather-scatter/scatter_store_run-2.c      |  40 +++
.../gather-scatter/scatter_store_run-3.c      |  40 +++
.../gather-scatter/scatter_store_run-4.c      |  40 +++
.../gather-scatter/scatter_store_run-5.c      |  40 +++
.../gather-scatter/scatter_store_run-6.c      |  40 +++
.../gather-scatter/scatter_store_run-7.c      |  40 +++
.../gather-scatter/scatter_store_run-8.c      |  40 +++
.../gather-scatter/scatter_store_run-9.c      |  40 +++
.../autovec/gather-scatter/strided_load-1.c   |  45 +++
.../autovec/gather-scatter/strided_load-2.c   |  45 +++
.../gather-scatter/strided_load_run-1.c       |  84 ++++++
.../gather-scatter/strided_load_run-2.c       |  84 ++++++
.../autovec/gather-scatter/strided_store-1.c  |  45 +++
.../autovec/gather-scatter/strided_store-2.c  |  45 +++
.../gather-scatter/strided_store_run-1.c      |  82 +++++
.../gather-scatter/strided_store_run-2.c      |  82 +++++
101 files changed, 4962 insertions(+), 61 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c
 
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md
index d98a63c285e..600ddcdb152 100644
--- a/gcc/config/riscv/autovec.md
+++ b/gcc/config/riscv/autovec.md
@@ -57,6 +57,262 @@
   }
)
+;; =========================================================================
+;; == Gather Load
+;; =========================================================================
+
+(define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
+  [(match_operand:VNX1_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX1_QHSDI 2 "register_operand")
+   (match_operand 3 "<VNX1_QHSD:gs_extension>")
+   (match_operand 4 "<VNX1_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX1_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX2_QHSD:mode><VNX2_QHSDI:mode>"
+  [(match_operand:VNX2_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX2_QHSDI 2 "register_operand")
+   (match_operand 3 "<VNX2_QHSD:gs_extension>")
+   (match_operand 4 "<VNX2_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX2_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX4_QHSD:mode><VNX4_QHSDI:mode>"
+  [(match_operand:VNX4_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX4_QHSDI 2 "register_operand")
+   (match_operand 3 "<VNX4_QHSD:gs_extension>")
+   (match_operand 4 "<VNX4_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX4_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX8_QHSD:mode><VNX8_QHSDI:mode>"
+  [(match_operand:VNX8_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX8_QHSDI 2 "register_operand")
+   (match_operand 3 "<VNX8_QHSD:gs_extension>")
+   (match_operand 4 "<VNX8_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX8_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
+  [(match_operand:VNX16_QHSD 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX16_QHSDI 2 "register_operand")
+   (match_operand 3 "<VNX16_QHSD:gs_extension>")
+   (match_operand 4 "<VNX16_QHSD:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX16_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX32_QHS:mode><VNX32_QHSI:mode>"
+  [(match_operand:VNX32_QHS 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX32_QHSI 2 "register_operand")
+   (match_operand 3 "<VNX32_QHS:gs_extension>")
+   (match_operand 4 "<VNX32_QHS:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX32_QHS:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+(define_expand "len_mask_gather_load<VNX64_QH:mode><VNX64_QHI:mode>"
+  [(match_operand:VNX64_QH 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX64_QHI 2 "register_operand")
+   (match_operand 3 "<VNX64_QH:gs_extension>")
+   (match_operand 4 "<VNX64_QH:gs_scale>")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX64_QH:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+;; When SEW = 8 and LMUL = 8, we can't find any index mode with
+;; larger SEW. Since RVV indexed load/store support zero extend
+;; implicitly and not support scaling, we should only allow
+;; operands[3] and operands[4] to be const_1_operand.
+(define_expand "len_mask_gather_load<mode><mode>"
+  [(match_operand:VNX128_Q 0 "register_operand")
+   (match_operand 1 "pmode_reg_or_0_operand")
+   (match_operand:VNX128_Q 2 "register_operand")
+   (match_operand 3 "const_1_operand")
+   (match_operand 4 "const_1_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, true);
+  DONE;
+})
+
+;; =========================================================================
+;; == Scatter Store
+;; =========================================================================
+
+(define_expand "len_mask_scatter_store<VNX1_QHSD:mode><VNX1_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX1_QHSDI 1 "register_operand")
+   (match_operand 2 "<VNX1_QHSD:gs_extension>")
+   (match_operand 3 "<VNX1_QHSD:gs_scale>")
+   (match_operand:VNX1_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX1_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX2_QHSD:mode><VNX2_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX2_QHSDI 1 "register_operand")
+   (match_operand 2 "<VNX2_QHSD:gs_extension>")
+   (match_operand 3 "<VNX2_QHSD:gs_scale>")
+   (match_operand:VNX2_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX2_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX4_QHSD:mode><VNX4_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX4_QHSDI 1 "register_operand")
+   (match_operand 2 "<VNX4_QHSD:gs_extension>")
+   (match_operand 3 "<VNX4_QHSD:gs_scale>")
+   (match_operand:VNX4_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX4_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX8_QHSD:mode><VNX8_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX8_QHSDI 1 "register_operand")
+   (match_operand 2 "<VNX8_QHSD:gs_extension>")
+   (match_operand 3 "<VNX8_QHSD:gs_scale>")
+   (match_operand:VNX8_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX8_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX16_QHSDI 1 "register_operand")
+   (match_operand 2 "<VNX16_QHSD:gs_extension>")
+   (match_operand 3 "<VNX16_QHSD:gs_scale>")
+   (match_operand:VNX16_QHSD 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX16_QHSD:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX32_QHS:mode><VNX32_QHSI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX32_QHSI 1 "register_operand")
+   (match_operand 2 "<VNX32_QHS:gs_extension>")
+   (match_operand 3 "<VNX32_QHS:gs_scale>")
+   (match_operand:VNX32_QHS 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX32_QHS:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+(define_expand "len_mask_scatter_store<VNX64_QH:mode><VNX64_QHI:mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX64_QHI 1 "register_operand")
+   (match_operand 2 "<VNX64_QH:gs_extension>")
+   (match_operand 3 "<VNX64_QH:gs_scale>")
+   (match_operand:VNX64_QH 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VNX64_QH:VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
+;; When SEW = 8 and LMUL = 8, we can't find any index mode with
+;; larger SEW. Since RVV indexed load/store support zero extend
+;; implicitly and not support scaling, we should only allow
+;; operands[3] and operands[4] to be const_1_operand.
+(define_expand "len_mask_scatter_store<mode><mode>"
+  [(match_operand 0 "pmode_reg_or_0_operand")
+   (match_operand:VNX128_Q 1 "register_operand")
+   (match_operand 2 "const_1_operand")
+   (match_operand 3 "const_1_operand")
+   (match_operand:VNX128_Q 4 "register_operand")
+   (match_operand 5 "autovec_length_operand")
+   (match_operand 6 "const_0_operand")
+   (match_operand:<VM> 7 "vector_mask_operand")]
+  "TARGET_VECTOR"
+{
+  riscv_vector::expand_gather_scatter (operands, false);
+  DONE;
+})
+
;; =========================================================================
;; == Vector creation
;; =========================================================================
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index eb975eaf994..5a22c77f0cd 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -61,6 +61,10 @@
   (and (match_code "const_int,const_wide_int,const_vector")
        (match_test "op == CONST0_RTX (GET_MODE (op))")))
+(define_predicate "const_1_operand"
+  (and (match_code "const_int,const_wide_int,const_vector")
+       (match_test "op == CONST1_RTX (GET_MODE (op))")))
+
(define_predicate "reg_or_0_operand"
   (ior (match_operand 0 "const_0_operand")
        (match_operand 0 "register_operand")))
@@ -341,6 +345,33 @@
   (ior (match_operand 0 "register_operand")
        (match_code "const_vector")))
+(define_predicate "vector_gs_scale_operand_16"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1 || INTVAL (op) == 2")))
+
+(define_predicate "vector_gs_scale_operand_32"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1 || INTVAL (op) == 4")))
+
+(define_predicate "vector_gs_scale_operand_64"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1 || (INTVAL (op) == 8 && Pmode == DImode)")))
+
+(define_predicate "vector_gs_extension_operand"
+  (ior (match_operand 0 "const_1_operand")
+       (and (match_operand 0 "const_0_operand")
+            (match_test "Pmode == SImode"))))
+
+(define_predicate "vector_gs_scale_operand_16_rv32"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1
+     || (INTVAL (op) == 2 && Pmode == SImode)")))
+
+(define_predicate "vector_gs_scale_operand_32_rv32"
+  (and (match_code "const_int")
+       (match_test "INTVAL (op) == 1
+     || (INTVAL (op) == 4 && Pmode == SImode)")))
+
(define_predicate "ltge_operator"
   (match_code "lt,ltu,ge,geu"))
@@ -376,7 +407,7 @@
|| rtx_equal_p (op, CONST0_RTX (GET_MODE (op))))
&& maybe_gt (GET_MODE_BITSIZE (GET_MODE (op)), GET_MODE_BITSIZE (Pmode)))")
     (ior (match_test "rtx_equal_p (op, CONST0_RTX (GET_MODE (op)))")
-         (ior (match_operand 0 "const_int_operand")
+         (ior (match_code "const_int,const_poly_int")
               (ior (match_operand 0 "register_operand")
                    (match_test "satisfies_constraint_Wdm (op)"))))))
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 6cd5c6639c9..4f690d41753 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -149,6 +149,8 @@ enum insn_type
   RVV_SCALAR_MOV_OP = 4, /* +1 for VUNDEF according to vector.md.  */
   RVV_SLIDE_OP = 4,      /* Dest, VUNDEF, source and offset.  */
   RVV_COMPRESS_OP = 4,
+  RVV_GATHER_M_OP = 5,
+  RVV_SCATTER_M_OP = 4,
};
enum vlmul_type
{
@@ -256,6 +258,7 @@ void expand_vec_init (rtx, rtx);
void expand_vec_perm (rtx, rtx, rtx, rtx);
void expand_select_vl (rtx *);
void expand_load_store (rtx *, bool);
+void expand_gather_scatter (rtx *, bool);
/* Rounding mode bitfield for fixed point VXRM.  */
enum fixed_point_rounding_mode
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 1a61f9a18cd..d4c90800400 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -49,6 +49,7 @@
#include "tm-constrs.h"
#include "rtx-vector-builder.h"
#include "targhooks.h"
+#include "gimple.h"
using namespace riscv_vector;
@@ -556,15 +557,22 @@ const_vec_all_in_range_p (rtx vec, poly_int64 minval, poly_int64 maxval)
   return true;
}
-/* Return a const_int vector of VAL.
-
-   This function also exists in aarch64, we may unify it in middle-end in the
-   future.  */
+/* Return a const vector of VAL. The VAL can be either const_int or
+   const_poly_int.  */
static rtx
gen_const_vector_dup (machine_mode mode, poly_int64 val)
{
-  rtx c = gen_int_mode (val, GET_MODE_INNER (mode));
+  scalar_mode smode = GET_MODE_INNER (mode);
+  rtx c = gen_int_mode (val, smode);
+  if (!val.is_constant () && GET_MODE_SIZE (smode) > GET_MODE_SIZE (Pmode))
+    {
+      /* When VAL is const_poly_int value, we need to explicitly broadcast
+ it into a vector using RVV broadcast instruction.  */
+      rtx dup = gen_reg_rtx (mode);
+      emit_insn (gen_vec_duplicate (mode, dup, c));
+      return dup;
+    }
   return gen_const_vec_duplicate (mode, c);
}
@@ -901,6 +909,39 @@ emit_nonvlmax_masked_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
   e.emit_insn ((enum insn_code) icode, ops);
}
+/* This function emits a VLMAX masked store instruction.  */
+static void
+emit_vlmax_masked_store_insn (unsigned icode, int op_num, rtx *ops)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+   /*HAS_DEST_P*/ false,
+   /*FULLY_UNMASKED_P*/ false,
+   /*USE_REAL_MERGE_P*/ true,
+   /*HAS_AVL_P*/ true,
+   /*VLMAX_P*/ true, dest_mode,
+   mask_mode);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
+/* This function emits a non-VLMAX masked store instruction.  */
+static void
+emit_nonvlmax_masked_store_insn (unsigned icode, int op_num, rtx *ops, rtx avl)
+{
+  machine_mode dest_mode = GET_MODE (ops[0]);
+  machine_mode mask_mode = get_mask_mode (dest_mode).require ();
+  insn_expander<RVV_INSN_OPERANDS_MAX> e (/*OP_NUM*/ op_num,
+   /*HAS_DEST_P*/ false,
+   /*FULLY_UNMASKED_P*/ false,
+   /*USE_REAL_MERGE_P*/ true,
+   /*HAS_AVL_P*/ true,
+   /*VLMAX_P*/ false, dest_mode,
+   mask_mode);
+  e.set_vl (avl);
+  e.emit_insn ((enum insn_code) icode, ops);
+}
+
/* This function emits a masked instruction.  */
void
emit_vlmax_masked_mu_insn (unsigned icode, int op_num, rtx *ops)
@@ -1155,7 +1196,6 @@ static void
expand_const_vector (rtx target, rtx src)
{
   machine_mode mode = GET_MODE (target);
-  scalar_mode elt_mode = GET_MODE_INNER (mode);
   if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL)
     {
       rtx elt;
@@ -1180,7 +1220,6 @@ expand_const_vector (rtx target, rtx src)
}
       else
{
-   elt = force_reg (elt_mode, elt);
  rtx ops[] = {tmp, elt};
  emit_vlmax_insn (code_for_pred_broadcast (mode), RVV_UNOP, ops);
}
@@ -2449,6 +2488,25 @@ expand_vec_cmp_float (rtx target, rtx_code code, rtx op0, rtx op1,
   return false;
}
+/* Modulo all SEL indices to ensure they are all in range if [0, MAX_SEL].  */
+static rtx
+modulo_sel_indices (rtx sel, poly_uint64 max_sel)
+{
+  rtx sel_mod;
+  machine_mode sel_mode = GET_MODE (sel);
+  poly_uint64 nunits = GET_MODE_NUNITS (sel_mode);
+  /* If SEL is variable-length CONST_VECTOR, we don't need to modulo it.  */
+  if (!nunits.is_constant () && CONST_VECTOR_P (sel))
+    sel_mod = sel;
+  else
+    {
+      rtx mod = gen_const_vector_dup (sel_mode, max_sel);
+      sel_mod
+ = expand_simple_binop (sel_mode, AND, sel, mod, NULL, 0, OPTAB_DIRECT);
+    }
+  return sel_mod;
+}
+
/* Implement vec_perm<mode>.  */
void
@@ -2462,41 +2520,43 @@ expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel)
      index is in range of [0, nunits - 1]. A single vrgather instructions is
      enough. Since we will use vrgatherei16.vv for variable-length vector,
      it is never out of range and we don't need to modulo the index.  */
-  if (!nunits.is_constant () || const_vec_all_in_range_p (sel, 0, nunits - 1))
+  if (nunits.is_constant () && const_vec_all_in_range_p (sel, 0, nunits - 1))
     {
       emit_vlmax_gather_insn (target, op0, sel);
       return;
     }
+  /* Check if all the indices are same.  */
+  rtx elt;
+  if (const_vec_duplicate_p (sel, &elt))
+    {
+      poly_uint64 value = rtx_to_poly_int64 (elt);
+      rtx op = op0;
+      if (maybe_gt (value, nunits - 1))
+ {
+   sel = gen_const_vector_dup (sel_mode, value - nunits);
+   op = op1;
+ }
+      emit_vlmax_gather_insn (target, op, sel);
+    }
+
+  /* Note: vec_perm indices are supposed to wrap when they go beyond the
+     size of the two value vectors, i.e. the upper bits of the indices
+     are effectively ignored.  RVV vrgather instead produces 0 for any
+     out-of-range indices, so we need to modulo all the vec_perm indices
+     to ensure they are all in range of [0, nunits - 1] when op0 == op1
+     or all in range of [0, 2 * nunits - 1] when op0 != op1.  */
+  rtx sel_mod
+    = modulo_sel_indices (sel,
+   rtx_equal_p (op0, op1) ? nunits - 1 : 2 * nunits - 1);
   /* Check if the two values vectors are the same.  */
-  if (rtx_equal_p (op0, op1) || const_vec_duplicate_p (sel))
-    {
-      /* Note: vec_perm indices are supposed to wrap when they go beyond the
- size of the two value vectors, i.e. the upper bits of the indices
- are effectively ignored.  RVV vrgather instead produces 0 for any
- out-of-range indices, so we need to modulo all the vec_perm indices
- to ensure they are all in range of [0, nunits - 1].  */
-      rtx max_sel = gen_const_vector_dup (sel_mode, nunits - 1);
-      rtx sel_mod = expand_simple_binop (sel_mode, AND, sel, max_sel, NULL, 0,
- OPTAB_DIRECT);
-      emit_vlmax_gather_insn (target, op1, sel_mod);
+  if (rtx_equal_p (op0, op1))
+    {
+      emit_vlmax_gather_insn (target, op0, sel_mod);
       return;
     }
-  rtx sel_mod = sel;
   rtx max_sel = gen_const_vector_dup (sel_mode, 2 * nunits - 1);
-  /* We don't need to modulo indices for VLA vector.
-     Since we should gurantee they aren't out of range before.  */
-  if (nunits.is_constant ())
-    {
-      /* Note: vec_perm indices are supposed to wrap when they go beyond the
- size of the two value vectors, i.e. the upper bits of the indices
- are effectively ignored.  RVV vrgather instead produces 0 for any
- out-of-range indices, so we need to modulo all the vec_perm indices
- to ensure they are all in range of [0, 2 * nunits - 1].  */
-      sel_mod = expand_simple_binop (sel_mode, AND, sel, max_sel, NULL, 0,
-      OPTAB_DIRECT);
-    }
   /* This following sequence is handling the case that:
      __builtin_shufflevector (vec1, vec2, index...), the index can be any
@@ -2968,4 +3028,163 @@ expand_load_store (rtx *ops, bool is_load)
     }
}
+/* Prepare insn_code for gather_load/scatter_store according to
+   the vector mode and index mode.  */
+static insn_code
+prepare_gather_scatter (machine_mode vec_mode, machine_mode idx_mode,
+ bool is_load)
+{
+  if (!is_load)
+    return code_for_pred_indexed_store (UNSPEC_UNORDERED, vec_mode, idx_mode);
+  else
+    {
+      unsigned src_eew_bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (idx_mode));
+      unsigned dst_eew_bitsize = GET_MODE_BITSIZE (GET_MODE_INNER (vec_mode));
+      if (dst_eew_bitsize == src_eew_bitsize)
+ return code_for_pred_indexed_load_same_eew (UNSPEC_UNORDERED, vec_mode);
+      else if (dst_eew_bitsize > src_eew_bitsize)
+ {
+   unsigned factor = dst_eew_bitsize / src_eew_bitsize;
+   switch (factor)
+     {
+     case 2:
+       return code_for_pred_indexed_load_x2_greater_eew (
+ UNSPEC_UNORDERED, vec_mode);
+     case 4:
+       return code_for_pred_indexed_load_x4_greater_eew (
+ UNSPEC_UNORDERED, vec_mode);
+     case 8:
+       return code_for_pred_indexed_load_x8_greater_eew (
+ UNSPEC_UNORDERED, vec_mode);
+     default:
+       gcc_unreachable ();
+     }
+ }
+      else
+ {
+   unsigned factor = src_eew_bitsize / dst_eew_bitsize;
+   switch (factor)
+     {
+     case 2:
+       return code_for_pred_indexed_load_x2_smaller_eew (
+ UNSPEC_UNORDERED, vec_mode);
+     case 4:
+       return code_for_pred_indexed_load_x4_smaller_eew (
+ UNSPEC_UNORDERED, vec_mode);
+     case 8:
+       return code_for_pred_indexed_load_x8_smaller_eew (
+ UNSPEC_UNORDERED, vec_mode);
+     default:
+       gcc_unreachable ();
+     }
+ }
+    }
+}
+
+/* Expand LEN_MASK_{GATHER_LOAD,SCATTER_STORE}.  */
+void
+expand_gather_scatter (rtx *ops, bool is_load)
+{
+  rtx ptr, vec_offset, vec_reg, len, mask;
+  bool zero_extend_p;
+  int scale_log2;
+  if (is_load)
+    {
+      vec_reg = ops[0];
+      ptr = ops[1];
+      vec_offset = ops[2];
+      zero_extend_p = INTVAL (ops[3]);
+      scale_log2 = exact_log2 (INTVAL (ops[4]));
+      len = ops[5];
+      mask = ops[7];
+    }
+  else
+    {
+      vec_reg = ops[4];
+      ptr = ops[0];
+      vec_offset = ops[1];
+      zero_extend_p = INTVAL (ops[2]);
+      scale_log2 = exact_log2 (INTVAL (ops[3]));
+      len = ops[5];
+      mask = ops[7];
+    }
+
+  machine_mode vec_mode = GET_MODE (vec_reg);
+  machine_mode idx_mode = GET_MODE (vec_offset);
+  scalar_mode inner_vec_mode = GET_MODE_INNER (vec_mode);
+  scalar_mode inner_idx_mode = GET_MODE_INNER (idx_mode);
+  unsigned inner_vsize = GET_MODE_BITSIZE (inner_vec_mode);
+  unsigned inner_offsize = GET_MODE_BITSIZE (inner_idx_mode);
+  poly_int64 nunits = GET_MODE_NUNITS (vec_mode);
+  poly_int64 value;
+  bool is_vlmax = poly_int_rtx_p (len, &value) && known_eq (value, nunits);
+
+  if (inner_offsize < inner_vsize)
+    {
+      /* 7.2. Vector Load/Store Addressing Modes.
+ If the vector offset elements are narrower than XLEN, they are
+ zero-extended to XLEN before adding to the ptr effective address. If
+ the vector offset elements are wider than XLEN, the least-significant
+ XLEN bits are used in the address calculation. An implementation must
+ raise an illegal instruction exception if the EEW is not supported for
+ offset elements.
+
+ RVV spec only refers to the scale_log == 0 case.  */
+      if (!zero_extend_p || (zero_extend_p && scale_log2 != 0))
+ {
+   if (zero_extend_p)
+     inner_idx_mode
+       = int_mode_for_size (inner_offsize * 2, 0).require ();
+   else
+     inner_idx_mode = int_mode_for_size (BITS_PER_WORD, 0).require ();
+   machine_mode new_idx_mode
+     = get_vector_mode (inner_idx_mode, nunits).require ();
+   rtx tmp = gen_reg_rtx (new_idx_mode);
+   emit_insn (gen_extend_insn (tmp, vec_offset, new_idx_mode, idx_mode,
+       zero_extend_p ? true : false));
+   vec_offset = tmp;
+   idx_mode = new_idx_mode;
+ }
+    }
+
+  if (scale_log2 != 0)
+    {
+      rtx tmp = expand_binop (idx_mode, ashl_optab, vec_offset,
+       gen_int_mode (scale_log2, Pmode), NULL_RTX, 0,
+       OPTAB_DIRECT);
+      vec_offset = tmp;
+    }
+
+  insn_code icode = prepare_gather_scatter (vec_mode, idx_mode, is_load);
+  if (is_vlmax)
+    {
+      if (is_load)
+ {
+   rtx load_ops[]
+     = {vec_reg, mask, RVV_VUNDEF (vec_mode), ptr, vec_offset};
+   emit_vlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops);
+ }
+      else
+ {
+   rtx store_ops[] = {mask, ptr, vec_offset, vec_reg};
+   emit_vlmax_masked_store_insn (icode, RVV_SCATTER_M_OP, store_ops);
+ }
+    }
+  else
+    {
+      if (is_load)
+ {
+   rtx load_ops[]
+     = {vec_reg, mask, RVV_VUNDEF (vec_mode), ptr, vec_offset};
+   emit_nonvlmax_masked_insn (icode, RVV_GATHER_M_OP, load_ops, len);
+ }
+      else
+ {
+   rtx store_ops[] = {mask, ptr, vec_offset, vec_reg};
+   emit_nonvlmax_masked_store_insn (icode, RVV_SCATTER_M_OP, store_ops,
+    len);
+ }
+    }
+}
+
} // namespace riscv_vector
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 38d8eb2fcf5..c5dbb6c34ee 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -2060,7 +2060,14 @@ riscv_legitimize_poly_move (machine_mode mode, rtx dest, rtx tmp, rtx src)
      (m, n) = base * magn + constant.
      This calculation doesn't need div operation.  */
-  emit_move_insn (tmp, gen_int_mode (BYTES_PER_RISCV_VECTOR, mode));
+  if (known_le (GET_MODE_SIZE (mode), GET_MODE_SIZE (Pmode)))
+    emit_move_insn (tmp, gen_int_mode (BYTES_PER_RISCV_VECTOR, mode));
+  else
+    {
+      emit_move_insn (gen_highpart (Pmode, tmp), CONST0_RTX (Pmode));
+      emit_move_insn (gen_lowpart (Pmode, tmp),
+       gen_int_mode (BYTES_PER_RISCV_VECTOR, Pmode));
+    }
   if (BYTES_PER_RISCV_VECTOR.is_constant ())
     {
@@ -2167,7 +2174,7 @@ riscv_legitimize_move (machine_mode mode, rtx dest, rtx src)
  return false;
}
-      if (satisfies_constraint_vp (src))
+      if (satisfies_constraint_vp (src) && GET_MODE (src) == Pmode)
return false;
       if (GET_MODE_SIZE (mode).to_constant () < GET_MODE_SIZE (Pmode))
diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md
index 8afd3dcaddd..ec49544bcf6 100644
--- a/gcc/config/riscv/vector-iterators.md
+++ b/gcc/config/riscv/vector-iterators.md
@@ -115,6 +115,9 @@
(define_mode_iterator VEEWEXT2 [
   (VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx64HI "TARGET_MIN_VLEN >= 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16") (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
   (VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128") (VNx2DI "TARGET_VECTOR_ELEN_64")
   (VNx4DI "TARGET_VECTOR_ELEN_64") (VNx8DI "TARGET_VECTOR_ELEN_64") (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
@@ -161,6 +164,8 @@
(define_mode_iterator VEEWTRUNC2 [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI VNx4QI VNx8QI VNx16QI VNx32QI (VNx64QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI VNx16HI (VNx32HI "TARGET_MIN_VLEN >= 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16") (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
   (VNx1SI "TARGET_MIN_VLEN < 128") VNx2SI VNx4SI VNx8SI (VNx16SI "TARGET_MIN_VLEN >= 128")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
@@ -172,6 +177,8 @@
(define_mode_iterator VEEWTRUNC4 [
   (VNx1QI "TARGET_MIN_VLEN < 128") VNx2QI VNx4QI VNx8QI VNx16QI (VNx32QI "TARGET_MIN_VLEN >= 128")
   (VNx1HI "TARGET_MIN_VLEN < 128") VNx2HI VNx4HI VNx8HI (VNx16HI "TARGET_MIN_VLEN >= 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128") (VNx2HF "TARGET_VECTOR_ELEN_FP_16") (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16") (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
])
(define_mode_iterator VEEWTRUNC8 [
@@ -362,46 +369,67 @@
])
(define_mode_iterator VNX1_QHSD [
-  (VNx1QI "TARGET_MIN_VLEN < 128") (VNx1HI "TARGET_MIN_VLEN < 128") (VNx1SI "TARGET_MIN_VLEN < 128")
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  (VNx1SI "TARGET_MIN_VLEN < 128")
   (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx1HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN < 128")
   (VNx1SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN < 128")
   (VNx1DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN < 128")
])
(define_mode_iterator VNX2_QHSD [
-  VNx2QI VNx2HI VNx2SI
+  VNx2QI
+  VNx2HI
+  VNx2SI
   (VNx2DI "TARGET_VECTOR_ELEN_64")
+  (VNx2HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx2SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx2DF "TARGET_VECTOR_ELEN_FP_64")
])
(define_mode_iterator VNX4_QHSD [
-  VNx4QI VNx4HI VNx4SI
+  VNx4QI
+  VNx4HI
+  VNx4SI
   (VNx4DI "TARGET_VECTOR_ELEN_64")
+  (VNx4HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx4SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx4DF "TARGET_VECTOR_ELEN_FP_64")
])
(define_mode_iterator VNX8_QHSD [
-  VNx8QI VNx8HI VNx8SI
+  VNx8QI
+  VNx8HI
+  VNx8SI
   (VNx8DI "TARGET_VECTOR_ELEN_64")
+  (VNx8HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx8SF "TARGET_VECTOR_ELEN_FP_32")
   (VNx8DF "TARGET_VECTOR_ELEN_FP_64")
])
-(define_mode_iterator VNX16_QHS [
-  VNx16QI VNx16HI (VNx16SI "TARGET_MIN_VLEN > 32")
+(define_mode_iterator VNX16_QHSD [
+  VNx16QI
+  VNx16HI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  (VNx16HF "TARGET_VECTOR_ELEN_FP_16")
   (VNx16SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32")
-  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128") (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
+  (VNx16DF "TARGET_VECTOR_ELEN_FP_64 && TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator VNX32_QHS [
-  VNx32QI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128") (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
+  VNx32QI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
+  (VNx32HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32")
+  (VNx32SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator VNX64_QH [
   (VNx64QI "TARGET_MIN_VLEN > 32")
   (VNx64HI "TARGET_MIN_VLEN >= 128")
+  (VNx64HF "TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator VNX128_Q [
@@ -409,35 +437,49 @@
])
(define_mode_iterator VNX1_QHSDI [
-  (VNx1QI "TARGET_MIN_VLEN < 128") (VNx1HI "TARGET_MIN_VLEN < 128") (VNx1SI "TARGET_MIN_VLEN < 128")
-  (VNx1DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128")
+  (VNx1QI "TARGET_MIN_VLEN < 128")
+  (VNx1HI "TARGET_MIN_VLEN < 128")
+  (VNx1SI "TARGET_MIN_VLEN < 128")
+  (VNx1DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN < 128 && TARGET_64BIT")
])
(define_mode_iterator VNX2_QHSDI [
-  VNx2QI VNx2HI VNx2SI
-  (VNx2DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
+  VNx2QI
+  VNx2HI
+  VNx2SI
+  (VNx2DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
])
(define_mode_iterator VNX4_QHSDI [
-  VNx4QI VNx4HI VNx4SI
-  (VNx4DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
+  VNx4QI
+  VNx4HI
+  VNx4SI
+  (VNx4DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
])
(define_mode_iterator VNX8_QHSDI [
-  VNx8QI VNx8HI VNx8SI
-  (VNx8DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64")
+  VNx8QI
+  VNx8HI
+  VNx8SI
+  (VNx8DI "TARGET_VECTOR_ELEN_64 && TARGET_64BIT")
])
(define_mode_iterator VNX16_QHSDI [
-  VNx16QI VNx16HI (VNx16SI "TARGET_MIN_VLEN > 32") (VNx16DI "TARGET_64BIT && TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128")
+  VNx16QI
+  VNx16HI
+  (VNx16SI "TARGET_MIN_VLEN > 32")
+  (VNx16DI "TARGET_VECTOR_ELEN_64 && TARGET_MIN_VLEN >= 128 && TARGET_64BIT")
])
(define_mode_iterator VNX32_QHSI [
-  VNx32QI (VNx32HI "TARGET_MIN_VLEN > 32") (VNx32SI "TARGET_MIN_VLEN >= 128")
+  VNx32QI
+  (VNx32HI "TARGET_MIN_VLEN > 32")
+  (VNx32SI "TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator VNX64_QHI [
-  VNx64QI (VNx64HI "TARGET_MIN_VLEN >= 128")
+  (VNx64QI "TARGET_MIN_VLEN > 32")
+  (VNx64HI "TARGET_MIN_VLEN >= 128")
])
(define_mode_iterator V_WHOLE [
@@ -1393,6 +1435,8 @@
(define_mode_attr VINDEX_DOUBLE_TRUNC [
   (VNx1HI "VNx1QI") (VNx2HI "VNx2QI")  (VNx4HI "VNx4QI")  (VNx8HI "VNx8QI")
   (VNx16HI "VNx16QI") (VNx32HI "VNx32QI") (VNx64HI "VNx64QI")
+  (VNx1HF "VNx1QI") (VNx2HF "VNx2QI")  (VNx4HF "VNx4QI")  (VNx8HF "VNx8QI")
+  (VNx16HF "VNx16QI") (VNx32HF "VNx32QI") (VNx64HF "VNx64QI")
   (VNx1SI "VNx1HI") (VNx2SI "VNx2HI") (VNx4SI "VNx4HI") (VNx8SI "VNx8HI")
   (VNx16SI "VNx16HI") (VNx32SI "VNx32HI")
   (VNx1SF "VNx1HI") (VNx2SF "VNx2HI") (VNx4SF "VNx4HI") (VNx8SF "VNx8HI")
@@ -1420,6 +1464,7 @@
(define_mode_attr VINDEX_DOUBLE_EXT [
   (VNx1QI "VNx1HI") (VNx2QI "VNx2HI") (VNx4QI "VNx4HI") (VNx8QI "VNx8HI") (VNx16QI "VNx16HI") (VNx32QI "VNx32HI") (VNx64QI "VNx64HI")
   (VNx1HI "VNx1SI") (VNx2HI "VNx2SI") (VNx4HI "VNx4SI") (VNx8HI "VNx8SI") (VNx16HI "VNx16SI") (VNx32HI "VNx32SI")
+  (VNx1HF "VNx1SI") (VNx2HF "VNx2SI") (VNx4HF "VNx4SI") (VNx8HF "VNx8SI") (VNx16HF "VNx16SI") (VNx32HF "VNx32SI")
   (VNx1SI "VNx1DI") (VNx2SI "VNx2DI") (VNx4SI "VNx4DI") (VNx8SI "VNx8DI") (VNx16SI "VNx16DI")
   (VNx1SF "VNx1DI") (VNx2SF "VNx2DI") (VNx4SF "VNx4DI") (VNx8SF "VNx8DI") (VNx16SF "VNx16DI")
])
@@ -1427,6 +1472,7 @@
(define_mode_attr VINDEX_QUAD_EXT [
   (VNx1QI "VNx1SI") (VNx2QI "VNx2SI") (VNx4QI "VNx4SI") (VNx8QI "VNx8SI") (VNx16QI "VNx16SI") (VNx32QI "VNx32SI")
   (VNx1HI "VNx1DI") (VNx2HI "VNx2DI") (VNx4HI "VNx4DI") (VNx8HI "VNx8DI") (VNx16HI "VNx16DI")
+  (VNx1HF "VNx1DI") (VNx2HF "VNx2DI") (VNx4HF "VNx4DI") (VNx8HF "VNx8DI") (VNx16HF "VNx16DI")
])
(define_mode_attr VINDEX_OCT_EXT [
@@ -1471,6 +1517,40 @@
   (VNx4DI "VNx8BI") (VNx8DI "VNx16BI") (VNx16DI "VNx32BI")
])
+(define_mode_attr gs_extension [
+  (VNx1QI "immediate_operand") (VNx2QI "immediate_operand") (VNx4QI "immediate_operand") (VNx8QI "immediate_operand") (VNx16QI "immediate_operand")
+  (VNx32QI "vector_gs_extension_operand") (VNx64QI "const_1_operand")
+  (VNx1HI "immediate_operand") (VNx2HI "immediate_operand") (VNx4HI "immediate_operand") (VNx8HI "immediate_operand") (VNx16HI "immediate_operand")
+  (VNx32HI "vector_gs_extension_operand") (VNx64HI "const_1_operand")
+  (VNx1SI "immediate_operand") (VNx2SI "immediate_operand") (VNx4SI "immediate_operand") (VNx8SI "immediate_operand") (VNx16SI "immediate_operand")
+  (VNx32SI "vector_gs_extension_operand")
+  (VNx1DI "immediate_operand") (VNx2DI "immediate_operand") (VNx4DI "immediate_operand") (VNx8DI "immediate_operand") (VNx16DI "immediate_operand")
+
+  (VNx1HF "immediate_operand") (VNx2HF "immediate_operand") (VNx4HF "immediate_operand") (VNx8HF "immediate_operand") (VNx16HF "immediate_operand")
+  (VNx32HF "vector_gs_extension_operand") (VNx64HF "const_1_operand")
+  (VNx1SF "immediate_operand") (VNx2SF "immediate_operand") (VNx4SF "immediate_operand") (VNx8SF "immediate_operand") (VNx16SF "immediate_operand")
+  (VNx32SF "vector_gs_extension_operand")
+  (VNx1DF "immediate_operand") (VNx2DF "immediate_operand") (VNx4DF "immediate_operand") (VNx8DF "immediate_operand") (VNx16DF "immediate_operand")
+])
+
+(define_mode_attr gs_scale [
+  (VNx1QI "const_1_operand") (VNx2QI "const_1_operand") (VNx4QI "const_1_operand") (VNx8QI "const_1_operand")
+  (VNx16QI "const_1_operand") (VNx32QI "const_1_operand") (VNx64QI "const_1_operand")
+  (VNx1HI "vector_gs_scale_operand_16") (VNx2HI "vector_gs_scale_operand_16") (VNx4HI "vector_gs_scale_operand_16") (VNx8HI "vector_gs_scale_operand_16")
+  (VNx16HI "vector_gs_scale_operand_16") (VNx32HI "vector_gs_scale_operand_16_rv32") (VNx64HI "const_1_operand")
+  (VNx1SI "vector_gs_scale_operand_32") (VNx2SI "vector_gs_scale_operand_32") (VNx4SI "vector_gs_scale_operand_32") (VNx8SI "vector_gs_scale_operand_32")
+  (VNx16SI "vector_gs_scale_operand_32") (VNx32SI "vector_gs_scale_operand_32_rv32")
+  (VNx1DI "vector_gs_scale_operand_64") (VNx2DI "vector_gs_scale_operand_64") (VNx4DI "vector_gs_scale_operand_64") (VNx8DI "vector_gs_scale_operand_64")
+  (VNx16DI "vector_gs_scale_operand_64")
+
+  (VNx1HF "vector_gs_scale_operand_16") (VNx2HF "vector_gs_scale_operand_16") (VNx4HF "vector_gs_scale_operand_16") (VNx8HF "vector_gs_scale_operand_16")
+  (VNx16HF "vector_gs_scale_operand_16") (VNx32HF "vector_gs_scale_operand_16_rv32") (VNx64HF "const_1_operand")
+  (VNx1SF "vector_gs_scale_operand_32") (VNx2SF "vector_gs_scale_operand_32") (VNx4SF "vector_gs_scale_operand_32") (VNx8SF "vector_gs_scale_operand_32")
+  (VNx16SF "vector_gs_scale_operand_32") (VNx32SF "vector_gs_scale_operand_32_rv32")
+  (VNx1DF "vector_gs_scale_operand_64") (VNx2DF "vector_gs_scale_operand_64") (VNx4DF "vector_gs_scale_operand_64") (VNx8DF "vector_gs_scale_operand_64")
+  (VNx16DF "vector_gs_scale_operand_64")
+])
+
(define_int_iterator WREDUC [UNSPEC_WREDUC_SUM UNSPEC_WREDUC_USUM])
(define_int_iterator ORDER [UNSPEC_ORDERED UNSPEC_UNORDERED])
diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 5b7a17b9d34..de944637570 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -818,7 +818,7 @@
;; This pattern only handles duplicates of non-constant inputs.
;; Constant vectors go through the movm pattern instead.
;; So "direct_broadcast_operand" can only be mem or reg, no CONSTANT.
-(define_expand "vec_duplicate<mode>"
+(define_expand "@vec_duplicate<mode>"
   [(set (match_operand:V 0 "register_operand")
(vec_duplicate:V
  (match_operand:<VEL> 1 "direct_broadcast_operand")))]
@@ -1357,8 +1357,16 @@
}
     }
   else if (GET_MODE_BITSIZE (<VEL>mode) > GET_MODE_BITSIZE (Pmode)
-           && immediate_operand (operands[3], Pmode))
-    operands[3] = gen_rtx_SIGN_EXTEND (<VEL>mode, force_reg (Pmode, operands[3]));
+           && (immediate_operand (operands[3], Pmode)
+        || (CONST_POLY_INT_P (operands[3])
+            && known_ge (rtx_to_poly_int64 (operands[3]), 0U)
+    && known_le (rtx_to_poly_int64 (operands[3]), GET_MODE_SIZE (<MODE>mode)))))
+    {
+      rtx tmp = gen_reg_rtx (Pmode);
+      poly_int64 value = rtx_to_poly_int64 (operands[3]);
+      emit_move_insn (tmp, gen_int_mode (value, Pmode));
+      operands[3] = gen_rtx_SIGN_EXTEND (<VEL>mode, tmp);
+    }
   else
     operands[3] = force_reg (<VEL>mode, operands[3]);
})
@@ -1387,7 +1395,8 @@
    vlse<sew>.v\t%0,%3,zero
    vmv.s.x\t%0,%3
    vmv.s.x\t%0,%3"
-  "register_operand (operands[3], <VEL>mode)
+  "(register_operand (operands[3], <VEL>mode)
+  || CONST_POLY_INT_P (operands[3]))
   && GET_MODE_BITSIZE (<VEL>mode) > GET_MODE_BITSIZE (Pmode)"
   [(set (match_dup 0)
(if_then_else:VI (unspec:<VM> [(match_dup 1) (match_dup 4)
@@ -1397,6 +1406,12 @@
  (match_dup 2)))]
   {
     gcc_assert (can_create_pseudo_p ());
+    if (CONST_POLY_INT_P (operands[3]))
+      {
+ rtx tmp = gen_reg_rtx (<VEL>mode);
+ emit_move_insn (tmp, operands[3]);
+ operands[3] = tmp;
+      }
     rtx m = assign_stack_local (<VEL>mode, GET_MODE_SIZE (<VEL>mode),
GET_MODE_ALIGNMENT (<VEL>mode));
     m = validize_mem (m);
@@ -1483,6 +1498,7 @@
     (match_operand 5 "vector_length_operand"    "   rK,    rK,    rK")
     (match_operand 6 "const_int_operand"        "    i,     i,     i")
     (match_operand 7 "const_int_operand"        "    i,     i,     i")
+      (match_operand 8 "const_int_operand"        "    i,     i,     i")
     (reg:SI VL_REGNUM)
     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
  (unspec:V
@@ -1738,7 +1754,7 @@
   [(set_attr "type" "vst<order>x")
    (set_attr "mode" "<VNX8_QHSD:MODE>")])
-(define_insn "@pred_indexed_<order>store<VNX16_QHS:mode><VNX16_QHSDI:mode>"
+(define_insn "@pred_indexed_<order>store<VNX16_QHSD:mode><VNX16_QHSDI:mode>"
   [(set (mem:BLK (scratch))
(unspec:BLK
  [(unspec:<VM>
@@ -1749,11 +1765,11 @@
     (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
   (match_operand 1 "pmode_reg_or_0_operand"      "  rJ")
   (match_operand:VNX16_QHSDI 2 "register_operand" "  vr")
-    (match_operand:VNX16_QHS 3 "register_operand"  "  vr")] ORDER))]
+    (match_operand:VNX16_QHSD 3 "register_operand"  "  vr")] ORDER))]
   "TARGET_VECTOR"
   "vs<order>xei<VNX16_QHSDI:sew>.v\t%3,(%z1),%2%p0"
   [(set_attr "type" "vst<order>x")
-   (set_attr "mode" "<VNX16_QHS:MODE>")])
+   (set_attr "mode" "<VNX16_QHSD:MODE>")])
(define_insn "@pred_indexed_<order>store<VNX32_QHS:mode><VNX32_QHSI:mode>"
   [(set (mem:BLK (scratch))
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c
new file mode 100644
index 00000000000..dffe13f6a8a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c
new file mode 100644
index 00000000000..a622e516f06
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-10.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c
new file mode 100644
index 00000000000..4692380233d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-11.c
@@ -0,0 +1,32 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_LOOP(DATA_TYPE)                                                   \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict *src)           \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += *src[i];                                                      \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t)                                                                   \
+  T (uint8_t)                                                                  \
+  T (int16_t)                                                                  \
+  T (uint16_t)                                                                 \
+  T (_Float16)                                                                 \
+  T (int32_t)                                                                  \
+  T (uint32_t)                                                                 \
+  T (float)                                                                    \
+  T (int64_t)                                                                  \
+  T (uint64_t)                                                                 \
+  T (double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c
new file mode 100644
index 00000000000..71a3dd466fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-12.c
@@ -0,0 +1,112 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_LOOP(DATA_TYPE, INDEX_TYPE)                                       \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x,  \
+ INDEX_TYPE *restrict index)                    \
+  {                                                                            \
+    for (int i = 0; i < 100; ++i)                                              \
+      {                                                                        \
+ y[i * 2] = x[index[i * 2]] + 1;                                        \
+ y[i * 2 + 1] = x[index[i * 2 + 1]] + 2;                                \
+      }                                                                        \
+  }
+
+TEST_LOOP (int8_t, int8_t)
+TEST_LOOP (uint8_t, int8_t)
+TEST_LOOP (int16_t, int8_t)
+TEST_LOOP (uint16_t, int8_t)
+TEST_LOOP (int32_t, int8_t)
+TEST_LOOP (uint32_t, int8_t)
+TEST_LOOP (int64_t, int8_t)
+TEST_LOOP (uint64_t, int8_t)
+TEST_LOOP (_Float16, int8_t)
+TEST_LOOP (float, int8_t)
+TEST_LOOP (double, int8_t)
+TEST_LOOP (int8_t, int16_t)
+TEST_LOOP (uint8_t, int16_t)
+TEST_LOOP (int16_t, int16_t)
+TEST_LOOP (uint16_t, int16_t)
+TEST_LOOP (int32_t, int16_t)
+TEST_LOOP (uint32_t, int16_t)
+TEST_LOOP (int64_t, int16_t)
+TEST_LOOP (uint64_t, int16_t)
+TEST_LOOP (_Float16, int16_t)
+TEST_LOOP (float, int16_t)
+TEST_LOOP (double, int16_t)
+TEST_LOOP (int8_t, int32_t)
+TEST_LOOP (uint8_t, int32_t)
+TEST_LOOP (int16_t, int32_t)
+TEST_LOOP (uint16_t, int32_t)
+TEST_LOOP (int32_t, int32_t)
+TEST_LOOP (uint32_t, int32_t)
+TEST_LOOP (int64_t, int32_t)
+TEST_LOOP (uint64_t, int32_t)
+TEST_LOOP (_Float16, int32_t)
+TEST_LOOP (float, int32_t)
+TEST_LOOP (double, int32_t)
+TEST_LOOP (int8_t, int64_t)
+TEST_LOOP (uint8_t, int64_t)
+TEST_LOOP (int16_t, int64_t)
+TEST_LOOP (uint16_t, int64_t)
+TEST_LOOP (int32_t, int64_t)
+TEST_LOOP (uint32_t, int64_t)
+TEST_LOOP (int64_t, int64_t)
+TEST_LOOP (uint64_t, int64_t)
+TEST_LOOP (_Float16, int64_t)
+TEST_LOOP (float, int64_t)
+TEST_LOOP (double, int64_t)
+TEST_LOOP (int8_t, uint8_t)
+TEST_LOOP (uint8_t, uint8_t)
+TEST_LOOP (int16_t, uint8_t)
+TEST_LOOP (uint16_t, uint8_t)
+TEST_LOOP (int32_t, uint8_t)
+TEST_LOOP (uint32_t, uint8_t)
+TEST_LOOP (int64_t, uint8_t)
+TEST_LOOP (uint64_t, uint8_t)
+TEST_LOOP (_Float16, uint8_t)
+TEST_LOOP (float, uint8_t)
+TEST_LOOP (double, uint8_t)
+TEST_LOOP (int8_t, uint16_t)
+TEST_LOOP (uint8_t, uint16_t)
+TEST_LOOP (int16_t, uint16_t)
+TEST_LOOP (uint16_t, uint16_t)
+TEST_LOOP (int32_t, uint16_t)
+TEST_LOOP (uint32_t, uint16_t)
+TEST_LOOP (int64_t, uint16_t)
+TEST_LOOP (uint64_t, uint16_t)
+TEST_LOOP (_Float16, uint16_t)
+TEST_LOOP (float, uint16_t)
+TEST_LOOP (double, uint16_t)
+TEST_LOOP (int8_t, uint32_t)
+TEST_LOOP (uint8_t, uint32_t)
+TEST_LOOP (int16_t, uint32_t)
+TEST_LOOP (uint16_t, uint32_t)
+TEST_LOOP (int32_t, uint32_t)
+TEST_LOOP (uint32_t, uint32_t)
+TEST_LOOP (int64_t, uint32_t)
+TEST_LOOP (uint64_t, uint32_t)
+TEST_LOOP (_Float16, uint32_t)
+TEST_LOOP (float, uint32_t)
+TEST_LOOP (double, uint32_t)
+TEST_LOOP (int8_t, uint64_t)
+TEST_LOOP (uint8_t, uint64_t)
+TEST_LOOP (int16_t, uint64_t)
+TEST_LOOP (uint16_t, uint64_t)
+TEST_LOOP (int32_t, uint64_t)
+TEST_LOOP (uint32_t, uint64_t)
+TEST_LOOP (int64_t, uint64_t)
+TEST_LOOP (uint64_t, uint64_t)
+TEST_LOOP (_Float16, uint64_t)
+TEST_LOOP (float, uint64_t)
+TEST_LOOP (double, uint64_t)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 88 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-assembler-not "vluxei64\.v" } } */
+/* { dg-final { scan-assembler-not "vsuxei64\.v" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c
new file mode 100644
index 00000000000..785550c4b2d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-2.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c
new file mode 100644
index 00000000000..22aeb889221
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-3.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c
new file mode 100644
index 00000000000..d74a83415d7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-4.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c
new file mode 100644
index 00000000000..2b6c0a87c18
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-5.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c
new file mode 100644
index 00000000000..407cc8a5a73
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-6.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c
new file mode 100644
index 00000000000..81b31ef26aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-7.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c
new file mode 100644
index 00000000000..0bfdb8f0acf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-8.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c
new file mode 100644
index 00000000000..46f791105ed
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load-9.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[i] += src[indices[i]];                                              \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c
new file mode 100644
index 00000000000..0d3c5b71e5b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-1.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i]                                           \
+ + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c
new file mode 100644
index 00000000000..145df1e7797
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-10.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i]                                           \
+ + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c
new file mode 100644
index 00000000000..d36b6f025f5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-11.c
@@ -0,0 +1,39 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-11.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE *src_##DATA_TYPE[128];                                             \
+  DATA_TYPE src2_##DATA_TYPE[128];                                             \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = src2_##DATA_TYPE + i;                               \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE);                           \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i] + src_##DATA_TYPE[i][0]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
new file mode 100644
index 00000000000..b4e2ead8ca9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-12.c
@@ -0,0 +1,124 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-12.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, INDEX_TYPE)                                        \
+  DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                        \
+  DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                         \
+  INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                      \
+  for (int i = 0; i < 202; i++)                                                \
+    {                                                                          \
+      src_##DATA_TYPE##_##INDEX_TYPE[i]                                        \
+ = (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1));         \
+      index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55);                    \
+    }                                                                          \
+  f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE,               \
+ src_##DATA_TYPE##_##INDEX_TYPE,                \
+ index_##DATA_TYPE##_##INDEX_TYPE);             \
+  for (int i = 0; i < 100; i++)                                                \
+    {                                                                          \
+      assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]                           \
+       == (src_##DATA_TYPE##_##INDEX_TYPE                               \
+     [index_##DATA_TYPE##_##INDEX_TYPE[i * 2]]                  \
+   + 1));                                                       \
+      assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]                       \
+       == (src_##DATA_TYPE##_##INDEX_TYPE                               \
+     [index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]]              \
+   + 2));                                                       \
+    }
+
+  RUN_LOOP (int8_t, int8_t)
+  RUN_LOOP (uint8_t, int8_t)
+  RUN_LOOP (int16_t, int8_t)
+  RUN_LOOP (uint16_t, int8_t)
+  RUN_LOOP (int32_t, int8_t)
+  RUN_LOOP (uint32_t, int8_t)
+  RUN_LOOP (int64_t, int8_t)
+  RUN_LOOP (uint64_t, int8_t)
+  RUN_LOOP (_Float16, int8_t)
+  RUN_LOOP (float, int8_t)
+  RUN_LOOP (double, int8_t)
+  RUN_LOOP (int8_t, int16_t)
+  RUN_LOOP (uint8_t, int16_t)
+  RUN_LOOP (int16_t, int16_t)
+  RUN_LOOP (uint16_t, int16_t)
+  RUN_LOOP (int32_t, int16_t)
+  RUN_LOOP (uint32_t, int16_t)
+  RUN_LOOP (int64_t, int16_t)
+  RUN_LOOP (uint64_t, int16_t)
+  RUN_LOOP (_Float16, int16_t)
+  RUN_LOOP (float, int16_t)
+  RUN_LOOP (double, int16_t)
+  RUN_LOOP (int8_t, int32_t)
+  RUN_LOOP (uint8_t, int32_t)
+  RUN_LOOP (int16_t, int32_t)
+  RUN_LOOP (uint16_t, int32_t)
+  RUN_LOOP (int32_t, int32_t)
+  RUN_LOOP (uint32_t, int32_t)
+  RUN_LOOP (int64_t, int32_t)
+  RUN_LOOP (uint64_t, int32_t)
+  RUN_LOOP (_Float16, int32_t)
+  RUN_LOOP (float, int32_t)
+  RUN_LOOP (double, int32_t)
+  RUN_LOOP (int8_t, int64_t)
+  RUN_LOOP (uint8_t, int64_t)
+  RUN_LOOP (int16_t, int64_t)
+  RUN_LOOP (uint16_t, int64_t)
+  RUN_LOOP (int32_t, int64_t)
+  RUN_LOOP (uint32_t, int64_t)
+  RUN_LOOP (int64_t, int64_t)
+  RUN_LOOP (uint64_t, int64_t)
+  RUN_LOOP (_Float16, int64_t)
+  RUN_LOOP (float, int64_t)
+  RUN_LOOP (double, int64_t)
+  RUN_LOOP (int8_t, uint8_t)
+  RUN_LOOP (uint8_t, uint8_t)
+  RUN_LOOP (int16_t, uint8_t)
+  RUN_LOOP (uint16_t, uint8_t)
+  RUN_LOOP (int32_t, uint8_t)
+  RUN_LOOP (uint32_t, uint8_t)
+  RUN_LOOP (int64_t, uint8_t)
+  RUN_LOOP (uint64_t, uint8_t)
+  RUN_LOOP (_Float16, uint8_t)
+  RUN_LOOP (float, uint8_t)
+  RUN_LOOP (double, uint8_t)
+  RUN_LOOP (int8_t, uint16_t)
+  RUN_LOOP (uint8_t, uint16_t)
+  RUN_LOOP (int16_t, uint16_t)
+  RUN_LOOP (uint16_t, uint16_t)
+  RUN_LOOP (int32_t, uint16_t)
+  RUN_LOOP (uint32_t, uint16_t)
+  RUN_LOOP (int64_t, uint16_t)
+  RUN_LOOP (uint64_t, uint16_t)
+  RUN_LOOP (_Float16, uint16_t)
+  RUN_LOOP (float, uint16_t)
+  RUN_LOOP (double, uint16_t)
+  RUN_LOOP (int8_t, uint32_t)
+  RUN_LOOP (uint8_t, uint32_t)
+  RUN_LOOP (int16_t, uint32_t)
+  RUN_LOOP (uint16_t, uint32_t)
+  RUN_LOOP (int32_t, uint32_t)
+  RUN_LOOP (uint32_t, uint32_t)
+  RUN_LOOP (int64_t, uint32_t)
+  RUN_LOOP (uint64_t, uint32_t)
+  RUN_LOOP (_Float16, uint32_t)
+  RUN_LOOP (float, uint32_t)
+  RUN_LOOP (double, uint32_t)
+  RUN_LOOP (int8_t, uint64_t)
+  RUN_LOOP (uint8_t, uint64_t)
+  RUN_LOOP (int16_t, uint64_t)
+  RUN_LOOP (uint16_t, uint64_t)
+  RUN_LOOP (int32_t, uint64_t)
+  RUN_LOOP (uint32_t, uint64_t)
+  RUN_LOOP (int64_t, uint64_t)
+  RUN_LOOP (uint64_t, uint64_t)
+  RUN_LOOP (_Float16, uint64_t)
+  RUN_LOOP (float, uint64_t)
+  RUN_LOOP (double, uint64_t)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c
new file mode 100644
index 00000000000..76c6df32e6c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-2.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i]                                           \
+ + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c
new file mode 100644
index 00000000000..0fd64260082
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-3.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i]                                           \
+ + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c
new file mode 100644
index 00000000000..069d232b912
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-4.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i]                                           \
+ + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c
new file mode 100644
index 00000000000..499e555c1d0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-5.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i]                                           \
+ + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c
new file mode 100644
index 00000000000..ec6587aa4e5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-6.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i]                                           \
+ + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c
new file mode 100644
index 00000000000..c16287955a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-7.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i]                                           \
+ + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c
new file mode 100644
index 00000000000..e1744f60dbb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-8.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i]                                           \
+ + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c
new file mode 100644
index 00000000000..3ad6d33087f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/gather_load_run-9.c
@@ -0,0 +1,41 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "gather_load-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[i]                                                \
+     == (dest2_##DATA_TYPE[i]                                           \
+ + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c
new file mode 100644
index 00000000000..a5de0deccbe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-1.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c
new file mode 100644
index 00000000000..74a0d05b37d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-10.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                               \
+  T (uint8_t, 64)                                                              \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
new file mode 100644
index 00000000000..98c5b4678b7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-11.c
@@ -0,0 +1,116 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-vect-cost-model -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_LOOP(DATA_TYPE, INDEX_TYPE)                                       \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##INDEX_TYPE (DATA_TYPE *restrict y, DATA_TYPE *restrict x,  \
+ INDEX_TYPE *restrict index,                    \
+ INDEX_TYPE *restrict cond)                     \
+  {                                                                            \
+    for (int i = 0; i < 100; ++i)                                              \
+      {                                                                        \
+ if (cond[i * 2])                                                       \
+   y[i * 2] = x[index[i * 2]] + 1;                                      \
+ if (cond[i * 2 + 1])                                                   \
+   y[i * 2 + 1] = x[index[i * 2 + 1]] + 2;                              \
+      }                                                                        \
+  }
+
+TEST_LOOP (int8_t, int8_t)
+TEST_LOOP (uint8_t, int8_t)
+TEST_LOOP (int16_t, int8_t)
+TEST_LOOP (uint16_t, int8_t)
+TEST_LOOP (int32_t, int8_t)
+TEST_LOOP (uint32_t, int8_t)
+TEST_LOOP (int64_t, int8_t)
+TEST_LOOP (uint64_t, int8_t)
+TEST_LOOP (_Float16, int8_t)
+TEST_LOOP (float, int8_t)
+TEST_LOOP (double, int8_t)
+TEST_LOOP (int8_t, int16_t)
+TEST_LOOP (uint8_t, int16_t)
+TEST_LOOP (int16_t, int16_t)
+TEST_LOOP (uint16_t, int16_t)
+TEST_LOOP (int32_t, int16_t)
+TEST_LOOP (uint32_t, int16_t)
+TEST_LOOP (int64_t, int16_t)
+TEST_LOOP (uint64_t, int16_t)
+TEST_LOOP (_Float16, int16_t)
+TEST_LOOP (float, int16_t)
+TEST_LOOP (double, int16_t)
+TEST_LOOP (int8_t, int32_t)
+TEST_LOOP (uint8_t, int32_t)
+TEST_LOOP (int16_t, int32_t)
+TEST_LOOP (uint16_t, int32_t)
+TEST_LOOP (int32_t, int32_t)
+TEST_LOOP (uint32_t, int32_t)
+TEST_LOOP (int64_t, int32_t)
+TEST_LOOP (uint64_t, int32_t)
+TEST_LOOP (_Float16, int32_t)
+TEST_LOOP (float, int32_t)
+TEST_LOOP (double, int32_t)
+TEST_LOOP (int8_t, int64_t)
+TEST_LOOP (uint8_t, int64_t)
+TEST_LOOP (int16_t, int64_t)
+TEST_LOOP (uint16_t, int64_t)
+TEST_LOOP (int32_t, int64_t)
+TEST_LOOP (uint32_t, int64_t)
+TEST_LOOP (int64_t, int64_t)
+TEST_LOOP (uint64_t, int64_t)
+TEST_LOOP (_Float16, int64_t)
+TEST_LOOP (float, int64_t)
+TEST_LOOP (double, int64_t)
+TEST_LOOP (int8_t, uint8_t)
+TEST_LOOP (uint8_t, uint8_t)
+TEST_LOOP (int16_t, uint8_t)
+TEST_LOOP (uint16_t, uint8_t)
+TEST_LOOP (int32_t, uint8_t)
+TEST_LOOP (uint32_t, uint8_t)
+TEST_LOOP (int64_t, uint8_t)
+TEST_LOOP (uint64_t, uint8_t)
+TEST_LOOP (_Float16, uint8_t)
+TEST_LOOP (float, uint8_t)
+TEST_LOOP (double, uint8_t)
+TEST_LOOP (int8_t, uint16_t)
+TEST_LOOP (uint8_t, uint16_t)
+TEST_LOOP (int16_t, uint16_t)
+TEST_LOOP (uint16_t, uint16_t)
+TEST_LOOP (int32_t, uint16_t)
+TEST_LOOP (uint32_t, uint16_t)
+TEST_LOOP (int64_t, uint16_t)
+TEST_LOOP (uint64_t, uint16_t)
+TEST_LOOP (_Float16, uint16_t)
+TEST_LOOP (float, uint16_t)
+TEST_LOOP (double, uint16_t)
+TEST_LOOP (int8_t, uint32_t)
+TEST_LOOP (uint8_t, uint32_t)
+TEST_LOOP (int16_t, uint32_t)
+TEST_LOOP (uint16_t, uint32_t)
+TEST_LOOP (int32_t, uint32_t)
+TEST_LOOP (uint32_t, uint32_t)
+TEST_LOOP (int64_t, uint32_t)
+TEST_LOOP (uint64_t, uint32_t)
+TEST_LOOP (_Float16, uint32_t)
+TEST_LOOP (float, uint32_t)
+TEST_LOOP (double, uint32_t)
+TEST_LOOP (int8_t, uint64_t)
+TEST_LOOP (uint8_t, uint64_t)
+TEST_LOOP (int16_t, uint64_t)
+TEST_LOOP (uint16_t, uint64_t)
+TEST_LOOP (int32_t, uint64_t)
+TEST_LOOP (uint32_t, uint64_t)
+TEST_LOOP (int64_t, uint64_t)
+TEST_LOOP (uint64_t, uint64_t)
+TEST_LOOP (_Float16, uint64_t)
+TEST_LOOP (float, uint64_t)
+TEST_LOOP (double, uint64_t)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 88 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-assembler-not "vluxei64\.v" } } */
+/* { dg-final { scan-assembler-not "vsuxei64\.v" } } */
+/* { dg-final { scan-assembler-not {vlse64\.v\s+v[0-9]+,\s*0\([a-x0-9]+\),\s*zero} } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c
new file mode 100644
index 00000000000..03f84ce962c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-2.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c
new file mode 100644
index 00000000000..8578001ef41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-3.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c
new file mode 100644
index 00000000000..b273caa0bfe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-4.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c
new file mode 100644
index 00000000000..5055d886d62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-5.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                               \
+  T (uint8_t, 16)                                                              \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c
new file mode 100644
index 00000000000..2a4ae58588f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-6.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                               \
+  T (uint8_t, 16)                                                              \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c
new file mode 100644
index 00000000000..31d9414c549
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-7.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                               \
+  T (uint8_t, 32)                                                              \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c
new file mode 100644
index 00000000000..73ed23042fb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-8.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                               \
+  T (uint8_t, 32)                                                              \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c
new file mode 100644
index 00000000000..2f64e805759
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load-9.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fno-schedule-insns -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[i] += src[indices[i]];                                            \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                               \
+  T (uint8_t, 64)                                                              \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c
new file mode 100644
index 00000000000..41f60bd88b9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-1.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[i]                                            \
+ == (dest2_##DATA_TYPE[i]                                       \
+     + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c
new file mode 100644
index 00000000000..9840434fa41
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-10.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[i]                                            \
+ == (dest2_##DATA_TYPE[i]                                       \
+     + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
new file mode 100644
index 00000000000..105c706dbf9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-11.c
@@ -0,0 +1,140 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "mask_gather_load-11.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, INDEX_TYPE)                                        \
+  DATA_TYPE dest_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                        \
+  DATA_TYPE dest2_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                       \
+  DATA_TYPE src_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                         \
+  INDEX_TYPE index_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                      \
+  INDEX_TYPE cond_##DATA_TYPE##_##INDEX_TYPE[202] = {0};                       \
+  for (int i = 0; i < 202; i++)                                                \
+    {                                                                          \
+      src_##DATA_TYPE##_##INDEX_TYPE[i]                                        \
+ = (DATA_TYPE) ((i * 19 + 735) & (sizeof (DATA_TYPE) * 7 - 1));         \
+      dest_##DATA_TYPE##_##INDEX_TYPE[i]                                       \
+ = (DATA_TYPE) ((i * 7 + 666) & (sizeof (DATA_TYPE) * 5 - 1));          \
+      dest2_##DATA_TYPE##_##INDEX_TYPE[i]                                      \
+ = (DATA_TYPE) ((i * 7 + 666) & (sizeof (DATA_TYPE) * 5 - 1));          \
+      index_##DATA_TYPE##_##INDEX_TYPE[i] = (i * 7) % (55);                    \
+      cond_##DATA_TYPE##_##INDEX_TYPE[i] = (INDEX_TYPE) ((i & 0x3) == 3);      \
+    }                                                                          \
+  f_##DATA_TYPE##_##INDEX_TYPE (dest_##DATA_TYPE##_##INDEX_TYPE,               \
+ src_##DATA_TYPE##_##INDEX_TYPE,                \
+ index_##DATA_TYPE##_##INDEX_TYPE,              \
+ cond_##DATA_TYPE##_##INDEX_TYPE);              \
+  for (int i = 0; i < 100; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##INDEX_TYPE[i * 2])                              \
+ assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]                         \
+ == (src_##DATA_TYPE##_##INDEX_TYPE                             \
+       [index_##DATA_TYPE##_##INDEX_TYPE[i * 2]]                \
+     + 1));                                                     \
+      else                                                                     \
+ assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2]                         \
+ == dest2_##DATA_TYPE##_##INDEX_TYPE[i * 2]);                   \
+      if (cond_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1])                          \
+ assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]                     \
+ == (src_##DATA_TYPE##_##INDEX_TYPE                             \
+       [index_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]]            \
+     + 2));                                                     \
+      else                                                                     \
+ assert (dest_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]                     \
+ == dest2_##DATA_TYPE##_##INDEX_TYPE[i * 2 + 1]);               \
+    }
+
+  RUN_LOOP (int8_t, int8_t)
+  RUN_LOOP (uint8_t, int8_t)
+  RUN_LOOP (int16_t, int8_t)
+  RUN_LOOP (uint16_t, int8_t)
+  RUN_LOOP (int32_t, int8_t)
+  RUN_LOOP (uint32_t, int8_t)
+  RUN_LOOP (int64_t, int8_t)
+  RUN_LOOP (uint64_t, int8_t)
+  RUN_LOOP (_Float16, int8_t)
+  RUN_LOOP (float, int8_t)
+  RUN_LOOP (double, int8_t)
+  RUN_LOOP (int8_t, int16_t)
+  RUN_LOOP (uint8_t, int16_t)
+  RUN_LOOP (int16_t, int16_t)
+  RUN_LOOP (uint16_t, int16_t)
+  RUN_LOOP (int32_t, int16_t)
+  RUN_LOOP (uint32_t, int16_t)
+  RUN_LOOP (int64_t, int16_t)
+  RUN_LOOP (uint64_t, int16_t)
+  RUN_LOOP (_Float16, int16_t)
+  RUN_LOOP (float, int16_t)
+  RUN_LOOP (double, int16_t)
+  RUN_LOOP (int8_t, int32_t)
+  RUN_LOOP (uint8_t, int32_t)
+  RUN_LOOP (int16_t, int32_t)
+  RUN_LOOP (uint16_t, int32_t)
+  RUN_LOOP (int32_t, int32_t)
+  RUN_LOOP (uint32_t, int32_t)
+  RUN_LOOP (int64_t, int32_t)
+  RUN_LOOP (uint64_t, int32_t)
+  RUN_LOOP (_Float16, int32_t)
+  RUN_LOOP (float, int32_t)
+  RUN_LOOP (double, int32_t)
+  RUN_LOOP (int8_t, int64_t)
+  RUN_LOOP (uint8_t, int64_t)
+  RUN_LOOP (int16_t, int64_t)
+  RUN_LOOP (uint16_t, int64_t)
+  RUN_LOOP (int32_t, int64_t)
+  RUN_LOOP (uint32_t, int64_t)
+  RUN_LOOP (int64_t, int64_t)
+  RUN_LOOP (uint64_t, int64_t)
+  RUN_LOOP (_Float16, int64_t)
+  RUN_LOOP (float, int64_t)
+  RUN_LOOP (double, int64_t)
+  RUN_LOOP (int8_t, uint8_t)
+  RUN_LOOP (uint8_t, uint8_t)
+  RUN_LOOP (int16_t, uint8_t)
+  RUN_LOOP (uint16_t, uint8_t)
+  RUN_LOOP (int32_t, uint8_t)
+  RUN_LOOP (uint32_t, uint8_t)
+  RUN_LOOP (int64_t, uint8_t)
+  RUN_LOOP (uint64_t, uint8_t)
+  RUN_LOOP (_Float16, uint8_t)
+  RUN_LOOP (float, uint8_t)
+  RUN_LOOP (double, uint8_t)
+  RUN_LOOP (int8_t, uint16_t)
+  RUN_LOOP (uint8_t, uint16_t)
+  RUN_LOOP (int16_t, uint16_t)
+  RUN_LOOP (uint16_t, uint16_t)
+  RUN_LOOP (int32_t, uint16_t)
+  RUN_LOOP (uint32_t, uint16_t)
+  RUN_LOOP (int64_t, uint16_t)
+  RUN_LOOP (uint64_t, uint16_t)
+  RUN_LOOP (_Float16, uint16_t)
+  RUN_LOOP (float, uint16_t)
+  RUN_LOOP (double, uint16_t)
+  RUN_LOOP (int8_t, uint32_t)
+  RUN_LOOP (uint8_t, uint32_t)
+  RUN_LOOP (int16_t, uint32_t)
+  RUN_LOOP (uint16_t, uint32_t)
+  RUN_LOOP (int32_t, uint32_t)
+  RUN_LOOP (uint32_t, uint32_t)
+  RUN_LOOP (int64_t, uint32_t)
+  RUN_LOOP (uint64_t, uint32_t)
+  RUN_LOOP (_Float16, uint32_t)
+  RUN_LOOP (float, uint32_t)
+  RUN_LOOP (double, uint32_t)
+  RUN_LOOP (int8_t, uint64_t)
+  RUN_LOOP (uint8_t, uint64_t)
+  RUN_LOOP (int16_t, uint64_t)
+  RUN_LOOP (uint16_t, uint64_t)
+  RUN_LOOP (int32_t, uint64_t)
+  RUN_LOOP (uint32_t, uint64_t)
+  RUN_LOOP (int64_t, uint64_t)
+  RUN_LOOP (uint64_t, uint64_t)
+  RUN_LOOP (_Float16, uint64_t)
+  RUN_LOOP (float, uint64_t)
+  RUN_LOOP (double, uint64_t)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c
new file mode 100644
index 00000000000..33ddb5d9909
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-2.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[i]                                            \
+ == (dest2_##DATA_TYPE[i]                                       \
+     + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c
new file mode 100644
index 00000000000..9f06fbe4ecf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-3.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[i]                                            \
+ == (dest2_##DATA_TYPE[i]                                       \
+     + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c
new file mode 100644
index 00000000000..ae578f0c7b6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-4.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[i]                                            \
+ == (dest2_##DATA_TYPE[i]                                       \
+     + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c
new file mode 100644
index 00000000000..741abd166e1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-5.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[i]                                            \
+ == (dest2_##DATA_TYPE[i]                                       \
+     + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c
new file mode 100644
index 00000000000..a14a5c4ced1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-6.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[i]                                            \
+ == (dest2_##DATA_TYPE[i]                                       \
+     + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c
new file mode 100644
index 00000000000..5f6ec81c93b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-7.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "mask_gather_load-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[i]                                            \
+ == (dest2_##DATA_TYPE[i]                                       \
+     + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c
new file mode 100644
index 00000000000..a34688ff339
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-8.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "mask_gather_load-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[i]                                            \
+ == (dest2_##DATA_TYPE[i]                                       \
+     + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c
new file mode 100644
index 00000000000..1cfdede2060
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_gather_load_run-9.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_gather_load-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128] = {0};                                       \
+  DATA_TYPE dest2_##DATA_TYPE[128] = {0};                                      \
+  DATA_TYPE src_##DATA_TYPE[128] = {0};                                        \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128] = {0};                         \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[i]                                            \
+ == (dest2_##DATA_TYPE[i]                                       \
+     + src_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]));      \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[i] == dest2_##DATA_TYPE[i]);                  \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c
new file mode 100644
index 00000000000..623de41267b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-1.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c
new file mode 100644
index 00000000000..55112b067fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-10.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c
new file mode 100644
index 00000000000..32a572d0064
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-2.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c
new file mode 100644
index 00000000000..fbaaa9d8a8e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-3.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                              \
+  T (uint16_t, 8)                                                             \
+  T (_Float16, 8)                                                             \
+  T (int32_t, 8)                                                              \
+  T (uint32_t, 8)                                                             \
+  T (float, 8)                                                                \
+  T (int64_t, 8)                                                              \
+  T (uint64_t, 8)                                                             \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c
new file mode 100644
index 00000000000..9b08661f8e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-4.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                              \
+  T (uint16_t, 8)                                                             \
+  T (_Float16, 8)                                                             \
+  T (int32_t, 8)                                                              \
+  T (uint32_t, 8)                                                             \
+  T (float, 8)                                                                \
+  T (int64_t, 8)                                                              \
+  T (uint64_t, 8)                                                             \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c
new file mode 100644
index 00000000000..dd26635f2cb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-5.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c
new file mode 100644
index 00000000000..fa0206a0ec2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-6.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 16)                                                              \
+  T (uint32_t, 16)                                                             \
+  T (float, 16)                                                                \
+  T (int64_t, 16)                                                              \
+  T (uint64_t, 16)                                                             \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c
new file mode 100644
index 00000000000..325e86c26a8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-7.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c
new file mode 100644
index 00000000000..b4b84e9cdda
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-8.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                              \
+  T (uint16_t, 32)                                                             \
+  T (_Float16, 32)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 32)                                                              \
+  T (uint64_t, 32)                                                             \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c
new file mode 100644
index 00000000000..77a9af953e0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store-9.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices, INDEX##BITS *restrict cond)    \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      if (cond[i])                                                             \
+ dest[indices[i]] = src[i] + 1;                                         \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                              \
+  T (uint16_t, 64)                                                             \
+  T (_Float16, 64)                                                             \
+  T (int32_t, 64)                                                              \
+  T (uint32_t, 64)                                                             \
+  T (float, 64)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c
new file mode 100644
index 00000000000..e0d52bf6291
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-1.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c
new file mode 100644
index 00000000000..c1af0d30e62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-10.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c
new file mode 100644
index 00000000000..6b1b02eae35
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-2.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c
new file mode 100644
index 00000000000..cef0bdec1d1
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-3.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c
new file mode 100644
index 00000000000..88a74d5a632
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-4.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c
new file mode 100644
index 00000000000..06804ab7111
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-5.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c
new file mode 100644
index 00000000000..c6c9a676ed6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-6.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c
new file mode 100644
index 00000000000..06a37f92907
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-7.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "mask_scatter_store-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c
new file mode 100644
index 00000000000..8ee35d2e505
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-8.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c
new file mode 100644
index 00000000000..c27a673e2b0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/mask_scatter_store_run-9.c
@@ -0,0 +1,48 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "mask_scatter_store-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  INDEX##BITS cond_##DATA_TYPE##_##BITS[128] = {0};                            \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+      cond_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i & 0x3) == 3);             \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS, cond_##DATA_TYPE##_##BITS);     \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      if (cond_##DATA_TYPE##_##BITS[i])                                        \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == (src_##DATA_TYPE[i] + 1));                                  \
+      else                                                                     \
+ assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]              \
+ == dest2_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]);        \
+    }
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c
new file mode 100644
index 00000000000..6a390261cfb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+#define INDEX16 uint16_t
+#define INDEX32 uint32_t
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c
new file mode 100644
index 00000000000..feb58d7d458
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-10.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c
new file mode 100644
index 00000000000..e4c587fd7bb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-2.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 16)                                                              \
+  T (uint16_t, 16)                                                             \
+  T (_Float16, 16)                                                             \
+  T (int32_t, 32)                                                              \
+  T (uint32_t, 32)                                                             \
+  T (float, 32)                                                                \
+  T (int64_t, 64)                                                              \
+  T (uint64_t, 64)                                                             \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c
new file mode 100644
index 00000000000..33ad256d3db
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-3.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 uint8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c
new file mode 100644
index 00000000000..48d305623e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-4.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX8 int8_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 8)                                                                \
+  T (uint8_t, 8)                                                               \
+  T (int16_t, 8)                                                               \
+  T (uint16_t, 8)                                                              \
+  T (_Float16, 8)                                                              \
+  T (int32_t, 8)                                                               \
+  T (uint32_t, 8)                                                              \
+  T (float, 8)                                                                 \
+  T (int64_t, 8)                                                               \
+  T (uint64_t, 8)                                                              \
+  T (double, 8)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c
new file mode 100644
index 00000000000..83ddc44bf9c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-5.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 uint16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c
new file mode 100644
index 00000000000..11eb68bdb13
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-6.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX16 int16_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 16)                                                                \
+  T (uint8_t, 16)                                                               \
+  T (int16_t, 16)                                                               \
+  T (uint16_t, 16)                                                              \
+  T (_Float16, 16)                                                              \
+  T (int32_t, 16)                                                               \
+  T (uint32_t, 16)                                                              \
+  T (float, 16)                                                                 \
+  T (int64_t, 16)                                                               \
+  T (uint64_t, 16)                                                              \
+  T (double, 16)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c
new file mode 100644
index 00000000000..2e323477258
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-7.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 uint32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c
new file mode 100644
index 00000000000..e6732fe3790
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-8.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX32 int32_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 32)                                                                \
+  T (uint8_t, 32)                                                               \
+  T (int16_t, 32)                                                               \
+  T (uint16_t, 32)                                                              \
+  T (_Float16, 32)                                                              \
+  T (int32_t, 32)                                                               \
+  T (uint32_t, 32)                                                              \
+  T (float, 32)                                                                 \
+  T (int64_t, 32)                                                               \
+  T (uint64_t, 32)                                                              \
+  T (double, 32)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c
new file mode 100644
index 00000000000..766a52b4622
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store-9.c
@@ -0,0 +1,35 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d  -fdump-tree-vect-details" } */
+
+#include <stdint-gcc.h>
+
+#define INDEX64 uint64_t
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,            \
+ INDEX##BITS *restrict indices)                                \
+  {                                                                            \
+    for (int i = 0; i < 128; ++i)                                              \
+      dest[indices[i]] = src[i] + 1;                                           \
+  }
+
+#define TEST_ALL(T)                                                            \
+  T (int8_t, 64)                                                                \
+  T (uint8_t, 64)                                                               \
+  T (int16_t, 64)                                                               \
+  T (uint16_t, 64)                                                              \
+  T (_Float16, 64)                                                              \
+  T (int32_t, 64)                                                               \
+  T (uint32_t, 64)                                                              \
+  T (float, 64)                                                                 \
+  T (int64_t, 64)                                                               \
+  T (uint64_t, 64)                                                              \
+  T (double, 64)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops in function" 11 "vect" } } */
+/* { dg-final { scan-tree-dump " \.LEN_MASK_SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "vect" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "vect" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c
new file mode 100644
index 00000000000..cafa64f3527
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-1.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+     == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c
new file mode 100644
index 00000000000..79f6885831f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-10.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-10.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+     == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c
new file mode 100644
index 00000000000..376db088153
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-2.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "scatter_store-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+     == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c
new file mode 100644
index 00000000000..103b8649d38
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-3.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-3.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+     == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c
new file mode 100644
index 00000000000..f5f89c0fb4f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-4.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-4.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+     == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c
new file mode 100644
index 00000000000..049251ec888
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-5.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-5.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+     == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c
new file mode 100644
index 00000000000..59c8e701dbe
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-6.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-6.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+     == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c
new file mode 100644
index 00000000000..a24401181e3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-7.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-7.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+     == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c
new file mode 100644
index 00000000000..080c9b83363
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-8.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+
+#include "scatter_store-8.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+     == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c
new file mode 100644
index 00000000000..cc9f20f0daa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/scatter_store_run-9.c
@@ -0,0 +1,40 @@
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "-mcmodel=medany" } */
+
+#include "scatter_store-9.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE[128];                                             \
+  DATA_TYPE dest2_##DATA_TYPE[128];                                            \
+  DATA_TYPE src_##DATA_TYPE[128];                                              \
+  INDEX##BITS indices_##DATA_TYPE##_##BITS[128];                               \
+  for (int i = 0; i < 128; i++)                                                \
+    {                                                                          \
+      dest_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));         \
+      dest2_##DATA_TYPE[i] = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));        \
+      src_##DATA_TYPE[i] = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));         \
+      indices_##DATA_TYPE##_##BITS[i] = (DATA_TYPE) ((i * 3 + 67) % 128);      \
+    }                                                                          \
+  f_##DATA_TYPE (dest_##DATA_TYPE, src_##DATA_TYPE,                            \
+ indices_##DATA_TYPE##_##BITS);                                \
+  for (int i = 0; i < 128; i++)                                                \
+    assert (dest_##DATA_TYPE[indices_##DATA_TYPE##_##BITS[i]]                  \
+     == (src_##DATA_TYPE[i] + 1));
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c
new file mode 100644
index 00000000000..0c64f3e1a00
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-1.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+   INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < n; ++i)                                        \
+      dest[i] += src[i * stride];                                              \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_GATHER_LOAD" 66 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c
new file mode 100644
index 00000000000..0f234209325
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load-2.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+   INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < (BITS + 13); ++i)                              \
+      dest[i] += src[i * (BITS - 3)];                                          \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_GATHER_LOAD" 46 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.GATHER_LOAD" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_GATHER_LOAD" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
new file mode 100644
index 00000000000..4b03c25a907
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-1.c
@@ -0,0 +1,84 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_load-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+ = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+ = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+ = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+   stride_##DATA_TYPE##_##BITS,                         \
+   n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (                                                                 \
+ dest_##DATA_TYPE##_##BITS[i]                                           \
+ == (dest2_##DATA_TYPE##_##BITS[i]                                      \
+     + src_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]));     \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
new file mode 100644
index 00000000000..8499e4cef24
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_load_run-2.c
@@ -0,0 +1,84 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_load-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+ = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+ = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+ = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+   stride_##DATA_TYPE##_##BITS,                         \
+   n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (                                                                 \
+ dest_##DATA_TYPE##_##BITS[i]                                           \
+ == (dest2_##DATA_TYPE##_##BITS[i]                                      \
+     + src_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]));     \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c
new file mode 100644
index 00000000000..2418a0d74eb
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-1.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+   INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < n; ++i)                                        \
+      dest[i * stride] = src[i] + BITS;                                        \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_SCATTER_STORE" 66 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c
new file mode 100644
index 00000000000..83e101d5688
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store-2.c
@@ -0,0 +1,45 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv_zvfh -mabi=lp64d -O3 --param riscv-autovec-preference=scalable -fno-vect-cost-model -ffast-math -fdump-tree-optimized-details" } */
+
+#include <stdint-gcc.h>
+
+#ifndef INDEX8
+#define INDEX8 int8_t
+#define INDEX16 int16_t
+#define INDEX32 int32_t
+#define INDEX64 int64_t
+#endif
+
+#define TEST_LOOP(DATA_TYPE, BITS)                                             \
+  void __attribute__ ((noinline, noclone))                                     \
+  f_##DATA_TYPE##_##BITS (DATA_TYPE *restrict dest, DATA_TYPE *restrict src,   \
+   INDEX##BITS stride, INDEX##BITS n)                   \
+  {                                                                            \
+    for (INDEX##BITS i = 0; i < n; ++i)                                        \
+      dest[i * (BITS - 3)] = src[i] + BITS;                                    \
+  }
+
+#define TEST_TYPE(T, DATA_TYPE)                                                \
+  T (DATA_TYPE, 8)                                                             \
+  T (DATA_TYPE, 16)                                                            \
+  T (DATA_TYPE, 32)                                                            \
+  T (DATA_TYPE, 64)
+
+#define TEST_ALL(T)                                                            \
+  TEST_TYPE (T, int8_t)                                                        \
+  TEST_TYPE (T, uint8_t)                                                       \
+  TEST_TYPE (T, int16_t)                                                       \
+  TEST_TYPE (T, uint16_t)                                                      \
+  TEST_TYPE (T, _Float16)                                                      \
+  TEST_TYPE (T, int32_t)                                                       \
+  TEST_TYPE (T, uint32_t)                                                      \
+  TEST_TYPE (T, float)                                                         \
+  TEST_TYPE (T, int64_t)                                                       \
+  TEST_TYPE (T, uint64_t)                                                      \
+  TEST_TYPE (T, double)
+
+TEST_ALL (TEST_LOOP)
+
+/* { dg-final { scan-tree-dump-times " \.LEN_MASK_SCATTER_STORE" 44 "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.SCATTER_STORE" "optimized" } } */
+/* { dg-final { scan-tree-dump-not " \.MASK_SCATTER_STORE" "optimized" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
new file mode 100644
index 00000000000..e9dca4672c6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-1.c
@@ -0,0 +1,82 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_store-1.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+ = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+ = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+ = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+   stride_##DATA_TYPE##_##BITS,                         \
+   n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (dest_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]       \
+       == (src_##DATA_TYPE##_##BITS[i] + BITS));                        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c
new file mode 100644
index 00000000000..509def789e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/gather-scatter/strided_store_run-2.c
@@ -0,0 +1,82 @@
+/* { dg-do run { target { riscv_vector } } } */
+
+#include "strided_store-2.c"
+#include <assert.h>
+
+int
+main (void)
+{
+#define RUN_LOOP(DATA_TYPE, BITS)                                              \
+  DATA_TYPE dest_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];               \
+  DATA_TYPE dest2_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];              \
+  DATA_TYPE src_##DATA_TYPE##_##BITS[(BITS - 3) * (BITS + 13)];                \
+  INDEX##BITS stride_##DATA_TYPE##_##BITS = (BITS - 3);                        \
+  INDEX##BITS n_##DATA_TYPE##_##BITS = (BITS + 13);                            \
+  for (INDEX##BITS i = 0;                                                      \
+       i < stride_##DATA_TYPE##_##BITS * n_##DATA_TYPE##_##BITS; i++)          \
+    {                                                                          \
+      dest_##DATA_TYPE##_##BITS[i]                                             \
+ = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      dest2_##DATA_TYPE##_##BITS[i]                                            \
+ = (DATA_TYPE) ((i * 81 + 735) & (BITS - 1));                           \
+      src_##DATA_TYPE##_##BITS[i]                                              \
+ = (DATA_TYPE) ((i * 13 + 9107) & (BITS - 1));                          \
+    }                                                                          \
+  f_##DATA_TYPE##_##BITS (dest_##DATA_TYPE##_##BITS, src_##DATA_TYPE##_##BITS, \
+   stride_##DATA_TYPE##_##BITS,                         \
+   n_##DATA_TYPE##_##BITS);                             \
+  for (int i = 0; i < n_##DATA_TYPE##_##BITS; i++)                             \
+    {                                                                          \
+      assert (dest_##DATA_TYPE##_##BITS[i * stride_##DATA_TYPE##_##BITS]       \
+       == (src_##DATA_TYPE##_##BITS[i] + BITS));                        \
+    }
+
+  RUN_LOOP (int8_t, 8)
+  RUN_LOOP (uint8_t, 8)
+  RUN_LOOP (int16_t, 8)
+  RUN_LOOP (uint16_t, 8)
+  RUN_LOOP (_Float16, 8)
+  RUN_LOOP (int32_t, 8)
+  RUN_LOOP (uint32_t, 8)
+  RUN_LOOP (float, 8)
+  RUN_LOOP (int64_t, 8)
+  RUN_LOOP (uint64_t, 8)
+  RUN_LOOP (double, 8)
+
+  RUN_LOOP (int8_t, 16)
+  RUN_LOOP (uint8_t, 16)
+  RUN_LOOP (int16_t, 16)
+  RUN_LOOP (uint16_t, 16)
+  RUN_LOOP (_Float16, 16)
+  RUN_LOOP (int32_t, 16)
+  RUN_LOOP (uint32_t, 16)
+  RUN_LOOP (float, 16)
+  RUN_LOOP (int64_t, 16)
+  RUN_LOOP (uint64_t, 16)
+  RUN_LOOP (double, 16)
+
+  RUN_LOOP (int8_t, 32)
+  RUN_LOOP (uint8_t, 32)
+  RUN_LOOP (int16_t, 32)
+  RUN_LOOP (uint16_t, 32)
+  RUN_LOOP (_Float16, 32)
+  RUN_LOOP (int32_t, 32)
+  RUN_LOOP (uint32_t, 32)
+  RUN_LOOP (float, 32)
+  RUN_LOOP (int64_t, 32)
+  RUN_LOOP (uint64_t, 32)
+  RUN_LOOP (double, 32)
+
+  RUN_LOOP (int8_t, 64)
+  RUN_LOOP (uint8_t, 64)
+  RUN_LOOP (int16_t, 64)
+  RUN_LOOP (uint16_t, 64)
+  RUN_LOOP (_Float16, 64)
+  RUN_LOOP (int32_t, 64)
+  RUN_LOOP (uint32_t, 64)
+  RUN_LOOP (float, 64)
+  RUN_LOOP (int64_t, 64)
+  RUN_LOOP (uint64_t, 64)
+  RUN_LOOP (double, 64)
+  return 0;
+}
-- 
2.36.1
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-12 21:22   ` 回复: " 钟居哲
@ 2023-07-12 21:48     ` Jeff Law
  2023-07-12 22:17       ` 钟居哲
  2023-07-13  1:12       ` 回复: " Li, Pan2
  0 siblings, 2 replies; 13+ messages in thread
From: Jeff Law @ 2023-07-12 21:48 UTC (permalink / raw)
  To: 钟居哲, gcc-patches; +Cc: kito.cheng, kito.cheng, rdapp.gcc



On 7/12/23 15:22, 钟居哲 wrote:
> I have removed strided load/store, instead, I will support strided 
> load/store in vectorizer later.
> 
> Ok for trunk?
Assuming this removes the strided loads/stores while we figure out the 
best way to support them, OK for the trunk.  The formatting is so messed 
up that it's nearly impossible to read.



Note for the future, if you hit the message size limit, go ahead and 
gzip the patch.  That's better than forwarding from a failed message as 
the latter mucks up indention so bad that the result is unreadable.

Jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-12 21:48     ` Jeff Law
@ 2023-07-12 22:17       ` 钟居哲
  2023-07-12 22:25         ` Jeff Law
  2023-07-13  1:12       ` 回复: " Li, Pan2
  1 sibling, 1 reply; 13+ messages in thread
From: 钟居哲 @ 2023-07-12 22:17 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: kito.cheng, kito.cheng, rdapp.gcc

[-- Attachment #1: Type: text/plain, Size: 1073 bytes --]

Thanks Jeff.
Will commit with formating the codes.

I am gonna first support COND_FMA and reduction.... first (which I think is higher priority).
Then come back support strided_load/store.

Thanks.



juzhe.zhong@rivai.ai
 
发件人: Jeff Law
发送时间: 2023-07-13 05:48
收件人: 钟居哲; gcc-patches
抄送: kito.cheng; kito.cheng; rdapp.gcc
主题: Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
 
 
On 7/12/23 15:22, 钟居哲 wrote:
> I have removed strided load/store, instead, I will support strided 
> load/store in vectorizer later.
> 
> Ok for trunk?
Assuming this removes the strided loads/stores while we figure out the 
best way to support them, OK for the trunk.  The formatting is so messed 
up that it's nearly impossible to read.
 
 
 
Note for the future, if you hit the message size limit, go ahead and 
gzip the patch.  That's better than forwarding from a failed message as 
the latter mucks up indention so bad that the result is unreadable.
 
Jeff
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-12 22:17       ` 钟居哲
@ 2023-07-12 22:25         ` Jeff Law
  2023-07-12 23:30           ` 钟居哲
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff Law @ 2023-07-12 22:25 UTC (permalink / raw)
  To: 钟居哲, gcc-patches; +Cc: kito.cheng, kito.cheng, rdapp.gcc



On 7/12/23 16:17, 钟居哲 wrote:
> Thanks Jeff.
> Will commit with formating the codes.
> 
> I am gonna first support COND_FMA and reduction.... first (which I think 
> is higher priority).
> Then come back support strided_load/store.
Sure.    One thing to note with strided loads, they can significantly 
help x264's sad/satd loops.  So hopefully you're testing with those :-)



jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-12 22:25         ` Jeff Law
@ 2023-07-12 23:30           ` 钟居哲
  2023-07-13  7:47             ` Richard Biener
  2023-07-13 13:58             ` Jeff Law
  0 siblings, 2 replies; 13+ messages in thread
From: 钟居哲 @ 2023-07-12 23:30 UTC (permalink / raw)
  To: Jeff Law, gcc-patches; +Cc: kito.cheng, kito.cheng, rdapp.gcc

[-- Attachment #1: Type: text/plain, Size: 1040 bytes --]

I notice vectorizable_call in Loop Vectorizer.
It's vectorizing CALL function for example like fmax/fmin.
From my understanding, we dont have RVV instruction for fmax/fmin?

So for now, I don't need to support builtin call function vectorization for RVV.
Am I right?

I am wondering whether we do have some kind of builtin function call vectorization by using RVV instructions.


Thanks.


juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-07-13 06:25
To: 钟居哲; gcc-patches
CC: kito.cheng; kito.cheng; rdapp.gcc
Subject: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
 
 
On 7/12/23 16:17, 钟居哲 wrote:
> Thanks Jeff.
> Will commit with formating the codes.
> 
> I am gonna first support COND_FMA and reduction.... first (which I think 
> is higher priority).
> Then come back support strided_load/store.
Sure.    One thing to note with strided loads, they can significantly 
help x264's sad/satd loops.  So hopefully you're testing with those :-)
 
 
 
jeff
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-12 21:48     ` Jeff Law
  2023-07-12 22:17       ` 钟居哲
@ 2023-07-13  1:12       ` Li, Pan2
  1 sibling, 0 replies; 13+ messages in thread
From: Li, Pan2 @ 2023-07-13  1:12 UTC (permalink / raw)
  To: Jeff Law, 钟居哲, gcc-patches
  Cc: kito.cheng, kito.cheng, rdapp.gcc

Committed, thanks Jeff.

Pan

-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of Jeff Law via Gcc-patches
Sent: Thursday, July 13, 2023 5:49 AM
To: 钟居哲 <juzhe.zhong@rivai.ai>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: kito.cheng <kito.cheng@gmail.com>; kito.cheng <kito.cheng@sifive.com>; rdapp.gcc <rdapp.gcc@gmail.com>
Subject: Re: 回复: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization



On 7/12/23 15:22, 钟居哲 wrote:
> I have removed strided load/store, instead, I will support strided 
> load/store in vectorizer later.
> 
> Ok for trunk?
Assuming this removes the strided loads/stores while we figure out the 
best way to support them, OK for the trunk.  The formatting is so messed 
up that it's nearly impossible to read.



Note for the future, if you hit the message size limit, go ahead and 
gzip the patch.  That's better than forwarding from a failed message as 
the latter mucks up indention so bad that the result is unreadable.

Jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-12 23:30           ` 钟居哲
@ 2023-07-13  7:47             ` Richard Biener
  2023-07-13 14:01               ` Jeff Law
  2023-07-13 13:58             ` Jeff Law
  1 sibling, 1 reply; 13+ messages in thread
From: Richard Biener @ 2023-07-13  7:47 UTC (permalink / raw)
  To: 钟居哲
  Cc: Jeff Law, gcc-patches, kito.cheng, kito.cheng, rdapp.gcc

On Thu, Jul 13, 2023 at 1:30 AM 钟居哲 <juzhe.zhong@rivai.ai> wrote:
>
> I notice vectorizable_call in Loop Vectorizer.
> It's vectorizing CALL function for example like fmax/fmin.
> From my understanding, we dont have RVV instruction for fmax/fmin?

There's things like .POPCOUNT which we can vectorize, but sure, it
depends on the ISA
if there's anything.

> So for now, I don't need to support builtin call function vectorization for RVV.
> Am I right?
>
> I am wondering whether we do have some kind of builtin function call vectorization by using RVV instructions.
>
>
> Thanks.
>
>
> juzhe.zhong@rivai.ai
>
> From: Jeff Law
> Date: 2023-07-13 06:25
> To: 钟居哲; gcc-patches
> CC: kito.cheng; kito.cheng; rdapp.gcc
> Subject: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
>
>
> On 7/12/23 16:17, 钟居哲 wrote:
> > Thanks Jeff.
> > Will commit with formating the codes.
> >
> > I am gonna first support COND_FMA and reduction.... first (which I think
> > is higher priority).
> > Then come back support strided_load/store.
> Sure.    One thing to note with strided loads, they can significantly
> help x264's sad/satd loops.  So hopefully you're testing with those :-)
>
>
>
> jeff
>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-12 23:30           ` 钟居哲
  2023-07-13  7:47             ` Richard Biener
@ 2023-07-13 13:58             ` Jeff Law
  1 sibling, 0 replies; 13+ messages in thread
From: Jeff Law @ 2023-07-13 13:58 UTC (permalink / raw)
  To: 钟居哲, gcc-patches; +Cc: kito.cheng, kito.cheng, rdapp.gcc



On 7/12/23 17:30, 钟居哲 wrote:
> I notice vectorizable_call in Loop Vectorizer.
> It's vectorizing CALL function for example like fmax/fmin.
>  From my understanding, we dont have RVV instruction for fmax/fmin?
> 
> So for now, I don't need to support builtin call function vectorization 
> for RVV.
> Am I right?
Yes, you are correct.

> 
> I am wondering whether we do have some kind of builtin function call 
> vectorization by using RVV instructions.
It can be advantageous, even if the call doesn't collapse down to a 
single vector instruction.  Consider libmvec which is an API to provide 
things like sin, cos, exp, log, etc in vector form.

Once the library routines are written, those can then be exposed to the 
compiler in turn allowing vectorization of loops with a subset of calls 
such as sin, cos, pow, log, etc.

jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-13  7:47             ` Richard Biener
@ 2023-07-13 14:01               ` Jeff Law
  2023-07-13 14:05                 ` Palmer Dabbelt
  0 siblings, 1 reply; 13+ messages in thread
From: Jeff Law @ 2023-07-13 14:01 UTC (permalink / raw)
  To: Richard Biener, 钟居哲
  Cc: gcc-patches, kito.cheng, kito.cheng, rdapp.gcc



On 7/13/23 01:47, Richard Biener wrote:
> On Thu, Jul 13, 2023 at 1:30 AM 钟居哲 <juzhe.zhong@rivai.ai> wrote:
>>
>> I notice vectorizable_call in Loop Vectorizer.
>> It's vectorizing CALL function for example like fmax/fmin.
>>  From my understanding, we dont have RVV instruction for fmax/fmin?
> 
> There's things like .POPCOUNT which we can vectorize, but sure, it
> depends on the ISA if there's anything.
Right.  And RV has some of these -- vcpop, vfirst...  Supporting them 
obviously isn't a requirement for a vector implementation, but they're 
nice to have :-)

Jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-13 14:01               ` Jeff Law
@ 2023-07-13 14:05                 ` Palmer Dabbelt
  2023-07-13 14:32                   ` Robin Dapp
  0 siblings, 1 reply; 13+ messages in thread
From: Palmer Dabbelt @ 2023-07-13 14:05 UTC (permalink / raw)
  To: gcc-patches
  Cc: richard.guenther, juzhe.zhong, gcc-patches, Kito Cheng,
	kito.cheng, rdapp.gcc

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain, Size: 1427 bytes --]

On Thu, 13 Jul 2023 07:01:26 PDT (-0700), gcc-patches@gcc.gnu.org wrote:
>
>
> On 7/13/23 01:47, Richard Biener wrote:
>> On Thu, Jul 13, 2023 at 1:30 AM 钟居哲 <juzhe.zhong@rivai.ai> wrote:
>>>
>>> I notice vectorizable_call in Loop Vectorizer.
>>> It's vectorizing CALL function for example like fmax/fmin.
>>>  From my understanding, we dont have RVV instruction for fmax/fmin?

Unless I'm misunderstanding, we do.  The ISA manual says

    === Vector Floating-Point MIN/MAX Instructions
    
    The vector floating-point `vfmin` and `vfmax` instructions have the
    same behavior as the corresponding scalar floating-point instructions
    in version 2.2 of the RISC-V F/D/Q extension: they perform the `minimumNumber`
    or `maximumNumber` operation on active elements.
    
    ----
        # Floating-point minimum
        vfmin.vv vd, vs2, vs1, vm   # Vector-vector
        vfmin.vf vd, vs2, rs1, vm   # vector-scalar
    
        # Floating-point maximum
        vfmax.vv vd, vs2, vs1, vm   # Vector-vector
        vfmax.vf vd, vs2, rs1, vm   # vector-scalar
    ----

so we should be able to match at least some loops.

>>
>> There's things like .POPCOUNT which we can vectorize, but sure, it
>> depends on the ISA if there's anything.
> Right.  And RV has some of these -- vcpop, vfirst...  Supporting them
> obviously isn't a requirement for a vector implementation, but they're
> nice to have :-)
>
> Jeff

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-13 14:05                 ` Palmer Dabbelt
@ 2023-07-13 14:32                   ` Robin Dapp
  2023-07-13 14:42                     ` 钟居哲
  0 siblings, 1 reply; 13+ messages in thread
From: Robin Dapp @ 2023-07-13 14:32 UTC (permalink / raw)
  To: Palmer Dabbelt, gcc-patches
  Cc: rdapp.gcc, richard.guenther, juzhe.zhong, Kito Cheng, kito.cheng

>>>>  From my understanding, we dont have RVV instruction for fmax/fmin?
> 
> Unless I'm misunderstanding, we do.  The ISA manual says
> 
>     === Vector Floating-Point MIN/MAX Instructions
>     
>     The vector floating-point `vfmin` and `vfmax` instructions have the
>     same behavior as the corresponding scalar floating-point instructions
>     in version 2.2 of the RISC-V F/D/Q extension: they perform the `minimumNumber`
>     or `maximumNumber` operation on active elements.
>     
>     ----
>         # Floating-point minimum
>         vfmin.vv vd, vs2, vs1, vm   # Vector-vector
>         vfmin.vf vd, vs2, rs1, vm   # vector-scalar
>     
>         # Floating-point maximum
>         vfmax.vv vd, vs2, vs1, vm   # Vector-vector
>         vfmax.vf vd, vs2, rs1, vm   # vector-scalar
>     ----
> 
> so we should be able to match at least some loops.

We're already emitting those (e.g. for a[i] = a[i] > b[i] ? a[i] : b[i])
but for fmin/fmax they are not wired up yet (as opposed to the scalar variants).
Juzhe are you referring to something else?  I mean it's always a bit tricky
for backends to verify if the fmin/fmax behavior exactly matches the instruction
regards signaling nans, rounding etc but if the scalar variant is fine
I don't see why the vector variant would be worse. 

Regards
 Robin


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
  2023-07-13 14:32                   ` Robin Dapp
@ 2023-07-13 14:42                     ` 钟居哲
  0 siblings, 0 replies; 13+ messages in thread
From: 钟居哲 @ 2023-07-13 14:42 UTC (permalink / raw)
  To: rdapp.gcc, palmer, gcc-patches
  Cc: rdapp.gcc, richard.guenther, kito.cheng, kito.cheng

[-- Attachment #1: Type: text/plain, Size: 1985 bytes --]

No, I am just want to whether we have some CALL vectorization need len or mask predication.

For example, Current GCC vectorization  CALL onyl FMAX/FMIN/FMA/FNMA/FMS/FNMS these CALL function
need length or mask predicate. I don't care about sin/cos/popcount...etc. We just use full vector autovectorization
is fine, no need to support RVV into middle-end.



juzhe.zhong@rivai.ai
 
From: Robin Dapp
Date: 2023-07-13 22:32
To: Palmer Dabbelt; gcc-patches
CC: rdapp.gcc; richard.guenther; juzhe.zhong; Kito Cheng; kito.cheng
Subject: Re: [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization
>>>>  From my understanding, we dont have RVV instruction for fmax/fmin?
> 
> Unless I'm misunderstanding, we do.  The ISA manual says
> 
>     === Vector Floating-Point MIN/MAX Instructions
>     
>     The vector floating-point `vfmin` and `vfmax` instructions have the
>     same behavior as the corresponding scalar floating-point instructions
>     in version 2.2 of the RISC-V F/D/Q extension: they perform the `minimumNumber`
>     or `maximumNumber` operation on active elements.
>     
>     ----
>         # Floating-point minimum
>         vfmin.vv vd, vs2, vs1, vm   # Vector-vector
>         vfmin.vf vd, vs2, rs1, vm   # vector-scalar
>     
>         # Floating-point maximum
>         vfmax.vv vd, vs2, vs1, vm   # Vector-vector
>         vfmax.vf vd, vs2, rs1, vm   # vector-scalar
>     ----
> 
> so we should be able to match at least some loops.
 
We're already emitting those (e.g. for a[i] = a[i] > b[i] ? a[i] : b[i])
but for fmin/fmax they are not wired up yet (as opposed to the scalar variants).
Juzhe are you referring to something else?  I mean it's always a bit tricky
for backends to verify if the fmin/fmax behavior exactly matches the instruction
regards signaling nans, rounding etc but if the scalar variant is fine
I don't see why the vector variant would be worse. 
 
Regards
Robin
 
 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2023-07-13 14:42 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-07-12  9:38 [PATCH V7] RISC-V: RISC-V: Support gather_load/scatter RVV auto-vectorization juzhe.zhong
     [not found] ` <66564378DDCC57ED+202307121748494317959@rivai.ai>
2023-07-12 21:22   ` 回复: " 钟居哲
2023-07-12 21:48     ` Jeff Law
2023-07-12 22:17       ` 钟居哲
2023-07-12 22:25         ` Jeff Law
2023-07-12 23:30           ` 钟居哲
2023-07-13  7:47             ` Richard Biener
2023-07-13 14:01               ` Jeff Law
2023-07-13 14:05                 ` Palmer Dabbelt
2023-07-13 14:32                   ` Robin Dapp
2023-07-13 14:42                     ` 钟居哲
2023-07-13 13:58             ` Jeff Law
2023-07-13  1:12       ` 回复: " Li, Pan2

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).