public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH v2][ARM][GCC][12x]: MVE ACLE intrinsics to set and get vector lane.
@ 2020-03-23 17:42 Srinath Parvathaneni
  2020-03-23 18:13 ` Kyrylo Tkachov
  0 siblings, 1 reply; 2+ messages in thread
From: Srinath Parvathaneni @ 2020-03-23 17:42 UTC (permalink / raw)
  To: gcc-patches

[-- Attachment #1: Type: text/plain, Size: 43486 bytes --]

Hello Kyrill,

Following patch is the rebased version of v1.
(version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534346.html

####

Hello,

This patch supports following MVE ACLE intrinsics to get and set vector lane.

vsetq_lane_f16, vsetq_lane_f32, vsetq_lane_s16, vsetq_lane_s32, vsetq_lane_s8,
vsetq_lane_s64, vsetq_lane_u8, vsetq_lane_u16, vsetq_lane_u32, vsetq_lane_u64,
vgetq_lane_f16, vgetq_lane_f32, vgetq_lane_s16, vgetq_lane_s32, vgetq_lane_s8,
vgetq_lane_s64, vgetq_lane_u8, vgetq_lane_u16, vgetq_lane_u32, vgetq_lane_u64.

Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more details.
[1] https://developer.arm.com/architectures/instruction-sets/simd-isas/helium/mve-intrinsics

Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?

Thanks,
Srinath.

gcc/ChangeLog:

2019-11-08  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>
            Andre Vieira  <andre.simoesdiasvieira@arm.com>
            Mihail Ionescu  <mihail.ionescu@arm.com>

	* config/arm/arm_mve.h (vsetq_lane_f16): Define macro.
	(vsetq_lane_f32): Likewise.
	(vsetq_lane_s16): Likewise.
	(vsetq_lane_s32): Likewise.
	(vsetq_lane_s8): Likewise.
	(vsetq_lane_s64): Likewise.
	(vsetq_lane_u8): Likewise.
	(vsetq_lane_u16): Likewise.
	(vsetq_lane_u32): Likewise.
	(vsetq_lane_u64): Likewise.
	(vgetq_lane_f16): Likewise.
	(vgetq_lane_f32): Likewise.
	(vgetq_lane_s16): Likewise.
	(vgetq_lane_s32): Likewise.
	(vgetq_lane_s8): Likewise.
	(vgetq_lane_s64): Likewise.
	(vgetq_lane_u8): Likewise.
	(vgetq_lane_u16): Likewise.
	(vgetq_lane_u32): Likewise.
	(vgetq_lane_u64): Likewise.
	(__ARM_NUM_LANES): Likewise.
	(__ARM_LANEQ): Likewise.
	(__ARM_CHECK_LANEQ): Likewise.
	(__arm_vsetq_lane_s16): Define intrinsic.
	(__arm_vsetq_lane_s32): Likewise.
	(__arm_vsetq_lane_s8): Likewise.
	(__arm_vsetq_lane_s64): Likewise.
	(__arm_vsetq_lane_u8): Likewise.
	(__arm_vsetq_lane_u16): Likewise.
	(__arm_vsetq_lane_u32): Likewise.
	(__arm_vsetq_lane_u64): Likewise.
	(__arm_vgetq_lane_s16): Likewise.
	(__arm_vgetq_lane_s32): Likewise.
	(__arm_vgetq_lane_s8): Likewise.
	(__arm_vgetq_lane_s64): Likewise.
	(__arm_vgetq_lane_u8): Likewise.
	(__arm_vgetq_lane_u16): Likewise.
	(__arm_vgetq_lane_u32): Likewise.
	(__arm_vgetq_lane_u64): Likewise.
	(__arm_vsetq_lane_f16): Likewise.
	(__arm_vsetq_lane_f32): Likewise.
	(__arm_vgetq_lane_f16): Likewise.
	(__arm_vgetq_lane_f32): Likewise.
	(vgetq_lane): Define polymorphic variant.
	(vsetq_lane): Likewise.
	* config/arm/mve.md (mve_vec_extract<mode><V_elem_l>): Define RTL
	pattern.
	(mve_vec_extractv2didi): Likewise.
	(mve_vec_extract_sext_internal<mode>): Likewise.
	(mve_vec_extract_zext_internal<mode>): Likewise.
	(mve_vec_set<mode>_internal): Likewise.
	(mve_vec_setv2di_internal): Likewise.
	* config/arm/neon.md (vec_set<mode>): Move RTL pattern to vec-common.md
	file.
	(vec_extract<mode><V_elem_l>): Rename to
	"neon_vec_extract<mode><V_elem_l>".
	(vec_extractv2didi): Rename to "neon_vec_extractv2didi".
	* config/arm/vec-common.md (vec_extract<mode><V_elem_l>): Define RTL
	pattern common for MVE and NEON.
	(vec_set<mode>): Move RTL pattern from neon.md and modify to accept both
	MVE and NEON.

gcc/testsuite/ChangeLog:

2019-11-08  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>
            Andre Vieira  <andre.simoesdiasvieira@arm.com>
            Mihail Ionescu  <mihail.ionescu@arm.com>

	* gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c: New test.
	* gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c: Likewise.
	* gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c: Likewise.


###############     Attachment also inlined for ease of reply    ###############


diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index f6810ddf4b735e1cd782a67c2d48bab8ddb75814..43520ee78e19f074912a6d965731465f1226986d 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -2506,8 +2506,40 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t;
 #define vld1q_z_f32(__base, __p) __arm_vld1q_z_f32(__base, __p)
 #define vst2q_f32(__addr, __value) __arm_vst2q_f32(__addr, __value)
 #define vst1q_p_f32(__addr, __value, __p) __arm_vst1q_p_f32(__addr, __value, __p)
+#define vsetq_lane_f16(__a, __b,  __idx) __arm_vsetq_lane_f16(__a, __b,  __idx)
+#define vsetq_lane_f32(__a, __b,  __idx) __arm_vsetq_lane_f32(__a, __b,  __idx)
+#define vsetq_lane_s16(__a, __b,  __idx) __arm_vsetq_lane_s16(__a, __b,  __idx)
+#define vsetq_lane_s32(__a, __b,  __idx) __arm_vsetq_lane_s32(__a, __b,  __idx)
+#define vsetq_lane_s8(__a, __b,  __idx) __arm_vsetq_lane_s8(__a, __b,  __idx)
+#define vsetq_lane_s64(__a, __b,  __idx) __arm_vsetq_lane_s64(__a, __b,  __idx)
+#define vsetq_lane_u8(__a, __b,  __idx) __arm_vsetq_lane_u8(__a, __b,  __idx)
+#define vsetq_lane_u16(__a, __b,  __idx) __arm_vsetq_lane_u16(__a, __b,  __idx)
+#define vsetq_lane_u32(__a, __b,  __idx) __arm_vsetq_lane_u32(__a, __b,  __idx)
+#define vsetq_lane_u64(__a, __b,  __idx) __arm_vsetq_lane_u64(__a, __b,  __idx)
+#define vgetq_lane_f16(__a,  __idx) __arm_vgetq_lane_f16(__a,  __idx)
+#define vgetq_lane_f32(__a,  __idx) __arm_vgetq_lane_f32(__a,  __idx)
+#define vgetq_lane_s16(__a,  __idx) __arm_vgetq_lane_s16(__a,  __idx)
+#define vgetq_lane_s32(__a,  __idx) __arm_vgetq_lane_s32(__a,  __idx)
+#define vgetq_lane_s8(__a,  __idx) __arm_vgetq_lane_s8(__a,  __idx)
+#define vgetq_lane_s64(__a,  __idx) __arm_vgetq_lane_s64(__a,  __idx)
+#define vgetq_lane_u8(__a,  __idx) __arm_vgetq_lane_u8(__a,  __idx)
+#define vgetq_lane_u16(__a,  __idx) __arm_vgetq_lane_u16(__a,  __idx)
+#define vgetq_lane_u32(__a,  __idx) __arm_vgetq_lane_u32(__a,  __idx)
+#define vgetq_lane_u64(__a,  __idx) __arm_vgetq_lane_u64(__a,  __idx)
 #endif
 
+/* For big-endian, GCC's vector indices are reversed within each 64 bits
+   compared to the architectural lane indices used by MVE intrinsics.  */
+#define __ARM_NUM_LANES(__v) (sizeof (__v) / sizeof (__v[0]))
+#ifdef __ARM_BIG_ENDIAN
+#define __ARM_LANEQ(__vec, __idx) (__idx ^ (__ARM_NUM_LANES(__vec)/2 - 1))
+#else
+#define __ARM_LANEQ(__vec, __idx) __idx
+#endif
+#define __ARM_CHECK_LANEQ(__vec, __idx)		 \
+  __builtin_arm_lane_check (__ARM_NUM_LANES(__vec),     \
+			    __ARM_LANEQ(__vec, __idx))
+
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vst4q_s8 (int8_t * __addr, int8x16x4_t __value)
@@ -16371,6 +16403,142 @@ __arm_vld4q_u32 (uint32_t const * __addr)
   return __rv.__i;
 }
 
+__extension__ extern __inline int16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_s16 (int16_t __a, int16x8_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline int32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_s32 (int32_t __a, int32x4_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline int8x16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_s8 (int8_t __a, int8x16_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline int64x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_s64 (int64_t __a, int64x2_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline uint8x16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_u8 (uint8_t __a, uint8x16_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline uint16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_u16 (uint16_t __a, uint16x8_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline uint32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_u32 (uint32_t __a, uint32x4_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline uint64x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_u64 (uint64_t __a, uint64x2_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline int16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_s16 (int16x8_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline int32_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_s32 (int32x4_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline int8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_s8 (int8x16_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline int64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_s64 (int64x2_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline uint8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_u8 (uint8x16_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline uint16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_u16 (uint16x8_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline uint32_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_u32 (uint32x4_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline uint64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_u64 (uint64x2_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
 #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point.  */
 
 __extension__ extern __inline void
@@ -19804,6 +19972,39 @@ __arm_vst1q_p_f32 (float32_t * __addr, float32x4_t __value, mve_pred16_t __p)
   return vstrwq_p_f32 (__addr, __value, __p);
 }
 
+__extension__ extern __inline float16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_f16 (float16_t __a, float16x8_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline float32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_f32 (float32_t __a, float32x4_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline float16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_f16 (float16x8_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline float32_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_f32 (float32x4_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
 #endif
 
 enum {
@@ -23090,6 +23291,35 @@ extern void *__ARM_undef;
   int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot90_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \
   int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot90_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));})
 
+#define vgetq_lane(p0,p1) __arm_vgetq_lane(p0,p1)
+#define __arm_vgetq_lane(p0,p1) ({ __typeof(p0) __p0 = (p0); \
+  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
+  int (*)[__ARM_mve_type_int8x16_t]: __arm_vgetq_lane_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1), \
+  int (*)[__ARM_mve_type_int16x8_t]: __arm_vgetq_lane_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1), \
+  int (*)[__ARM_mve_type_int32x4_t]: __arm_vgetq_lane_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1), \
+  int (*)[__ARM_mve_type_int64x2_t]: __arm_vgetq_lane_s64 (__ARM_mve_coerce(__p0, int64x2_t), p1), \
+  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vgetq_lane_u8 (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
+  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vgetq_lane_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
+  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vgetq_lane_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1), \
+  int (*)[__ARM_mve_type_uint64x2_t]: __arm_vgetq_lane_u64 (__ARM_mve_coerce(__p0, uint64x2_t), p1), \
+  int (*)[__ARM_mve_type_float16x8_t]: __arm_vgetq_lane_f16 (__ARM_mve_coerce(__p0, float16x8_t), p1), \
+  int (*)[__ARM_mve_type_float32x4_t]: __arm_vgetq_lane_f32 (__ARM_mve_coerce(__p0, float32x4_t), p1));})
+
+#define vsetq_lane(p0,p1,p2) __arm_vsetq_lane(p0,p1,p2)
+#define __arm_vsetq_lane(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
+  __typeof(p1) __p1 = (p1); \
+  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \
+  int (*)[__ARM_mve_type_int8_t][__ARM_mve_type_int8x16_t]: __arm_vsetq_lane_s8 (__ARM_mve_coerce(__p0, int8_t), __ARM_mve_coerce(__p1, int8x16_t), p2), \
+  int (*)[__ARM_mve_type_int16_t][__ARM_mve_type_int16x8_t]: __arm_vsetq_lane_s16 (__ARM_mve_coerce(__p0, int16_t), __ARM_mve_coerce(__p1, int16x8_t), p2), \
+  int (*)[__ARM_mve_type_int32_t][__ARM_mve_type_int32x4_t]: __arm_vsetq_lane_s32 (__ARM_mve_coerce(__p0, int32_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \
+  int (*)[__ARM_mve_type_int64_t][__ARM_mve_type_int64x2_t]: __arm_vsetq_lane_s64 (__ARM_mve_coerce(__p0, int64_t), __ARM_mve_coerce(__p1, int64x2_t), p2), \
+  int (*)[__ARM_mve_type_uint8_t][__ARM_mve_type_uint8x16_t]: __arm_vsetq_lane_u8 (__ARM_mve_coerce(__p0, uint8_t), __ARM_mve_coerce(__p1, uint8x16_t), p2), \
+  int (*)[__ARM_mve_type_uint16_t][__ARM_mve_type_uint16x8_t]: __arm_vsetq_lane_u16 (__ARM_mve_coerce(__p0, uint16_t), __ARM_mve_coerce(__p1, uint16x8_t), p2), \
+  int (*)[__ARM_mve_type_uint32_t][__ARM_mve_type_uint32x4_t]: __arm_vsetq_lane_u32 (__ARM_mve_coerce(__p0, uint32_t), __ARM_mve_coerce(__p1, uint32x4_t), p2), \
+  int (*)[__ARM_mve_type_uint64_t][__ARM_mve_type_uint64x2_t]: __arm_vsetq_lane_u64 (__ARM_mve_coerce(__p0, uint64_t), __ARM_mve_coerce(__p1, uint64x2_t), p2), \
+  int (*)[__ARM_mve_type_float16_t][__ARM_mve_type_float16x8_t]: __arm_vsetq_lane_f16 (__ARM_mve_coerce(__p0, float16_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \
+  int (*)[__ARM_mve_type_float32_t][__ARM_mve_type_float32x4_t]: __arm_vsetq_lane_f32 (__ARM_mve_coerce(__p0, float32_t), __ARM_mve_coerce(__p1, float32x4_t), p2));})
+
 #else /* MVE Integer.  */
 
 #define vstrwq_scatter_base_wb(p0,p1,p2) __arm_vstrwq_scatter_base_wb(p0,p1,p2)
@@ -25885,6 +26115,31 @@ extern void *__ARM_undef;
   int (*)[__ARM_mve_type_uint16_t_const_ptr]: __arm_vld4q_u16 (__ARM_mve_coerce(__p0, uint16_t const *)), \
   int (*)[__ARM_mve_type_uint32_t_const_ptr]: __arm_vld4q_u32 (__ARM_mve_coerce(__p0, uint32_t const *)));})
 
+#define vgetq_lane(p0,p1) __arm_vgetq_lane(p0,p1)
+#define __arm_vgetq_lane(p0,p1) ({ __typeof(p0) __p0 = (p0); \
+  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
+  int (*)[__ARM_mve_type_int8x16_t]: __arm_vgetq_lane_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1), \
+  int (*)[__ARM_mve_type_int16x8_t]: __arm_vgetq_lane_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1), \
+  int (*)[__ARM_mve_type_int32x4_t]: __arm_vgetq_lane_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1), \
+  int (*)[__ARM_mve_type_int64x2_t]: __arm_vgetq_lane_s64 (__ARM_mve_coerce(__p0, int64x2_t), p1), \
+  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vgetq_lane_u8 (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
+  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vgetq_lane_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
+  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vgetq_lane_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1), \
+  int (*)[__ARM_mve_type_uint64x2_t]: __arm_vgetq_lane_u64 (__ARM_mve_coerce(__p0, uint64x2_t), p1));})
+
+#define vsetq_lane(p0,p1,p2) __arm_vsetq_lane(p0,p1,p2)
+#define __arm_vsetq_lane(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
+  __typeof(p1) __p1 = (p1); \
+  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \
+  int (*)[__ARM_mve_type_int8_t][__ARM_mve_type_int8x16_t]: __arm_vsetq_lane_s8 (__ARM_mve_coerce(__p0, int8_t), __ARM_mve_coerce(__p1, int8x16_t), p2), \
+  int (*)[__ARM_mve_type_int16_t][__ARM_mve_type_int16x8_t]: __arm_vsetq_lane_s16 (__ARM_mve_coerce(__p0, int16_t), __ARM_mve_coerce(__p1, int16x8_t), p2), \
+  int (*)[__ARM_mve_type_int32_t][__ARM_mve_type_int32x4_t]: __arm_vsetq_lane_s32 (__ARM_mve_coerce(__p0, int32_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \
+  int (*)[__ARM_mve_type_int64_t][__ARM_mve_type_int64x2_t]: __arm_vsetq_lane_s64 (__ARM_mve_coerce(__p0, int64_t), __ARM_mve_coerce(__p1, int64x2_t), p2), \
+  int (*)[__ARM_mve_type_uint8_t][__ARM_mve_type_uint8x16_t]: __arm_vsetq_lane_u8 (__ARM_mve_coerce(__p0, uint8_t), __ARM_mve_coerce(__p1, uint8x16_t), p2), \
+  int (*)[__ARM_mve_type_uint16_t][__ARM_mve_type_uint16x8_t]: __arm_vsetq_lane_u16 (__ARM_mve_coerce(__p0, uint16_t), __ARM_mve_coerce(__p1, uint16x8_t), p2), \
+  int (*)[__ARM_mve_type_uint32_t][__ARM_mve_type_uint32x4_t]: __arm_vsetq_lane_u32 (__ARM_mve_coerce(__p0, uint32_t), __ARM_mve_coerce(__p1, uint32x4_t), p2), \
+  int (*)[__ARM_mve_type_uint64_t][__ARM_mve_type_uint64x2_t]: __arm_vsetq_lane_u64 (__ARM_mve_coerce(__p0, uint64_t), __ARM_mve_coerce(__p1, uint64x2_t), p2));})
+
 #endif /* MVE Integer.  */
 
 #define vmvnq_x(p1,p2) __arm_vmvnq_x(p1,p2)
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index f3cbc0d03564ef8866226f836a27ed6051353f5d..e6b66eef3728122c87bd6ea68b8a643dd4552b00 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -129,6 +129,9 @@
 ;; Quad-width vector modes plus 64-bit elements.
 (define_mode_iterator VQX [V16QI V8HI V8HF V8BF V4SI V4SF V2DI])
 
+;; Quad-width vector modes plus 64-bit elements.
+(define_mode_iterator VQX_NOBF [V16QI V8HI V8HF V4SI V4SF V2DI])
+
 ;; Quad-width vector modes plus 64-bit elements and V8BF.
 (define_mode_iterator VQXBF [V16QI V8HI V8HF (V8BF "TARGET_BF16_SIMD") V4SI V4SF V2DI])
 
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 2e28d9d8408127dd52b9d16c772e7f27a47d390a..2b59d5a58171cddea1155610ddbb3c7f96105d24 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -411,6 +411,8 @@
 (define_mode_attr MVE_H_ELEM [ (V8HI "V8HI") (V4SI "V4HI")])
 (define_mode_attr V_sz_elem1 [(V16QI "b") (V8HI  "h") (V4SI "w") (V8HF "h")
 			      (V4SF "w")])
+(define_mode_attr V_extr_elem [(V16QI "u8") (V8HI "u16") (V4SI "32")
+			       (V8HF "u16") (V4SF "32")])
 
 (define_int_iterator VCVTQ_TO_F [VCVTQ_TO_F_S VCVTQ_TO_F_U])
 (define_int_iterator VMVNQ_N [VMVNQ_N_U VMVNQ_N_S])
@@ -10885,3 +10887,121 @@
    return "";
 }
   [(set_attr "length" "16")])
+;;
+;; [vgetq_lane_u, vgetq_lane_s, vgetq_lane_f])
+;;
+(define_insn "mve_vec_extract<mode><V_elem_l>"
+ [(set (match_operand:<V_elem> 0 "s_register_operand" "=r")
+   (vec_select:<V_elem>
+    (match_operand:MVE_VLD_ST 1 "s_register_operand" "w")
+    (parallel [(match_operand:SI 2 "immediate_operand" "i")])))]
+  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
+   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      int elt = INTVAL (operands[2]);
+      elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
+      operands[2] = GEN_INT (elt);
+    }
+  return "vmov.<V_extr_elem>\t%0, %q1[%c2]";
+}
+ [(set_attr "type" "mve_move")])
+
+(define_insn "mve_vec_extractv2didi"
+ [(set (match_operand:DI 0 "s_register_operand" "=r")
+   (vec_select:DI
+    (match_operand:V2DI 1 "s_register_operand" "w")
+    (parallel [(match_operand:SI 2 "immediate_operand" "i")])))]
+  "TARGET_HAVE_MVE"
+{
+  int elt = INTVAL (operands[2]);
+  if (BYTES_BIG_ENDIAN)
+    elt = 1 - elt;
+
+  if (elt == 0)
+   return "vmov\t%Q0, %R0, %e1";
+  else
+   return "vmov\t%J0, %K0, %f1";
+}
+ [(set_attr "type" "mve_move")])
+
+(define_insn "*mve_vec_extract_sext_internal<mode>"
+ [(set (match_operand:SI 0 "s_register_operand" "=r")
+   (sign_extend:SI
+    (vec_select:<V_elem>
+     (match_operand:MVE_2 1 "s_register_operand" "w")
+     (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
+  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
+   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      int elt = INTVAL (operands[2]);
+      elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
+      operands[2] = GEN_INT (elt);
+    }
+  return "vmov.s<V_sz_elem>\t%0, %q1[%c2]";
+}
+ [(set_attr "type" "mve_move")])
+
+(define_insn "*mve_vec_extract_zext_internal<mode>"
+ [(set (match_operand:SI 0 "s_register_operand" "=r")
+   (zero_extend:SI
+    (vec_select:<V_elem>
+     (match_operand:MVE_2 1 "s_register_operand" "w")
+     (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
+  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
+   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      int elt = INTVAL (operands[2]);
+      elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
+      operands[2] = GEN_INT (elt);
+    }
+  return "vmov.u<V_sz_elem>\t%0, %q1[%c2]";
+}
+ [(set_attr "type" "mve_move")])
+
+;;
+;; [vsetq_lane_u, vsetq_lane_s, vsetq_lane_f])
+;;
+(define_insn "mve_vec_set<mode>_internal"
+ [(set (match_operand:VQ2 0 "s_register_operand" "=w")
+       (vec_merge:VQ2
+	(vec_duplicate:VQ2
+	  (match_operand:<V_elem> 1 "nonimmediate_operand" "r"))
+	(match_operand:VQ2 3 "s_register_operand" "0")
+	(match_operand:SI 2 "immediate_operand" "i")))]
+  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
+   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))"
+{
+  int elt = ffs ((int) INTVAL (operands[2])) - 1;
+  if (BYTES_BIG_ENDIAN)
+    elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
+  operands[2] = GEN_INT (elt);
+
+  return "vmov.<V_sz_elem>\t%q0[%c2], %1";
+}
+ [(set_attr "type" "mve_move")])
+
+(define_insn "mve_vec_setv2di_internal"
+ [(set (match_operand:V2DI 0 "s_register_operand" "=w")
+       (vec_merge:V2DI
+	(vec_duplicate:V2DI
+	  (match_operand:DI 1 "nonimmediate_operand" "r"))
+	(match_operand:V2DI 3 "s_register_operand" "0")
+	(match_operand:SI 2 "immediate_operand" "i")))]
+ "TARGET_HAVE_MVE"
+{
+  int elt = ffs ((int) INTVAL (operands[2])) - 1;
+  if (BYTES_BIG_ENDIAN)
+    elt = 1 - elt;
+
+  if (elt == 0)
+   return "vmov\t%e0, %Q1, %R1";
+  else
+   return "vmov\t%f0, %J1, %K1";
+}
+ [(set_attr "type" "mve_move")])
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 272e6c1e7cfc4c42065d1d50131ef49d89052d91..3e7b51d8ab60007901392df0ca1cb09fead4d0e9 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -411,18 +411,6 @@
   [(set_attr "type" "neon_load1_all_lanes_q,neon_from_gp_q")]
 )
 
-(define_expand "vec_set<mode>"
-  [(match_operand:VDQ 0 "s_register_operand")
-   (match_operand:<V_elem> 1 "s_register_operand")
-   (match_operand:SI 2 "immediate_operand")]
-  "TARGET_NEON"
-{
-  HOST_WIDE_INT elem = HOST_WIDE_INT_1 << INTVAL (operands[2]);
-  emit_insn (gen_vec_set<mode>_internal (operands[0], operands[1],
-					 GEN_INT (elem), operands[0]));
-  DONE;
-})
-
 (define_insn "vec_extract<mode><V_elem_l>"
   [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
         (vec_select:<V_elem>
@@ -445,7 +433,10 @@
   [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
 )
 
-(define_insn "vec_extract<mode><V_elem_l>"
+;; This pattern is renamed from "vec_extract<mode><V_elem_l>" to
+;; "neon_vec_extract<mode><V_elem_l>" and this pattern is called
+;; by define_expand in vec-common.md file.
+(define_insn "neon_vec_extract<mode><V_elem_l>"
   [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
 	(vec_select:<V_elem>
           (match_operand:VQ2 1 "s_register_operand" "w,w")
@@ -471,7 +462,9 @@
   [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
 )
 
-(define_insn "vec_extractv2didi"
+;; This pattern is renamed from "vec_extractv2didi" to "neon_vec_extractv2didi"
+;; and this pattern is called by define_expand in vec-common.md file.
+(define_insn "neon_vec_extractv2didi"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=Um,r")
 	(vec_select:DI
           (match_operand:V2DI 1 "s_register_operand" "w,w")
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index 786daa628510a5def50530c5b459bece45a0007c..b7e3619caf461063876654c54393d305147f7bf7 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -190,3 +190,36 @@
   arm_expand_vec_perm (operands[0], operands[1], operands[2], operands[3]);
   DONE;
 })
+
+(define_expand "vec_extract<mode><V_elem_l>"
+ [(match_operand:<V_elem> 0 "nonimmediate_operand")
+  (match_operand:VQX_NOBF 1 "s_register_operand")
+  (match_operand:SI 2 "immediate_operand")]
+ "TARGET_NEON || TARGET_HAVE_MVE"
+{
+  if (TARGET_NEON)
+    emit_insn (gen_neon_vec_extract<mode><V_elem_l> (operands[0], operands[1],
+						     operands[2]));
+  else if (TARGET_HAVE_MVE)
+    emit_insn (gen_mve_vec_extract<mode><V_elem_l> (operands[0], operands[1],
+						     operands[2]));
+  else
+    gcc_unreachable ();
+  DONE;
+})
+
+(define_expand "vec_set<mode>"
+  [(match_operand:VQX_NOBF 0 "s_register_operand" "")
+   (match_operand:<V_elem> 1 "s_register_operand" "")
+   (match_operand:SI 2 "immediate_operand" "")]
+  "TARGET_NEON || TARGET_HAVE_MVE"
+{
+  HOST_WIDE_INT elem = HOST_WIDE_INT_1 << INTVAL (operands[2]);
+  if (TARGET_NEON)
+    emit_insn (gen_vec_set<mode>_internal (operands[0], operands[1],
+					   GEN_INT (elem), operands[0]));
+  else
+    emit_insn (gen_mve_vec_set<mode>_internal (operands[0], operands[1],
+					       GEN_INT (elem), operands[0]));
+  DONE;
+})
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c
new file mode 100644
index 0000000000000000000000000000000000000000..2a5aa63f4572a666e50d7825c8820d49eb9cd70e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+float16_t
+foo (float16x8_t a)
+{
+  return vgetq_lane_f16 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u16"  }  } */
+
+float16_t
+foo1 (float16x8_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u16"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c
new file mode 100644
index 0000000000000000000000000000000000000000..f1839cccffe1c34478f2372cd20b47761357b142
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+float32_t
+foo (float32x4_t a)
+{
+  return vgetq_lane_f32 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
+float32_t
+foo1 (float32x4_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c
new file mode 100644
index 0000000000000000000000000000000000000000..ed1c2178839568dcc3eea3342606ba8eff57ea72
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int16_t
+foo (int16x8_t a)
+{
+  return vgetq_lane_s16 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.s16"  }  } */
+
+int16_t
+foo1 (int16x8_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.s16"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c
new file mode 100644
index 0000000000000000000000000000000000000000..c87ed93e70def5bbf6b1055d99656f7386f97ea8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int32_t
+foo (int32x4_t a)
+{
+  return vgetq_lane_s32 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
+int32_t
+foo1 (int32x4_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c
new file mode 100644
index 0000000000000000000000000000000000000000..a7457f86320b6277aba26236715a69bd05b60d89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int64_t
+foo (int64x2_t a)
+{
+  return vgetq_lane_s64 (a, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
+
+int64_t
+foo1 (int64x2_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c
new file mode 100644
index 0000000000000000000000000000000000000000..11242ff3bc090a11bf7f8f163f0348824158bed7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int8_t
+foo (int8x16_t a)
+{
+  return vgetq_lane_s8 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.s8"  }  } */
+
+int8_t
+foo1 (int8x16_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.s8"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c
new file mode 100644
index 0000000000000000000000000000000000000000..2788b585535c46a3271be65849b1ba058df1adcf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint16_t
+foo (uint16x8_t a)
+{
+  return vgetq_lane_u16 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u16"  }  } */
+
+uint16_t
+foo1 (uint16x8_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u16"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c
new file mode 100644
index 0000000000000000000000000000000000000000..721c5a5ffd77cd1ad038d44f32fa197fe2687311
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint32_t
+foo (uint32x4_t a)
+{
+  return vgetq_lane_u32 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
+uint32_t
+foo1 (uint32x4_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c
new file mode 100644
index 0000000000000000000000000000000000000000..3cbbef520aee0731277883ae2449e9d2968c8683
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint64_t
+foo (uint64x2_t a)
+{
+  return vgetq_lane_u64 (a, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
+
+uint64_t
+foo1 (uint64x2_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c
new file mode 100644
index 0000000000000000000000000000000000000000..2bcaeac3fe1f5775f448d7f702ea139726fadcc3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint8_t
+foo (uint8x16_t a)
+{
+  return vgetq_lane_u8 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u8"  }  } */
+
+uint8_t
+foo1 (uint8x16_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u8"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
new file mode 100644
index 0000000000000000000000000000000000000000..e03e9620528b02d4e59d6365f0484c2478d70883
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+float16x8_t
+foo (float16_t a, float16x8_t b)
+{
+    return vsetq_lane_f16 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.16"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
new file mode 100644
index 0000000000000000000000000000000000000000..2b9f1a7e6272629ef6310704a4769c478c7695fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+float32x4_t
+foo (float32_t a, float32x4_t b)
+{
+    return vsetq_lane_f32 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c
new file mode 100644
index 0000000000000000000000000000000000000000..92ad0dd16a85d7b80645d9f54341dafbc760740b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int16x8_t
+foo (int16_t a, int16x8_t b)
+{
+    return vsetq_lane_s16 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.16"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c
new file mode 100644
index 0000000000000000000000000000000000000000..e60c8f26700be36d299e2a2fd44a6224c39f02a0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int32x4_t
+foo (int32_t a, int32x4_t b)
+{
+    return vsetq_lane_s32 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c
new file mode 100644
index 0000000000000000000000000000000000000000..e487b73d417a2af5a35560fda19f0c40d05a4315
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int64x2_t
+foo (int64_t a, int64x2_t b)
+{
+    return vsetq_lane_s64 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\td0, r[1-9]*[0-9], r[1-9]*[0-9]}  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c
new file mode 100644
index 0000000000000000000000000000000000000000..d8ccbb524fd0bc2ffb6bd2fde3c27583fd0f4542
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo (int8_t a, int8x16_t b)
+{
+    return vsetq_lane_s8 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.8"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c
new file mode 100644
index 0000000000000000000000000000000000000000..156a5d1de1b51332b30cd818cabae6f89011cc12
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint16x8_t
+foo (uint16_t a, uint16x8_t b)
+{
+    return vsetq_lane_u16 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.16"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c
new file mode 100644
index 0000000000000000000000000000000000000000..e9575483cc9b278268aa87238f27a8d743bb6398
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint32x4_t
+foo (uint32_t a, uint32x4_t b)
+{
+    return vsetq_lane_u32 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c
new file mode 100644
index 0000000000000000000000000000000000000000..ae57b9c947c3e7ff878c9d6c36880dd42ebbe88d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint64x2_t
+foo (uint64_t a, uint64x2_t b)
+{
+    return vsetq_lane_u64 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\td0, r[1-9]*[0-9], r[1-9]*[0-9]}  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c
new file mode 100644
index 0000000000000000000000000000000000000000..668b3fea953f8144f797895376e3bb8a7a3e64d4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint8x16_t
+foo (uint8_t a, uint8x16_t b)
+{
+    return vsetq_lane_u8 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.8"  }  } */
+


[-- Attachment #2: rb12713.patch --]
[-- Type: text/plain, Size: 38795 bytes --]

diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index f6810ddf4b735e1cd782a67c2d48bab8ddb75814..43520ee78e19f074912a6d965731465f1226986d 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -2506,8 +2506,40 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t;
 #define vld1q_z_f32(__base, __p) __arm_vld1q_z_f32(__base, __p)
 #define vst2q_f32(__addr, __value) __arm_vst2q_f32(__addr, __value)
 #define vst1q_p_f32(__addr, __value, __p) __arm_vst1q_p_f32(__addr, __value, __p)
+#define vsetq_lane_f16(__a, __b,  __idx) __arm_vsetq_lane_f16(__a, __b,  __idx)
+#define vsetq_lane_f32(__a, __b,  __idx) __arm_vsetq_lane_f32(__a, __b,  __idx)
+#define vsetq_lane_s16(__a, __b,  __idx) __arm_vsetq_lane_s16(__a, __b,  __idx)
+#define vsetq_lane_s32(__a, __b,  __idx) __arm_vsetq_lane_s32(__a, __b,  __idx)
+#define vsetq_lane_s8(__a, __b,  __idx) __arm_vsetq_lane_s8(__a, __b,  __idx)
+#define vsetq_lane_s64(__a, __b,  __idx) __arm_vsetq_lane_s64(__a, __b,  __idx)
+#define vsetq_lane_u8(__a, __b,  __idx) __arm_vsetq_lane_u8(__a, __b,  __idx)
+#define vsetq_lane_u16(__a, __b,  __idx) __arm_vsetq_lane_u16(__a, __b,  __idx)
+#define vsetq_lane_u32(__a, __b,  __idx) __arm_vsetq_lane_u32(__a, __b,  __idx)
+#define vsetq_lane_u64(__a, __b,  __idx) __arm_vsetq_lane_u64(__a, __b,  __idx)
+#define vgetq_lane_f16(__a,  __idx) __arm_vgetq_lane_f16(__a,  __idx)
+#define vgetq_lane_f32(__a,  __idx) __arm_vgetq_lane_f32(__a,  __idx)
+#define vgetq_lane_s16(__a,  __idx) __arm_vgetq_lane_s16(__a,  __idx)
+#define vgetq_lane_s32(__a,  __idx) __arm_vgetq_lane_s32(__a,  __idx)
+#define vgetq_lane_s8(__a,  __idx) __arm_vgetq_lane_s8(__a,  __idx)
+#define vgetq_lane_s64(__a,  __idx) __arm_vgetq_lane_s64(__a,  __idx)
+#define vgetq_lane_u8(__a,  __idx) __arm_vgetq_lane_u8(__a,  __idx)
+#define vgetq_lane_u16(__a,  __idx) __arm_vgetq_lane_u16(__a,  __idx)
+#define vgetq_lane_u32(__a,  __idx) __arm_vgetq_lane_u32(__a,  __idx)
+#define vgetq_lane_u64(__a,  __idx) __arm_vgetq_lane_u64(__a,  __idx)
 #endif
 
+/* For big-endian, GCC's vector indices are reversed within each 64 bits
+   compared to the architectural lane indices used by MVE intrinsics.  */
+#define __ARM_NUM_LANES(__v) (sizeof (__v) / sizeof (__v[0]))
+#ifdef __ARM_BIG_ENDIAN
+#define __ARM_LANEQ(__vec, __idx) (__idx ^ (__ARM_NUM_LANES(__vec)/2 - 1))
+#else
+#define __ARM_LANEQ(__vec, __idx) __idx
+#endif
+#define __ARM_CHECK_LANEQ(__vec, __idx)		 \
+  __builtin_arm_lane_check (__ARM_NUM_LANES(__vec),     \
+			    __ARM_LANEQ(__vec, __idx))
+
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vst4q_s8 (int8_t * __addr, int8x16x4_t __value)
@@ -16371,6 +16403,142 @@ __arm_vld4q_u32 (uint32_t const * __addr)
   return __rv.__i;
 }
 
+__extension__ extern __inline int16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_s16 (int16_t __a, int16x8_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline int32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_s32 (int32_t __a, int32x4_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline int8x16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_s8 (int8_t __a, int8x16_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline int64x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_s64 (int64_t __a, int64x2_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline uint8x16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_u8 (uint8_t __a, uint8x16_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline uint16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_u16 (uint16_t __a, uint16x8_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline uint32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_u32 (uint32_t __a, uint32x4_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline uint64x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_u64 (uint64_t __a, uint64x2_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline int16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_s16 (int16x8_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline int32_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_s32 (int32x4_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline int8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_s8 (int8x16_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline int64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_s64 (int64x2_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline uint8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_u8 (uint8x16_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline uint16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_u16 (uint16x8_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline uint32_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_u32 (uint32x4_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline uint64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_u64 (uint64x2_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
 #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point.  */
 
 __extension__ extern __inline void
@@ -19804,6 +19972,39 @@ __arm_vst1q_p_f32 (float32_t * __addr, float32x4_t __value, mve_pred16_t __p)
   return vstrwq_p_f32 (__addr, __value, __p);
 }
 
+__extension__ extern __inline float16x8_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_f16 (float16_t __a, float16x8_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline float32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vsetq_lane_f32 (float32_t __a, float32x4_t __b, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__b, __idx);
+  __b[__ARM_LANEQ(__b,__idx)] = __a;
+  return __b;
+}
+
+__extension__ extern __inline float16_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_f16 (float16x8_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
+
+__extension__ extern __inline float32_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+__arm_vgetq_lane_f32 (float32x4_t __a, const int __idx)
+{
+  __ARM_CHECK_LANEQ (__a, __idx);
+  return __a[__ARM_LANEQ(__a,__idx)];
+}
 #endif
 
 enum {
@@ -23090,6 +23291,35 @@ extern void *__ARM_undef;
   int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: __arm_vcmulq_rot90_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, float16x8_t), p3), \
   int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: __arm_vcmulq_rot90_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, float32x4_t), p3));})
 
+#define vgetq_lane(p0,p1) __arm_vgetq_lane(p0,p1)
+#define __arm_vgetq_lane(p0,p1) ({ __typeof(p0) __p0 = (p0); \
+  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
+  int (*)[__ARM_mve_type_int8x16_t]: __arm_vgetq_lane_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1), \
+  int (*)[__ARM_mve_type_int16x8_t]: __arm_vgetq_lane_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1), \
+  int (*)[__ARM_mve_type_int32x4_t]: __arm_vgetq_lane_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1), \
+  int (*)[__ARM_mve_type_int64x2_t]: __arm_vgetq_lane_s64 (__ARM_mve_coerce(__p0, int64x2_t), p1), \
+  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vgetq_lane_u8 (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
+  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vgetq_lane_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
+  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vgetq_lane_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1), \
+  int (*)[__ARM_mve_type_uint64x2_t]: __arm_vgetq_lane_u64 (__ARM_mve_coerce(__p0, uint64x2_t), p1), \
+  int (*)[__ARM_mve_type_float16x8_t]: __arm_vgetq_lane_f16 (__ARM_mve_coerce(__p0, float16x8_t), p1), \
+  int (*)[__ARM_mve_type_float32x4_t]: __arm_vgetq_lane_f32 (__ARM_mve_coerce(__p0, float32x4_t), p1));})
+
+#define vsetq_lane(p0,p1,p2) __arm_vsetq_lane(p0,p1,p2)
+#define __arm_vsetq_lane(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
+  __typeof(p1) __p1 = (p1); \
+  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \
+  int (*)[__ARM_mve_type_int8_t][__ARM_mve_type_int8x16_t]: __arm_vsetq_lane_s8 (__ARM_mve_coerce(__p0, int8_t), __ARM_mve_coerce(__p1, int8x16_t), p2), \
+  int (*)[__ARM_mve_type_int16_t][__ARM_mve_type_int16x8_t]: __arm_vsetq_lane_s16 (__ARM_mve_coerce(__p0, int16_t), __ARM_mve_coerce(__p1, int16x8_t), p2), \
+  int (*)[__ARM_mve_type_int32_t][__ARM_mve_type_int32x4_t]: __arm_vsetq_lane_s32 (__ARM_mve_coerce(__p0, int32_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \
+  int (*)[__ARM_mve_type_int64_t][__ARM_mve_type_int64x2_t]: __arm_vsetq_lane_s64 (__ARM_mve_coerce(__p0, int64_t), __ARM_mve_coerce(__p1, int64x2_t), p2), \
+  int (*)[__ARM_mve_type_uint8_t][__ARM_mve_type_uint8x16_t]: __arm_vsetq_lane_u8 (__ARM_mve_coerce(__p0, uint8_t), __ARM_mve_coerce(__p1, uint8x16_t), p2), \
+  int (*)[__ARM_mve_type_uint16_t][__ARM_mve_type_uint16x8_t]: __arm_vsetq_lane_u16 (__ARM_mve_coerce(__p0, uint16_t), __ARM_mve_coerce(__p1, uint16x8_t), p2), \
+  int (*)[__ARM_mve_type_uint32_t][__ARM_mve_type_uint32x4_t]: __arm_vsetq_lane_u32 (__ARM_mve_coerce(__p0, uint32_t), __ARM_mve_coerce(__p1, uint32x4_t), p2), \
+  int (*)[__ARM_mve_type_uint64_t][__ARM_mve_type_uint64x2_t]: __arm_vsetq_lane_u64 (__ARM_mve_coerce(__p0, uint64_t), __ARM_mve_coerce(__p1, uint64x2_t), p2), \
+  int (*)[__ARM_mve_type_float16_t][__ARM_mve_type_float16x8_t]: __arm_vsetq_lane_f16 (__ARM_mve_coerce(__p0, float16_t), __ARM_mve_coerce(__p1, float16x8_t), p2), \
+  int (*)[__ARM_mve_type_float32_t][__ARM_mve_type_float32x4_t]: __arm_vsetq_lane_f32 (__ARM_mve_coerce(__p0, float32_t), __ARM_mve_coerce(__p1, float32x4_t), p2));})
+
 #else /* MVE Integer.  */
 
 #define vstrwq_scatter_base_wb(p0,p1,p2) __arm_vstrwq_scatter_base_wb(p0,p1,p2)
@@ -25885,6 +26115,31 @@ extern void *__ARM_undef;
   int (*)[__ARM_mve_type_uint16_t_const_ptr]: __arm_vld4q_u16 (__ARM_mve_coerce(__p0, uint16_t const *)), \
   int (*)[__ARM_mve_type_uint32_t_const_ptr]: __arm_vld4q_u32 (__ARM_mve_coerce(__p0, uint32_t const *)));})
 
+#define vgetq_lane(p0,p1) __arm_vgetq_lane(p0,p1)
+#define __arm_vgetq_lane(p0,p1) ({ __typeof(p0) __p0 = (p0); \
+  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
+  int (*)[__ARM_mve_type_int8x16_t]: __arm_vgetq_lane_s8 (__ARM_mve_coerce(__p0, int8x16_t), p1), \
+  int (*)[__ARM_mve_type_int16x8_t]: __arm_vgetq_lane_s16 (__ARM_mve_coerce(__p0, int16x8_t), p1), \
+  int (*)[__ARM_mve_type_int32x4_t]: __arm_vgetq_lane_s32 (__ARM_mve_coerce(__p0, int32x4_t), p1), \
+  int (*)[__ARM_mve_type_int64x2_t]: __arm_vgetq_lane_s64 (__ARM_mve_coerce(__p0, int64x2_t), p1), \
+  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vgetq_lane_u8 (__ARM_mve_coerce(__p0, uint8x16_t), p1), \
+  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vgetq_lane_u16 (__ARM_mve_coerce(__p0, uint16x8_t), p1), \
+  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vgetq_lane_u32 (__ARM_mve_coerce(__p0, uint32x4_t), p1), \
+  int (*)[__ARM_mve_type_uint64x2_t]: __arm_vgetq_lane_u64 (__ARM_mve_coerce(__p0, uint64x2_t), p1));})
+
+#define vsetq_lane(p0,p1,p2) __arm_vsetq_lane(p0,p1,p2)
+#define __arm_vsetq_lane(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
+  __typeof(p1) __p1 = (p1); \
+  _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \
+  int (*)[__ARM_mve_type_int8_t][__ARM_mve_type_int8x16_t]: __arm_vsetq_lane_s8 (__ARM_mve_coerce(__p0, int8_t), __ARM_mve_coerce(__p1, int8x16_t), p2), \
+  int (*)[__ARM_mve_type_int16_t][__ARM_mve_type_int16x8_t]: __arm_vsetq_lane_s16 (__ARM_mve_coerce(__p0, int16_t), __ARM_mve_coerce(__p1, int16x8_t), p2), \
+  int (*)[__ARM_mve_type_int32_t][__ARM_mve_type_int32x4_t]: __arm_vsetq_lane_s32 (__ARM_mve_coerce(__p0, int32_t), __ARM_mve_coerce(__p1, int32x4_t), p2), \
+  int (*)[__ARM_mve_type_int64_t][__ARM_mve_type_int64x2_t]: __arm_vsetq_lane_s64 (__ARM_mve_coerce(__p0, int64_t), __ARM_mve_coerce(__p1, int64x2_t), p2), \
+  int (*)[__ARM_mve_type_uint8_t][__ARM_mve_type_uint8x16_t]: __arm_vsetq_lane_u8 (__ARM_mve_coerce(__p0, uint8_t), __ARM_mve_coerce(__p1, uint8x16_t), p2), \
+  int (*)[__ARM_mve_type_uint16_t][__ARM_mve_type_uint16x8_t]: __arm_vsetq_lane_u16 (__ARM_mve_coerce(__p0, uint16_t), __ARM_mve_coerce(__p1, uint16x8_t), p2), \
+  int (*)[__ARM_mve_type_uint32_t][__ARM_mve_type_uint32x4_t]: __arm_vsetq_lane_u32 (__ARM_mve_coerce(__p0, uint32_t), __ARM_mve_coerce(__p1, uint32x4_t), p2), \
+  int (*)[__ARM_mve_type_uint64_t][__ARM_mve_type_uint64x2_t]: __arm_vsetq_lane_u64 (__ARM_mve_coerce(__p0, uint64_t), __ARM_mve_coerce(__p1, uint64x2_t), p2));})
+
 #endif /* MVE Integer.  */
 
 #define vmvnq_x(p1,p2) __arm_vmvnq_x(p1,p2)
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index f3cbc0d03564ef8866226f836a27ed6051353f5d..e6b66eef3728122c87bd6ea68b8a643dd4552b00 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -129,6 +129,9 @@
 ;; Quad-width vector modes plus 64-bit elements.
 (define_mode_iterator VQX [V16QI V8HI V8HF V8BF V4SI V4SF V2DI])
 
+;; Quad-width vector modes plus 64-bit elements.
+(define_mode_iterator VQX_NOBF [V16QI V8HI V8HF V4SI V4SF V2DI])
+
 ;; Quad-width vector modes plus 64-bit elements and V8BF.
 (define_mode_iterator VQXBF [V16QI V8HI V8HF (V8BF "TARGET_BF16_SIMD") V4SI V4SF V2DI])
 
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 2e28d9d8408127dd52b9d16c772e7f27a47d390a..2b59d5a58171cddea1155610ddbb3c7f96105d24 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -411,6 +411,8 @@
 (define_mode_attr MVE_H_ELEM [ (V8HI "V8HI") (V4SI "V4HI")])
 (define_mode_attr V_sz_elem1 [(V16QI "b") (V8HI  "h") (V4SI "w") (V8HF "h")
 			      (V4SF "w")])
+(define_mode_attr V_extr_elem [(V16QI "u8") (V8HI "u16") (V4SI "32")
+			       (V8HF "u16") (V4SF "32")])
 
 (define_int_iterator VCVTQ_TO_F [VCVTQ_TO_F_S VCVTQ_TO_F_U])
 (define_int_iterator VMVNQ_N [VMVNQ_N_U VMVNQ_N_S])
@@ -10885,3 +10887,121 @@
    return "";
 }
   [(set_attr "length" "16")])
+;;
+;; [vgetq_lane_u, vgetq_lane_s, vgetq_lane_f])
+;;
+(define_insn "mve_vec_extract<mode><V_elem_l>"
+ [(set (match_operand:<V_elem> 0 "s_register_operand" "=r")
+   (vec_select:<V_elem>
+    (match_operand:MVE_VLD_ST 1 "s_register_operand" "w")
+    (parallel [(match_operand:SI 2 "immediate_operand" "i")])))]
+  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
+   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      int elt = INTVAL (operands[2]);
+      elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
+      operands[2] = GEN_INT (elt);
+    }
+  return "vmov.<V_extr_elem>\t%0, %q1[%c2]";
+}
+ [(set_attr "type" "mve_move")])
+
+(define_insn "mve_vec_extractv2didi"
+ [(set (match_operand:DI 0 "s_register_operand" "=r")
+   (vec_select:DI
+    (match_operand:V2DI 1 "s_register_operand" "w")
+    (parallel [(match_operand:SI 2 "immediate_operand" "i")])))]
+  "TARGET_HAVE_MVE"
+{
+  int elt = INTVAL (operands[2]);
+  if (BYTES_BIG_ENDIAN)
+    elt = 1 - elt;
+
+  if (elt == 0)
+   return "vmov\t%Q0, %R0, %e1";
+  else
+   return "vmov\t%J0, %K0, %f1";
+}
+ [(set_attr "type" "mve_move")])
+
+(define_insn "*mve_vec_extract_sext_internal<mode>"
+ [(set (match_operand:SI 0 "s_register_operand" "=r")
+   (sign_extend:SI
+    (vec_select:<V_elem>
+     (match_operand:MVE_2 1 "s_register_operand" "w")
+     (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
+  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
+   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      int elt = INTVAL (operands[2]);
+      elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
+      operands[2] = GEN_INT (elt);
+    }
+  return "vmov.s<V_sz_elem>\t%0, %q1[%c2]";
+}
+ [(set_attr "type" "mve_move")])
+
+(define_insn "*mve_vec_extract_zext_internal<mode>"
+ [(set (match_operand:SI 0 "s_register_operand" "=r")
+   (zero_extend:SI
+    (vec_select:<V_elem>
+     (match_operand:MVE_2 1 "s_register_operand" "w")
+     (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
+  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
+   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))"
+{
+  if (BYTES_BIG_ENDIAN)
+    {
+      int elt = INTVAL (operands[2]);
+      elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
+      operands[2] = GEN_INT (elt);
+    }
+  return "vmov.u<V_sz_elem>\t%0, %q1[%c2]";
+}
+ [(set_attr "type" "mve_move")])
+
+;;
+;; [vsetq_lane_u, vsetq_lane_s, vsetq_lane_f])
+;;
+(define_insn "mve_vec_set<mode>_internal"
+ [(set (match_operand:VQ2 0 "s_register_operand" "=w")
+       (vec_merge:VQ2
+	(vec_duplicate:VQ2
+	  (match_operand:<V_elem> 1 "nonimmediate_operand" "r"))
+	(match_operand:VQ2 3 "s_register_operand" "0")
+	(match_operand:SI 2 "immediate_operand" "i")))]
+  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
+   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (<MODE>mode))"
+{
+  int elt = ffs ((int) INTVAL (operands[2])) - 1;
+  if (BYTES_BIG_ENDIAN)
+    elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
+  operands[2] = GEN_INT (elt);
+
+  return "vmov.<V_sz_elem>\t%q0[%c2], %1";
+}
+ [(set_attr "type" "mve_move")])
+
+(define_insn "mve_vec_setv2di_internal"
+ [(set (match_operand:V2DI 0 "s_register_operand" "=w")
+       (vec_merge:V2DI
+	(vec_duplicate:V2DI
+	  (match_operand:DI 1 "nonimmediate_operand" "r"))
+	(match_operand:V2DI 3 "s_register_operand" "0")
+	(match_operand:SI 2 "immediate_operand" "i")))]
+ "TARGET_HAVE_MVE"
+{
+  int elt = ffs ((int) INTVAL (operands[2])) - 1;
+  if (BYTES_BIG_ENDIAN)
+    elt = 1 - elt;
+
+  if (elt == 0)
+   return "vmov\t%e0, %Q1, %R1";
+  else
+   return "vmov\t%f0, %J1, %K1";
+}
+ [(set_attr "type" "mve_move")])
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 272e6c1e7cfc4c42065d1d50131ef49d89052d91..3e7b51d8ab60007901392df0ca1cb09fead4d0e9 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -411,18 +411,6 @@
   [(set_attr "type" "neon_load1_all_lanes_q,neon_from_gp_q")]
 )
 
-(define_expand "vec_set<mode>"
-  [(match_operand:VDQ 0 "s_register_operand")
-   (match_operand:<V_elem> 1 "s_register_operand")
-   (match_operand:SI 2 "immediate_operand")]
-  "TARGET_NEON"
-{
-  HOST_WIDE_INT elem = HOST_WIDE_INT_1 << INTVAL (operands[2]);
-  emit_insn (gen_vec_set<mode>_internal (operands[0], operands[1],
-					 GEN_INT (elem), operands[0]));
-  DONE;
-})
-
 (define_insn "vec_extract<mode><V_elem_l>"
   [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
         (vec_select:<V_elem>
@@ -445,7 +433,10 @@
   [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
 )
 
-(define_insn "vec_extract<mode><V_elem_l>"
+;; This pattern is renamed from "vec_extract<mode><V_elem_l>" to
+;; "neon_vec_extract<mode><V_elem_l>" and this pattern is called
+;; by define_expand in vec-common.md file.
+(define_insn "neon_vec_extract<mode><V_elem_l>"
   [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
 	(vec_select:<V_elem>
           (match_operand:VQ2 1 "s_register_operand" "w,w")
@@ -471,7 +462,9 @@
   [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
 )
 
-(define_insn "vec_extractv2didi"
+;; This pattern is renamed from "vec_extractv2didi" to "neon_vec_extractv2didi"
+;; and this pattern is called by define_expand in vec-common.md file.
+(define_insn "neon_vec_extractv2didi"
   [(set (match_operand:DI 0 "nonimmediate_operand" "=Um,r")
 	(vec_select:DI
           (match_operand:V2DI 1 "s_register_operand" "w,w")
diff --git a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md
index 786daa628510a5def50530c5b459bece45a0007c..b7e3619caf461063876654c54393d305147f7bf7 100644
--- a/gcc/config/arm/vec-common.md
+++ b/gcc/config/arm/vec-common.md
@@ -190,3 +190,36 @@
   arm_expand_vec_perm (operands[0], operands[1], operands[2], operands[3]);
   DONE;
 })
+
+(define_expand "vec_extract<mode><V_elem_l>"
+ [(match_operand:<V_elem> 0 "nonimmediate_operand")
+  (match_operand:VQX_NOBF 1 "s_register_operand")
+  (match_operand:SI 2 "immediate_operand")]
+ "TARGET_NEON || TARGET_HAVE_MVE"
+{
+  if (TARGET_NEON)
+    emit_insn (gen_neon_vec_extract<mode><V_elem_l> (operands[0], operands[1],
+						     operands[2]));
+  else if (TARGET_HAVE_MVE)
+    emit_insn (gen_mve_vec_extract<mode><V_elem_l> (operands[0], operands[1],
+						     operands[2]));
+  else
+    gcc_unreachable ();
+  DONE;
+})
+
+(define_expand "vec_set<mode>"
+  [(match_operand:VQX_NOBF 0 "s_register_operand" "")
+   (match_operand:<V_elem> 1 "s_register_operand" "")
+   (match_operand:SI 2 "immediate_operand" "")]
+  "TARGET_NEON || TARGET_HAVE_MVE"
+{
+  HOST_WIDE_INT elem = HOST_WIDE_INT_1 << INTVAL (operands[2]);
+  if (TARGET_NEON)
+    emit_insn (gen_vec_set<mode>_internal (operands[0], operands[1],
+					   GEN_INT (elem), operands[0]));
+  else
+    emit_insn (gen_mve_vec_set<mode>_internal (operands[0], operands[1],
+					       GEN_INT (elem), operands[0]));
+  DONE;
+})
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c
new file mode 100644
index 0000000000000000000000000000000000000000..2a5aa63f4572a666e50d7825c8820d49eb9cd70e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+float16_t
+foo (float16x8_t a)
+{
+  return vgetq_lane_f16 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u16"  }  } */
+
+float16_t
+foo1 (float16x8_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u16"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c
new file mode 100644
index 0000000000000000000000000000000000000000..f1839cccffe1c34478f2372cd20b47761357b142
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+float32_t
+foo (float32x4_t a)
+{
+  return vgetq_lane_f32 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
+float32_t
+foo1 (float32x4_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c
new file mode 100644
index 0000000000000000000000000000000000000000..ed1c2178839568dcc3eea3342606ba8eff57ea72
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int16_t
+foo (int16x8_t a)
+{
+  return vgetq_lane_s16 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.s16"  }  } */
+
+int16_t
+foo1 (int16x8_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.s16"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c
new file mode 100644
index 0000000000000000000000000000000000000000..c87ed93e70def5bbf6b1055d99656f7386f97ea8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int32_t
+foo (int32x4_t a)
+{
+  return vgetq_lane_s32 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
+int32_t
+foo1 (int32x4_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c
new file mode 100644
index 0000000000000000000000000000000000000000..a7457f86320b6277aba26236715a69bd05b60d89
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int64_t
+foo (int64x2_t a)
+{
+  return vgetq_lane_s64 (a, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
+
+int64_t
+foo1 (int64x2_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c
new file mode 100644
index 0000000000000000000000000000000000000000..11242ff3bc090a11bf7f8f163f0348824158bed7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int8_t
+foo (int8x16_t a)
+{
+  return vgetq_lane_s8 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.s8"  }  } */
+
+int8_t
+foo1 (int8x16_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.s8"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c
new file mode 100644
index 0000000000000000000000000000000000000000..2788b585535c46a3271be65849b1ba058df1adcf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint16_t
+foo (uint16x8_t a)
+{
+  return vgetq_lane_u16 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u16"  }  } */
+
+uint16_t
+foo1 (uint16x8_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u16"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c
new file mode 100644
index 0000000000000000000000000000000000000000..721c5a5ffd77cd1ad038d44f32fa197fe2687311
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint32_t
+foo (uint32x4_t a)
+{
+  return vgetq_lane_u32 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
+uint32_t
+foo1 (uint32x4_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c
new file mode 100644
index 0000000000000000000000000000000000000000..3cbbef520aee0731277883ae2449e9d2968c8683
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint64_t
+foo (uint64x2_t a)
+{
+  return vgetq_lane_u64 (a, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
+
+uint64_t
+foo1 (uint64x2_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c
new file mode 100644
index 0000000000000000000000000000000000000000..2bcaeac3fe1f5775f448d7f702ea139726fadcc3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c
@@ -0,0 +1,22 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint8_t
+foo (uint8x16_t a)
+{
+  return vgetq_lane_u8 (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u8"  }  } */
+
+uint8_t
+foo1 (uint8x16_t a)
+{
+  return vgetq_lane (a, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.u8"  }  } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
new file mode 100644
index 0000000000000000000000000000000000000000..e03e9620528b02d4e59d6365f0484c2478d70883
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+float16x8_t
+foo (float16_t a, float16x8_t b)
+{
+    return vsetq_lane_f16 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.16"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
new file mode 100644
index 0000000000000000000000000000000000000000..2b9f1a7e6272629ef6310704a4769c478c7695fa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-add-options arm_v8_1m_mve_fp } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+float32x4_t
+foo (float32_t a, float32x4_t b)
+{
+    return vsetq_lane_f32 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c
new file mode 100644
index 0000000000000000000000000000000000000000..92ad0dd16a85d7b80645d9f54341dafbc760740b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int16x8_t
+foo (int16_t a, int16x8_t b)
+{
+    return vsetq_lane_s16 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.16"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c
new file mode 100644
index 0000000000000000000000000000000000000000..e60c8f26700be36d299e2a2fd44a6224c39f02a0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int32x4_t
+foo (int32_t a, int32x4_t b)
+{
+    return vsetq_lane_s32 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c
new file mode 100644
index 0000000000000000000000000000000000000000..e487b73d417a2af5a35560fda19f0c40d05a4315
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int64x2_t
+foo (int64_t a, int64x2_t b)
+{
+    return vsetq_lane_s64 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\td0, r[1-9]*[0-9], r[1-9]*[0-9]}  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c
new file mode 100644
index 0000000000000000000000000000000000000000..d8ccbb524fd0bc2ffb6bd2fde3c27583fd0f4542
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo (int8_t a, int8x16_t b)
+{
+    return vsetq_lane_s8 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.8"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c
new file mode 100644
index 0000000000000000000000000000000000000000..156a5d1de1b51332b30cd818cabae6f89011cc12
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint16x8_t
+foo (uint16_t a, uint16x8_t b)
+{
+    return vsetq_lane_u16 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.16"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c
new file mode 100644
index 0000000000000000000000000000000000000000..e9575483cc9b278268aa87238f27a8d743bb6398
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint32x4_t
+foo (uint32_t a, uint32x4_t b)
+{
+    return vsetq_lane_u32 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.32"  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c
new file mode 100644
index 0000000000000000000000000000000000000000..ae57b9c947c3e7ff878c9d6c36880dd42ebbe88d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint64x2_t
+foo (uint64_t a, uint64x2_t b)
+{
+    return vsetq_lane_u64 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler {vmov\td0, r[1-9]*[0-9], r[1-9]*[0-9]}  }  } */
+
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c
new file mode 100644
index 0000000000000000000000000000000000000000..668b3fea953f8144f797895376e3bb8a7a3e64d4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c
@@ -0,0 +1,15 @@
+/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+
+uint8x16_t
+foo (uint8_t a, uint8x16_t b)
+{
+    return vsetq_lane_u8 (a, b, 0);
+}
+
+/* { dg-final { scan-assembler "vmov.8"  }  } */
+


^ permalink raw reply	[flat|nested] 2+ messages in thread

* RE: [PATCH v2][ARM][GCC][12x]: MVE ACLE intrinsics to set and get vector lane.
  2020-03-23 17:42 [PATCH v2][ARM][GCC][12x]: MVE ACLE intrinsics to set and get vector lane Srinath Parvathaneni
@ 2020-03-23 18:13 ` Kyrylo Tkachov
  0 siblings, 0 replies; 2+ messages in thread
From: Kyrylo Tkachov @ 2020-03-23 18:13 UTC (permalink / raw)
  To: Srinath Parvathaneni, gcc-patches

Hi Srinath,

> -----Original Message-----
> From: Srinath Parvathaneni <Srinath.Parvathaneni@arm.com>
> Sent: 23 March 2020 17:43
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
> Subject: [PATCH v2][ARM][GCC][12x]: MVE ACLE intrinsics to set and get
> vector lane.
> 
> Hello Kyrill,
> 
> Following patch is the rebased version of v1.
> (version v1) https://gcc.gnu.org/pipermail/gcc-patches/2019-
> November/534346.html
> 
> ####
> 
> Hello,
> 
> This patch supports following MVE ACLE intrinsics to get and set vector lane.
> 
> vsetq_lane_f16, vsetq_lane_f32, vsetq_lane_s16, vsetq_lane_s32,
> vsetq_lane_s8, vsetq_lane_s64, vsetq_lane_u8, vsetq_lane_u16,
> vsetq_lane_u32, vsetq_lane_u64, vgetq_lane_f16, vgetq_lane_f32,
> vgetq_lane_s16, vgetq_lane_s32, vgetq_lane_s8, vgetq_lane_s64,
> vgetq_lane_u8, vgetq_lane_u16, vgetq_lane_u32, vgetq_lane_u64.
> 
> Please refer to M-profile Vector Extension (MVE) intrinsics [1]  for more
> details.
> [1] https://developer.arm.com/architectures/instruction-sets/simd-
> isas/helium/mve-intrinsics
> 
> Regression tested on arm-none-eabi and found no regressions.
> 
> Ok for trunk?

Thanks, I've pushed this patch to master.
Kyrill

> 
> Thanks,
> Srinath.
> 
> gcc/ChangeLog:
> 
> 2019-11-08  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>
>             Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Mihail Ionescu  <mihail.ionescu@arm.com>
> 
> 	* config/arm/arm_mve.h (vsetq_lane_f16): Define macro.
> 	(vsetq_lane_f32): Likewise.
> 	(vsetq_lane_s16): Likewise.
> 	(vsetq_lane_s32): Likewise.
> 	(vsetq_lane_s8): Likewise.
> 	(vsetq_lane_s64): Likewise.
> 	(vsetq_lane_u8): Likewise.
> 	(vsetq_lane_u16): Likewise.
> 	(vsetq_lane_u32): Likewise.
> 	(vsetq_lane_u64): Likewise.
> 	(vgetq_lane_f16): Likewise.
> 	(vgetq_lane_f32): Likewise.
> 	(vgetq_lane_s16): Likewise.
> 	(vgetq_lane_s32): Likewise.
> 	(vgetq_lane_s8): Likewise.
> 	(vgetq_lane_s64): Likewise.
> 	(vgetq_lane_u8): Likewise.
> 	(vgetq_lane_u16): Likewise.
> 	(vgetq_lane_u32): Likewise.
> 	(vgetq_lane_u64): Likewise.
> 	(__ARM_NUM_LANES): Likewise.
> 	(__ARM_LANEQ): Likewise.
> 	(__ARM_CHECK_LANEQ): Likewise.
> 	(__arm_vsetq_lane_s16): Define intrinsic.
> 	(__arm_vsetq_lane_s32): Likewise.
> 	(__arm_vsetq_lane_s8): Likewise.
> 	(__arm_vsetq_lane_s64): Likewise.
> 	(__arm_vsetq_lane_u8): Likewise.
> 	(__arm_vsetq_lane_u16): Likewise.
> 	(__arm_vsetq_lane_u32): Likewise.
> 	(__arm_vsetq_lane_u64): Likewise.
> 	(__arm_vgetq_lane_s16): Likewise.
> 	(__arm_vgetq_lane_s32): Likewise.
> 	(__arm_vgetq_lane_s8): Likewise.
> 	(__arm_vgetq_lane_s64): Likewise.
> 	(__arm_vgetq_lane_u8): Likewise.
> 	(__arm_vgetq_lane_u16): Likewise.
> 	(__arm_vgetq_lane_u32): Likewise.
> 	(__arm_vgetq_lane_u64): Likewise.
> 	(__arm_vsetq_lane_f16): Likewise.
> 	(__arm_vsetq_lane_f32): Likewise.
> 	(__arm_vgetq_lane_f16): Likewise.
> 	(__arm_vgetq_lane_f32): Likewise.
> 	(vgetq_lane): Define polymorphic variant.
> 	(vsetq_lane): Likewise.
> 	* config/arm/mve.md (mve_vec_extract<mode><V_elem_l>): Define
> RTL
> 	pattern.
> 	(mve_vec_extractv2didi): Likewise.
> 	(mve_vec_extract_sext_internal<mode>): Likewise.
> 	(mve_vec_extract_zext_internal<mode>): Likewise.
> 	(mve_vec_set<mode>_internal): Likewise.
> 	(mve_vec_setv2di_internal): Likewise.
> 	* config/arm/neon.md (vec_set<mode>): Move RTL pattern to vec-
> common.md
> 	file.
> 	(vec_extract<mode><V_elem_l>): Rename to
> 	"neon_vec_extract<mode><V_elem_l>".
> 	(vec_extractv2didi): Rename to "neon_vec_extractv2didi".
> 	* config/arm/vec-common.md (vec_extract<mode><V_elem_l>):
> Define RTL
> 	pattern common for MVE and NEON.
> 	(vec_set<mode>): Move RTL pattern from neon.md and modify to
> accept both
> 	MVE and NEON.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-11-08  Srinath Parvathaneni  <srinath.parvathaneni@arm.com>
>             Andre Vieira  <andre.simoesdiasvieira@arm.com>
>             Mihail Ionescu  <mihail.ionescu@arm.com>
> 
> 	* gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c: New test.
> 	* gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c: Likewise.
> 	* gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c: Likewise.
> 
> 
> ###############     Attachment also inlined for ease of reply
> ###############
> 
> 
> diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index
> f6810ddf4b735e1cd782a67c2d48bab8ddb75814..43520ee78e19f074912a6d9
> 65731465f1226986d 100644
> --- a/gcc/config/arm/arm_mve.h
> +++ b/gcc/config/arm/arm_mve.h
> @@ -2506,8 +2506,40 @@ typedef struct { uint8x16_t val[4]; } uint8x16x4_t;
> #define vld1q_z_f32(__base, __p) __arm_vld1q_z_f32(__base, __p)  #define
> vst2q_f32(__addr, __value) __arm_vst2q_f32(__addr, __value)  #define
> vst1q_p_f32(__addr, __value, __p) __arm_vst1q_p_f32(__addr, __value, __p)
> +#define vsetq_lane_f16(__a, __b,  __idx) __arm_vsetq_lane_f16(__a, __b,
> +__idx) #define vsetq_lane_f32(__a, __b,  __idx)
> +__arm_vsetq_lane_f32(__a, __b,  __idx) #define vsetq_lane_s16(__a, __b,
> +__idx) __arm_vsetq_lane_s16(__a, __b,  __idx) #define
> +vsetq_lane_s32(__a, __b,  __idx) __arm_vsetq_lane_s32(__a, __b,  __idx)
> +#define vsetq_lane_s8(__a, __b,  __idx) __arm_vsetq_lane_s8(__a, __b,
> +__idx) #define vsetq_lane_s64(__a, __b,  __idx)
> +__arm_vsetq_lane_s64(__a, __b,  __idx) #define vsetq_lane_u8(__a, __b,
> +__idx) __arm_vsetq_lane_u8(__a, __b,  __idx) #define
> +vsetq_lane_u16(__a, __b,  __idx) __arm_vsetq_lane_u16(__a, __b,  __idx)
> +#define vsetq_lane_u32(__a, __b,  __idx) __arm_vsetq_lane_u32(__a, __b,
> +__idx) #define vsetq_lane_u64(__a, __b,  __idx)
> +__arm_vsetq_lane_u64(__a, __b,  __idx) #define vgetq_lane_f16(__a,
> +__idx) __arm_vgetq_lane_f16(__a,  __idx) #define vgetq_lane_f32(__a,
> +__idx) __arm_vgetq_lane_f32(__a,  __idx) #define vgetq_lane_s16(__a,
> +__idx) __arm_vgetq_lane_s16(__a,  __idx) #define vgetq_lane_s32(__a,
> +__idx) __arm_vgetq_lane_s32(__a,  __idx) #define vgetq_lane_s8(__a,
> +__idx) __arm_vgetq_lane_s8(__a,  __idx) #define vgetq_lane_s64(__a,
> +__idx) __arm_vgetq_lane_s64(__a,  __idx) #define vgetq_lane_u8(__a,
> +__idx) __arm_vgetq_lane_u8(__a,  __idx) #define vgetq_lane_u16(__a,
> +__idx) __arm_vgetq_lane_u16(__a,  __idx) #define vgetq_lane_u32(__a,
> +__idx) __arm_vgetq_lane_u32(__a,  __idx) #define vgetq_lane_u64(__a,
> +__idx) __arm_vgetq_lane_u64(__a,  __idx)
>  #endif
> 
> +/* For big-endian, GCC's vector indices are reversed within each 64 bits
> +   compared to the architectural lane indices used by MVE intrinsics.
> +*/ #define __ARM_NUM_LANES(__v) (sizeof (__v) / sizeof (__v[0])) #ifdef
> +__ARM_BIG_ENDIAN #define __ARM_LANEQ(__vec, __idx) (__idx ^
> +(__ARM_NUM_LANES(__vec)/2 - 1)) #else #define __ARM_LANEQ(__vec,
> __idx)
> +__idx #endif
> +#define __ARM_CHECK_LANEQ(__vec, __idx)		 \
> +  __builtin_arm_lane_check (__ARM_NUM_LANES(__vec),     \
> +			    __ARM_LANEQ(__vec, __idx))
> +
>  __extension__ extern __inline void
>  __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
>  __arm_vst4q_s8 (int8_t * __addr, int8x16x4_t __value) @@ -16371,6
> +16403,142 @@ __arm_vld4q_u32 (uint32_t const * __addr)
>    return __rv.__i;
>  }
> 
> +__extension__ extern __inline int16x8_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vsetq_lane_s16 (int16_t __a, int16x8_t __b, const int __idx) {
> +  __ARM_CHECK_LANEQ (__b, __idx);
> +  __b[__ARM_LANEQ(__b,__idx)] = __a;
> +  return __b;
> +}
> +
> +__extension__ extern __inline int32x4_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vsetq_lane_s32 (int32_t __a, int32x4_t __b, const int __idx) {
> +  __ARM_CHECK_LANEQ (__b, __idx);
> +  __b[__ARM_LANEQ(__b,__idx)] = __a;
> +  return __b;
> +}
> +
> +__extension__ extern __inline int8x16_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vsetq_lane_s8 (int8_t __a, int8x16_t __b, const int __idx) {
> +  __ARM_CHECK_LANEQ (__b, __idx);
> +  __b[__ARM_LANEQ(__b,__idx)] = __a;
> +  return __b;
> +}
> +
> +__extension__ extern __inline int64x2_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vsetq_lane_s64 (int64_t __a, int64x2_t __b, const int __idx) {
> +  __ARM_CHECK_LANEQ (__b, __idx);
> +  __b[__ARM_LANEQ(__b,__idx)] = __a;
> +  return __b;
> +}
> +
> +__extension__ extern __inline uint8x16_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vsetq_lane_u8 (uint8_t __a, uint8x16_t __b, const int __idx) {
> +  __ARM_CHECK_LANEQ (__b, __idx);
> +  __b[__ARM_LANEQ(__b,__idx)] = __a;
> +  return __b;
> +}
> +
> +__extension__ extern __inline uint16x8_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vsetq_lane_u16 (uint16_t __a, uint16x8_t __b, const int __idx) {
> +  __ARM_CHECK_LANEQ (__b, __idx);
> +  __b[__ARM_LANEQ(__b,__idx)] = __a;
> +  return __b;
> +}
> +
> +__extension__ extern __inline uint32x4_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vsetq_lane_u32 (uint32_t __a, uint32x4_t __b, const int __idx) {
> +  __ARM_CHECK_LANEQ (__b, __idx);
> +  __b[__ARM_LANEQ(__b,__idx)] = __a;
> +  return __b;
> +}
> +
> +__extension__ extern __inline uint64x2_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vsetq_lane_u64 (uint64_t __a, uint64x2_t __b, const int __idx) {
> +  __ARM_CHECK_LANEQ (__b, __idx);
> +  __b[__ARM_LANEQ(__b,__idx)] = __a;
> +  return __b;
> +}
> +
> +__extension__ extern __inline int16_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vgetq_lane_s16 (int16x8_t __a, const int __idx) {
> +  __ARM_CHECK_LANEQ (__a, __idx);
> +  return __a[__ARM_LANEQ(__a,__idx)];
> +}
> +
> +__extension__ extern __inline int32_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vgetq_lane_s32 (int32x4_t __a, const int __idx) {
> +  __ARM_CHECK_LANEQ (__a, __idx);
> +  return __a[__ARM_LANEQ(__a,__idx)];
> +}
> +
> +__extension__ extern __inline int8_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vgetq_lane_s8 (int8x16_t __a, const int __idx) {
> +  __ARM_CHECK_LANEQ (__a, __idx);
> +  return __a[__ARM_LANEQ(__a,__idx)];
> +}
> +
> +__extension__ extern __inline int64_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vgetq_lane_s64 (int64x2_t __a, const int __idx) {
> +  __ARM_CHECK_LANEQ (__a, __idx);
> +  return __a[__ARM_LANEQ(__a,__idx)];
> +}
> +
> +__extension__ extern __inline uint8_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vgetq_lane_u8 (uint8x16_t __a, const int __idx) {
> +  __ARM_CHECK_LANEQ (__a, __idx);
> +  return __a[__ARM_LANEQ(__a,__idx)];
> +}
> +
> +__extension__ extern __inline uint16_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vgetq_lane_u16 (uint16x8_t __a, const int __idx) {
> +  __ARM_CHECK_LANEQ (__a, __idx);
> +  return __a[__ARM_LANEQ(__a,__idx)];
> +}
> +
> +__extension__ extern __inline uint32_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vgetq_lane_u32 (uint32x4_t __a, const int __idx) {
> +  __ARM_CHECK_LANEQ (__a, __idx);
> +  return __a[__ARM_LANEQ(__a,__idx)];
> +}
> +
> +__extension__ extern __inline uint64_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vgetq_lane_u64 (uint64x2_t __a, const int __idx) {
> +  __ARM_CHECK_LANEQ (__a, __idx);
> +  return __a[__ARM_LANEQ(__a,__idx)];
> +}
> +
>  #if (__ARM_FEATURE_MVE & 2) /* MVE Floating point.  */
> 
>  __extension__ extern __inline void
> @@ -19804,6 +19972,39 @@ __arm_vst1q_p_f32 (float32_t * __addr,
> float32x4_t __value, mve_pred16_t __p)
>    return vstrwq_p_f32 (__addr, __value, __p);  }
> 
> +__extension__ extern __inline float16x8_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vsetq_lane_f16 (float16_t __a, float16x8_t __b, const int __idx)
> +{
> +  __ARM_CHECK_LANEQ (__b, __idx);
> +  __b[__ARM_LANEQ(__b,__idx)] = __a;
> +  return __b;
> +}
> +
> +__extension__ extern __inline float32x4_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vsetq_lane_f32 (float32_t __a, float32x4_t __b, const int __idx)
> +{
> +  __ARM_CHECK_LANEQ (__b, __idx);
> +  __b[__ARM_LANEQ(__b,__idx)] = __a;
> +  return __b;
> +}
> +
> +__extension__ extern __inline float16_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vgetq_lane_f16 (float16x8_t __a, const int __idx) {
> +  __ARM_CHECK_LANEQ (__a, __idx);
> +  return __a[__ARM_LANEQ(__a,__idx)];
> +}
> +
> +__extension__ extern __inline float32_t __attribute__
> +((__always_inline__, __gnu_inline__, __artificial__))
> +__arm_vgetq_lane_f32 (float32x4_t __a, const int __idx) {
> +  __ARM_CHECK_LANEQ (__a, __idx);
> +  return __a[__ARM_LANEQ(__a,__idx)];
> +}
>  #endif
> 
>  enum {
> @@ -23090,6 +23291,35 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]:
> __arm_vcmulq_rot90_x_f16 (__ARM_mve_coerce(__p1, float16x8_t),
> __ARM_mve_coerce(__p2, float16x8_t), p3), \
>    int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]:
> __arm_vcmulq_rot90_x_f32 (__ARM_mve_coerce(__p1, float32x4_t),
> __ARM_mve_coerce(__p2, float32x4_t), p3));})
> 
> +#define vgetq_lane(p0,p1) __arm_vgetq_lane(p0,p1) #define
> +__arm_vgetq_lane(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> +  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> +  int (*)[__ARM_mve_type_int8x16_t]: __arm_vgetq_lane_s8
> +(__ARM_mve_coerce(__p0, int8x16_t), p1), \
> +  int (*)[__ARM_mve_type_int16x8_t]: __arm_vgetq_lane_s16
> +(__ARM_mve_coerce(__p0, int16x8_t), p1), \
> +  int (*)[__ARM_mve_type_int32x4_t]: __arm_vgetq_lane_s32
> +(__ARM_mve_coerce(__p0, int32x4_t), p1), \
> +  int (*)[__ARM_mve_type_int64x2_t]: __arm_vgetq_lane_s64
> +(__ARM_mve_coerce(__p0, int64x2_t), p1), \
> +  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vgetq_lane_u8
> +(__ARM_mve_coerce(__p0, uint8x16_t), p1), \
> +  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vgetq_lane_u16
> +(__ARM_mve_coerce(__p0, uint16x8_t), p1), \
> +  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vgetq_lane_u32
> +(__ARM_mve_coerce(__p0, uint32x4_t), p1), \
> +  int (*)[__ARM_mve_type_uint64x2_t]: __arm_vgetq_lane_u64
> +(__ARM_mve_coerce(__p0, uint64x2_t), p1), \
> +  int (*)[__ARM_mve_type_float16x8_t]: __arm_vgetq_lane_f16
> +(__ARM_mve_coerce(__p0, float16x8_t), p1), \
> +  int (*)[__ARM_mve_type_float32x4_t]: __arm_vgetq_lane_f32
> +(__ARM_mve_coerce(__p0, float32x4_t), p1));})
> +
> +#define vsetq_lane(p0,p1,p2) __arm_vsetq_lane(p0,p1,p2) #define
> +__arm_vsetq_lane(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
> +  __typeof(p1) __p1 = (p1); \
> +  _Generic( (int
> (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> +\
> +  int (*)[__ARM_mve_type_int8_t][__ARM_mve_type_int8x16_t]:
> +__arm_vsetq_lane_s8 (__ARM_mve_coerce(__p0, int8_t),
> +__ARM_mve_coerce(__p1, int8x16_t), p2), \
> +  int (*)[__ARM_mve_type_int16_t][__ARM_mve_type_int16x8_t]:
> +__arm_vsetq_lane_s16 (__ARM_mve_coerce(__p0, int16_t),
> +__ARM_mve_coerce(__p1, int16x8_t), p2), \
> +  int (*)[__ARM_mve_type_int32_t][__ARM_mve_type_int32x4_t]:
> +__arm_vsetq_lane_s32 (__ARM_mve_coerce(__p0, int32_t),
> +__ARM_mve_coerce(__p1, int32x4_t), p2), \
> +  int (*)[__ARM_mve_type_int64_t][__ARM_mve_type_int64x2_t]:
> +__arm_vsetq_lane_s64 (__ARM_mve_coerce(__p0, int64_t),
> +__ARM_mve_coerce(__p1, int64x2_t), p2), \
> +  int (*)[__ARM_mve_type_uint8_t][__ARM_mve_type_uint8x16_t]:
> +__arm_vsetq_lane_u8 (__ARM_mve_coerce(__p0, uint8_t),
> +__ARM_mve_coerce(__p1, uint8x16_t), p2), \
> +  int (*)[__ARM_mve_type_uint16_t][__ARM_mve_type_uint16x8_t]:
> +__arm_vsetq_lane_u16 (__ARM_mve_coerce(__p0, uint16_t),
> +__ARM_mve_coerce(__p1, uint16x8_t), p2), \
> +  int (*)[__ARM_mve_type_uint32_t][__ARM_mve_type_uint32x4_t]:
> +__arm_vsetq_lane_u32 (__ARM_mve_coerce(__p0, uint32_t),
> +__ARM_mve_coerce(__p1, uint32x4_t), p2), \
> +  int (*)[__ARM_mve_type_uint64_t][__ARM_mve_type_uint64x2_t]:
> +__arm_vsetq_lane_u64 (__ARM_mve_coerce(__p0, uint64_t),
> +__ARM_mve_coerce(__p1, uint64x2_t), p2), \
> +  int (*)[__ARM_mve_type_float16_t][__ARM_mve_type_float16x8_t]:
> +__arm_vsetq_lane_f16 (__ARM_mve_coerce(__p0, float16_t),
> +__ARM_mve_coerce(__p1, float16x8_t), p2), \
> +  int (*)[__ARM_mve_type_float32_t][__ARM_mve_type_float32x4_t]:
> +__arm_vsetq_lane_f32 (__ARM_mve_coerce(__p0, float32_t),
> +__ARM_mve_coerce(__p1, float32x4_t), p2));})
> +
>  #else /* MVE Integer.  */
> 
>  #define vstrwq_scatter_base_wb(p0,p1,p2)
> __arm_vstrwq_scatter_base_wb(p0,p1,p2)
> @@ -25885,6 +26115,31 @@ extern void *__ARM_undef;
>    int (*)[__ARM_mve_type_uint16_t_const_ptr]: __arm_vld4q_u16
> (__ARM_mve_coerce(__p0, uint16_t const *)), \
>    int (*)[__ARM_mve_type_uint32_t_const_ptr]: __arm_vld4q_u32
> (__ARM_mve_coerce(__p0, uint32_t const *)));})
> 
> +#define vgetq_lane(p0,p1) __arm_vgetq_lane(p0,p1) #define
> +__arm_vgetq_lane(p0,p1) ({ __typeof(p0) __p0 = (p0); \
> +  _Generic( (int (*)[__ARM_mve_typeid(__p0)])0, \
> +  int (*)[__ARM_mve_type_int8x16_t]: __arm_vgetq_lane_s8
> +(__ARM_mve_coerce(__p0, int8x16_t), p1), \
> +  int (*)[__ARM_mve_type_int16x8_t]: __arm_vgetq_lane_s16
> +(__ARM_mve_coerce(__p0, int16x8_t), p1), \
> +  int (*)[__ARM_mve_type_int32x4_t]: __arm_vgetq_lane_s32
> +(__ARM_mve_coerce(__p0, int32x4_t), p1), \
> +  int (*)[__ARM_mve_type_int64x2_t]: __arm_vgetq_lane_s64
> +(__ARM_mve_coerce(__p0, int64x2_t), p1), \
> +  int (*)[__ARM_mve_type_uint8x16_t]: __arm_vgetq_lane_u8
> +(__ARM_mve_coerce(__p0, uint8x16_t), p1), \
> +  int (*)[__ARM_mve_type_uint16x8_t]: __arm_vgetq_lane_u16
> +(__ARM_mve_coerce(__p0, uint16x8_t), p1), \
> +  int (*)[__ARM_mve_type_uint32x4_t]: __arm_vgetq_lane_u32
> +(__ARM_mve_coerce(__p0, uint32x4_t), p1), \
> +  int (*)[__ARM_mve_type_uint64x2_t]: __arm_vgetq_lane_u64
> +(__ARM_mve_coerce(__p0, uint64x2_t), p1));})
> +
> +#define vsetq_lane(p0,p1,p2) __arm_vsetq_lane(p0,p1,p2) #define
> +__arm_vsetq_lane(p0,p1,p2) ({ __typeof(p0) __p0 = (p0); \
> +  __typeof(p1) __p1 = (p1); \
> +  _Generic( (int
> (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0,
> +\
> +  int (*)[__ARM_mve_type_int8_t][__ARM_mve_type_int8x16_t]:
> +__arm_vsetq_lane_s8 (__ARM_mve_coerce(__p0, int8_t),
> +__ARM_mve_coerce(__p1, int8x16_t), p2), \
> +  int (*)[__ARM_mve_type_int16_t][__ARM_mve_type_int16x8_t]:
> +__arm_vsetq_lane_s16 (__ARM_mve_coerce(__p0, int16_t),
> +__ARM_mve_coerce(__p1, int16x8_t), p2), \
> +  int (*)[__ARM_mve_type_int32_t][__ARM_mve_type_int32x4_t]:
> +__arm_vsetq_lane_s32 (__ARM_mve_coerce(__p0, int32_t),
> +__ARM_mve_coerce(__p1, int32x4_t), p2), \
> +  int (*)[__ARM_mve_type_int64_t][__ARM_mve_type_int64x2_t]:
> +__arm_vsetq_lane_s64 (__ARM_mve_coerce(__p0, int64_t),
> +__ARM_mve_coerce(__p1, int64x2_t), p2), \
> +  int (*)[__ARM_mve_type_uint8_t][__ARM_mve_type_uint8x16_t]:
> +__arm_vsetq_lane_u8 (__ARM_mve_coerce(__p0, uint8_t),
> +__ARM_mve_coerce(__p1, uint8x16_t), p2), \
> +  int (*)[__ARM_mve_type_uint16_t][__ARM_mve_type_uint16x8_t]:
> +__arm_vsetq_lane_u16 (__ARM_mve_coerce(__p0, uint16_t),
> +__ARM_mve_coerce(__p1, uint16x8_t), p2), \
> +  int (*)[__ARM_mve_type_uint32_t][__ARM_mve_type_uint32x4_t]:
> +__arm_vsetq_lane_u32 (__ARM_mve_coerce(__p0, uint32_t),
> +__ARM_mve_coerce(__p1, uint32x4_t), p2), \
> +  int (*)[__ARM_mve_type_uint64_t][__ARM_mve_type_uint64x2_t]:
> +__arm_vsetq_lane_u64 (__ARM_mve_coerce(__p0, uint64_t),
> +__ARM_mve_coerce(__p1, uint64x2_t), p2));})
> +
>  #endif /* MVE Integer.  */
> 
>  #define vmvnq_x(p1,p2) __arm_vmvnq_x(p1,p2) diff --git
> a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md index
> f3cbc0d03564ef8866226f836a27ed6051353f5d..e6b66eef3728122c87bd6ea6
> 8b8a643dd4552b00 100644
> --- a/gcc/config/arm/iterators.md
> +++ b/gcc/config/arm/iterators.md
> @@ -129,6 +129,9 @@
>  ;; Quad-width vector modes plus 64-bit elements.
>  (define_mode_iterator VQX [V16QI V8HI V8HF V8BF V4SI V4SF V2DI])
> 
> +;; Quad-width vector modes plus 64-bit elements.
> +(define_mode_iterator VQX_NOBF [V16QI V8HI V8HF V4SI V4SF V2DI])
> +
>  ;; Quad-width vector modes plus 64-bit elements and V8BF.
>  (define_mode_iterator VQXBF [V16QI V8HI V8HF (V8BF
> "TARGET_BF16_SIMD") V4SI V4SF V2DI])
> 
> diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index
> 2e28d9d8408127dd52b9d16c772e7f27a47d390a..2b59d5a58171cddea11556
> 10ddbb3c7f96105d24 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -411,6 +411,8 @@
>  (define_mode_attr MVE_H_ELEM [ (V8HI "V8HI") (V4SI "V4HI")])
> (define_mode_attr V_sz_elem1 [(V16QI "b") (V8HI  "h") (V4SI "w") (V8HF "h")
>  			      (V4SF "w")])
> +(define_mode_attr V_extr_elem [(V16QI "u8") (V8HI "u16") (V4SI "32")
> +			       (V8HF "u16") (V4SF "32")])
> 
>  (define_int_iterator VCVTQ_TO_F [VCVTQ_TO_F_S VCVTQ_TO_F_U])
> (define_int_iterator VMVNQ_N [VMVNQ_N_U VMVNQ_N_S]) @@ -10885,3
> +10887,121 @@
>     return "";
>  }
>    [(set_attr "length" "16")])
> +;;
> +;; [vgetq_lane_u, vgetq_lane_s, vgetq_lane_f]) ;; (define_insn
> +"mve_vec_extract<mode><V_elem_l>"
> + [(set (match_operand:<V_elem> 0 "s_register_operand" "=r")
> +   (vec_select:<V_elem>
> +    (match_operand:MVE_VLD_ST 1 "s_register_operand" "w")
> +    (parallel [(match_operand:SI 2 "immediate_operand" "i")])))]
> +  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
> +   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE
> (<MODE>mode))"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +      int elt = INTVAL (operands[2]);
> +      elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
> +      operands[2] = GEN_INT (elt);
> +    }
> +  return "vmov.<V_extr_elem>\t%0, %q1[%c2]"; }  [(set_attr "type"
> +"mve_move")])
> +
> +(define_insn "mve_vec_extractv2didi"
> + [(set (match_operand:DI 0 "s_register_operand" "=r")
> +   (vec_select:DI
> +    (match_operand:V2DI 1 "s_register_operand" "w")
> +    (parallel [(match_operand:SI 2 "immediate_operand" "i")])))]
> +  "TARGET_HAVE_MVE"
> +{
> +  int elt = INTVAL (operands[2]);
> +  if (BYTES_BIG_ENDIAN)
> +    elt = 1 - elt;
> +
> +  if (elt == 0)
> +   return "vmov\t%Q0, %R0, %e1";
> +  else
> +   return "vmov\t%J0, %K0, %f1";
> +}
> + [(set_attr "type" "mve_move")])
> +
> +(define_insn "*mve_vec_extract_sext_internal<mode>"
> + [(set (match_operand:SI 0 "s_register_operand" "=r")
> +   (sign_extend:SI
> +    (vec_select:<V_elem>
> +     (match_operand:MVE_2 1 "s_register_operand" "w")
> +     (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
> +  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
> +   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE
> (<MODE>mode))"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +      int elt = INTVAL (operands[2]);
> +      elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
> +      operands[2] = GEN_INT (elt);
> +    }
> +  return "vmov.s<V_sz_elem>\t%0, %q1[%c2]"; }  [(set_attr "type"
> +"mve_move")])
> +
> +(define_insn "*mve_vec_extract_zext_internal<mode>"
> + [(set (match_operand:SI 0 "s_register_operand" "=r")
> +   (zero_extend:SI
> +    (vec_select:<V_elem>
> +     (match_operand:MVE_2 1 "s_register_operand" "w")
> +     (parallel [(match_operand:SI 2 "immediate_operand" "i")]))))]
> +  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
> +   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE
> (<MODE>mode))"
> +{
> +  if (BYTES_BIG_ENDIAN)
> +    {
> +      int elt = INTVAL (operands[2]);
> +      elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
> +      operands[2] = GEN_INT (elt);
> +    }
> +  return "vmov.u<V_sz_elem>\t%0, %q1[%c2]"; }  [(set_attr "type"
> +"mve_move")])
> +
> +;;
> +;; [vsetq_lane_u, vsetq_lane_s, vsetq_lane_f]) ;; (define_insn
> +"mve_vec_set<mode>_internal"
> + [(set (match_operand:VQ2 0 "s_register_operand" "=w")
> +       (vec_merge:VQ2
> +	(vec_duplicate:VQ2
> +	  (match_operand:<V_elem> 1 "nonimmediate_operand" "r"))
> +	(match_operand:VQ2 3 "s_register_operand" "0")
> +	(match_operand:SI 2 "immediate_operand" "i")))]
> +  "(TARGET_HAVE_MVE && VALID_MVE_SI_MODE (<MODE>mode))
> +   || (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE
> (<MODE>mode))"
> +{
> +  int elt = ffs ((int) INTVAL (operands[2])) - 1;
> +  if (BYTES_BIG_ENDIAN)
> +    elt = GET_MODE_NUNITS (<MODE>mode) - 1 - elt;
> +  operands[2] = GEN_INT (elt);
> +
> +  return "vmov.<V_sz_elem>\t%q0[%c2], %1"; }  [(set_attr "type"
> +"mve_move")])
> +
> +(define_insn "mve_vec_setv2di_internal"
> + [(set (match_operand:V2DI 0 "s_register_operand" "=w")
> +       (vec_merge:V2DI
> +	(vec_duplicate:V2DI
> +	  (match_operand:DI 1 "nonimmediate_operand" "r"))
> +	(match_operand:V2DI 3 "s_register_operand" "0")
> +	(match_operand:SI 2 "immediate_operand" "i")))]
> "TARGET_HAVE_MVE"
> +{
> +  int elt = ffs ((int) INTVAL (operands[2])) - 1;
> +  if (BYTES_BIG_ENDIAN)
> +    elt = 1 - elt;
> +
> +  if (elt == 0)
> +   return "vmov\t%e0, %Q1, %R1";
> +  else
> +   return "vmov\t%f0, %J1, %K1";
> +}
> + [(set_attr "type" "mve_move")])
> diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index
> 272e6c1e7cfc4c42065d1d50131ef49d89052d91..3e7b51d8ab60007901392df0
> ca1cb09fead4d0e9 100644
> --- a/gcc/config/arm/neon.md
> +++ b/gcc/config/arm/neon.md
> @@ -411,18 +411,6 @@
>    [(set_attr "type" "neon_load1_all_lanes_q,neon_from_gp_q")]
>  )
> 
> -(define_expand "vec_set<mode>"
> -  [(match_operand:VDQ 0 "s_register_operand")
> -   (match_operand:<V_elem> 1 "s_register_operand")
> -   (match_operand:SI 2 "immediate_operand")]
> -  "TARGET_NEON"
> -{
> -  HOST_WIDE_INT elem = HOST_WIDE_INT_1 << INTVAL (operands[2]);
> -  emit_insn (gen_vec_set<mode>_internal (operands[0], operands[1],
> -					 GEN_INT (elem), operands[0]));
> -  DONE;
> -})
> -
>  (define_insn "vec_extract<mode><V_elem_l>"
>    [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
>          (vec_select:<V_elem>
> @@ -445,7 +433,10 @@
>    [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
>  )
> 
> -(define_insn "vec_extract<mode><V_elem_l>"
> +;; This pattern is renamed from "vec_extract<mode><V_elem_l>" to ;;
> +"neon_vec_extract<mode><V_elem_l>" and this pattern is called ;; by
> +define_expand in vec-common.md file.
> +(define_insn "neon_vec_extract<mode><V_elem_l>"
>    [(set (match_operand:<V_elem> 0 "nonimmediate_operand" "=Um,r")
>  	(vec_select:<V_elem>
>            (match_operand:VQ2 1 "s_register_operand" "w,w") @@ -471,7
> +462,9 @@
>    [(set_attr "type" "neon_store1_one_lane<q>,neon_to_gp<q>")]
>  )
> 
> -(define_insn "vec_extractv2didi"
> +;; This pattern is renamed from "vec_extractv2didi" to
> "neon_vec_extractv2didi"
> +;; and this pattern is called by define_expand in vec-common.md file.
> +(define_insn "neon_vec_extractv2didi"
>    [(set (match_operand:DI 0 "nonimmediate_operand" "=Um,r")
>  	(vec_select:DI
>            (match_operand:V2DI 1 "s_register_operand" "w,w") diff --git
> a/gcc/config/arm/vec-common.md b/gcc/config/arm/vec-common.md index
> 786daa628510a5def50530c5b459bece45a0007c..b7e3619caf461063876654c5
> 4393d305147f7bf7 100644
> --- a/gcc/config/arm/vec-common.md
> +++ b/gcc/config/arm/vec-common.md
> @@ -190,3 +190,36 @@
>    arm_expand_vec_perm (operands[0], operands[1], operands[2],
> operands[3]);
>    DONE;
>  })
> +
> +(define_expand "vec_extract<mode><V_elem_l>"
> + [(match_operand:<V_elem> 0 "nonimmediate_operand")
> +  (match_operand:VQX_NOBF 1 "s_register_operand")
> +  (match_operand:SI 2 "immediate_operand")]  "TARGET_NEON ||
> +TARGET_HAVE_MVE"
> +{
> +  if (TARGET_NEON)
> +    emit_insn (gen_neon_vec_extract<mode><V_elem_l> (operands[0],
> operands[1],
> +						     operands[2]));
> +  else if (TARGET_HAVE_MVE)
> +    emit_insn (gen_mve_vec_extract<mode><V_elem_l> (operands[0],
> operands[1],
> +						     operands[2]));
> +  else
> +    gcc_unreachable ();
> +  DONE;
> +})
> +
> +(define_expand "vec_set<mode>"
> +  [(match_operand:VQX_NOBF 0 "s_register_operand" "")
> +   (match_operand:<V_elem> 1 "s_register_operand" "")
> +   (match_operand:SI 2 "immediate_operand" "")]
> +  "TARGET_NEON || TARGET_HAVE_MVE"
> +{
> +  HOST_WIDE_INT elem = HOST_WIDE_INT_1 << INTVAL (operands[2]);
> +  if (TARGET_NEON)
> +    emit_insn (gen_vec_set<mode>_internal (operands[0], operands[1],
> +					   GEN_INT (elem), operands[0]));
> +  else
> +    emit_insn (gen_mve_vec_set<mode>_internal (operands[0], operands[1],
> +					       GEN_INT (elem), operands[0]));
> +  DONE;
> +})
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..2a5aa63f4572a666e50d782
> 5c8820d49eb9cd70e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f16.c
> @@ -0,0 +1,22 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +float16_t
> +foo (float16x8_t a)
> +{
> +  return vgetq_lane_f16 (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.u16"  }  } */
> +
> +float16_t
> +foo1 (float16x8_t a)
> +{
> +  return vgetq_lane (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.u16"  }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..f1839cccffe1c34478f2372cd
> 20b47761357b142
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_f32.c
> @@ -0,0 +1,22 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +float32_t
> +foo (float32x4_t a)
> +{
> +  return vgetq_lane_f32 (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.32"  }  } */
> +
> +float32_t
> +foo1 (float32x4_t a)
> +{
> +  return vgetq_lane (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.32"  }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..ed1c2178839568dcc3eea33
> 42606ba8eff57ea72
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s16.c
> @@ -0,0 +1,22 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +int16_t
> +foo (int16x8_t a)
> +{
> +  return vgetq_lane_s16 (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.s16"  }  } */
> +
> +int16_t
> +foo1 (int16x8_t a)
> +{
> +  return vgetq_lane (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.s16"  }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..c87ed93e70def5bbf6b1055
> d99656f7386f97ea8
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s32.c
> @@ -0,0 +1,22 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +int32_t
> +foo (int32x4_t a)
> +{
> +  return vgetq_lane_s32 (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.32"  }  } */
> +
> +int32_t
> +foo1 (int32x4_t a)
> +{
> +  return vgetq_lane (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.32"  }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..a7457f86320b6277aba2623
> 6715a69bd05b60d89
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s64.c
> @@ -0,0 +1,22 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +int64_t
> +foo (int64x2_t a)
> +{
> +  return vgetq_lane_s64 (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
> +
> +int64_t
> +foo1 (int64x2_t a)
> +{
> +  return vgetq_lane (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..11242ff3bc090a11bf7f8f16
> 3f0348824158bed7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_s8.c
> @@ -0,0 +1,22 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +int8_t
> +foo (int8x16_t a)
> +{
> +  return vgetq_lane_s8 (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.s8"  }  } */
> +
> +int8_t
> +foo1 (int8x16_t a)
> +{
> +  return vgetq_lane (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.s8"  }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..2788b585535c46a3271be65
> 849b1ba058df1adcf
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u16.c
> @@ -0,0 +1,22 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +uint16_t
> +foo (uint16x8_t a)
> +{
> +  return vgetq_lane_u16 (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.u16"  }  } */
> +
> +uint16_t
> +foo1 (uint16x8_t a)
> +{
> +  return vgetq_lane (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.u16"  }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..721c5a5ffd77cd1ad038d44f
> 32fa197fe2687311
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u32.c
> @@ -0,0 +1,22 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +uint32_t
> +foo (uint32x4_t a)
> +{
> +  return vgetq_lane_u32 (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.32"  }  } */
> +
> +uint32_t
> +foo1 (uint32x4_t a)
> +{
> +  return vgetq_lane (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.32"  }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..3cbbef520aee0731277883a
> e2449e9d2968c8683
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u64.c
> @@ -0,0 +1,22 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +uint64_t
> +foo (uint64x2_t a)
> +{
> +  return vgetq_lane_u64 (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
> +
> +uint64_t
> +foo1 (uint64x2_t a)
> +{
> +  return vgetq_lane (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler {vmov\tr0, r1, d0}  }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..2bcaeac3fe1f5775f448d7f7
> 02ea139726fadcc3
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vgetq_lane_u8.c
> @@ -0,0 +1,22 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +uint8_t
> +foo (uint8x16_t a)
> +{
> +  return vgetq_lane_u8 (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.u8"  }  } */
> +
> +uint8_t
> +foo1 (uint8x16_t a)
> +{
> +  return vgetq_lane (a, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.u8"  }  } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..e03e9620528b02d4e59d63
> 65f0484c2478d70883
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
> @@ -0,0 +1,15 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +float16x8_t
> +foo (float16_t a, float16x8_t b)
> +{
> +    return vsetq_lane_f16 (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.16"  }  } */
> +
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..2b9f1a7e6272629ef631070
> 4a4769c478c7695fa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
> @@ -0,0 +1,15 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-add-options arm_v8_1m_mve_fp } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +float32x4_t
> +foo (float32_t a, float32x4_t b)
> +{
> +    return vsetq_lane_f32 (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.32"  }  } */
> +
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..92ad0dd16a85d7b80645d9
> f54341dafbc760740b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c
> @@ -0,0 +1,15 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +int16x8_t
> +foo (int16_t a, int16x8_t b)
> +{
> +    return vsetq_lane_s16 (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.16"  }  } */
> +
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..e60c8f26700be36d299e2a2
> fd44a6224c39f02a0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c
> @@ -0,0 +1,15 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +int32x4_t
> +foo (int32_t a, int32x4_t b)
> +{
> +    return vsetq_lane_s32 (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.32"  }  } */
> +
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..e487b73d417a2af5a35560f
> da19f0c40d05a4315
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c
> @@ -0,0 +1,15 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +int64x2_t
> +foo (int64_t a, int64x2_t b)
> +{
> +    return vsetq_lane_s64 (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler {vmov\td0, r[1-9]*[0-9], r[1-9]*[0-9]}
> +}  } */
> +
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..d8ccbb524fd0bc2ffb6bd2fd
> e3c27583fd0f4542
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c
> @@ -0,0 +1,15 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo (int8_t a, int8x16_t b)
> +{
> +    return vsetq_lane_s8 (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.8"  }  } */
> +
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..156a5d1de1b51332b30cd8
> 18cabae6f89011cc12
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c
> @@ -0,0 +1,15 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +uint16x8_t
> +foo (uint16_t a, uint16x8_t b)
> +{
> +    return vsetq_lane_u16 (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.16"  }  } */
> +
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..e9575483cc9b278268aa872
> 38f27a8d743bb6398
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c
> @@ -0,0 +1,15 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +uint32x4_t
> +foo (uint32_t a, uint32x4_t b)
> +{
> +    return vsetq_lane_u32 (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.32"  }  } */
> +
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..ae57b9c947c3e7ff878c9d6c
> 36880dd42ebbe88d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c
> @@ -0,0 +1,15 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +uint64x2_t
> +foo (uint64_t a, uint64x2_t b)
> +{
> +    return vsetq_lane_u64 (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler {vmov\td0, r[1-9]*[0-9], r[1-9]*[0-9]}
> +}  } */
> +
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..668b3fea953f8144f7978953
> 76e3bb8a7a3e64d4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c
> @@ -0,0 +1,15 @@
> +/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft"
> +} {""} } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-add-options arm_v8_1m_mve } */
> +/* { dg-additional-options "-O2" } */
> +
> +#include "arm_mve.h"
> +
> +uint8x16_t
> +foo (uint8_t a, uint8x16_t b)
> +{
> +    return vsetq_lane_u8 (a, b, 0);
> +}
> +
> +/* { dg-final { scan-assembler "vmov.8"  }  } */
> +


^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2020-03-23 18:14 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-23 17:42 [PATCH v2][ARM][GCC][12x]: MVE ACLE intrinsics to set and get vector lane Srinath Parvathaneni
2020-03-23 18:13 ` Kyrylo Tkachov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).