From: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>
To: Srinath Parvathaneni <Srinath.Parvathaneni@arm.com>,
"gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org>
Cc: Richard Earnshaw <Richard.Earnshaw@arm.com>
Subject: RE: [GCC][PATCH][ARM]: Fix for MVE ACLE intrinsics with writeback (PR94317).
Date: Thu, 2 Apr 2020 09:58:12 +0000 [thread overview]
Message-ID: <DB7PR08MB3002BCC7F4771B4CBC76D56293C60@DB7PR08MB3002.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <AM0PR08MB538007365371A43722463E699BC80@AM0PR08MB5380.eurprd08.prod.outlook.com>
Hi Srinath,
> -----Original Message-----
> From: Srinath Parvathaneni <Srinath.Parvathaneni@arm.com>
> Sent: 31 March 2020 17:13
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov <Kyrylo.Tkachov@arm.com>; Richard Earnshaw
> <Richard.Earnshaw@arm.com>
> Subject: [GCC][PATCH][ARM]: Fix for MVE ACLE intrinsics with writeback
> (PR94317).
>
> Hello,
>
> Following MVE ACLE intrinsics have an issue with writeback to the base
> address.
>
> vldrdq_gather_base_wb_s64, vldrdq_gather_base_wb_u64,
> vldrdq_gather_base_wb_z_s64, vldrdq_gather_base_wb_z_u64,
> vldrwq_gather_base_wb_s32, vldrwq_gather_base_wb_u32,
> vldrwq_gather_base_wb_z_s32, vldrwq_gather_base_wb_z_u32,
> vldrwq_gather_base_wb_f32, vldrwq_gather_base_wb_z_f32.
>
> This patch fixes the bug reported in PR94317 by adding separate builtin calls
> to update the result and writeback to base address for the above intrinsics.
>
> Please refer to M-profile Vector Extension (MVE) intrinsics [1] for more
> details.
> [1] https://developer.arm.com/architectures/instruction-sets/simd-
> isas/helium/mve-intrinsics
>
> Regression tested on arm-none-eabi and found no regressions.
>
> Ok for trunk?
Thanks, I've pushed this patch to master.
Kyrill
>
> Thanks,
> Srinath.
>
> gcc/ChangeLog:
>
> 2020-03-31 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
>
> PR target/94317
> * config/arm/arm-builtins.c (LDRGBWBXU_QUALIFIERS): Define.
> (LDRGBWBXU_Z_QUALIFIERS): Likewise.
> * config/arm/arm_mve.h (__arm_vldrdq_gather_base_wb_s64):
> Modify
> intrinsic defintion by adding a new builtin call to writeback into base
> address.
> (__arm_vldrdq_gather_base_wb_u64): Likewise.
> (__arm_vldrdq_gather_base_wb_z_s64): Likewise.
> (__arm_vldrdq_gather_base_wb_z_u64): Likewise.
> (__arm_vldrwq_gather_base_wb_s32): Likewise.
> (__arm_vldrwq_gather_base_wb_u32): Likewise.
> (__arm_vldrwq_gather_base_wb_z_s32): Likewise.
> (__arm_vldrwq_gather_base_wb_z_u32): Likewise.
> (__arm_vldrwq_gather_base_wb_f32): Likewise.
> (__arm_vldrwq_gather_base_wb_z_f32): Likewise.
> * config/arm/arm_mve_builtins.def (vldrwq_gather_base_wb_z_u):
> Modify
> builtin's qualifier.
> (vldrdq_gather_base_wb_z_u): Likewise.
> (vldrwq_gather_base_wb_u): Likewise.
> (vldrdq_gather_base_wb_u): Likewise.
> (vldrwq_gather_base_wb_z_s): Likewise.
> (vldrwq_gather_base_wb_z_f): Likewise.
> (vldrdq_gather_base_wb_z_s): Likewise.
> (vldrwq_gather_base_wb_s): Likewise.
> (vldrwq_gather_base_wb_f): Likewise.
> (vldrdq_gather_base_wb_s): Likewise.
> (vldrwq_gather_base_nowb_z_u): Define builtin.
> (vldrdq_gather_base_nowb_z_u): Likewise.
> (vldrwq_gather_base_nowb_u): Likewise.
> (vldrdq_gather_base_nowb_u): Likewise.
> (vldrwq_gather_base_nowb_z_s): Likewise.
> (vldrwq_gather_base_nowb_z_f): Likewise.
> (vldrdq_gather_base_nowb_z_s): Likewise.
> (vldrwq_gather_base_nowb_s): Likewise.
> (vldrwq_gather_base_nowb_f): Likewise.
> (vldrdq_gather_base_nowb_s): Likewise.
> * config/arm/mve.md (mve_vldrwq_gather_base_nowb_<supf>v4si):
> Define RTL
> pattern.
> (mve_vldrwq_gather_base_wb_<supf>v4si): Modify RTL pattern.
> (mve_vldrwq_gather_base_nowb_z_<supf>v4si): Define RTL pattern.
> (mve_vldrwq_gather_base_wb_z_<supf>v4si): Modify RTL pattern.
> (mve_vldrwq_gather_base_wb_fv4sf): Modify RTL pattern.
> (mve_vldrwq_gather_base_nowb_fv4sf): Define RTL pattern.
> (mve_vldrwq_gather_base_wb_z_fv4sf): Modify RTL pattern.
> (mve_vldrwq_gather_base_nowb_z_fv4sf): Define RTL pattern.
> (mve_vldrdq_gather_base_nowb_<supf>v4di): Define RTL pattern.
> (mve_vldrdq_gather_base_wb_<supf>v4di): Modify RTL pattern.
> (mve_vldrdq_gather_base_nowb_z_<supf>v4di): Define RTL pattern.
> (mve_vldrdq_gather_base_wb_z_<supf>v4di): Modify RTL pattern.
>
> gcc/testsuite/ChangeLog:
>
> 2020-03-31 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
>
> PR target/94317
> * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c:
> Modify
> * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c:
> Likewise.
> * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c:
> Likewise.
> * gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c:
> Likewise.
> * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c:
> Likewise.
> * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c:
> Likewise.
> * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c:
> Likewise.
> * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c:
> Likewise.
> * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c:
> Likewise.
> * gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c:
> Likewise.
>
>
>
> ############### Attachment also inlined for ease of reply
> ###############
>
>
> diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
> index
> 56f0db21ea95dcd738877daba27f1cb60f0d5a32..832b9107424fd9a4a0ee272
> b773b3d0929172370 100644
> --- a/gcc/config/arm/arm-builtins.c
> +++ b/gcc/config/arm/arm-builtins.c
> @@ -719,6 +719,17 @@
> arm_quinop_unone_unone_unone_unone_imm_unone_qualifiers[SIMD_M
> AX_BUILTIN_ARGS]
> (arm_quinop_unone_unone_unone_unone_imm_unone_qualifiers)
>
> static enum arm_type_qualifiers
> +arm_ldrgbwbxu_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> + = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate};
> +#define LDRGBWBXU_QUALIFIERS (arm_ldrgbwbxu_qualifiers)
> +
> +static enum arm_type_qualifiers
> +arm_ldrgbwbxu_z_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> + = { qualifier_unsigned, qualifier_unsigned, qualifier_immediate,
> + qualifier_unsigned};
> +#define LDRGBWBXU_Z_QUALIFIERS (arm_ldrgbwbxu_z_qualifiers)
> +
> +static enum arm_type_qualifiers
> arm_ldrgbwbs_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> = { qualifier_none, qualifier_unsigned, qualifier_immediate}; #define
> LDRGBWBS_QUALIFIERS (arm_ldrgbwbs_qualifiers) diff --git
> a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index
> f1dcdc2153217e796c58526ba0e5be11be642234..47a6268e0800958f49d4623
> 8fe34ec749d243929 100644
> --- a/gcc/config/arm/arm_mve.h
> +++ b/gcc/config/arm/arm_mve.h
> @@ -13903,8 +13903,8 @@ __attribute__ ((__always_inline__,
> __gnu_inline__, __artificial__))
> __arm_vldrdq_gather_base_wb_s64 (uint64x2_t * __addr, const int __offset)
> {
> int64x2_t
> - result = __builtin_mve_vldrdq_gather_base_wb_sv2di (*__addr, __offset);
> - __addr += __offset;
> + result = __builtin_mve_vldrdq_gather_base_nowb_sv2di (*__addr,
> + __offset); *__addr = __builtin_mve_vldrdq_gather_base_wb_sv2di
> + (*__addr, __offset);
> return result;
> }
>
> @@ -13913,8 +13913,8 @@ __attribute__ ((__always_inline__,
> __gnu_inline__, __artificial__))
> __arm_vldrdq_gather_base_wb_u64 (uint64x2_t * __addr, const int
> __offset) {
> uint64x2_t
> - result = __builtin_mve_vldrdq_gather_base_wb_uv2di (*__addr, __offset);
> - __addr += __offset;
> + result = __builtin_mve_vldrdq_gather_base_nowb_uv2di (*__addr,
> + __offset); *__addr = __builtin_mve_vldrdq_gather_base_wb_uv2di
> + (*__addr, __offset);
> return result;
> }
>
> @@ -13923,8 +13923,8 @@ __attribute__ ((__always_inline__,
> __gnu_inline__, __artificial__))
> __arm_vldrdq_gather_base_wb_z_s64 (uint64x2_t * __addr, const int
> __offset, mve_pred16_t __p) {
> int64x2_t
> - result = __builtin_mve_vldrdq_gather_base_wb_z_sv2di (*__addr, __offset,
> __p);
> - __addr += __offset;
> + result = __builtin_mve_vldrdq_gather_base_nowb_z_sv2di (*__addr,
> + __offset, __p); *__addr = __builtin_mve_vldrdq_gather_base_wb_z_sv2di
> + (*__addr, __offset, __p);
> return result;
> }
>
> @@ -13933,8 +13933,8 @@ __attribute__ ((__always_inline__,
> __gnu_inline__, __artificial__))
> __arm_vldrdq_gather_base_wb_z_u64 (uint64x2_t * __addr, const int
> __offset, mve_pred16_t __p) {
> uint64x2_t
> - result = __builtin_mve_vldrdq_gather_base_wb_z_uv2di (*__addr,
> __offset, __p);
> - __addr += __offset;
> + result = __builtin_mve_vldrdq_gather_base_nowb_z_uv2di (*__addr,
> + __offset, __p); *__addr = __builtin_mve_vldrdq_gather_base_wb_z_uv2di
> + (*__addr, __offset, __p);
> return result;
> }
>
> @@ -13943,8 +13943,8 @@ __attribute__ ((__always_inline__,
> __gnu_inline__, __artificial__))
> __arm_vldrwq_gather_base_wb_s32 (uint32x4_t * __addr, const int
> __offset) {
> int32x4_t
> - result = __builtin_mve_vldrwq_gather_base_wb_sv4si (*__addr, __offset);
> - __addr += __offset;
> + result = __builtin_mve_vldrwq_gather_base_nowb_sv4si (*__addr,
> + __offset); *__addr = __builtin_mve_vldrwq_gather_base_wb_sv4si
> + (*__addr, __offset);
> return result;
> }
>
> @@ -13953,8 +13953,8 @@ __attribute__ ((__always_inline__,
> __gnu_inline__, __artificial__))
> __arm_vldrwq_gather_base_wb_u32 (uint32x4_t * __addr, const int
> __offset) {
> uint32x4_t
> - result = __builtin_mve_vldrwq_gather_base_wb_uv4si (*__addr, __offset);
> - __addr += __offset;
> + result = __builtin_mve_vldrwq_gather_base_nowb_uv4si (*__addr,
> + __offset); *__addr = __builtin_mve_vldrwq_gather_base_wb_uv4si
> + (*__addr, __offset);
> return result;
> }
>
> @@ -13963,8 +13963,8 @@ __attribute__ ((__always_inline__,
> __gnu_inline__, __artificial__))
> __arm_vldrwq_gather_base_wb_z_s32 (uint32x4_t * __addr, const int
> __offset, mve_pred16_t __p) {
> int32x4_t
> - result = __builtin_mve_vldrwq_gather_base_wb_z_sv4si (*__addr,
> __offset, __p);
> - __addr += __offset;
> + result = __builtin_mve_vldrwq_gather_base_nowb_z_sv4si (*__addr,
> + __offset, __p); *__addr = __builtin_mve_vldrwq_gather_base_wb_z_sv4si
> + (*__addr, __offset, __p);
> return result;
> }
>
> @@ -13973,8 +13973,8 @@ __attribute__ ((__always_inline__,
> __gnu_inline__, __artificial__))
> __arm_vldrwq_gather_base_wb_z_u32 (uint32x4_t * __addr, const int
> __offset, mve_pred16_t __p) {
> uint32x4_t
> - result = __builtin_mve_vldrwq_gather_base_wb_z_uv4si (*__addr,
> __offset, __p);
> - __addr += __offset;
> + result = __builtin_mve_vldrwq_gather_base_nowb_z_uv4si (*__addr,
> + __offset, __p); *__addr = __builtin_mve_vldrwq_gather_base_wb_z_uv4si
> + (*__addr, __offset, __p);
> return result;
> }
>
> @@ -19372,8 +19372,8 @@ __attribute__ ((__always_inline__,
> __gnu_inline__, __artificial__))
> __arm_vldrwq_gather_base_wb_f32 (uint32x4_t * __addr, const int __offset)
> {
> float32x4_t
> - result = __builtin_mve_vldrwq_gather_base_wb_fv4sf (*__addr, __offset);
> - __addr += __offset;
> + result = __builtin_mve_vldrwq_gather_base_nowb_fv4sf (*__addr,
> + __offset); *__addr = __builtin_mve_vldrwq_gather_base_wb_fv4sf
> + (*__addr, __offset);
> return result;
> }
>
> @@ -19382,8 +19382,8 @@ __attribute__ ((__always_inline__,
> __gnu_inline__, __artificial__))
> __arm_vldrwq_gather_base_wb_z_f32 (uint32x4_t * __addr, const int
> __offset, mve_pred16_t __p) {
> float32x4_t
> - result = __builtin_mve_vldrwq_gather_base_wb_z_fv4sf (*__addr,
> __offset, __p);
> - __addr += __offset;
> + result = __builtin_mve_vldrwq_gather_base_nowb_z_fv4sf (*__addr,
> + __offset, __p); *__addr = __builtin_mve_vldrwq_gather_base_wb_z_fv4sf
> + (*__addr, __offset, __p);
> return result;
> }
>
> diff --git a/gcc/config/arm/arm_mve_builtins.def
> b/gcc/config/arm/arm_mve_builtins.def
> index
> 2fb975944b9fdac9de4b5a1bec3962be410637f1..753e40a951d071c1ab77476
> a1cc4779e91689178 100644
> --- a/gcc/config/arm/arm_mve_builtins.def
> +++ b/gcc/config/arm/arm_mve_builtins.def
> @@ -847,16 +847,26 @@ VAR1 (STRSBWBS, vstrdq_scatter_base_wb_s, v2di)
> VAR1 (STRSBWBS_P, vstrwq_scatter_base_wb_p_s, v4si)
> VAR1 (STRSBWBS_P, vstrwq_scatter_base_wb_p_f, v4sf)
> VAR1 (STRSBWBS_P, vstrdq_scatter_base_wb_p_s, v2di)
> -VAR1 (LDRGBWBU_Z, vldrwq_gather_base_wb_z_u, v4si)
> -VAR1 (LDRGBWBU_Z, vldrdq_gather_base_wb_z_u, v2di)
> -VAR1 (LDRGBWBU, vldrwq_gather_base_wb_u, v4si)
> -VAR1 (LDRGBWBU, vldrdq_gather_base_wb_u, v2di)
> -VAR1 (LDRGBWBS_Z, vldrwq_gather_base_wb_z_s, v4si)
> -VAR1 (LDRGBWBS_Z, vldrwq_gather_base_wb_z_f, v4sf)
> -VAR1 (LDRGBWBS_Z, vldrdq_gather_base_wb_z_s, v2di)
> -VAR1 (LDRGBWBS, vldrwq_gather_base_wb_s, v4si)
> -VAR1 (LDRGBWBS, vldrwq_gather_base_wb_f, v4sf)
> -VAR1 (LDRGBWBS, vldrdq_gather_base_wb_s, v2di)
> +VAR1 (LDRGBWBU_Z, vldrwq_gather_base_nowb_z_u, v4si)
> +VAR1 (LDRGBWBU_Z, vldrdq_gather_base_nowb_z_u, v2di)
> +VAR1 (LDRGBWBU, vldrwq_gather_base_nowb_u, v4si)
> +VAR1 (LDRGBWBU, vldrdq_gather_base_nowb_u, v2di)
> +VAR1 (LDRGBWBS_Z, vldrwq_gather_base_nowb_z_s, v4si)
> +VAR1 (LDRGBWBS_Z, vldrwq_gather_base_nowb_z_f, v4sf)
> +VAR1 (LDRGBWBS_Z, vldrdq_gather_base_nowb_z_s, v2di)
> +VAR1 (LDRGBWBS, vldrwq_gather_base_nowb_s, v4si)
> +VAR1 (LDRGBWBS, vldrwq_gather_base_nowb_f, v4sf)
> +VAR1 (LDRGBWBS, vldrdq_gather_base_nowb_s, v2di)
> +VAR1 (LDRGBWBXU_Z, vldrdq_gather_base_wb_z_s, v2di)
> +VAR1 (LDRGBWBXU_Z, vldrdq_gather_base_wb_z_u, v2di)
> +VAR1 (LDRGBWBXU, vldrdq_gather_base_wb_s, v2di)
> +VAR1 (LDRGBWBXU, vldrdq_gather_base_wb_u, v2di)
> +VAR1 (LDRGBWBXU_Z, vldrwq_gather_base_wb_z_s, v4si)
> +VAR1 (LDRGBWBXU_Z, vldrwq_gather_base_wb_z_f, v4sf)
> +VAR1 (LDRGBWBXU_Z, vldrwq_gather_base_wb_z_u, v4si)
> +VAR1 (LDRGBWBXU, vldrwq_gather_base_wb_s, v4si)
> +VAR1 (LDRGBWBXU, vldrwq_gather_base_wb_f, v4sf)
> +VAR1 (LDRGBWBXU, vldrwq_gather_base_wb_u, v4si)
> VAR1 (BINOP_NONE_NONE_NONE, vadciq_s, v4si)
> VAR1 (BINOP_UNONE_UNONE_UNONE, vadciq_u, v4si)
> VAR1 (BINOP_NONE_NONE_NONE, vadcq_s, v4si) diff --git
> a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md index
> df602b07840bb4ccb9aa2a9b10992ba7078452ba..d1028f4542b4972b4080e46
> 544c86d625d77383a 100644
> --- a/gcc/config/arm/mve.md
> +++ b/gcc/config/arm/mve.md
> @@ -10420,6 +10420,20 @@
> (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)]
> "TARGET_HAVE_MVE"
> {
> + rtx ignore_result = gen_reg_rtx (V4SImode);
> + emit_insn (
> + gen_mve_vldrwq_gather_base_wb_<supf>v4si_insn (ignore_result,
> operands[0],
> + operands[1], operands[2]));
> + DONE;
> +})
> +
> +(define_expand "mve_vldrwq_gather_base_nowb_<supf>v4si"
> + [(match_operand:V4SI 0 "s_register_operand")
> + (match_operand:V4SI 1 "s_register_operand")
> + (match_operand:SI 2 "mve_vldrd_immediate")
> + (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)]
> + "TARGET_HAVE_MVE"
> +{
> rtx ignore_wb = gen_reg_rtx (V4SImode);
> emit_insn (
> gen_mve_vldrwq_gather_base_wb_<supf>v4si_insn (operands[0],
> ignore_wb, @@ -10459,6 +10473,21 @@
> (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)]
> "TARGET_HAVE_MVE"
> {
> + rtx ignore_result = gen_reg_rtx (V4SImode);
> + emit_insn (
> + gen_mve_vldrwq_gather_base_wb_z_<supf>v4si_insn (ignore_result,
> operands[0],
> + operands[1], operands[2],
> + operands[3]));
> + DONE;
> +})
> +(define_expand "mve_vldrwq_gather_base_nowb_z_<supf>v4si"
> + [(match_operand:V4SI 0 "s_register_operand")
> + (match_operand:V4SI 1 "s_register_operand")
> + (match_operand:SI 2 "mve_vldrd_immediate")
> + (match_operand:HI 3 "vpr_register_operand")
> + (unspec:V4SI [(const_int 0)] VLDRWGBWBQ)]
> + "TARGET_HAVE_MVE"
> +{
> rtx ignore_wb = gen_reg_rtx (V4SImode);
> emit_insn (
> gen_mve_vldrwq_gather_base_wb_z_<supf>v4si_insn (operands[0],
> ignore_wb, @@ -10487,12 +10516,26 @@
> ops[0] = operands[0];
> ops[1] = operands[2];
> ops[2] = operands[3];
> - output_asm_insn ("vpst\;\tvldrwt.u32\t%q0, [%q1, %2]!",ops);
> + output_asm_insn ("vpst\;vldrwt.u32\t%q0, [%q1, %2]!",ops);
> return "";
> }
> [(set_attr "length" "8")])
>
> (define_expand "mve_vldrwq_gather_base_wb_fv4sf"
> + [(match_operand:V4SI 0 "s_register_operand")
> + (match_operand:V4SI 1 "s_register_operand")
> + (match_operand:SI 2 "mve_vldrd_immediate")
> + (unspec:V4SI [(const_int 0)] VLDRWQGBWB_F)]
> + "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
> +{
> + rtx ignore_result = gen_reg_rtx (V4SFmode);
> + emit_insn (
> + gen_mve_vldrwq_gather_base_wb_fv4sf_insn (ignore_result, operands[0],
> + operands[1], operands[2]));
> + DONE;
> +})
> +
> +(define_expand "mve_vldrwq_gather_base_nowb_fv4sf"
> [(match_operand:V4SF 0 "s_register_operand")
> (match_operand:V4SI 1 "s_register_operand")
> (match_operand:SI 2 "mve_vldrd_immediate") @@ -10531,6 +10574,22
> @@
> [(set_attr "length" "4")])
>
> (define_expand "mve_vldrwq_gather_base_wb_z_fv4sf"
> + [(match_operand:V4SI 0 "s_register_operand")
> + (match_operand:V4SI 1 "s_register_operand")
> + (match_operand:SI 2 "mve_vldrd_immediate")
> + (match_operand:HI 3 "vpr_register_operand")
> + (unspec:V4SI [(const_int 0)] VLDRWQGBWB_F)]
> + "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
> +{
> + rtx ignore_result = gen_reg_rtx (V4SFmode);
> + emit_insn (
> + gen_mve_vldrwq_gather_base_wb_z_fv4sf_insn (ignore_result,
> operands[0],
> + operands[1], operands[2],
> + operands[3]));
> + DONE;
> +})
> +
> +(define_expand "mve_vldrwq_gather_base_nowb_z_fv4sf"
> [(match_operand:V4SF 0 "s_register_operand")
> (match_operand:V4SI 1 "s_register_operand")
> (match_operand:SI 2 "mve_vldrd_immediate") @@ -10566,7 +10625,7
> @@
> ops[0] = operands[0];
> ops[1] = operands[2];
> ops[2] = operands[3];
> - output_asm_insn ("vpst\;\tvldrwt.u32\t%q0, [%q1, %2]!",ops);
> + output_asm_insn ("vpst\;vldrwt.u32\t%q0, [%q1, %2]!",ops);
> return "";
> }
> [(set_attr "length" "8")])
> @@ -10578,6 +10637,20 @@
> (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)]
> "TARGET_HAVE_MVE"
> {
> + rtx ignore_result = gen_reg_rtx (V2DImode);
> + emit_insn (
> + gen_mve_vldrdq_gather_base_wb_<supf>v2di_insn (ignore_result,
> operands[0],
> + operands[1], operands[2]));
> + DONE;
> +})
> +
> +(define_expand "mve_vldrdq_gather_base_nowb_<supf>v2di"
> + [(match_operand:V2DI 0 "s_register_operand")
> + (match_operand:V2DI 1 "s_register_operand")
> + (match_operand:SI 2 "mve_vldrd_immediate")
> + (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)]
> + "TARGET_HAVE_MVE"
> +{
> rtx ignore_wb = gen_reg_rtx (V2DImode);
> emit_insn (
> gen_mve_vldrdq_gather_base_wb_<supf>v2di_insn (operands[0],
> ignore_wb, @@ -10585,6 +10658,7 @@
> DONE;
> })
>
> +
> ;;
> ;; [vldrdq_gather_base_wb_s vldrdq_gather_base_wb_u] ;; @@ -10617,6
> +10691,22 @@
> (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)]
> "TARGET_HAVE_MVE"
> {
> + rtx ignore_result = gen_reg_rtx (V2DImode);
> + emit_insn (
> + gen_mve_vldrdq_gather_base_wb_z_<supf>v2di_insn (ignore_result,
> operands[0],
> + operands[1], operands[2],
> + operands[3]));
> + DONE;
> +})
> +
> +(define_expand "mve_vldrdq_gather_base_nowb_z_<supf>v2di"
> + [(match_operand:V2DI 0 "s_register_operand")
> + (match_operand:V2DI 1 "s_register_operand")
> + (match_operand:SI 2 "mve_vldrd_immediate")
> + (match_operand:HI 3 "vpr_register_operand")
> + (unspec:V2DI [(const_int 0)] VLDRDGBWBQ)]
> + "TARGET_HAVE_MVE"
> +{
> rtx ignore_wb = gen_reg_rtx (V2DImode);
> emit_insn (
> gen_mve_vldrdq_gather_base_wb_z_<supf>v2di_insn (operands[0],
> ignore_wb, @@ -10660,7 +10750,7 @@
> ops[0] = operands[0];
> ops[1] = operands[2];
> ops[2] = operands[3];
> - output_asm_insn ("vpst\;\tvldrdt.u64\t%q0, [%q1, %2]!",ops);
> + output_asm_insn ("vpst\;vldrdt.u64\t%q0, [%q1, %2]!",ops);
> return "";
> }
> [(set_attr "length" "8")])
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c
> index
> a5c5a61345cb0a46abc7796ceff195698cabe804..0d1ee769ec64b55c7559ce9d
> c14f8a6ae2e43e34 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_
> +++ s64.c
> @@ -10,4 +10,6 @@ foo (uint64x2_t * addr)
> return vldrdq_gather_base_wb_s64 (addr, 8); }
>
> -/* { dg-final { scan-assembler "vldrd.64" } } */
> +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> +/* { dg-final { scan-assembler "vldrd.64\tq\[0-9\]+, \\\[q\[0-9\]+,
> +#\[0-9\]+\\\]!" } } */
> +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.
> c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.
> c
> index
> 442bca92a43c05124717bf6ea0c44672941091f0..cb2a41bdcd32b553a93d3bcc
> 4787d506f1b54f74 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.
> c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_
> +++ u64.c
> @@ -10,4 +10,6 @@ foo (uint64x2_t * addr)
> return vldrdq_gather_base_wb_u64 (addr, 8); }
>
> -/* { dg-final { scan-assembler "vldrd.64" } } */
> +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> +/* { dg-final { scan-assembler "vldrd.64\tq\[0-9\]+, \\\[q\[0-9\]+,
> +#\[0-9\]+\\\]!" } } */
> +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s6
> 4.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s6
> 4.c
> index
> 1863d0835e12328b7b7bb824f59e3d441042f56d..243fbeacc3429025202da2ff
> 157ade38a472e123 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s6
> 4.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_
> +++ z_s64.c
> @@ -8,4 +8,8 @@ int64x2_t foo (uint64x2_t * addr, mve_pred16_t p)
> return vldrdq_gather_base_wb_z_s64 (addr, 1016, p); }
>
> -/* { dg-final { scan-assembler "vldrdt.u64" } } */
> +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*$" } } */
> +/* { dg-final { scan-assembler "vpst" } } */
> +/* { dg-final { scan-assembler "vldrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+,
> +#\[0-9\]+\\\]!" } } */
> +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u6
> 4.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u6
> 4.c
> index
> 7ba272a112607b0e57a3d4659e5b4033044af83c..10ba42405fe8fde9d4f8993
> b20e41a59c7bb2e77 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u6
> 4.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_
> +++ z_u64.c
> @@ -8,4 +8,8 @@ uint64x2_t foo (uint64x2_t * addr, mve_pred16_t p)
> return vldrdq_gather_base_wb_z_u64 (addr, 8, p); }
>
> -/* { dg-final { scan-assembler "vldrdt.u64" } } */
> +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */
> +/* { dg-final { scan-assembler "vpst" } } */
> +/* { dg-final { scan-assembler "vldrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+,
> +#\[0-9\]+\\\]!" } } */
> +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.
> c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.
> c
> index
> 6b496873f173e30414ffcddf50513758bc8ca770..db8108e37325c4e1fafd2293d
> 48eba0c33309073 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.
> c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_
> +++ f32.c
> @@ -10,4 +10,6 @@ foo (uint32x4_t * addr)
> return vldrwq_gather_base_wb_f32 (addr, 8); }
>
> -/* { dg-final { scan-assembler "vldrw.u32" } } */
> +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> +/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> +#\[0-9\]+\\\]!" } } */
> +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.
> c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.
> c
> index
> 9bbbd0d701546b5ec224129aef49e632addea550..3da64e218e2c0789e996be
> 551650033567eba4e5 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.
> c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_
> +++ s32.c
> @@ -10,4 +10,6 @@ foo (uint32x4_t * addr)
> return vldrwq_gather_base_wb_s32 (addr, 8); }
>
> -/* { dg-final { scan-assembler "vldrw.u32" } } */
> +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> +/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> +#\[0-9\]+\\\]!" } } */
> +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.
> c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.
> c
> index
> 774230b290367a7d28f0c8579be26fc9c75db1cb..2597ee11608bfe21d697f225
> 0bee7e69c0cc7aec 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.
> c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_
> +++ u32.c
> @@ -10,4 +10,6 @@ foo (uint32x4_t * addr)
> return vldrwq_gather_base_wb_u32 (addr, 8); }
>
> -/* { dg-final { scan-assembler "vldrw.u32" } } */
> +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> +/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> +#\[0-9\]+\\\]!" } } */
> +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f3
> 2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f3
> 2.c
> index
> 6400f014a88ccf34fef15effff65f9b1267dbd5f..f1ba63855be254d96806c16317
> 7e32856294c106 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f3
> 2.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_
> +++ z_f32.c
> @@ -10,4 +10,8 @@ foo (uint32x4_t * addr, mve_pred16_t p)
> return vldrwq_gather_base_wb_z_f32 (addr, 8, p); }
>
> -/* { dg-final { scan-assembler "vldrwt.u32" } } */
> +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> +/* { dg-final { scan-assembler "vmsr\tP0, r\[0-9\]+.*" } } */
> +/* { dg-final { scan-assembler "vpst" } } */
> +/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> +#\[0-9\]+\\\]!" } } */
> +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s3
> 2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s3
> 2.c
> index
> de7006c51f17665b80b83fd5ea034477b7a7e778..56da5a46c64d2946ceade86
> 89105048e19efdc6a 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s3
> 2.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_
> +++ z_s32.c
> @@ -10,4 +10,8 @@ foo (uint32x4_t * addr, mve_pred16_t p)
> return vldrwq_gather_base_wb_z_s32 (addr, 8, p); }
>
> -/* { dg-final { scan-assembler "vldrwt.u32" } } */
> +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */
> +/* { dg-final { scan-assembler "vpst" } } */
> +/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> +#\[0-9\]+\\\]!" } } */
> +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> diff --git
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u3
> 2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u
> 32.c
> index
> 6c9608f07ba966876804f56403a4352a51a0e0c4..63165d97c1a7b4120be0363
> 48a09b73afddd36d1 100644
> ---
> a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u3
> 2.c
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_
> +++ z_u32.c
> @@ -10,4 +10,8 @@ foo (uint32x4_t * addr, mve_pred16_t p)
> return vldrwq_gather_base_wb_z_u32 (addr, 8, p); }
>
> -/* { dg-final { scan-assembler "vldrwt.u32" } } */
> +/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
> +/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */
> +/* { dg-final { scan-assembler "vpst" } } */
> +/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+,
> +#\[0-9\]+\\\]!" } } */
> +/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" }
> +} */
prev parent reply other threads:[~2020-04-02 9:58 UTC|newest]
Thread overview: 2+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-31 16:13 Srinath Parvathaneni
2020-04-02 9:58 ` Kyrylo Tkachov [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DB7PR08MB3002BCC7F4771B4CBC76D56293C60@DB7PR08MB3002.eurprd08.prod.outlook.com \
--to=kyrylo.tkachov@arm.com \
--cc=Richard.Earnshaw@arm.com \
--cc=Srinath.Parvathaneni@arm.com \
--cc=gcc-patches@gcc.gnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).