public inbox for gcc-cvs@sourceware.org
help / color / mirror / Atom feed
* [gcc/devel/autopar_devel] [ARM]: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).
@ 2020-08-22 21:25 Giuliano Belinassi
0 siblings, 0 replies; only message in thread
From: Giuliano Belinassi @ 2020-08-22 21:25 UTC (permalink / raw)
To: gcc-cvs
https://gcc.gnu.org/g:96c20ee68aecd3d04f3848ba6e3527fc1333931a
commit 96c20ee68aecd3d04f3848ba6e3527fc1333931a
Author: Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Date: Wed May 20 10:17:22 2020 +0100
[ARM]: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959).
Few MVE intrinsics like vldrbq_s32, vldrhq_s32 etc., the assembler instructions
generated by current compiler are wrong.
eg: vldrbq_s32 generates an assembly instructions `vldrb.s32 q0,[ip]`.
But as per Arm-arm second argument in above instructions must also be a low
register (<= r7). This patch fixes this issue by creating a new predicate
"mve_memory_operand" and constraint "Ux" which allows low registers as arguments
to the generated instructions depending on the mode of the argument. A new constraint
"Ul" is created to handle loading to PC-relative addressing modes for vector
store/load intrinsiscs.
All the corresponding MVE intrinsic generating wrong code-gen as vldrbq_s32
are modified in this patch.
gcc/ChangeLog:
2020-05-20 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
Andre Vieira <andre.simoesdiasvieira@arm.com>
PR target/94959
* config/arm/arm-protos.h (arm_mode_base_reg_class): Function
declaration.
(mve_vector_mem_operand): Likewise.
* config/arm/arm.c (thumb2_legitimate_address_p): For MVE target check
the load from memory to a core register is legitimate for give mode.
(mve_vector_mem_operand): Define function.
(arm_print_operand): Modify comment.
(arm_mode_base_reg_class): Define.
* config/arm/arm.h (MODE_BASE_REG_CLASS): Modify to add check for
TARGET_HAVE_MVE and expand to arm_mode_base_reg_class on TRUE.
* config/arm/constraints.md (Ux): Likewise.
(Ul): Likewise.
* config/arm/mve.md (mve_mov): Replace constraint Us with Ux and also
add support for missing Vector Store Register and Vector Load Register.
Add a new alternative to support load from memory to PC (or label) in
vector store/load.
(mve_vstrbq_<supf><mode>): Modify constraint Us to Ux.
(mve_vldrbq_<supf><mode>): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vldrbq_z_<supf><mode>): Modify constraint Us to Ux.
(mve_vldrhq_fv8hf): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vldrhq_<supf><mode>): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vldrhq_z_fv8hf): Likewise.
(mve_vldrhq_z_<supf><mode>): Likewise.
(mve_vldrwq_fv4sf): Likewise.
(mve_vldrwq_<supf>v4si): Likewise.
(mve_vldrwq_z_fv4sf): Likewise.
(mve_vldrwq_z_<supf>v4si): Likewise.
(mve_vld1q_f<mode>): Modify constriant Us to Ux.
(mve_vld1q_<supf><mode>): Likewise.
(mve_vstrhq_fv8hf): Modify constriant Us to Ux, predicate to
mve_memory_operand.
(mve_vstrhq_p_fv8hf): Modify constriant Us to Ux, predicate to
mve_memory_operand and also modify the MVE instructions to emit.
(mve_vstrhq_p_<supf><mode>): Likewise.
(mve_vstrhq_<supf><mode>): Modify constriant Us to Ux, predicate to
mve_memory_operand.
(mve_vstrwq_fv4sf): Modify constriant Us to Ux.
(mve_vstrwq_p_fv4sf): Modify constriant Us to Ux and also modify the MVE
instructions to emit.
(mve_vstrwq_p_<supf>v4si): Likewise.
(mve_vstrwq_<supf>v4si): Likewise.Modify constriant Us to Ux.
* config/arm/predicates.md (mve_memory_operand): Define.
gcc/testsuite/ChangeLog:
2020-05-20 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
PR target/94959
* gcc.target/arm/mve/intrinsics/mve_vector_float2.c: Modify.
* gcc.target/arm/mve/intrinsics/mve_vldr.c: New test.
* gcc.target/arm/mve/intrinsics/mve_vldr_z.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vstr.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_vstr_p.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_f16.c: Modify.
* gcc.target/arm/mve/intrinsics/vld1q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld1q_z_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_z_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrbq_z_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrhq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrhq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrhq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrhq_z_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrhq_z_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrhq_z_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrhq_z_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrhq_z_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_z_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_z_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vldrwq_z_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vuninitializedq_float.c: Likewise.
* gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c: Likewise.
* gcc.target/arm/mve/intrinsics/vuninitializedq_int.c: Likewise.
* gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c: Likewise.
Diff:
---
gcc/ChangeLog | 52 +++++++
gcc/config/arm/arm-protos.h | 3 +
gcc/config/arm/arm.c | 100 +++++++++++--
gcc/config/arm/arm.h | 8 +-
gcc/config/arm/constraints.md | 23 ++-
gcc/config/arm/mve.md | 156 ++++++++++++++-------
gcc/config/arm/predicates.md | 6 +
.../arm/mve/intrinsics/mve_vector_float2.c | 13 +-
.../gcc.target/arm/mve/intrinsics/mve_vldr.c | 61 ++++++++
.../gcc.target/arm/mve/intrinsics/mve_vldr_z.c | 73 ++++++++++
.../gcc.target/arm/mve/intrinsics/mve_vstr.c | 43 ++++++
.../gcc.target/arm/mve/intrinsics/mve_vstr_p.c | 42 ++++++
.../gcc.target/arm/mve/intrinsics/vld1q_f16.c | 5 +-
.../gcc.target/arm/mve/intrinsics/vld1q_f32.c | 5 +-
.../gcc.target/arm/mve/intrinsics/vld1q_s16.c | 5 +-
.../gcc.target/arm/mve/intrinsics/vld1q_s32.c | 5 +-
.../gcc.target/arm/mve/intrinsics/vld1q_s8.c | 5 +-
.../gcc.target/arm/mve/intrinsics/vld1q_u16.c | 5 +-
.../gcc.target/arm/mve/intrinsics/vld1q_u32.c | 5 +-
.../gcc.target/arm/mve/intrinsics/vld1q_u8.c | 5 +-
.../gcc.target/arm/mve/intrinsics/vld1q_z_f16.c | 6 +-
.../gcc.target/arm/mve/intrinsics/vld1q_z_f32.c | 6 +-
.../gcc.target/arm/mve/intrinsics/vld1q_z_s16.c | 6 +-
.../gcc.target/arm/mve/intrinsics/vld1q_z_s32.c | 6 +-
.../gcc.target/arm/mve/intrinsics/vld1q_z_s8.c | 6 +-
.../gcc.target/arm/mve/intrinsics/vld1q_z_u16.c | 6 +-
.../gcc.target/arm/mve/intrinsics/vld1q_z_u32.c | 6 +-
.../gcc.target/arm/mve/intrinsics/vld1q_z_u8.c | 6 +-
.../gcc.target/arm/mve/intrinsics/vldrbq_s8.c | 3 +-
.../gcc.target/arm/mve/intrinsics/vldrbq_u8.c | 3 +-
.../gcc.target/arm/mve/intrinsics/vldrbq_z_s8.c | 4 +-
.../gcc.target/arm/mve/intrinsics/vldrbq_z_u8.c | 4 +-
.../arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c | 5 +-
.../arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c | 5 +-
.../mve/intrinsics/vldrdq_gather_base_wb_z_s64.c | 6 +-
.../mve/intrinsics/vldrdq_gather_base_wb_z_u64.c | 6 +-
.../gcc.target/arm/mve/intrinsics/vldrhq_f16.c | 3 +-
.../gcc.target/arm/mve/intrinsics/vldrhq_s16.c | 3 +-
.../gcc.target/arm/mve/intrinsics/vldrhq_s32.c | 3 +-
.../gcc.target/arm/mve/intrinsics/vldrhq_u16.c | 3 +-
.../gcc.target/arm/mve/intrinsics/vldrhq_u32.c | 3 +-
.../gcc.target/arm/mve/intrinsics/vldrhq_z_f16.c | 4 +-
.../gcc.target/arm/mve/intrinsics/vldrhq_z_s16.c | 4 +-
.../gcc.target/arm/mve/intrinsics/vldrhq_z_s32.c | 4 +-
.../gcc.target/arm/mve/intrinsics/vldrhq_z_u16.c | 4 +-
.../gcc.target/arm/mve/intrinsics/vldrhq_z_u32.c | 4 +-
.../gcc.target/arm/mve/intrinsics/vldrwq_f32.c | 3 +-
.../arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c | 5 +-
.../arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c | 5 +-
.../arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c | 5 +-
.../mve/intrinsics/vldrwq_gather_base_wb_z_f32.c | 5 +-
.../mve/intrinsics/vldrwq_gather_base_wb_z_s32.c | 5 +-
.../mve/intrinsics/vldrwq_gather_base_wb_z_u32.c | 5 +-
.../gcc.target/arm/mve/intrinsics/vldrwq_s32.c | 3 +-
.../gcc.target/arm/mve/intrinsics/vldrwq_u32.c | 3 +-
.../gcc.target/arm/mve/intrinsics/vldrwq_z_f32.c | 4 +-
.../gcc.target/arm/mve/intrinsics/vldrwq_z_s32.c | 4 +-
.../gcc.target/arm/mve/intrinsics/vldrwq_z_u32.c | 4 +-
.../arm/mve/intrinsics/vuninitializedq_float.c | 6 +-
.../arm/mve/intrinsics/vuninitializedq_float1.c | 6 +-
.../arm/mve/intrinsics/vuninitializedq_int.c | 8 +-
.../arm/mve/intrinsics/vuninitializedq_int1.c | 8 +-
62 files changed, 645 insertions(+), 173 deletions(-)
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3133b3c4ef1..21070e05743 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,55 @@
+gcc/ChangeLog:
+
+2020-05-20 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
+ Andre Vieira <andre.simoesdiasvieira@arm.com>
+
+ PR target/94959
+ * config/arm/arm-protos.h (arm_mode_base_reg_class): Function
+ declaration.
+ (mve_vector_mem_operand): Likewise.
+ * config/arm/arm.c (thumb2_legitimate_address_p): For MVE target check
+ the load from memory to a core register is legitimate for give mode.
+ (mve_vector_mem_operand): Define function.
+ (arm_print_operand): Modify comment.
+ (arm_mode_base_reg_class): Define.
+ * config/arm/arm.h (MODE_BASE_REG_CLASS): Modify to add check for
+ TARGET_HAVE_MVE and expand to arm_mode_base_reg_class on TRUE.
+ * config/arm/constraints.md (Ux): Likewise.
+ (Ul): Likewise.
+ * config/arm/mve.md (mve_mov): Replace constraint Us with Ux and also
+ add support for missing Vector Store Register and Vector Load Register.
+ Add a new alternative to support load from memory to PC (or label) in
+ vector store/load.
+ (mve_vstrbq_<supf><mode>): Modify constraint Us to Ux.
+ (mve_vldrbq_<supf><mode>): Modify constriant Us to Ux, predicate to
+ mve_memory_operand and also modify the MVE instructions to emit.
+ (mve_vldrbq_z_<supf><mode>): Modify constraint Us to Ux.
+ (mve_vldrhq_fv8hf): Modify constriant Us to Ux, predicate to
+ mve_memory_operand and also modify the MVE instructions to emit.
+ (mve_vldrhq_<supf><mode>): Modify constriant Us to Ux, predicate to
+ mve_memory_operand and also modify the MVE instructions to emit.
+ (mve_vldrhq_z_fv8hf): Likewise.
+ (mve_vldrhq_z_<supf><mode>): Likewise.
+ (mve_vldrwq_fv4sf): Likewise.
+ (mve_vldrwq_<supf>v4si): Likewise.
+ (mve_vldrwq_z_fv4sf): Likewise.
+ (mve_vldrwq_z_<supf>v4si): Likewise.
+ (mve_vld1q_f<mode>): Modify constriant Us to Ux.
+ (mve_vld1q_<supf><mode>): Likewise.
+ (mve_vstrhq_fv8hf): Modify constriant Us to Ux, predicate to
+ mve_memory_operand.
+ (mve_vstrhq_p_fv8hf): Modify constriant Us to Ux, predicate to
+ mve_memory_operand and also modify the MVE instructions to emit.
+ (mve_vstrhq_p_<supf><mode>): Likewise.
+ (mve_vstrhq_<supf><mode>): Modify constriant Us to Ux, predicate to
+ mve_memory_operand.
+ (mve_vstrwq_fv4sf): Modify constriant Us to Ux.
+ (mve_vstrwq_p_fv4sf): Modify constriant Us to Ux and also modify the MVE
+ instructions to emit.
+ (mve_vstrwq_p_<supf>v4si): Likewise.
+ (mve_vstrwq_<supf>v4si): Likewise.Modify constriant Us to Ux.
+ * config/arm/predicates.md (mve_memory_operand): Define.
+
2020-05-30 Richard Biener <rguenther@suse.de>
PR c/95141
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 9571b60f84f..33d162c3e00 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -64,6 +64,8 @@ extern bool arm_q_bit_access (void);
extern bool arm_ge_bits_access (void);
#ifdef RTX_CODE
+enum reg_class
+arm_mode_base_reg_class (machine_mode);
extern void arm_gen_unlikely_cbranch (enum rtx_code, machine_mode cc_mode,
rtx label_ref);
extern bool arm_vector_mode_supported_p (machine_mode);
@@ -114,6 +116,7 @@ extern bool arm_tls_referenced_p (rtx);
extern int arm_coproc_mem_operand (rtx, bool);
extern int neon_vector_mem_operand (rtx, int, bool);
+extern int mve_vector_mem_operand (machine_mode, rtx, bool);
extern int neon_struct_mem_operand (rtx);
extern rtx *neon_vcmla_lane_prepare_operands (rtx *);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 635e7adac45..c396b5b28e3 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -8468,6 +8468,10 @@ thumb2_legitimate_address_p (machine_mode mode, rtx x, int strict_p)
bool use_ldrd;
enum rtx_code code = GET_CODE (x);
+ if (TARGET_HAVE_MVE
+ && (mode == V8QImode || mode == E_V4QImode || mode == V4HImode))
+ return mve_vector_mem_operand (mode, x, strict_p);
+
if (arm_address_register_rtx_p (x, strict_p))
return 1;
@@ -13283,6 +13287,79 @@ arm_coproc_mem_operand (rtx op, bool wb)
return FALSE;
}
+/* This function returns TRUE on matching mode and op.
+1. For given modes, check for [Rn], return TRUE for Rn <= LO_REGS.
+2. For other modes, check for [Rn], return TRUE for Rn < R15 (expect R13). */
+int
+mve_vector_mem_operand (machine_mode mode, rtx op, bool strict)
+{
+ enum rtx_code code;
+ int val, reg_no;
+
+ /* Match: (mem (reg)). */
+ if (REG_P (op))
+ {
+ int reg_no = REGNO (op);
+ return (((mode == E_V8QImode || mode == E_V4QImode || mode == E_V4HImode)
+ ? reg_no <= LAST_LO_REGNUM
+ :(reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM))
+ || (!strict && reg_no >= FIRST_PSEUDO_REGISTER));
+ }
+ code = GET_CODE (op);
+
+ if (code == POST_INC || code == PRE_DEC
+ || code == PRE_INC || code == POST_DEC)
+ {
+ reg_no = REGNO (XEXP (op, 0));
+ return (((mode == E_V8QImode || mode == E_V4QImode || mode == E_V4HImode)
+ ? reg_no <= LAST_LO_REGNUM
+ :(reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM))
+ || (!strict && reg_no >= FIRST_PSEUDO_REGISTER));
+ }
+ else if ((code == POST_MODIFY || code == PRE_MODIFY)
+ && GET_CODE (XEXP (op, 1)) == PLUS && REG_P (XEXP (XEXP (op, 1), 1)))
+ {
+ reg_no = REGNO (XEXP (op, 0));
+ val = INTVAL (XEXP ( XEXP (op, 1), 1));
+ switch (mode)
+ {
+ case E_V16QImode:
+ if (abs (val) <= 127)
+ return ((reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM)
+ || (!strict && reg_no >= FIRST_PSEUDO_REGISTER));
+ case E_V8HImode:
+ case E_V8HFmode:
+ if (abs (val) <= 255)
+ return ((reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM)
+ || (!strict && reg_no >= FIRST_PSEUDO_REGISTER));
+ case E_V8QImode:
+ case E_V4QImode:
+ if (abs (val) <= 127)
+ return (reg_no <= LAST_LO_REGNUM
+ || (!strict && reg_no >= FIRST_PSEUDO_REGISTER));
+ case E_V4HImode:
+ case E_V4HFmode:
+ if (val % 2 == 0 && abs (val) <= 254)
+ return (reg_no <= LAST_LO_REGNUM
+ || (!strict && reg_no >= FIRST_PSEUDO_REGISTER));
+ case E_V4SImode:
+ case E_V4SFmode:
+ if (val % 4 == 0 && abs (val) <= 508)
+ return ((reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM)
+ || (!strict && reg_no >= FIRST_PSEUDO_REGISTER));
+ case E_V2DImode:
+ case E_V2DFmode:
+ case E_TImode:
+ if (val % 4 == 0 && val >= 0 && val <= 1020)
+ return ((reg_no < LAST_ARM_REGNUM && reg_no != SP_REGNUM)
+ || (!strict && reg_no >= FIRST_PSEUDO_REGISTER));
+ default:
+ return FALSE;
+ }
+ }
+ return FALSE;
+}
+
/* Return TRUE if OP is a memory operand which we can load or store a vector
to/from. TYPE is one of the following values:
0 - Vector load/stor (vldr)
@@ -13350,15 +13427,6 @@ neon_vector_mem_operand (rtx op, int type, bool strict)
&& (INTVAL (XEXP (ind, 1)) & 3) == 0)
return TRUE;
- if (type == 1 && TARGET_HAVE_MVE
- && (GET_CODE (ind) == POST_INC || GET_CODE (ind) == PRE_DEC))
- {
- rtx ind1 = XEXP (ind, 0);
- if (!REG_P (ind1))
- return 0;
- return VFP_REGNO_OK_FOR_SINGLE (REGNO (ind1));
- }
-
return FALSE;
}
@@ -24042,7 +24110,7 @@ arm_print_operand (FILE *stream, rtx x, int code)
}
return;
- /* To print the memory operand with "Us" constraint. Based on the rtx_code
+ /* To print the memory operand with "Ux" constraint. Based on the rtx_code
the memory operands output looks like following.
1. [Rn], #+/-<imm>
2. [Rn, #+/-<imm>]!
@@ -33408,6 +33476,18 @@ arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
return "";
}
+/* If given mode matches, load from memory to LO_REGS.
+ (i.e [Rn], Rn <= LO_REGS). */
+enum reg_class
+arm_mode_base_reg_class (machine_mode mode)
+{
+ if (TARGET_HAVE_MVE
+ && (mode == E_V8QImode || mode == E_V4QImode || mode == E_V4HImode))
+ return LO_REGS;
+
+ return MODE_BASE_REG_REG_CLASS (mode);
+}
+
struct gcc_target targetm = TARGET_INITIALIZER;
#include "gt-arm.h"
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 0126f390abb..30e1d6dc994 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -1292,11 +1292,13 @@ extern const char *fp_sysreg_names[NB_FP_SYSREGS];
/* For the Thumb the high registers cannot be used as base registers
when addressing quantities in QI or HI mode; if we don't know the
- mode, then we must be conservative. */
+ mode, then we must be conservative. For MVE we need to load from
+ memory to low regs based on given modes i.e [Rn], Rn <= LO_REGS. */
#define MODE_BASE_REG_CLASS(MODE) \
- (TARGET_32BIT ? CORE_REGS \
+ (TARGET_HAVE_MVE ? arm_mode_base_reg_class (MODE) \
+ :(TARGET_32BIT ? CORE_REGS \
: GET_MODE_SIZE (MODE) >= 4 ? BASE_REGS \
- : LO_REGS)
+ : LO_REGS))
/* For Thumb we cannot support SP+reg addressing, so we return LO_REGS
instead of BASE_REGS. */
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index fed6c7c8403..011badc9957 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -39,7 +39,7 @@
;; in all states: Pf, Pg
;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up, Uf
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up, Uf, Ux, Ul
;; in ARM state: Uq
;; in Thumb state: Uu, Uw
;; in all states: Q
@@ -47,6 +47,18 @@
(define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"
"MVE VPR register")
+(define_memory_constraint "Ul"
+ "@internal
+ In ARM/Thumb-2 state a valid address for load instruction with XEXP (op, 0)
+ being label of the literal data item to be loaded."
+ (and (match_code "mem")
+ (match_test "TARGET_HAVE_MVE && reload_completed
+ && (GET_CODE (XEXP (op, 0)) == LABEL_REF
+ || (GET_CODE (XEXP (op, 0)) == CONST
+ && GET_CODE (XEXP (XEXP (op, 0), 0)) == PLUS
+ && GET_CODE (XEXP (XEXP (XEXP (op, 0), 0), 0)) == LABEL_REF
+ && CONST_INT_P (XEXP (XEXP (XEXP (op, 0), 0), 1))))")))
+
(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS"
"MVE FPCCR register")
@@ -467,6 +479,15 @@
(and (match_code "mem")
(match_test "TARGET_32BIT && neon_vector_mem_operand (op, 1, true)")))
+(define_memory_constraint "Ux"
+ "@internal
+ In ARM/Thumb-2 state a valid address and load into CORE regs or only to
+ LO_REGS based on mode of op."
+ (and (match_code "mem")
+ (match_test "(TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT)
+ && mve_vector_mem_operand (GET_MODE (op),
+ XEXP (op, 0), true)")))
+
(define_memory_constraint "Uq"
"@internal
In ARM state an address valid in ldrsb instructions."
diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index f43dabbfd4f..986fbfe2aba 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -666,8 +666,8 @@
(define_int_iterator VSHLCQ_M [VSHLCQ_M_S VSHLCQ_M_U])
(define_insn "*mve_mov<mode>"
- [(set (match_operand:MVE_types 0 "nonimmediate_operand" "=w,w,r,w,w,r,w,Us")
- (match_operand:MVE_types 1 "general_operand" "w,r,w,Dn,Usi,r,Dm,w"))]
+ [(set (match_operand:MVE_types 0 "nonimmediate_operand" "=w,w,r,w,w,r,w,Ux,w")
+ (match_operand:MVE_types 1 "general_operand" "w,r,w,Dn,Uxi,r,Dm,w,Ul"))]
"TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT"
{
if (which_alternative == 3 || which_alternative == 6)
@@ -686,6 +686,50 @@
sprintf (templ, "vmov.i%d\t%%q0, %%x1 @ <mode>", width);
return templ;
}
+
+ if (which_alternative == 4 || which_alternative == 7)
+ {
+ rtx ops[2];
+ int regno = (which_alternative == 7)
+ ? REGNO (operands[1]) : REGNO (operands[0]);
+
+ ops[0] = operands[0];
+ ops[1] = operands[1];
+ if (<MODE>mode == V2DFmode || <MODE>mode == V2DImode)
+ {
+ if (which_alternative == 7)
+ {
+ ops[1] = gen_rtx_REG (DImode, regno);
+ output_asm_insn ("vstr.64\t%P1, %E0",ops);
+ }
+ else
+ {
+ ops[0] = gen_rtx_REG (DImode, regno);
+ output_asm_insn ("vldr.64\t%P0, %E1",ops);
+ }
+ }
+ else if (<MODE>mode == TImode)
+ {
+ if (which_alternative == 7)
+ output_asm_insn ("vstr.64\t%q1, %E0",ops);
+ else
+ output_asm_insn ("vldr.64\t%q0, %E1",ops);
+ }
+ else
+ {
+ if (which_alternative == 7)
+ {
+ ops[1] = gen_rtx_REG (TImode, regno);
+ output_asm_insn ("vstr<V_sz_elem1>.<V_sz_elem>\t%q1, %E0",ops);
+ }
+ else
+ {
+ ops[0] = gen_rtx_REG (TImode, regno);
+ output_asm_insn ("vldr<V_sz_elem1>.<V_sz_elem>\t%q0, %E1",ops);
+ }
+ }
+ return "";
+ }
switch (which_alternative)
{
case 0:
@@ -694,26 +738,19 @@
return "vmov\t%e0, %Q1, %R1 @ <mode>\;vmov\t%f0, %J1, %K1";
case 2:
return "vmov\t%Q0, %R0, %e1 @ <mode>\;vmov\t%J0, %K0, %f1";
- case 4:
- if (MEM_P (operands[1])
- && (GET_CODE (XEXP (operands[1], 0)) == LABEL_REF
- || GET_CODE (XEXP (operands[1], 0)) == CONST))
- return output_move_neon (operands);
- else
- return "vldrb.8 %q0, %E1";
case 5:
return output_move_quad (operands);
- case 7:
- return "vstrb.8 %q1, %E0";
+ case 8:
+ return output_move_neon (operands);
default:
gcc_unreachable ();
return "";
}
}
- [(set_attr "type" "mve_move,mve_move,mve_move,mve_move,mve_load,multiple,mve_move,mve_store")
- (set_attr "length" "4,8,8,4,8,8,4,4")
- (set_attr "thumb2_pool_range" "*,*,*,*,1018,*,*,*")
- (set_attr "neg_pool_range" "*,*,*,*,996,*,*,*")])
+ [(set_attr "type" "mve_move,mve_move,mve_move,mve_move,mve_load,multiple,mve_move,mve_store,mve_load")
+ (set_attr "length" "4,8,8,4,8,8,4,4,4")
+ (set_attr "thumb2_pool_range" "*,*,*,*,1018,*,*,*,*")
+ (set_attr "neg_pool_range" "*,*,*,*,996,*,*,*,*")])
(define_insn "*mve_mov<mode>"
[(set (match_operand:MVE_types 0 "s_register_operand" "=w,w")
@@ -8047,7 +8084,7 @@
;; [vstrbq_s vstrbq_u]
;;
(define_insn "mve_vstrbq_<supf><mode>"
- [(set (match_operand:<MVE_B_ELEM> 0 "memory_operand" "=Us")
+ [(set (match_operand:<MVE_B_ELEM> 0 "mve_memory_operand" "=Ux")
(unspec:<MVE_B_ELEM> [(match_operand:MVE_2 1 "s_register_operand" "w")]
VSTRBQ))
]
@@ -8133,7 +8170,7 @@
;;
(define_insn "mve_vldrbq_<supf><mode>"
[(set (match_operand:MVE_2 0 "s_register_operand" "=w")
- (unspec:MVE_2 [(match_operand:<MVE_B_ELEM> 1 "memory_operand" "Us")]
+ (unspec:MVE_2 [(match_operand:<MVE_B_ELEM> 1 "mve_memory_operand" "Ux")]
VLDRBQ))
]
"TARGET_HAVE_MVE"
@@ -8142,7 +8179,10 @@
int regno = REGNO (operands[0]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = operands[1];
- output_asm_insn ("vldrb.<supf><V_sz_elem>\t%q0, %E1",ops);
+ if (<V_sz_elem> == 8)
+ output_asm_insn ("vldrb.<V_sz_elem>\t%q0, %E1",ops);
+ else
+ output_asm_insn ("vldrb.<supf><V_sz_elem>\t%q0, %E1",ops);
return "";
}
[(set_attr "length" "4")])
@@ -8216,7 +8256,7 @@
;; [vstrbq_p_s vstrbq_p_u]
;;
(define_insn "mve_vstrbq_p_<supf><mode>"
- [(set (match_operand:<MVE_B_ELEM> 0 "memory_operand" "=Us")
+ [(set (match_operand:<MVE_B_ELEM> 0 "mve_memory_operand" "=Ux")
(unspec:<MVE_B_ELEM> [(match_operand:MVE_2 1 "s_register_operand" "w")
(match_operand:HI 2 "vpr_register_operand" "Up")]
VSTRBQ))
@@ -8227,7 +8267,7 @@
int regno = REGNO (operands[1]);
ops[1] = gen_rtx_REG (TImode, regno);
ops[0] = operands[0];
- output_asm_insn ("vpst\n\tvstrbt.<V_sz_elem>\t%q1, %E0",ops);
+ output_asm_insn ("vpst\;vstrbt.<V_sz_elem>\t%q1, %E0",ops);
return "";
}
[(set_attr "length" "8")])
@@ -8262,7 +8302,7 @@
;;
(define_insn "mve_vldrbq_z_<supf><mode>"
[(set (match_operand:MVE_2 0 "s_register_operand" "=w")
- (unspec:MVE_2 [(match_operand:<MVE_B_ELEM> 1 "memory_operand" "Us")
+ (unspec:MVE_2 [(match_operand:<MVE_B_ELEM> 1 "mve_memory_operand" "Ux")
(match_operand:HI 2 "vpr_register_operand" "Up")]
VLDRBQ))
]
@@ -8272,7 +8312,10 @@
int regno = REGNO (operands[0]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = operands[1];
- output_asm_insn ("vpst\n\tvldrbt.<supf><V_sz_elem>\t%q0, %E1",ops);
+ if (<V_sz_elem> == 8)
+ output_asm_insn ("vpst\;vldrbt.<V_sz_elem>\t%q0, %E1",ops);
+ else
+ output_asm_insn ("vpst\;vldrbt.<supf><V_sz_elem>\t%q0, %E1",ops);
return "";
}
[(set_attr "length" "8")])
@@ -8303,7 +8346,7 @@
;;
(define_insn "mve_vldrhq_fv8hf"
[(set (match_operand:V8HF 0 "s_register_operand" "=w")
- (unspec:V8HF [(match_operand:V8HI 1 "memory_operand" "Us")]
+ (unspec:V8HF [(match_operand:V8HI 1 "mve_memory_operand" "Ux")]
VLDRHQ_F))
]
"TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
@@ -8312,7 +8355,7 @@
int regno = REGNO (operands[0]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = operands[1];
- output_asm_insn ("vldrh.f16\t%q0, %E1",ops);
+ output_asm_insn ("vldrh.16\t%q0, %E1",ops);
return "";
}
[(set_attr "length" "4")])
@@ -8413,13 +8456,12 @@
}
[(set_attr "length" "8")])
-;;
;;
;; [vldrhq_s, vldrhq_u]
;;
(define_insn "mve_vldrhq_<supf><mode>"
[(set (match_operand:MVE_6 0 "s_register_operand" "=w")
- (unspec:MVE_6 [(match_operand:<MVE_H_ELEM> 1 "memory_operand" "Us")]
+ (unspec:MVE_6 [(match_operand:<MVE_H_ELEM> 1 "mve_memory_operand" "Ux")]
VLDRHQ))
]
"TARGET_HAVE_MVE"
@@ -8428,7 +8470,10 @@
int regno = REGNO (operands[0]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = operands[1];
- output_asm_insn ("vldrh.<supf><V_sz_elem>\t%q0, %E1",ops);
+ if (<V_sz_elem> == 16)
+ output_asm_insn ("vldrh.16\t%q0, %E1",ops);
+ else
+ output_asm_insn ("vldrh.<supf><V_sz_elem>\t%q0, %E1",ops);
return "";
}
[(set_attr "length" "4")])
@@ -8438,7 +8483,7 @@
;;
(define_insn "mve_vldrhq_z_fv8hf"
[(set (match_operand:V8HF 0 "s_register_operand" "=w")
- (unspec:V8HF [(match_operand:V8HI 1 "memory_operand" "Us")
+ (unspec:V8HF [(match_operand:V8HI 1 "mve_memory_operand" "Ux")
(match_operand:HI 2 "vpr_register_operand" "Up")]
VLDRHQ_F))
]
@@ -8448,7 +8493,7 @@
int regno = REGNO (operands[0]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = operands[1];
- output_asm_insn ("vpst\n\tvldrht.f16\t%q0, %E1",ops);
+ output_asm_insn ("vpst\;vldrht.16\t%q0, %E1",ops);
return "";
}
[(set_attr "length" "8")])
@@ -8458,7 +8503,7 @@
;;
(define_insn "mve_vldrhq_z_<supf><mode>"
[(set (match_operand:MVE_6 0 "s_register_operand" "=w")
- (unspec:MVE_6 [(match_operand:<MVE_H_ELEM> 1 "memory_operand" "Us")
+ (unspec:MVE_6 [(match_operand:<MVE_H_ELEM> 1 "mve_memory_operand" "Ux")
(match_operand:HI 2 "vpr_register_operand" "Up")]
VLDRHQ))
]
@@ -8468,7 +8513,10 @@
int regno = REGNO (operands[0]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = operands[1];
- output_asm_insn ("vpst\n\tvldrht.<supf><V_sz_elem>\t%q0, %E1",ops);
+ if (<V_sz_elem> == 16)
+ output_asm_insn ("vpst\;vldrht.16\t%q0, %E1",ops);
+ else
+ output_asm_insn ("vpst\;vldrht.<supf><V_sz_elem>\t%q0, %E1",ops);
return "";
}
[(set_attr "length" "8")])
@@ -8478,7 +8526,7 @@
;;
(define_insn "mve_vldrwq_fv4sf"
[(set (match_operand:V4SF 0 "s_register_operand" "=w")
- (unspec:V4SF [(match_operand:V4SI 1 "memory_operand" "Us")]
+ (unspec:V4SF [(match_operand:V4SI 1 "memory_operand" "Ux")]
VLDRWQ_F))
]
"TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
@@ -8487,7 +8535,7 @@
int regno = REGNO (operands[0]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = operands[1];
- output_asm_insn ("vldrw.f32\t%q0, %E1",ops);
+ output_asm_insn ("vldrw.32\t%q0, %E1",ops);
return "";
}
[(set_attr "length" "4")])
@@ -8497,7 +8545,7 @@
;;
(define_insn "mve_vldrwq_<supf>v4si"
[(set (match_operand:V4SI 0 "s_register_operand" "=w")
- (unspec:V4SI [(match_operand:V4SI 1 "memory_operand" "Us")]
+ (unspec:V4SI [(match_operand:V4SI 1 "memory_operand" "Ux")]
VLDRWQ))
]
"TARGET_HAVE_MVE"
@@ -8506,7 +8554,7 @@
int regno = REGNO (operands[0]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = operands[1];
- output_asm_insn ("vldrw.<supf>32\t%q0, %E1",ops);
+ output_asm_insn ("vldrw.32\t%q0, %E1",ops);
return "";
}
[(set_attr "length" "4")])
@@ -8516,7 +8564,7 @@
;;
(define_insn "mve_vldrwq_z_fv4sf"
[(set (match_operand:V4SF 0 "s_register_operand" "=w")
- (unspec:V4SF [(match_operand:V4SI 1 "memory_operand" "Us")
+ (unspec:V4SF [(match_operand:V4SI 1 "memory_operand" "Ux")
(match_operand:HI 2 "vpr_register_operand" "Up")]
VLDRWQ_F))
]
@@ -8526,7 +8574,7 @@
int regno = REGNO (operands[0]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = operands[1];
- output_asm_insn ("vpst\n\tvldrwt.f32\t%q0, %E1",ops);
+ output_asm_insn ("vpst\;vldrwt.32\t%q0, %E1",ops);
return "";
}
[(set_attr "length" "8")])
@@ -8536,7 +8584,7 @@
;;
(define_insn "mve_vldrwq_z_<supf>v4si"
[(set (match_operand:V4SI 0 "s_register_operand" "=w")
- (unspec:V4SI [(match_operand:V4SI 1 "memory_operand" "Us")
+ (unspec:V4SI [(match_operand:V4SI 1 "memory_operand" "Ux")
(match_operand:HI 2 "vpr_register_operand" "Up")]
VLDRWQ))
]
@@ -8546,14 +8594,14 @@
int regno = REGNO (operands[0]);
ops[0] = gen_rtx_REG (TImode, regno);
ops[1] = operands[1];
- output_asm_insn ("vpst\n\tvldrwt.<supf>32\t%q0, %E1",ops);
+ output_asm_insn ("vpst\;vldrwt.32\t%q0, %E1",ops);
return "";
}
[(set_attr "length" "8")])
(define_expand "mve_vld1q_f<mode>"
[(match_operand:MVE_0 0 "s_register_operand")
- (unspec:MVE_0 [(match_operand:<MVE_CNVT> 1 "memory_operand")] VLD1Q_F)
+ (unspec:MVE_0 [(match_operand:<MVE_CNVT> 1 "mve_memory_operand")] VLD1Q_F)
]
"TARGET_HAVE_MVE || TARGET_HAVE_MVE_FLOAT"
{
@@ -8563,7 +8611,7 @@
(define_expand "mve_vld1q_<supf><mode>"
[(match_operand:MVE_2 0 "s_register_operand")
- (unspec:MVE_2 [(match_operand:MVE_2 1 "memory_operand")] VLD1Q)
+ (unspec:MVE_2 [(match_operand:MVE_2 1 "mve_memory_operand")] VLD1Q)
]
"TARGET_HAVE_MVE"
{
@@ -8991,7 +9039,7 @@
;; [vstrhq_f]
;;
(define_insn "mve_vstrhq_fv8hf"
- [(set (match_operand:V8HI 0 "memory_operand" "=Us")
+ [(set (match_operand:V8HI 0 "mve_memory_operand" "=Ux")
(unspec:V8HI [(match_operand:V8HF 1 "s_register_operand" "w")]
VSTRHQ_F))
]
@@ -9010,7 +9058,7 @@
;; [vstrhq_p_f]
;;
(define_insn "mve_vstrhq_p_fv8hf"
- [(set (match_operand:V8HI 0 "memory_operand" "=Us")
+ [(set (match_operand:V8HI 0 "mve_memory_operand" "=Ux")
(unspec:V8HI [(match_operand:V8HF 1 "s_register_operand" "w")
(match_operand:HI 2 "vpr_register_operand" "Up")]
VSTRHQ_F))
@@ -9021,7 +9069,7 @@
int regno = REGNO (operands[1]);
ops[1] = gen_rtx_REG (TImode, regno);
ops[0] = operands[0];
- output_asm_insn ("vpst\n\tvstrht.16\t%q1, %E0",ops);
+ output_asm_insn ("vpst\;vstrht.16\t%q1, %E0",ops);
return "";
}
[(set_attr "length" "8")])
@@ -9030,7 +9078,7 @@
;; [vstrhq_p_s vstrhq_p_u]
;;
(define_insn "mve_vstrhq_p_<supf><mode>"
- [(set (match_operand:<MVE_H_ELEM> 0 "memory_operand" "=Us")
+ [(set (match_operand:<MVE_H_ELEM> 0 "mve_memory_operand" "=Ux")
(unspec:<MVE_H_ELEM> [(match_operand:MVE_6 1 "s_register_operand" "w")
(match_operand:HI 2 "vpr_register_operand" "Up")]
VSTRHQ))
@@ -9041,7 +9089,7 @@
int regno = REGNO (operands[1]);
ops[1] = gen_rtx_REG (TImode, regno);
ops[0] = operands[0];
- output_asm_insn ("vpst\n\tvstrht.<V_sz_elem>\t%q1, %E0",ops);
+ output_asm_insn ("vpst\;vstrht.<V_sz_elem>\t%q1, %E0",ops);
return "";
}
[(set_attr "length" "8")])
@@ -9093,7 +9141,7 @@
;; [vstrhq_scatter_shifted_offset_p_s vstrhq_scatter_shifted_offset_p_u]
;;
(define_insn "mve_vstrhq_scatter_shifted_offset_p_<supf><mode>"
- [(set (match_operand:<MVE_H_ELEM> 0 "memory_operand" "=Us")
+ [(set (match_operand:<MVE_H_ELEM> 0 "memory_operand" "=Ux")
(unspec:<MVE_H_ELEM>
[(match_operand:MVE_6 1 "s_register_operand" "w")
(match_operand:MVE_6 2 "s_register_operand" "w")
@@ -9136,7 +9184,7 @@
;; [vstrhq_s, vstrhq_u]
;;
(define_insn "mve_vstrhq_<supf><mode>"
- [(set (match_operand:<MVE_H_ELEM> 0 "memory_operand" "=Us")
+ [(set (match_operand:<MVE_H_ELEM> 0 "mve_memory_operand" "=Ux")
(unspec:<MVE_H_ELEM> [(match_operand:MVE_6 1 "s_register_operand" "w")]
VSTRHQ))
]
@@ -9155,7 +9203,7 @@
;; [vstrwq_f]
;;
(define_insn "mve_vstrwq_fv4sf"
- [(set (match_operand:V4SI 0 "memory_operand" "=Us")
+ [(set (match_operand:V4SI 0 "memory_operand" "=Ux")
(unspec:V4SI [(match_operand:V4SF 1 "s_register_operand" "w")]
VSTRWQ_F))
]
@@ -9174,7 +9222,7 @@
;; [vstrwq_p_f]
;;
(define_insn "mve_vstrwq_p_fv4sf"
- [(set (match_operand:V4SI 0 "memory_operand" "=Us")
+ [(set (match_operand:V4SI 0 "memory_operand" "=Ux")
(unspec:V4SI [(match_operand:V4SF 1 "s_register_operand" "w")
(match_operand:HI 2 "vpr_register_operand" "Up")]
VSTRWQ_F))
@@ -9185,7 +9233,7 @@
int regno = REGNO (operands[1]);
ops[1] = gen_rtx_REG (TImode, regno);
ops[0] = operands[0];
- output_asm_insn ("vpst\n\tvstrwt.32\t%q1, %E0",ops);
+ output_asm_insn ("vpst\;vstrwt.32\t%q1, %E0",ops);
return "";
}
[(set_attr "length" "8")])
@@ -9194,7 +9242,7 @@
;; [vstrwq_p_s vstrwq_p_u]
;;
(define_insn "mve_vstrwq_p_<supf>v4si"
- [(set (match_operand:V4SI 0 "memory_operand" "=Us")
+ [(set (match_operand:V4SI 0 "memory_operand" "=Ux")
(unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w")
(match_operand:HI 2 "vpr_register_operand" "Up")]
VSTRWQ))
@@ -9205,7 +9253,7 @@
int regno = REGNO (operands[1]);
ops[1] = gen_rtx_REG (TImode, regno);
ops[0] = operands[0];
- output_asm_insn ("vpst\n\tvstrwt.32\t%q1, %E0",ops);
+ output_asm_insn ("vpst\;vstrwt.32\t%q1, %E0",ops);
return "";
}
[(set_attr "length" "8")])
@@ -9214,7 +9262,7 @@
;; [vstrwq_s vstrwq_u]
;;
(define_insn "mve_vstrwq_<supf>v4si"
- [(set (match_operand:V4SI 0 "memory_operand" "=Us")
+ [(set (match_operand:V4SI 0 "memory_operand" "=Ux")
(unspec:V4SI [(match_operand:V4SI 1 "s_register_operand" "w")]
VSTRWQ))
]
diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md
index 009862e012c..c57ad73577e 100644
--- a/gcc/config/arm/predicates.md
+++ b/gcc/config/arm/predicates.md
@@ -31,6 +31,12 @@
|| REGNO_REG_CLASS (REGNO (op)) != NO_REGS));
})
+(define_predicate "mve_memory_operand"
+ (and (match_code "mem")
+ (match_test "TARGET_32BIT
+ && mve_vector_mem_operand (GET_MODE (op), XEXP (op, 0),
+ false)")))
+
;; True for immediates in the range of 1 to 16 for MVE.
(define_predicate "mve_imm_16"
(match_test "satisfies_constraint_Rd (op)"))
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float2.c
index e3cf8f8207d..35f83c6b298 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float2.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vector_float2.c
@@ -11,10 +11,6 @@ foo32 ()
return b;
}
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
-/* { dg-final { scan-assembler "vstrb.*" } } */
-/* { dg-final { scan-assembler "vldr.64*" } } */
-
float16x8_t
foo16 ()
{
@@ -22,6 +18,9 @@ foo16 ()
return b;
}
-/* { dg-final { scan-assembler "vmov\\tq\[0-7\], q\[0-7\]" } } */
-/* { dg-final { scan-assembler "vstrb.*" } } */
-/* { dg-final { scan-assembler "vldr.64.*" } } */
+/* { dg-final { scan-assembler-times "vmov\\tq\[0-7\], q\[0-7\]" 2 } } */
+/* { dg-final { scan-assembler-times "vstrw.32*" 1 } } */
+/* { dg-final { scan-assembler-times "vstrh.16*" 1 } } */
+/* { dg-final { scan-assembler-times "vldrw.32*" 1 } } */
+/* { dg-final { scan-assembler-times "vldrh.16*" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vldr.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vldr.c
new file mode 100644
index 00000000000..15656ed8c3c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vldr.c
@@ -0,0 +1,61 @@
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+void
+foo (uint16_t row_x_col, int8_t *out)
+{
+ for (;;)
+ {
+ int32x4_t out_3;
+ int8_t *rhs_0;
+ int8_t *lhs_3;
+ int i_row_x_col;
+ for (;i_row_x_col < row_x_col; i_row_x_col++)
+ {
+ int32x4_t ker_0 = vldrbq_s32(rhs_0);
+ int32x4_t ip_3 = vldrbq_s32(lhs_3);
+ out_3 = vmulq_s32(ip_3, ker_0);
+ }
+ vstrbq_s32(out, out_3);
+ }
+}
+
+void
+foo1 (uint16_t row_x_col, int8_t *out)
+{
+ for (;;)
+ {
+ int16x8_t out_3;
+ int8_t *rhs_0;
+ int8_t *lhs_3;
+ int i_row_x_col;
+ for (; i_row_x_col < row_x_col; i_row_x_col++)
+ {
+ int16x8_t ker_0 = vldrbq_s16(rhs_0);
+ int16x8_t ip_3 = vldrbq_s16(lhs_3);
+ out_3 = vmulq_s16(ip_3, ker_0);
+ }
+ vstrbq_s16(out, out_3);
+ }
+}
+
+void
+foo2 (uint16_t row_x_col, int16_t *out)
+{
+ for (;;)
+ {
+ int32x4_t out_3;
+ int16_t *rhs_0;
+ int16_t *lhs_3;
+ int i_row_x_col;
+ for (; i_row_x_col < row_x_col; i_row_x_col++)
+ {
+ int32x4_t ker_0 = vldrhq_s32(rhs_0);
+ int32x4_t ip_3 = vldrhq_s32(lhs_3);
+ out_3 = vmulq_s32(ip_3, ker_0);
+ }
+ vstrhq_s32(out, out_3);
+ }
+}
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vldr_z.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vldr_z.c
new file mode 100644
index 00000000000..ae640837d14
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vldr_z.c
@@ -0,0 +1,73 @@
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+void
+foo (uint16_t row_len, const int32_t *bias, int8_t *out)
+{
+ int i_out_ch;
+ for (;;)
+ {
+ int8_t *ip_c3;
+ int32_t acc_3;
+ int32_t row_loop_cnt = row_len;
+ int32x4_t res = {acc_3};
+ uint32x4_t scatter_offset;
+ int i_row_loop;
+ for (; i_row_loop < row_loop_cnt; i_row_loop++)
+ {
+ mve_pred16_t p;
+ int16x8_t r0;
+ int16x8_t c3 = vldrbq_z_s16(ip_c3, p);
+ acc_3 = vmladavaq_p_s16(acc_3, r0, c3, p);
+ }
+ vstrbq_scatter_offset_s32(&out[i_out_ch], scatter_offset, res);
+ }
+}
+
+void
+foo1 (uint16_t row_len, const int32_t *bias, int8_t *out)
+{
+ int i_out_ch;
+ for (;;)
+ {
+ int8_t *ip_c3;
+ int32_t acc_3;
+ int32_t row_loop_cnt = row_len;
+ int i_row_loop;
+ int32x4_t res = {acc_3};
+ uint32x4_t scatter_offset;
+ for (; i_row_loop < row_loop_cnt; i_row_loop++)
+ {
+ mve_pred16_t p;
+ int32x4_t r0;
+ int32x4_t c3 = vldrbq_z_s32(ip_c3, p);
+ acc_3 = vmladavaq_p_s32(acc_3, r0, c3, p);
+ }
+ vstrbq_scatter_offset_s32(&out[i_out_ch], scatter_offset, res);
+ }
+}
+
+void
+foo2 (uint16_t row_len, const int32_t *bias, int8_t *out)
+{
+ int i_out_ch;
+ for (;;)
+ {
+ int16_t *ip_c3;
+ int32_t acc_3;
+ int32_t row_loop_cnt = row_len;
+ int i_row_loop;
+ int32x4_t res = {acc_3};
+ uint32x4_t scatter_offset;
+ for (; i_row_loop < row_loop_cnt; i_row_loop++)
+ {
+ mve_pred16_t p;
+ int32x4_t r0;
+ int32x4_t c3 = vldrhq_z_s32(ip_c3, p);
+ acc_3 = vmladavaq_p_s32(acc_3, r0, c3, p);
+ }
+ vstrbq_scatter_offset_s32(&out[i_out_ch], scatter_offset, res);
+ }
+}
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vstr.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vstr.c
new file mode 100644
index 00000000000..dd785f28bc0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vstr.c
@@ -0,0 +1,43 @@
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+void
+foo (const int32_t *output_bias, int8_t *out, uint16_t num_ch)
+{
+ int32_t loop_count = num_ch;
+ const int32_t *bias = output_bias;
+ int i_loop_cnt;
+ for (; i_loop_cnt < loop_count; out += 4, i_loop_cnt++)
+ {
+ int32x4_t out_0 = vldrwq_s32(bias);
+ vstrbq_s32(out, out_0);
+ }
+}
+
+void
+foo1 (const int16_t *output_bias, int8_t *out, uint16_t num_ch)
+{
+ int32_t loop_count = num_ch;
+ const int16_t *bias = output_bias;
+ int i_loop_cnt;
+ for (; i_loop_cnt < loop_count; out += 4, i_loop_cnt++)
+ {
+ int16x8_t out_0 = vldrhq_s16(bias);
+ vstrbq_s16(out, out_0);
+ }
+}
+
+void
+foo2 (const int32_t *output_bias, int16_t *out, uint16_t num_ch)
+{
+ int32_t loop_count = num_ch;
+ const int32_t *bias = output_bias;
+ int i_loop_cnt;
+ for (; i_loop_cnt < loop_count; out += 4, i_loop_cnt++)
+ {
+ int32x4_t out_0 = vldrwq_s32(bias);
+ vstrhq_s32(out, out_0);
+ }
+}
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vstr_p.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vstr_p.c
new file mode 100644
index 00000000000..8b222f1be0a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_vstr_p.c
@@ -0,0 +1,42 @@
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-add-options arm_v8_1m_mve } */
+/* { dg-additional-options "-O2" } */
+
+#include "arm_mve.h"
+void
+foo1 (int8_t *x, int32_t * i1)
+{
+ mve_pred16_t p;
+ int32x4_t x_0;
+ int32_t * bias1 = i1;
+ for (;; x++)
+ {
+ x_0 = vldrwq_s32(bias1);
+ vstrbq_p_s32(x, x_0, p);
+ }
+}
+void
+foo2 (int8_t *x, int16_t * i1)
+{
+ mve_pred16_t p;
+ int16x8_t x_0;
+ int16_t * bias1 = i1;
+ for (;; x++)
+ {
+ x_0 = vldrhq_s16(bias1);
+ vstrbq_p_s16(x, x_0, p);
+ }
+}
+
+void
+foo3 (int16_t *x, int32_t * i1)
+{
+ mve_pred16_t p;
+ int32x4_t x_0;
+ int32_t * bias1 = i1;
+ for (;; x++)
+ {
+ x_0 = vldrwq_s32(bias1);
+ vstrhq_p_s32(x, x_0, p);
+ }
+}
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_f16.c
index 5e42f634412..699e40d0e3b 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_f16.c
@@ -10,12 +10,11 @@ foo (float16_t const * base)
return vld1q_f16 (base);
}
-/* { dg-final { scan-assembler "vldrh.f16" } } */
-
float16x8_t
foo1 (float16_t const * base)
{
return vld1q (base);
}
-/* { dg-final { scan-assembler "vldrh.f16" } } */
+/* { dg-final { scan-assembler-times "vldrh.16" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_f32.c
index 99d1a7a9c5e..86592303362 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_f32.c
@@ -10,12 +10,11 @@ foo (float32_t const * base)
return vld1q_f32 (base);
}
-/* { dg-final { scan-assembler "vldrw.f32" } } */
-
float32x4_t
foo1 (float32_t const * base)
{
return vld1q (base);
}
-/* { dg-final { scan-assembler "vldrw.f32" } } */
+/* { dg-final { scan-assembler-times "vldrw.32" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s16.c
index d77f98ea889..f4f04f534db 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s16.c
@@ -10,12 +10,11 @@ foo (int16_t const * base)
return vld1q_s16 (base);
}
-/* { dg-final { scan-assembler "vldrh.s16" } } */
-
int16x8_t
foo1 (int16_t const * base)
{
return vld1q (base);
}
-/* { dg-final { scan-assembler "vldrh.s16" } } */
+/* { dg-final { scan-assembler-times "vldrh.16" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s32.c
index 9a7f024f735..e0f66166751 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s32.c
@@ -10,12 +10,11 @@ foo (int32_t const * base)
return vld1q_s32 (base);
}
-/* { dg-final { scan-assembler "vldrw.s32" } } */
-
int32x4_t
foo1 (int32_t const * base)
{
return vld1q (base);
}
-/* { dg-final { scan-assembler "vldrw.s32" } } */
+/* { dg-final { scan-assembler-times "vldrw.32" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s8.c
index 9c67bb60110..1b7edead6b1 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_s8.c
@@ -10,12 +10,11 @@ foo (int8_t const * base)
return vld1q_s8 (base);
}
-/* { dg-final { scan-assembler "vldrb.s8" } } */
-
int8x16_t
foo1 (int8_t const * base)
{
return vld1q (base);
}
-/* { dg-final { scan-assembler "vldrb.s8" } } */
+/* { dg-final { scan-assembler-times "vldrb.8" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u16.c
index 2bef21a5a1d..50e1f5cedcb 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u16.c
@@ -10,12 +10,11 @@ foo (uint16_t const * base)
return vld1q_u16 (base);
}
-/* { dg-final { scan-assembler "vldrh.u16" } } */
-
uint16x8_t
foo1 (uint16_t const * base)
{
return vld1q (base);
}
-/* { dg-final { scan-assembler "vldrh.u16" } } */
+/* { dg-final { scan-assembler-times "vldrh.16" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u32.c
index 01a1dd611ed..a13fe824382 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u32.c
@@ -10,12 +10,11 @@ foo (uint32_t const * base)
return vld1q_u32 (base);
}
-/* { dg-final { scan-assembler "vldrw.u32" } } */
-
uint32x4_t
foo1 (uint32_t const * base)
{
return vld1q (base);
}
-/* { dg-final { scan-assembler "vldrw.u32" } } */
+/* { dg-final { scan-assembler-times "vldrw.32" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u8.c
index 997bc1b212d..dfd1deb93f0 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_u8.c
@@ -10,12 +10,11 @@ foo (uint8_t const * base)
return vld1q_u8 (base);
}
-/* { dg-final { scan-assembler "vldrb.u8" } } */
-
uint8x16_t
foo1 (uint8_t const * base)
{
return vld1q (base);
}
-/* { dg-final { scan-assembler "vldrb.u8" } } */
+/* { dg-final { scan-assembler-times "vldrb.8" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_f16.c
index ea5593a9dd1..3c32e408e42 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_f16.c
@@ -10,12 +10,12 @@ foo (float16_t const * base, mve_pred16_t p)
return vld1q_z_f16 (base, p);
}
-/* { dg-final { scan-assembler "vldrht.f16" } } */
-
float16x8_t
foo1 (float16_t const * base, mve_pred16_t p)
{
return vld1q_z (base, p);
}
-/* { dg-final { scan-assembler "vldrht.f16" } } */
+/* { dg-final { scan-assembler-times "vpst" 2 } } */
+/* { dg-final { scan-assembler-times "vldrht.16" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_f32.c
index 28937cd18aa..3fc935c889b 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_f32.c
@@ -10,12 +10,12 @@ foo (float32_t const * base, mve_pred16_t p)
return vld1q_z_f32 (base, p);
}
-/* { dg-final { scan-assembler "vldrwt.f32" } } */
-
float32x4_t
foo1 (float32_t const * base, mve_pred16_t p)
{
return vld1q_z (base, p);
}
-/* { dg-final { scan-assembler "vldrwt.f32" } } */
+/* { dg-final { scan-assembler-times "vpst" 2 } } */
+/* { dg-final { scan-assembler-times "vldrwt.32" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s16.c
index 81a1c439d6e..49cc81092f3 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s16.c
@@ -10,12 +10,12 @@ foo (int16_t const * base, mve_pred16_t p)
return vld1q_z_s16 (base, p);
}
-/* { dg-final { scan-assembler "vldrht.s16" } } */
-
int16x8_t
foo1 (int16_t const * base, mve_pred16_t p)
{
return vld1q_z (base, p);
}
-/* { dg-final { scan-assembler "vldrht.s16" } } */
+/* { dg-final { scan-assembler-times "vpst" 2 } } */
+/* { dg-final { scan-assembler-times "vldrht.16" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s32.c
index d03ab345f19..ec317cd70e8 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s32.c
@@ -10,12 +10,12 @@ foo (int32_t const * base, mve_pred16_t p)
return vld1q_z_s32 (base, p);
}
-/* { dg-final { scan-assembler "vldrwt.s32" } } */
-
int32x4_t
foo1 (int32_t const * base, mve_pred16_t p)
{
return vld1q_z (base, p);
}
-/* { dg-final { scan-assembler "vldrwt.s32" } } */
+/* { dg-final { scan-assembler-times "vpst" 2 } } */
+/* { dg-final { scan-assembler-times "vldrwt.32" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s8.c
index e535662c7d0..538c140e78e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_s8.c
@@ -10,12 +10,12 @@ foo (int8_t const * base, mve_pred16_t p)
return vld1q_z_s8 (base, p);
}
-/* { dg-final { scan-assembler "vldrbt.s8" } } */
-
int8x16_t
foo1 (int8_t const * base, mve_pred16_t p)
{
return vld1q_z (base, p);
}
-/* { dg-final { scan-assembler "vldrbt.s8" } } */
+/* { dg-final { scan-assembler-times "vpst" 2 } } */
+/* { dg-final { scan-assembler-times "vldrbt.8" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u16.c
index 3f20f4ed9ca..e5e588a187e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u16.c
@@ -10,12 +10,12 @@ foo (uint16_t const * base, mve_pred16_t p)
return vld1q_z_u16 (base, p);
}
-/* { dg-final { scan-assembler "vldrht.u16" } } */
-
uint16x8_t
foo1 (uint16_t const * base, mve_pred16_t p)
{
return vld1q_z (base, p);
}
-/* { dg-final { scan-assembler "vldrht.u16" } } */
+/* { dg-final { scan-assembler-times "vpst" 2 } } */
+/* { dg-final { scan-assembler-times "vldrht.16" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u32.c
index 1d3b53e38e8..999beefa7e8 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u32.c
@@ -10,12 +10,12 @@ foo (uint32_t const * base, mve_pred16_t p)
return vld1q_z_u32 (base, p);
}
-/* { dg-final { scan-assembler "vldrwt.u32" } } */
-
uint32x4_t
foo1 (uint32_t const * base, mve_pred16_t p)
{
return vld1q_z (base, p);
}
-/* { dg-final { scan-assembler "vldrwt.u32" } } */
+/* { dg-final { scan-assembler-times "vpst" 2 } } */
+/* { dg-final { scan-assembler-times "vldrwt.32" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u8.c
index 47d3f6fa4c7..172053c7142 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld1q_z_u8.c
@@ -10,12 +10,12 @@ foo (uint8_t const * base, mve_pred16_t p)
return vld1q_z_u8 (base, p);
}
-/* { dg-final { scan-assembler "vldrbt.u8" } } */
-
uint8x16_t
foo1 (uint8_t const * base, mve_pred16_t p)
{
return vld1q_z (base, p);
}
-/* { dg-final { scan-assembler "vldrbt.u8" } } */
+/* { dg-final { scan-assembler-times "vpst" 2 } } */
+/* { dg-final { scan-assembler-times "vldrbt.8" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_s8.c
index 886491f0052..ec2f2176ccf 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_s8.c
@@ -10,4 +10,5 @@ foo (int8_t const * base)
return vldrbq_s8 (base);
}
-/* { dg-final { scan-assembler "vldrb.s8" } } */
+/* { dg-final { scan-assembler-times "vldrb.8" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_u8.c
index e58120a2b64..d07b472a4ff 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_u8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_u8.c
@@ -10,4 +10,5 @@ foo (uint8_t const * base)
return vldrbq_u8 (base);
}
-/* { dg-final { scan-assembler "vldrb.u8" } } */
+/* { dg-final { scan-assembler-times "vldrb.8" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_z_s8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_z_s8.c
index 7d66c704516..aed3c910063 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_z_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_z_s8.c
@@ -10,4 +10,6 @@ foo (int8_t const * base, mve_pred16_t p)
return vldrbq_z_s8 (base, p);
}
-/* { dg-final { scan-assembler "vldrbt.s8" } } */
+/* { dg-final { scan-assembler-times "vpst" 1 } } */
+/* { dg-final { scan-assembler-times "vldrbt.8" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_z_u8.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_z_u8.c
index 05ae2628d56..54c61e74454 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_z_u8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrbq_z_u8.c
@@ -10,4 +10,6 @@ foo (uint8_t const * base, mve_pred16_t p)
return vldrbq_z_u8 (base, p);
}
-/* { dg-final { scan-assembler "vldrbt.u8" } } */
+/* { dg-final { scan-assembler-times "vpst" 1 } } */
+/* { dg-final { scan-assembler-times "vldrbt.8" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c
index 0d1ee769ec6..7420d0198e7 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_s64.c
@@ -10,6 +10,7 @@ foo (uint64x2_t * addr)
return vldrdq_gather_base_wb_s64 (addr, 8);
}
-/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
/* { dg-final { scan-assembler "vldrd.64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-times "vldr.64" 1 } } */
+/* { dg-final { scan-assembler-times "vstr.64" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c
index cb2a41bdcd3..ebe5b2fd70c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_u64.c
@@ -10,6 +10,7 @@ foo (uint64x2_t * addr)
return vldrdq_gather_base_wb_u64 (addr, 8);
}
-/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
/* { dg-final { scan-assembler "vldrd.64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-times "vldr.64" 1 } } */
+/* { dg-final { scan-assembler-times "vstr.64" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c
index 243fbeacc34..231a24a1e55 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_s64.c
@@ -8,8 +8,8 @@ int64x2_t foo (uint64x2_t * addr, mve_pred16_t p)
return vldrdq_gather_base_wb_z_s64 (addr, 1016, p);
}
-/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
-/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*$" } } */
/* { dg-final { scan-assembler "vpst" } } */
/* { dg-final { scan-assembler "vldrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-times "vldr.64" 1 } } */
+/* { dg-final { scan-assembler-times "vstr.64" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c
index 10ba42405fe..b8d9b5c1391 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrdq_gather_base_wb_z_u64.c
@@ -8,8 +8,8 @@ uint64x2_t foo (uint64x2_t * addr, mve_pred16_t p)
return vldrdq_gather_base_wb_z_u64 (addr, 8, p);
}
-/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
-/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */
/* { dg-final { scan-assembler "vpst" } } */
/* { dg-final { scan-assembler "vldrdt.u64\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-times "vldr.64" 1 } } */
+/* { dg-final { scan-assembler-times "vstr.64" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_f16.c
index b79c0e9bfe4..05bef418d82 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_f16.c
@@ -10,4 +10,5 @@ foo (float16_t const * base)
return vldrhq_f16 (base);
}
-/* { dg-final { scan-assembler "vldrh.f16" } } */
+/* { dg-final { scan-assembler-times "vldrh.16" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_s16.c
index 4872eb555f3..7c977b6a699 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_s16.c
@@ -10,4 +10,5 @@ foo (int16_t const * base)
return vldrhq_s16 (base);
}
-/* { dg-final { scan-assembler "vldrh.s16" } } */
+/* { dg-final { scan-assembler-times "vldrh.16" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_s32.c
index e73e208c26a..229b52163fa 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_s32.c
@@ -10,4 +10,5 @@ foo (int16_t const * base)
return vldrhq_s32 (base);
}
-/* { dg-final { scan-assembler "vldrh.s32" } } */
+/* { dg-final { scan-assembler-times "vldrh.s32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_u16.c
index 6b285d45aaa..07f6d9e3944 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_u16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_u16.c
@@ -10,4 +10,5 @@ foo (uint16_t const * base)
return vldrhq_u16 (base);
}
-/* { dg-final { scan-assembler "vldrh.u16" } } */
+/* { dg-final { scan-assembler-times "vldrh.16" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_u32.c
index 994cd4a20ba..cd24f01831f 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_u32.c
@@ -10,4 +10,5 @@ foo (uint16_t const * base)
return vldrhq_u32 (base);
}
-/* { dg-final { scan-assembler "vldrh.u32" } } */
+/* { dg-final { scan-assembler-times "vldrh.u32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_f16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_f16.c
index 2b866a99dd4..dd0fc9c7b73 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_f16.c
@@ -10,4 +10,6 @@ foo (float16_t const * base, mve_pred16_t p)
return vldrhq_z_f16 (base, p);
}
-/* { dg-final { scan-assembler "vldrht.f16" } } */
+/* { dg-final { scan-assembler-times "vpst" 1 } } */
+/* { dg-final { scan-assembler-times "vldrht.16" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_s16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_s16.c
index 6c92c50ba12..36d3458d95c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_s16.c
@@ -10,4 +10,6 @@ foo (int16_t const * base, mve_pred16_t p)
return vldrhq_z_s16 (base, p);
}
-/* { dg-final { scan-assembler "vldrht.s16" } } */
+/* { dg-final { scan-assembler-times "vpst" 1 } } */
+/* { dg-final { scan-assembler-times "vldrht.16" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_s32.c
index 4cd97ba5743..9c67b479be7 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_s32.c
@@ -10,4 +10,6 @@ foo (int16_t const * base, mve_pred16_t p)
return vldrhq_z_s32 (base, p);
}
-/* { dg-final { scan-assembler "vldrht.s32" } } */
+/* { dg-final { scan-assembler-times "vpst" 1 } } */
+/* { dg-final { scan-assembler-times "vldrht.s32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_u16.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_u16.c
index 80ae0e5cd17..26354b5971a 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_u16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_u16.c
@@ -10,4 +10,6 @@ foo (uint16_t const * base, mve_pred16_t p)
return vldrhq_z_u16 (base, p);
}
-/* { dg-final { scan-assembler "vldrht.u16" } } */
+/* { dg-final { scan-assembler-times "vpst" 1 } } */
+/* { dg-final { scan-assembler-times "vldrht.16" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_u32.c
index 1a8590116eb..948fe5ee5b4 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrhq_z_u32.c
@@ -10,4 +10,6 @@ foo (uint16_t const * base, mve_pred16_t p)
return vldrhq_z_u32 (base, p);
}
-/* { dg-final { scan-assembler "vldrht.u32" } } */
+/* { dg-final { scan-assembler-times "vpst" 1 } } */
+/* { dg-final { scan-assembler-times "vldrht.u32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_f32.c
index 2c834ae53df..143079aa23f 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_f32.c
@@ -10,4 +10,5 @@ foo (float32_t const * base)
return vldrwq_f32 (base);
}
-/* { dg-final { scan-assembler "vldrw.f32" } } */
+/* { dg-final { scan-assembler-times "vldrw.32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c
index db8108e3732..8e2994f75d7 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_f32.c
@@ -10,6 +10,7 @@ foo (uint32x4_t * addr)
return vldrwq_gather_base_wb_f32 (addr, 8);
}
-/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vldrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vstrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c
index 3da64e218e2..e5054738b75 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_s32.c
@@ -10,6 +10,7 @@ foo (uint32x4_t * addr)
return vldrwq_gather_base_wb_s32 (addr, 8);
}
-/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vldrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vstrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c
index 2597ee11608..7f39414143b 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_u32.c
@@ -10,6 +10,7 @@ foo (uint32x4_t * addr)
return vldrwq_gather_base_wb_u32 (addr, 8);
}
-/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vldrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
/* { dg-final { scan-assembler "vldrw.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vstrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c
index 9fb47daf486..f3219e2e825 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_f32.c
@@ -10,8 +10,9 @@ foo (uint32x4_t * addr, mve_pred16_t p)
return vldrwq_gather_base_wb_z_f32 (addr, 8, p);
}
-/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vldrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */
/* { dg-final { scan-assembler "vpst" } } */
/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vstrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c
index 56da5a46c64..4d093d243fe 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_s32.c
@@ -10,8 +10,9 @@ foo (uint32x4_t * addr, mve_pred16_t p)
return vldrwq_gather_base_wb_z_s32 (addr, 8, p);
}
-/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vldrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */
/* { dg-final { scan-assembler "vpst" } } */
/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vstrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c
index 63165d97c1a..e796522a49c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_gather_base_wb_z_u32.c
@@ -10,8 +10,9 @@ foo (uint32x4_t * addr, mve_pred16_t p)
return vldrwq_gather_base_wb_z_u32 (addr, 8, p);
}
-/* { dg-final { scan-assembler "vldrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vldrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
/* { dg-final { scan-assembler "vmsr\t P0, r\[0-9\]+.*" } } */
/* { dg-final { scan-assembler "vpst" } } */
/* { dg-final { scan-assembler "vldrwt.u32\tq\[0-9\]+, \\\[q\[0-9\]+, #\[0-9\]+\\\]!" } } */
-/* { dg-final { scan-assembler "vstrb.8 q\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler "vstrw.32\tq\[0-9\]+, \\\[r\[0-9\]+\\\]" } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_s32.c
index f48c29f8bff..860dd324d25 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_s32.c
@@ -10,4 +10,5 @@ foo (int32_t const * base)
return vldrwq_s32 (base);
}
-/* { dg-final { scan-assembler "vldrw.s32" } } */
+/* { dg-final { scan-assembler-times "vldrw.32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_u32.c
index 7c722200ecc..513ed49fb6e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_u32.c
@@ -10,4 +10,5 @@ foo (uint32_t const * base)
return vldrwq_u32 (base);
}
-/* { dg-final { scan-assembler "vldrw.u32" } } */
+/* { dg-final { scan-assembler-times "vldrw.32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_f32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_f32.c
index bcdcecab468..3e0a6a60bcf 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_f32.c
@@ -10,4 +10,6 @@ foo (float32_t const * base, mve_pred16_t p)
return vldrwq_z_f32 (base, p);
}
-/* { dg-final { scan-assembler "vldrwt.f32" } } */
+/* { dg-final { scan-assembler-times "vpst" 1 } } */
+/* { dg-final { scan-assembler-times "vldrwt.32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_s32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_s32.c
index fd32b305656..82b914885b5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_s32.c
@@ -10,4 +10,6 @@ foo (int32_t const * base, mve_pred16_t p)
return vldrwq_z_s32 (base, p);
}
-/* { dg-final { scan-assembler "vldrwt.s32" } } */
+/* { dg-final { scan-assembler-times "vpst" 1 } } */
+/* { dg-final { scan-assembler-times "vldrwt.32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_u32.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_u32.c
index f4944043834..6a66e167881 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vldrwq_z_u32.c
@@ -10,4 +10,6 @@ foo (uint32_t const * base, mve_pred16_t p)
return vldrwq_z_u32 (base, p);
}
-/* { dg-final { scan-assembler "vldrwt.u32" } } */
+/* { dg-final { scan-assembler-times "vpst" 1 } } */
+/* { dg-final { scan-assembler-times "vldrwt.32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float.c
index 52bad05b621..739f282c476 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float.c
@@ -1,6 +1,6 @@
/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
/* { dg-add-options arm_v8_1m_mve_fp } */
-/* { dg-additional-options "-O0" } */
+/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
@@ -14,4 +14,6 @@ foo ()
fb = vuninitializedq_f32 ();
}
-/* { dg-final { scan-assembler-times "vstrb.8" 4 } } */
+/* { dg-final { scan-assembler-times "vstrh.16" 1 } } */
+/* { dg-final { scan-assembler-times "vstrw.32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c
index c6724a52074..a9130607f26 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_float1.c
@@ -1,6 +1,6 @@
/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
/* { dg-add-options arm_v8_1m_mve_fp } */
-/* { dg-additional-options "-O0" } */
+/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
@@ -14,4 +14,6 @@ foo ()
fb = vuninitializedq (fbb);
}
-/* { dg-final { scan-assembler-times "vstrb.8" 6 } } */
+/* { dg-final { scan-assembler-times "vstrh.16" 1 } } */
+/* { dg-final { scan-assembler-times "vstrw.32" 1 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c
index 13a0109a9b5..bf6692fe573 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int.c
@@ -1,6 +1,6 @@
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-add-options arm_v8_1m_mve } */
-/* { dg-additional-options "-O0" } */
+/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
int8x16_t a;
@@ -25,4 +25,8 @@ foo ()
ud = vuninitializedq_u64 ();
}
-/* { dg-final { scan-assembler-times "vstrb.8" 16 } } */
+/* { dg-final { scan-assembler-times "vstrb.8" 2 } } */
+/* { dg-final { scan-assembler-times "vstrh.16" 2 } } */
+/* { dg-final { scan-assembler-times "vstrw.32" 2 } } */
+/* { dg-final { scan-assembler-times "vstr.64" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c
index a321398709e..4f66a07ac29 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vuninitializedq_int1.c
@@ -1,6 +1,6 @@
/* { dg-require-effective-target arm_v8_1m_mve_ok } */
/* { dg-add-options arm_v8_1m_mve } */
-/* { dg-additional-options "-O0" } */
+/* { dg-additional-options "-O2" } */
#include "arm_mve.h"
@@ -26,4 +26,8 @@ foo ()
ud = vuninitializedq (udd);
}
-/* { dg-final { scan-assembler-times "vstrb.8" 24 } } */
+/* { dg-final { scan-assembler-times "vstrb.8" 2 } } */
+/* { dg-final { scan-assembler-times "vstrh.16" 2 } } */
+/* { dg-final { scan-assembler-times "vstrw.32" 2 } } */
+/* { dg-final { scan-assembler-times "vstr.64" 2 } } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
^ permalink raw reply [flat|nested] only message in thread
only message in thread, other threads:[~2020-08-22 21:25 UTC | newest]
Thread overview: (only message) (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-22 21:25 [gcc/devel/autopar_devel] [ARM]: Fix the wrong code-gen generated by MVE vector load/store intrinsics (PR94959) Giuliano Belinassi
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).