* [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
@ 2020-03-10 18:19 Srinath Parvathaneni
2020-03-12 11:16 ` Kyrill Tkachov
0 siblings, 1 reply; 4+ messages in thread
From: Srinath Parvathaneni @ 2020-03-10 18:19 UTC (permalink / raw)
To: gcc-patches
[-- Attachment #1: Type: text/plain, Size: 31889 bytes --]
Hello Kyrill,
This patch addresses all the comments in patch version v2.
(version v2) https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540416.html
####
Hello,
This patch is part of MVE ACLE intrinsics framework.
This patches add support to update (read/write) the APSR (Application Program Status Register)
register and FPSCR (Floating-point Status and Control Register) register for MVE.
This patch also enables thumb2 mov RTL patterns for MVE.
A new feature bit vfp_base is added. This bit is enabled for all VFP, MVE and MVE with floating point
extensions. This bit is used to enable the macro TARGET_VFP_BASE. For all the VFP instructions, RTL patterns,
status and control registers are guarded by TARGET_HAVE_FLOAT. But this patch modifies that and the
common instructions, RTL patterns, status and control registers bewteen MVE and VFP are guarded by
TARGET_VFP_BASE macro.
The RTL pattern set_fpscr and get_fpscr are updated to use VFPCC_REGNUM because few MVE intrinsics
set/get carry bit of FPSCR register.
Please refer to Arm reference manual [1] for more details.
[1] https://developer.arm.com/docs/ddi0553/latest
Regression tested on target arm-none-eabi and armeb-none-eabi and found no regressions.
Ok for trunk?
Thanks,
Srinath
gcc/ChangeLog:
2020-03-06 Andre Vieira <andre.simoesdiasvieira@arm.com>
Mihail Ionescu <mihail.ionescu@arm.com>
Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* common/config/arm/arm-common.c (arm_asm_auto_mfpu): When vfp_base
feature bit is on and -mfpu=auto is passed as compiler option, do not
generate error on not finding any match fpu. Because in this case fpu
is not required.
* config/arm/arm-cpus.in (vfp_base): Define feature bit, this bit is
enabled for MVE and also for all VFP extensions.
(VFPv2): Modify fgroup to enable vfp_base feature bit when ever VFPv2
is enabled.
(MVE): Define fgroup to enable feature bits mve, vfp_base and armv7em.
(MVE_FP): Define fgroup to enable feature bits is fgroup MVE and FPv5
along with feature bits mve_float.
(mve): Modify add options in armv8.1-m.main arch for MVE.
(mve.fp): Modify add options in armv8.1-m.main arch for MVE with
floating point.
* config/arm/arm.c (use_return_insn): Replace the
check with TARGET_VFP_BASE.
(thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with
TARGET_VFP_BASE.
(arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE"
with TARGET_VFP_BASE, to allow cost calculations for copies in MVE as
well.
(arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with
TARGET_VFP_BASE, to allow space calculation for VFP registers in MVE
as well.
(arm_compute_frame_layout): Likewise.
(arm_save_coproc_regs): Likewise.
(arm_fixed_condition_code_regs): Modify to enable using VFPCC_REGNUM
in MVE as well.
(arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT || TARGET_HAVE_MVE"
with equivalent macro TARGET_VFP_BASE.
(arm_expand_epilogue_apcs_frame): Likewise.
(arm_expand_epilogue): Likewise.
(arm_conditional_register_usage): Likewise.
(arm_declare_function_name): Add check to skip printing .fpu directive
in assembly file when TARGET_VFP_BASE is enabled and fpu_to_print is
"softvfp".
* config/arm/arm.h (TARGET_VFP_BASE): Define.
* config/arm/arm.md (arch): Add "mve" to arch.
(eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true.
(vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT
|| TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE.
* config/arm/constraints.md (Uf): Define to allow modification to FPCCR
in MVE.
* config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify target guard
to not allow for MVE.
* config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile unspecs
enum.
(VUNSPEC_GET_FPSCR): Define.
* config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR and VMRS
instructions which move to general-purpose Register from Floating-point
Special register and vice-versa.
(thumb2_movhi_fp16): Likewise.
(thumb2_movsi_vfp): Add support for VMSR and VMRS instructions along
with MCR and MRC instructions which set and get Floating-point Status
and Control Register (FPSCR).
(movdi_vfp): Modify pattern to enable Single-precision scalar float move
in MVE.
(thumb2_movdf_vfp): Modify pattern to enable Double-precision scalar
float move patterns in MVE.
(thumb2_movsfcc_vfp): Modify pattern to enable single float conditional
code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check.
(thumb2_movdfcc_vfp): Modify pattern to enable double float conditional
code move patterns of VFP also in MVE by adding TARGET_VFP_BASE check.
(push_multi_vfp): Add support to use VFP VPUSH pattern for MVE by adding
TARGET_VFP_BASE check.
(set_fpscr): Add support to set FPSCR register for MVE. Modify pattern
using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
register.
(get_fpscr): Add support to get FPSCR register for MVE. Modify pattern
using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
register.
gcc/testsuite/ChangeLog:
2020-03-06 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
* gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: New test.
* gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_fpu1.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_fpu2.c: Likewise.
* gcc.target/arm/mve/intrinsics/mve_fpu3.c: Likewise.
############### Attachment also inlined for ease of reply ###############
diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c
index 30a2a1deb864ee22d48cebb08247176640524955..83cc68009ac16a89ab5515f19d4eb84f595e33f1 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -1009,7 +1009,8 @@ arm_asm_auto_mfpu (int argc, const char **argv)
}
}
- gcc_assert (i != TARGET_FPU_auto);
+ gcc_assert (i != TARGET_FPU_auto
+ || bitmap_bit_p (arm_active_target.isa, isa_bit_vfp_base));
}
auto_fpu = (char *) xmalloc (strlen (fpuname) + sizeof ("-mfpu="));
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 96f584da325172bd1460251e2de0ad679589d312..77b43090d69a599d8806cfcc02037e1bbed6e7a1 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -135,6 +135,10 @@ define feature armv8_1m_main
# Floating point and Neon extensions.
# VFPv1 is not supported in GCC.
+# This feature bit is enabled for all VFP, MVE and
+# MVE with floating point extensions.
+define feature vfp_base
+
# Vector floating point v2.
define feature vfpv2
@@ -234,7 +238,7 @@ define fgroup ALL_SIMD ALL_SIMD_INTERNAL ALL_SIMD_EXTERNAL
# List of all FPU bits to strip out if -mfpu is used to override the
# default. fp16 is deliberately missing from this list.
-define fgroup ALL_FPU_INTERNAL vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
+define fgroup ALL_FPU_INTERNAL vfp_base vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
# Similarly, but including fp16 and other extensions that aren't part of
# -mfpu support.
define fgroup ALL_FPU_EXTERNAL fp16 bf16
@@ -279,10 +283,12 @@ define fgroup ARMv8r ARMv8a
define fgroup ARMv8_1m_main ARMv8m_main armv8_1m_main
# Useful combinations.
-define fgroup VFPv2 vfpv2
+define fgroup VFPv2 vfp_base vfpv2
define fgroup VFPv3 VFPv2 vfpv3
define fgroup VFPv4 VFPv3 vfpv4 fp16conv
define fgroup FPv5 VFPv4 fpv5
+define fgroup MVE mve vfp_base armv7em
+define fgroup MVE_FP MVE FPv5 fp16 mve_float
define fgroup FP_DBL fp_dbl
define fgroup FP_D32 FP_DBL fp_d32
@@ -699,8 +705,8 @@ begin arch armv8.1-m.main
option fp add FPv5 fp16
option fp.dp add FPv5 FP_DBL fp16
option nofp remove ALL_FP
- option mve add mve armv7em
- option mve.fp add mve FPv5 fp16 mve_float armv7em
+ option mve add MVE
+ option mve.fp add MVE_FP
end arch armv8.1-m.main
begin arch iwmmxt
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index a0283ed62c8047fe1ccbbb9b639ad34771fe46c2..c7453412959f23bf25c2052b4e0bb6a95faf3163 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -334,6 +334,19 @@ emission of floating point pcs attributes. */
isa_bit_mve_float) \
&& !TARGET_GENERAL_REGS_ONLY)
+/* MVE have few common instructions as VFP, like VLDM alias VPOP, VLDR, VSTM
+ alia VPUSH, VSTR and VMOV, VMSR and VMRS. In the same manner it updates few
+ registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and MVFR2. All
+ the VFP instructions, RTL patterns and register are guarded by
+ TARGET_HARD_FLOAT. But the common instructions, RTL pattern and registers
+ between MVE and VFP will be guarded by the following macro TARGET_VFP_BASE
+ hereafter. */
+
+#define TARGET_VFP_BASE (arm_float_abi != ARM_FLOAT_ABI_SOFT \
+ && bitmap_bit_p (arm_active_target.isa, \
+ isa_bit_vfp_base) \
+ && !TARGET_GENERAL_REGS_ONLY)
+
/* Nonzero if integer division instructions supported. */
#define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
|| (TARGET_THUMB && arm_arch_thumb_hwdiv))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c769104a93746cd7c02b46b82f1a8f8057b9ae62..b40904a40e0979af4285fdbd85bfae55abea25dd 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -4295,7 +4295,7 @@ use_return_insn (int iscond, rtx sibling)
/* Can't be done if any of the VFP regs are pushed,
since this also requires an insn. */
- if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ if (TARGET_VFP_BASE)
for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)
if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p (regno))
return 0;
@@ -6289,7 +6289,7 @@ use_vfp_abi (enum arm_pcs pcs_variant, bool is_double)
return false;
return (TARGET_32BIT && TARGET_HARD_FLOAT &&
- (TARGET_VFP_DOUBLE || !is_double));
+ (TARGET_VFP_DOUBLE || !is_double));
}
/* Return true if an argument whose type is TYPE, or mode is MODE, is
@@ -8512,7 +8512,7 @@ thumb2_legitimate_index_p (machine_mode mode, rtx index, int strict_p)
/* ??? Combine arm and thumb2 coprocessor addressing modes. */
/* Standard coprocessor addressing modes. */
- if (TARGET_HARD_FLOAT
+ if (TARGET_VFP_BASE
&& (mode == SFmode || mode == DFmode))
return (code == CONST_INT && INTVAL (index) < 1024
/* Thumb-2 allows only > -256 index range for it's core register
@@ -9905,7 +9905,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
/* Assume that most copies can be done with a single insn,
unless we don't have HW FP, in which case everything
larger than word mode will require two insns. */
- *cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ *cost = COSTS_N_INSNS (((!TARGET_VFP_BASE
&& GET_MODE_SIZE (mode) > 4)
|| mode == DImode)
? 2 : 1);
@@ -20821,7 +20821,7 @@ arm_get_vfp_saved_size (void)
saved = 0;
/* Space for saved VFP registers. */
- if (TARGET_HARD_FLOAT)
+ if (TARGET_VFP_BASE)
{
count = 0;
for (regno = FIRST_VFP_REGNUM;
@@ -22364,7 +22364,7 @@ arm_compute_frame_layout (void)
func_type = arm_current_func_type ();
/* Space for saved VFP registers. */
if (! IS_VOLATILE (func_type)
- && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
+ && TARGET_VFP_BASE)
saved += arm_get_vfp_saved_size ();
/* Allocate space for saving/restoring FPCXTNS in Armv8.1-M Mainline
@@ -22588,7 +22588,7 @@ arm_save_coproc_regs(void)
saved_size += 8;
}
- if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ if (TARGET_VFP_BASE)
{
start_reg = FIRST_VFP_REGNUM;
@@ -24546,7 +24546,7 @@ arm_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2)
return false;
*p1 = CC_REGNUM;
- *p2 = TARGET_HARD_FLOAT ? VFPCC_REGNUM : INVALID_REGNUM;
+ *p2 = TARGET_VFP_BASE ? VFPCC_REGNUM : INVALID_REGNUM;
return true;
}
@@ -24965,7 +24965,7 @@ arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
{
if (GET_MODE_CLASS (mode) == MODE_CC)
return (regno == CC_REGNUM
- || ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ || (TARGET_VFP_BASE
&& regno == VFPCC_REGNUM));
if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC)
@@ -24982,7 +24982,7 @@ arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
start of an even numbered register pair. */
return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM);
- if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno))
+ if (TARGET_VFP_BASE && IS_VFP_REGNUM (regno))
{
if (mode == DFmode)
return VFP_REGNO_OK_FOR_DOUBLE (regno);
@@ -26933,7 +26933,7 @@ arm_expand_epilogue_apcs_frame (bool really_return)
floats_from_frame += 4;
}
- if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ if (TARGET_VFP_BASE)
{
int start_reg;
rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM);
@@ -27179,7 +27179,7 @@ arm_expand_epilogue (bool really_return)
}
}
- if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ if (TARGET_VFP_BASE)
{
/* Generate VFP register multi-pop. */
int end_reg = LAST_VFP_REGNUM + 1;
@@ -29699,7 +29699,7 @@ arm_conditional_register_usage (void)
if (TARGET_THUMB1)
fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;
- if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
+ if (TARGET_32BIT && TARGET_VFP_BASE)
{
/* VFPv3 registers are disabled when earlier VFP
versions are selected due to the definition of
@@ -32478,7 +32478,8 @@ arm_declare_function_name (FILE *stream, const char *name, tree decl)
= TARGET_SOFT_FLOAT
? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa);
- if (fpu_to_print != arm_last_printed_arch_string)
+ if (!(!strcmp (fpu_to_print.c_str (), "softvfp") && TARGET_VFP_BASE)
+ && (fpu_to_print != arm_last_printed_arch_string))
{
asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ());
arm_last_printed_fpu_string = fpu_to_print;
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 8f8c91d5fe146ed64cd4eb5450f04b3cf0c0ed18..5387f972f5a864a153873f21b9423d28446daefc 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -134,7 +134,7 @@
; arm_arch6. "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M
; Baseline. This attribute is used to compute attribute "enabled",
; use type "any" to enable an alternative in all cases.
-(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
+(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon,mve"
(const_string "any"))
(define_attr "arch_enabled" "no,yes"
@@ -188,6 +188,10 @@
(and (eq_attr "arch" "neon")
(match_test "TARGET_NEON"))
(const_string "yes")
+
+ (and (eq_attr "arch" "mve")
+ (match_test "TARGET_HAVE_MVE"))
+ (const_string "yes")
]
(const_string "no")))
@@ -11758,7 +11762,7 @@
(match_operand:SI 2 "const_int_I_operand" "I")))
(set (match_operand:DF 3 "vfp_hard_register_operand" "")
(mem:DF (match_dup 1)))])]
- "TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)"
+ "TARGET_32BIT && TARGET_VFP_BASE"
"*
{
int num_regs = XVECLEN (operands[0], 0);
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index a12de97cdaab589e0c8704b408ac4c329def416d..bf8f4ff1e5d2d6132d0afdd05255cc697c54159d 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -38,7 +38,7 @@
;; in all states: Pf, Pg
;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up, Uf
;; in ARM state: Uq
;; in Thumb state: Uu, Uw
;; in all states: Q
@@ -46,6 +46,9 @@
(define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"
"MVE VPR register")
+(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS"
+ "MVE FPCCR register")
+
(define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
"The VFP registers @code{s0}-@code{s31}.")
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index b0d3bd1cf1c484927e6ac6522bc30f0f089291c7..793f67068687a60abf94c230e5485a1eb2eca6a0 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -517,7 +517,7 @@
[(match_operand 4 "cc_register" "") (const_int 0)])
(match_operand:SF 1 "s_register_operand" "0,r")
(match_operand:SF 2 "s_register_operand" "r,0")))]
- "TARGET_THUMB2 && TARGET_SOFT_FLOAT"
+ "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE"
"@
it\\t%D3\;mov%D3\\t%0, %2
it\\t%d3\;mov%d3\\t%0, %1"
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index f0b1f465de4b63d624510783576700519044717d..e76609f79418af38b70746336dd43592a1dc8713 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -170,6 +170,7 @@
UNSPEC_TORC ; Used by the intrinsic form of the iWMMXt TORC instruction.
UNSPEC_TORVSC ; Used by the intrinsic form of the iWMMXt TORVSC instruction.
UNSPEC_TEXTRC ; Used by the intrinsic form of the iWMMXt TEXTRC instruction.
+ UNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
])
@@ -216,7 +217,6 @@
VUNSPEC_SLX ; Represent a store-register-release-exclusive.
VUNSPEC_LDA ; Represent a store-register-acquire.
VUNSPEC_STL ; Represent a store-register-release.
- VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content.
VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
VUNSPEC_CDP ; Represent the coprocessor cdp instruction.
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index ab16a6b0eac822b4e1a1ae4dcbe39491a82cc9fe..eb6ae7bea7927c666f36219797d54c0127001bc1 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -74,10 +74,10 @@
(define_insn "*thumb2_movhi_vfp"
[(set
(match_operand:HI 0 "nonimmediate_operand"
- "=rk, r, l, r, m, r, *t, r, *t")
+ "=rk, r, l, r, m, r, *t, r, *t, Up, r")
(match_operand:HI 1 "general_operand"
- "rk, I, Py, n, r, m, r, *t, *t"))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT
+ "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
+ "TARGET_THUMB2 && TARGET_VFP_BASE
&& !TARGET_VFP_FP16INST
&& (register_operand (operands[0], HImode)
|| register_operand (operands[1], HImode))"
@@ -99,20 +99,24 @@
return "vmov%?\t%0, %1\t%@ int";
case 8:
return "vmov%?.f32\t%0, %1\t%@ int";
+ case 9:
+ return "vmsr%?\t P0, %1\t@ movhi";
+ case 10:
+ return "vmrs%?\t %0, P0\t@ movhi";
default:
gcc_unreachable ();
}
}
[(set_attr "predicable" "yes")
(set_attr "predicable_short_it"
- "yes, no, yes, no, no, no, no, no, no")
+ "yes, no, yes, no, no, no, no, no, no, no, no")
(set_attr "type"
"mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
- f_mcr, f_mrc, fmov")
- (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
- (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
- (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
- (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
+ f_mcr, f_mrc, fmov, mve_move, mve_move")
+ (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
+ (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
+ (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
+ (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
)
;; Patterns for HI moves which provide more data transfer instructions when FP16
@@ -170,10 +174,10 @@
(define_insn "*thumb2_movhi_fp16"
[(set
(match_operand:HI 0 "nonimmediate_operand"
- "=rk, r, l, r, m, r, *t, r, *t")
+ "=rk, r, l, r, m, r, *t, r, *t, Up, r")
(match_operand:HI 1 "general_operand"
- "rk, I, Py, n, r, m, r, *t, *t"))]
- "TARGET_THUMB2 && TARGET_VFP_FP16INST
+ "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
+ "TARGET_THUMB2 && (TARGET_VFP_FP16INST || TARGET_HAVE_MVE)
&& (register_operand (operands[0], HImode)
|| register_operand (operands[1], HImode))"
{
@@ -194,21 +198,25 @@
return "vmov.f16\t%0, %1\t%@ int";
case 8:
return "vmov%?.f32\t%0, %1\t%@ int";
+ case 9:
+ return "vmsr%?\tP0, %1\t%@ movhi";
+ case 10:
+ return "vmrs%?\t%0, P0\t%@ movhi";
default:
gcc_unreachable ();
}
}
[(set_attr "predicable"
- "yes, yes, yes, yes, yes, yes, no, no, yes")
+ "yes, yes, yes, yes, yes, yes, no, no, yes, yes, yes")
(set_attr "predicable_short_it"
- "yes, no, yes, no, no, no, no, no, no")
+ "yes, no, yes, no, no, no, no, no, no, no, no")
(set_attr "type"
"mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
- f_mcr, f_mrc, fmov")
- (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
- (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
- (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
- (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
+ f_mcr, f_mrc, fmov, mve_move, mve_move")
+ (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
+ (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
+ (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
+ (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
)
;; SImode moves
@@ -258,9 +266,11 @@
;; is chosen with length 2 when the instruction is predicated for
;; arm_restrict_it.
(define_insn "*thumb2_movsi_vfp"
- [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r,lk*r,m,*t, r,*t,*t, *Uv")
- (match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,lk*r, r,*t,*t,*UvTu,*t"))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT
+ [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r,l,*hk,m,*m,*t,\
+ r,*t,*t,*Uv, Up, r,Uf,r")
+ (match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,*mi,l,*hk,r,*t,\
+ *t,*UvTu,*t, r, Up,r,Uf"))]
+ "TARGET_THUMB2 && TARGET_VFP_BASE
&& ( s_register_operand (operands[0], SImode)
|| s_register_operand (operands[1], SImode))"
"*
@@ -275,30 +285,44 @@
case 4:
return \"movw%?\\t%0, %1\";
case 5:
+ case 6:
/* Cannot load it directly, split to load it via MOV / MOVT. */
if (!MEM_P (operands[1]) && arm_disable_literal_pool)
return \"#\";
return \"ldr%?\\t%0, %1\";
- case 6:
- return \"str%?\\t%1, %0\";
case 7:
- return \"vmov%?\\t%0, %1\\t%@ int\";
case 8:
- return \"vmov%?\\t%0, %1\\t%@ int\";
+ return \"str%?\\t%1, %0\";
case 9:
+ return \"vmov%?\\t%0, %1\\t%@ int\";
+ case 10:
+ return \"vmov%?\\t%0, %1\\t%@ int\";
+ case 11:
return \"vmov%?.f32\\t%0, %1\\t%@ int\";
- case 10: case 11:
+ case 12: case 13:
return output_move_vfp (operands);
+ case 14:
+ return \"vmsr\\t P0, %1\";
+ case 15:
+ return \"vmrs\\t %0, P0\";
+ case 16:
+ return \"mcr\\tp10, 7, %1, cr1, cr0, 0\\t @SET_FPSCR\";
+ case 17:
+ return \"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR\";
default:
gcc_unreachable ();
}
"
[(set_attr "predicable" "yes")
- (set_attr "predicable_short_it" "yes,no,yes,no,no,no,no,no,no,no,no,no")
- (set_attr "type" "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores")
- (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4")
- (set_attr "pool_range" "*,*,*,*,*,1018,*,*,*,*,1018,*")
- (set_attr "neg_pool_range" "*,*,*,*,*, 0,*,*,*,*,1008,*")]
+ (set_attr "predicable_short_it" "yes,no,yes,no,no,no,no,no,no,no,no,no,no,\
+ no,no,no,no,no")
+ (set_attr "type" "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,load_4,\
+ store_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores,mve_move,\
+ mve_move,mrs,mrs")
+ (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4")
+ (set_attr "pool_range" "*,*,*,*,*,1018,4094,*,*,*,*,*,1018,*,*,*,*,*")
+ (set_attr "arch" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,mve,mve,mve,mve")
+ (set_attr "neg_pool_range" "*,*,*,*,*, 0, 0,*,*,*,*,*,1008,*,*,*,*,*")]
)
@@ -306,12 +330,12 @@
(define_insn "*movdi_vfp"
[(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,r,r,m,w,!r,w,w, Uv")
- (match_operand:DI 1 "di_operand" "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
- "TARGET_32BIT && TARGET_HARD_FLOAT
+ (match_operand:DI 1 "di_operand" "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
+ "TARGET_32BIT && TARGET_VFP_BASE
&& ( register_operand (operands[0], DImode)
|| register_operand (operands[1], DImode))
- && !(TARGET_NEON && CONST_INT_P (operands[1])
- && simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
+ && !((TARGET_NEON || TARGET_HAVE_MVE) && CONST_INT_P (operands[1])
+ && simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
"*
switch (which_alternative)
{
@@ -333,7 +357,7 @@
case 8:
return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\";
case 9:
- if (TARGET_VFP_SINGLE)
+ if (TARGET_VFP_SINGLE || TARGET_HAVE_MVE)
return \"vmov%?.f32\\t%0, %1\\t%@ int\;vmov%?.f32\\t%p0, %p1\\t%@ int\";
else
return \"vmov%?.f64\\t%P0, %P1\\t%@ int\";
@@ -390,9 +414,15 @@
case 6: /* S register from immediate. */
return \"vmov.f16\\t%0, %1\t%@ __<fporbf>\";
case 7: /* S register from memory. */
- return \"vld1.16\\t{%z0}, %A1\";
+ if (TARGET_HAVE_MVE)
+ return \"vldr.16\\t%0, %A1\";
+ else
+ return \"vld1.16\\t{%z0}, %A1\";
case 8: /* Memory from S register. */
- return \"vst1.16\\t{%z1}, %A0\";
+ if (TARGET_HAVE_MVE)
+ return \"vstr.16\\t%1, %A0\";
+ else
+ return \"vst1.16\\t{%z1}, %A0\";
case 9: /* ARM register from constant. */
{
long bits;
@@ -593,7 +623,7 @@
(define_insn "*thumb2_movsf_vfp"
[(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t ,Uv,r ,m,t,r")
(match_operand:SF 1 "hard_sf_operand" " ?r,t,Dv,UvHa,t, mHa,r,t,r"))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT
+ "TARGET_THUMB2 && TARGET_VFP_BASE
&& ( s_register_operand (operands[0], SFmode)
|| s_register_operand (operands[1], SFmode))"
"*
@@ -682,7 +712,7 @@
(define_insn "*thumb2_movdf_vfp"
[(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w ,Uv,r ,m,w,r")
(match_operand:DF 1 "hard_df_operand" " ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT
+ "TARGET_THUMB2 && TARGET_VFP_BASE
&& ( register_operand (operands[0], DFmode)
|| register_operand (operands[1], DFmode))"
"*
@@ -760,7 +790,7 @@
[(match_operand 4 "cc_register" "") (const_int 0)])
(match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")
(match_operand:SF 2 "s_register_operand" "t,0,t,?r,0,?r,t,0,t")))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT && !arm_restrict_it"
+ "TARGET_THUMB2 && TARGET_VFP_BASE && !arm_restrict_it"
"@
it\\t%D3\;vmov%D3.f32\\t%0, %2
it\\t%d3\;vmov%d3.f32\\t%0, %1
@@ -806,7 +836,8 @@
[(match_operand 4 "cc_register" "") (const_int 0)])
(match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
(match_operand:DF 2 "s_register_operand" "w,0,w,?r,0,?r,w,0,w")))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE && !arm_restrict_it"
+ "TARGET_THUMB2 && TARGET_VFP_BASE && TARGET_VFP_DOUBLE
+ && !arm_restrict_it"
"@
it\\t%D3\;vmov%D3.f64\\t%P0, %P2
it\\t%d3\;vmov%d3.f64\\t%P0, %P1
@@ -1977,7 +2008,7 @@
[(set (match_operand:BLK 0 "memory_operand" "=m")
(unspec:BLK [(match_operand:DF 1 "vfp_register_operand" "")]
UNSPEC_PUSH_MULT))])]
- "TARGET_32BIT && TARGET_HARD_FLOAT"
+ "TARGET_32BIT && TARGET_VFP_BASE"
"* return vfp_output_vstmd (operands);"
[(set_attr "type" "f_stored")]
)
@@ -2065,16 +2096,18 @@
;; Write Floating-point Status and Control Register.
(define_insn "set_fpscr"
- [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR)]
- "TARGET_HARD_FLOAT"
+ [(set (reg:SI VFPCC_REGNUM)
+ (unspec_volatile:SI
+ [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR))]
+ "TARGET_VFP_BASE"
"mcr\\tp10, 7, %0, cr1, cr0, 0\\t @SET_FPSCR"
[(set_attr "type" "mrs")])
;; Read Floating-point Status and Control Register.
(define_insn "get_fpscr"
[(set (match_operand:SI 0 "register_operand" "=r")
- (unspec_volatile:SI [(const_int 0)] VUNSPEC_GET_FPSCR))]
- "TARGET_HARD_FLOAT"
+ (unspec:SI [(reg:SI VFPCC_REGNUM)] UNSPEC_GET_FPSCR))]
+ "TARGET_VFP_BASE"
"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR"
[(set_attr "type" "mrs")])
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
new file mode 100644
index 0000000000000000000000000000000000000000..17ba616c041378b88463cb7ef150b70b2e7b95ad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve.fp -mfloat-abi=hard -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+ int8x16_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
new file mode 100644
index 0000000000000000000000000000000000000000..7b877c4a90c506343d6b4edb750ba06ce3d7a68d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve.fp -mfloat-abi=softfp -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+ int8x16_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
new file mode 100644
index 0000000000000000000000000000000000000000..85fbb5767edc3c25ceb4d6da780d47afa1ee416c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+ int8x16_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
new file mode 100644
index 0000000000000000000000000000000000000000..23b3683ae559b3f7bf6c3ad11c4070ad2ddb9387
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=softfp -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+ int8x16_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
new file mode 100644
index 0000000000000000000000000000000000000000..8f7fa348d130e8456d5300ac25821fd96f9d5a97
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=soft -mthumb" } */
+
+int
+foo1 (int value)
+{
+ int b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu softvfp" } } */
[-- Attachment #2: 12776.patch --]
[-- Type: text/plain, Size: 26416 bytes --]
diff --git a/gcc/common/config/arm/arm-common.c b/gcc/common/config/arm/arm-common.c
index 30a2a1deb864ee22d48cebb08247176640524955..83cc68009ac16a89ab5515f19d4eb84f595e33f1 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -1009,7 +1009,8 @@ arm_asm_auto_mfpu (int argc, const char **argv)
}
}
- gcc_assert (i != TARGET_FPU_auto);
+ gcc_assert (i != TARGET_FPU_auto
+ || bitmap_bit_p (arm_active_target.isa, isa_bit_vfp_base));
}
auto_fpu = (char *) xmalloc (strlen (fpuname) + sizeof ("-mfpu="));
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 96f584da325172bd1460251e2de0ad679589d312..77b43090d69a599d8806cfcc02037e1bbed6e7a1 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -135,6 +135,10 @@ define feature armv8_1m_main
# Floating point and Neon extensions.
# VFPv1 is not supported in GCC.
+# This feature bit is enabled for all VFP, MVE and
+# MVE with floating point extensions.
+define feature vfp_base
+
# Vector floating point v2.
define feature vfpv2
@@ -234,7 +238,7 @@ define fgroup ALL_SIMD ALL_SIMD_INTERNAL ALL_SIMD_EXTERNAL
# List of all FPU bits to strip out if -mfpu is used to override the
# default. fp16 is deliberately missing from this list.
-define fgroup ALL_FPU_INTERNAL vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
+define fgroup ALL_FPU_INTERNAL vfp_base vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl ALL_SIMD_INTERNAL
# Similarly, but including fp16 and other extensions that aren't part of
# -mfpu support.
define fgroup ALL_FPU_EXTERNAL fp16 bf16
@@ -279,10 +283,12 @@ define fgroup ARMv8r ARMv8a
define fgroup ARMv8_1m_main ARMv8m_main armv8_1m_main
# Useful combinations.
-define fgroup VFPv2 vfpv2
+define fgroup VFPv2 vfp_base vfpv2
define fgroup VFPv3 VFPv2 vfpv3
define fgroup VFPv4 VFPv3 vfpv4 fp16conv
define fgroup FPv5 VFPv4 fpv5
+define fgroup MVE mve vfp_base armv7em
+define fgroup MVE_FP MVE FPv5 fp16 mve_float
define fgroup FP_DBL fp_dbl
define fgroup FP_D32 FP_DBL fp_d32
@@ -699,8 +705,8 @@ begin arch armv8.1-m.main
option fp add FPv5 fp16
option fp.dp add FPv5 FP_DBL fp16
option nofp remove ALL_FP
- option mve add mve armv7em
- option mve.fp add mve FPv5 fp16 mve_float armv7em
+ option mve add MVE
+ option mve.fp add MVE_FP
end arch armv8.1-m.main
begin arch iwmmxt
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index a0283ed62c8047fe1ccbbb9b639ad34771fe46c2..c7453412959f23bf25c2052b4e0bb6a95faf3163 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -334,6 +334,19 @@ emission of floating point pcs attributes. */
isa_bit_mve_float) \
&& !TARGET_GENERAL_REGS_ONLY)
+/* MVE have few common instructions as VFP, like VLDM alias VPOP, VLDR, VSTM
+ alia VPUSH, VSTR and VMOV, VMSR and VMRS. In the same manner it updates few
+ registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and MVFR2. All
+ the VFP instructions, RTL patterns and register are guarded by
+ TARGET_HARD_FLOAT. But the common instructions, RTL pattern and registers
+ between MVE and VFP will be guarded by the following macro TARGET_VFP_BASE
+ hereafter. */
+
+#define TARGET_VFP_BASE (arm_float_abi != ARM_FLOAT_ABI_SOFT \
+ && bitmap_bit_p (arm_active_target.isa, \
+ isa_bit_vfp_base) \
+ && !TARGET_GENERAL_REGS_ONLY)
+
/* Nonzero if integer division instructions supported. */
#define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
|| (TARGET_THUMB && arm_arch_thumb_hwdiv))
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index c769104a93746cd7c02b46b82f1a8f8057b9ae62..b40904a40e0979af4285fdbd85bfae55abea25dd 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -4295,7 +4295,7 @@ use_return_insn (int iscond, rtx sibling)
/* Can't be done if any of the VFP regs are pushed,
since this also requires an insn. */
- if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ if (TARGET_VFP_BASE)
for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)
if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p (regno))
return 0;
@@ -6289,7 +6289,7 @@ use_vfp_abi (enum arm_pcs pcs_variant, bool is_double)
return false;
return (TARGET_32BIT && TARGET_HARD_FLOAT &&
- (TARGET_VFP_DOUBLE || !is_double));
+ (TARGET_VFP_DOUBLE || !is_double));
}
/* Return true if an argument whose type is TYPE, or mode is MODE, is
@@ -8512,7 +8512,7 @@ thumb2_legitimate_index_p (machine_mode mode, rtx index, int strict_p)
/* ??? Combine arm and thumb2 coprocessor addressing modes. */
/* Standard coprocessor addressing modes. */
- if (TARGET_HARD_FLOAT
+ if (TARGET_VFP_BASE
&& (mode == SFmode || mode == DFmode))
return (code == CONST_INT && INTVAL (index) < 1024
/* Thumb-2 allows only > -256 index range for it's core register
@@ -9905,7 +9905,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code code, enum rtx_code outer_code,
/* Assume that most copies can be done with a single insn,
unless we don't have HW FP, in which case everything
larger than word mode will require two insns. */
- *cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ *cost = COSTS_N_INSNS (((!TARGET_VFP_BASE
&& GET_MODE_SIZE (mode) > 4)
|| mode == DImode)
? 2 : 1);
@@ -20821,7 +20821,7 @@ arm_get_vfp_saved_size (void)
saved = 0;
/* Space for saved VFP registers. */
- if (TARGET_HARD_FLOAT)
+ if (TARGET_VFP_BASE)
{
count = 0;
for (regno = FIRST_VFP_REGNUM;
@@ -22364,7 +22364,7 @@ arm_compute_frame_layout (void)
func_type = arm_current_func_type ();
/* Space for saved VFP registers. */
if (! IS_VOLATILE (func_type)
- && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
+ && TARGET_VFP_BASE)
saved += arm_get_vfp_saved_size ();
/* Allocate space for saving/restoring FPCXTNS in Armv8.1-M Mainline
@@ -22588,7 +22588,7 @@ arm_save_coproc_regs(void)
saved_size += 8;
}
- if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ if (TARGET_VFP_BASE)
{
start_reg = FIRST_VFP_REGNUM;
@@ -24546,7 +24546,7 @@ arm_fixed_condition_code_regs (unsigned int *p1, unsigned int *p2)
return false;
*p1 = CC_REGNUM;
- *p2 = TARGET_HARD_FLOAT ? VFPCC_REGNUM : INVALID_REGNUM;
+ *p2 = TARGET_VFP_BASE ? VFPCC_REGNUM : INVALID_REGNUM;
return true;
}
@@ -24965,7 +24965,7 @@ arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
{
if (GET_MODE_CLASS (mode) == MODE_CC)
return (regno == CC_REGNUM
- || ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ || (TARGET_VFP_BASE
&& regno == VFPCC_REGNUM));
if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC)
@@ -24982,7 +24982,7 @@ arm_hard_regno_mode_ok (unsigned int regno, machine_mode mode)
start of an even numbered register pair. */
return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM);
- if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno))
+ if (TARGET_VFP_BASE && IS_VFP_REGNUM (regno))
{
if (mode == DFmode)
return VFP_REGNO_OK_FOR_DOUBLE (regno);
@@ -26933,7 +26933,7 @@ arm_expand_epilogue_apcs_frame (bool really_return)
floats_from_frame += 4;
}
- if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ if (TARGET_VFP_BASE)
{
int start_reg;
rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM);
@@ -27179,7 +27179,7 @@ arm_expand_epilogue (bool really_return)
}
}
- if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
+ if (TARGET_VFP_BASE)
{
/* Generate VFP register multi-pop. */
int end_reg = LAST_VFP_REGNUM + 1;
@@ -29699,7 +29699,7 @@ arm_conditional_register_usage (void)
if (TARGET_THUMB1)
fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;
- if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
+ if (TARGET_32BIT && TARGET_VFP_BASE)
{
/* VFPv3 registers are disabled when earlier VFP
versions are selected due to the definition of
@@ -32478,7 +32478,8 @@ arm_declare_function_name (FILE *stream, const char *name, tree decl)
= TARGET_SOFT_FLOAT
? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa);
- if (fpu_to_print != arm_last_printed_arch_string)
+ if (!(!strcmp (fpu_to_print.c_str (), "softvfp") && TARGET_VFP_BASE)
+ && (fpu_to_print != arm_last_printed_arch_string))
{
asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ());
arm_last_printed_fpu_string = fpu_to_print;
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 8f8c91d5fe146ed64cd4eb5450f04b3cf0c0ed18..5387f972f5a864a153873f21b9423d28446daefc 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -134,7 +134,7 @@
; arm_arch6. "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M
; Baseline. This attribute is used to compute attribute "enabled",
; use type "any" to enable an alternative in all cases.
-(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
+(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon,mve"
(const_string "any"))
(define_attr "arch_enabled" "no,yes"
@@ -188,6 +188,10 @@
(and (eq_attr "arch" "neon")
(match_test "TARGET_NEON"))
(const_string "yes")
+
+ (and (eq_attr "arch" "mve")
+ (match_test "TARGET_HAVE_MVE"))
+ (const_string "yes")
]
(const_string "no")))
@@ -11758,7 +11762,7 @@
(match_operand:SI 2 "const_int_I_operand" "I")))
(set (match_operand:DF 3 "vfp_hard_register_operand" "")
(mem:DF (match_dup 1)))])]
- "TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)"
+ "TARGET_32BIT && TARGET_VFP_BASE"
"*
{
int num_regs = XVECLEN (operands[0], 0);
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index a12de97cdaab589e0c8704b408ac4c329def416d..bf8f4ff1e5d2d6132d0afdd05255cc697c54159d 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -38,7 +38,7 @@
;; in all states: Pf, Pg
;; The following memory constraints have been used:
-;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up
+;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up, Uf
;; in ARM state: Uq
;; in Thumb state: Uu, Uw
;; in all states: Q
@@ -46,6 +46,9 @@
(define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"
"MVE VPR register")
+(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS"
+ "MVE FPCCR register")
+
(define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
"The VFP registers @code{s0}-@code{s31}.")
diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
index b0d3bd1cf1c484927e6ac6522bc30f0f089291c7..793f67068687a60abf94c230e5485a1eb2eca6a0 100644
--- a/gcc/config/arm/thumb2.md
+++ b/gcc/config/arm/thumb2.md
@@ -517,7 +517,7 @@
[(match_operand 4 "cc_register" "") (const_int 0)])
(match_operand:SF 1 "s_register_operand" "0,r")
(match_operand:SF 2 "s_register_operand" "r,0")))]
- "TARGET_THUMB2 && TARGET_SOFT_FLOAT"
+ "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE"
"@
it\\t%D3\;mov%D3\\t%0, %2
it\\t%d3\;mov%d3\\t%0, %1"
diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
index f0b1f465de4b63d624510783576700519044717d..e76609f79418af38b70746336dd43592a1dc8713 100644
--- a/gcc/config/arm/unspecs.md
+++ b/gcc/config/arm/unspecs.md
@@ -170,6 +170,7 @@
UNSPEC_TORC ; Used by the intrinsic form of the iWMMXt TORC instruction.
UNSPEC_TORVSC ; Used by the intrinsic form of the iWMMXt TORVSC instruction.
UNSPEC_TEXTRC ; Used by the intrinsic form of the iWMMXt TEXTRC instruction.
+ UNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
])
@@ -216,7 +217,6 @@
VUNSPEC_SLX ; Represent a store-register-release-exclusive.
VUNSPEC_LDA ; Represent a store-register-acquire.
VUNSPEC_STL ; Represent a store-register-release.
- VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content.
VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
VUNSPEC_CDP ; Represent the coprocessor cdp instruction.
diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
index ab16a6b0eac822b4e1a1ae4dcbe39491a82cc9fe..eb6ae7bea7927c666f36219797d54c0127001bc1 100644
--- a/gcc/config/arm/vfp.md
+++ b/gcc/config/arm/vfp.md
@@ -74,10 +74,10 @@
(define_insn "*thumb2_movhi_vfp"
[(set
(match_operand:HI 0 "nonimmediate_operand"
- "=rk, r, l, r, m, r, *t, r, *t")
+ "=rk, r, l, r, m, r, *t, r, *t, Up, r")
(match_operand:HI 1 "general_operand"
- "rk, I, Py, n, r, m, r, *t, *t"))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT
+ "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
+ "TARGET_THUMB2 && TARGET_VFP_BASE
&& !TARGET_VFP_FP16INST
&& (register_operand (operands[0], HImode)
|| register_operand (operands[1], HImode))"
@@ -99,20 +99,24 @@
return "vmov%?\t%0, %1\t%@ int";
case 8:
return "vmov%?.f32\t%0, %1\t%@ int";
+ case 9:
+ return "vmsr%?\t P0, %1\t@ movhi";
+ case 10:
+ return "vmrs%?\t %0, P0\t@ movhi";
default:
gcc_unreachable ();
}
}
[(set_attr "predicable" "yes")
(set_attr "predicable_short_it"
- "yes, no, yes, no, no, no, no, no, no")
+ "yes, no, yes, no, no, no, no, no, no, no, no")
(set_attr "type"
"mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
- f_mcr, f_mrc, fmov")
- (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
- (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
- (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
- (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
+ f_mcr, f_mrc, fmov, mve_move, mve_move")
+ (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
+ (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
+ (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
+ (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
)
;; Patterns for HI moves which provide more data transfer instructions when FP16
@@ -170,10 +174,10 @@
(define_insn "*thumb2_movhi_fp16"
[(set
(match_operand:HI 0 "nonimmediate_operand"
- "=rk, r, l, r, m, r, *t, r, *t")
+ "=rk, r, l, r, m, r, *t, r, *t, Up, r")
(match_operand:HI 1 "general_operand"
- "rk, I, Py, n, r, m, r, *t, *t"))]
- "TARGET_THUMB2 && TARGET_VFP_FP16INST
+ "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
+ "TARGET_THUMB2 && (TARGET_VFP_FP16INST || TARGET_HAVE_MVE)
&& (register_operand (operands[0], HImode)
|| register_operand (operands[1], HImode))"
{
@@ -194,21 +198,25 @@
return "vmov.f16\t%0, %1\t%@ int";
case 8:
return "vmov%?.f32\t%0, %1\t%@ int";
+ case 9:
+ return "vmsr%?\tP0, %1\t%@ movhi";
+ case 10:
+ return "vmrs%?\t%0, P0\t%@ movhi";
default:
gcc_unreachable ();
}
}
[(set_attr "predicable"
- "yes, yes, yes, yes, yes, yes, no, no, yes")
+ "yes, yes, yes, yes, yes, yes, no, no, yes, yes, yes")
(set_attr "predicable_short_it"
- "yes, no, yes, no, no, no, no, no, no")
+ "yes, no, yes, no, no, no, no, no, no, no, no")
(set_attr "type"
"mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
- f_mcr, f_mrc, fmov")
- (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
- (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
- (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
- (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
+ f_mcr, f_mrc, fmov, mve_move, mve_move")
+ (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
+ (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
+ (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
+ (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
)
;; SImode moves
@@ -258,9 +266,11 @@
;; is chosen with length 2 when the instruction is predicated for
;; arm_restrict_it.
(define_insn "*thumb2_movsi_vfp"
- [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r,lk*r,m,*t, r,*t,*t, *Uv")
- (match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,lk*r, r,*t,*t,*UvTu,*t"))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT
+ [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,l,r,r,l,*hk,m,*m,*t,\
+ r,*t,*t,*Uv, Up, r,Uf,r")
+ (match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,*mi,l,*hk,r,*t,\
+ *t,*UvTu,*t, r, Up,r,Uf"))]
+ "TARGET_THUMB2 && TARGET_VFP_BASE
&& ( s_register_operand (operands[0], SImode)
|| s_register_operand (operands[1], SImode))"
"*
@@ -275,30 +285,44 @@
case 4:
return \"movw%?\\t%0, %1\";
case 5:
+ case 6:
/* Cannot load it directly, split to load it via MOV / MOVT. */
if (!MEM_P (operands[1]) && arm_disable_literal_pool)
return \"#\";
return \"ldr%?\\t%0, %1\";
- case 6:
- return \"str%?\\t%1, %0\";
case 7:
- return \"vmov%?\\t%0, %1\\t%@ int\";
case 8:
- return \"vmov%?\\t%0, %1\\t%@ int\";
+ return \"str%?\\t%1, %0\";
case 9:
+ return \"vmov%?\\t%0, %1\\t%@ int\";
+ case 10:
+ return \"vmov%?\\t%0, %1\\t%@ int\";
+ case 11:
return \"vmov%?.f32\\t%0, %1\\t%@ int\";
- case 10: case 11:
+ case 12: case 13:
return output_move_vfp (operands);
+ case 14:
+ return \"vmsr\\t P0, %1\";
+ case 15:
+ return \"vmrs\\t %0, P0\";
+ case 16:
+ return \"mcr\\tp10, 7, %1, cr1, cr0, 0\\t @SET_FPSCR\";
+ case 17:
+ return \"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR\";
default:
gcc_unreachable ();
}
"
[(set_attr "predicable" "yes")
- (set_attr "predicable_short_it" "yes,no,yes,no,no,no,no,no,no,no,no,no")
- (set_attr "type" "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores")
- (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4")
- (set_attr "pool_range" "*,*,*,*,*,1018,*,*,*,*,1018,*")
- (set_attr "neg_pool_range" "*,*,*,*,*, 0,*,*,*,*,1008,*")]
+ (set_attr "predicable_short_it" "yes,no,yes,no,no,no,no,no,no,no,no,no,no,\
+ no,no,no,no,no")
+ (set_attr "type" "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,load_4,\
+ store_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores,mve_move,\
+ mve_move,mrs,mrs")
+ (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4")
+ (set_attr "pool_range" "*,*,*,*,*,1018,4094,*,*,*,*,*,1018,*,*,*,*,*")
+ (set_attr "arch" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,mve,mve,mve,mve")
+ (set_attr "neg_pool_range" "*,*,*,*,*, 0, 0,*,*,*,*,*,1008,*,*,*,*,*")]
)
@@ -306,12 +330,12 @@
(define_insn "*movdi_vfp"
[(set (match_operand:DI 0 "nonimmediate_di_operand" "=r,r,r,r,r,r,m,w,!r,w,w, Uv")
- (match_operand:DI 1 "di_operand" "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
- "TARGET_32BIT && TARGET_HARD_FLOAT
+ (match_operand:DI 1 "di_operand" "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
+ "TARGET_32BIT && TARGET_VFP_BASE
&& ( register_operand (operands[0], DImode)
|| register_operand (operands[1], DImode))
- && !(TARGET_NEON && CONST_INT_P (operands[1])
- && simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
+ && !((TARGET_NEON || TARGET_HAVE_MVE) && CONST_INT_P (operands[1])
+ && simd_immediate_valid_for_move (operands[1], DImode, NULL, NULL))"
"*
switch (which_alternative)
{
@@ -333,7 +357,7 @@
case 8:
return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\";
case 9:
- if (TARGET_VFP_SINGLE)
+ if (TARGET_VFP_SINGLE || TARGET_HAVE_MVE)
return \"vmov%?.f32\\t%0, %1\\t%@ int\;vmov%?.f32\\t%p0, %p1\\t%@ int\";
else
return \"vmov%?.f64\\t%P0, %P1\\t%@ int\";
@@ -390,9 +414,15 @@
case 6: /* S register from immediate. */
return \"vmov.f16\\t%0, %1\t%@ __<fporbf>\";
case 7: /* S register from memory. */
- return \"vld1.16\\t{%z0}, %A1\";
+ if (TARGET_HAVE_MVE)
+ return \"vldr.16\\t%0, %A1\";
+ else
+ return \"vld1.16\\t{%z0}, %A1\";
case 8: /* Memory from S register. */
- return \"vst1.16\\t{%z1}, %A0\";
+ if (TARGET_HAVE_MVE)
+ return \"vstr.16\\t%1, %A0\";
+ else
+ return \"vst1.16\\t{%z1}, %A0\";
case 9: /* ARM register from constant. */
{
long bits;
@@ -593,7 +623,7 @@
(define_insn "*thumb2_movsf_vfp"
[(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t ,Uv,r ,m,t,r")
(match_operand:SF 1 "hard_sf_operand" " ?r,t,Dv,UvHa,t, mHa,r,t,r"))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT
+ "TARGET_THUMB2 && TARGET_VFP_BASE
&& ( s_register_operand (operands[0], SFmode)
|| s_register_operand (operands[1], SFmode))"
"*
@@ -682,7 +712,7 @@
(define_insn "*thumb2_movdf_vfp"
[(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w ,Uv,r ,m,w,r")
(match_operand:DF 1 "hard_df_operand" " ?r,w,Dy,G,UvHa,w, mHa,r, w,r"))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT
+ "TARGET_THUMB2 && TARGET_VFP_BASE
&& ( register_operand (operands[0], DFmode)
|| register_operand (operands[1], DFmode))"
"*
@@ -760,7 +790,7 @@
[(match_operand 4 "cc_register" "") (const_int 0)])
(match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")
(match_operand:SF 2 "s_register_operand" "t,0,t,?r,0,?r,t,0,t")))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT && !arm_restrict_it"
+ "TARGET_THUMB2 && TARGET_VFP_BASE && !arm_restrict_it"
"@
it\\t%D3\;vmov%D3.f32\\t%0, %2
it\\t%d3\;vmov%d3.f32\\t%0, %1
@@ -806,7 +836,8 @@
[(match_operand 4 "cc_register" "") (const_int 0)])
(match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
(match_operand:DF 2 "s_register_operand" "w,0,w,?r,0,?r,w,0,w")))]
- "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE && !arm_restrict_it"
+ "TARGET_THUMB2 && TARGET_VFP_BASE && TARGET_VFP_DOUBLE
+ && !arm_restrict_it"
"@
it\\t%D3\;vmov%D3.f64\\t%P0, %P2
it\\t%d3\;vmov%d3.f64\\t%P0, %P1
@@ -1977,7 +2008,7 @@
[(set (match_operand:BLK 0 "memory_operand" "=m")
(unspec:BLK [(match_operand:DF 1 "vfp_register_operand" "")]
UNSPEC_PUSH_MULT))])]
- "TARGET_32BIT && TARGET_HARD_FLOAT"
+ "TARGET_32BIT && TARGET_VFP_BASE"
"* return vfp_output_vstmd (operands);"
[(set_attr "type" "f_stored")]
)
@@ -2065,16 +2096,18 @@
;; Write Floating-point Status and Control Register.
(define_insn "set_fpscr"
- [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR)]
- "TARGET_HARD_FLOAT"
+ [(set (reg:SI VFPCC_REGNUM)
+ (unspec_volatile:SI
+ [(match_operand:SI 0 "register_operand" "r")] VUNSPEC_SET_FPSCR))]
+ "TARGET_VFP_BASE"
"mcr\\tp10, 7, %0, cr1, cr0, 0\\t @SET_FPSCR"
[(set_attr "type" "mrs")])
;; Read Floating-point Status and Control Register.
(define_insn "get_fpscr"
[(set (match_operand:SI 0 "register_operand" "=r")
- (unspec_volatile:SI [(const_int 0)] VUNSPEC_GET_FPSCR))]
- "TARGET_HARD_FLOAT"
+ (unspec:SI [(reg:SI VFPCC_REGNUM)] UNSPEC_GET_FPSCR))]
+ "TARGET_VFP_BASE"
"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR"
[(set_attr "type" "mrs")])
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
new file mode 100644
index 0000000000000000000000000000000000000000..17ba616c041378b88463cb7ef150b70b2e7b95ad
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve.fp -mfloat-abi=hard -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+ int8x16_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
new file mode 100644
index 0000000000000000000000000000000000000000..7b877c4a90c506343d6b4edb750ba06ce3d7a68d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve.fp -mfloat-abi=softfp -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+ int8x16_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
new file mode 100644
index 0000000000000000000000000000000000000000..85fbb5767edc3c25ceb4d6da780d47afa1ee416c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=hard -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+ int8x16_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
new file mode 100644
index 0000000000000000000000000000000000000000..23b3683ae559b3f7bf6c3ad11c4070ad2ddb9387
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=softfp -mthumb" } */
+
+#include "arm_mve.h"
+
+int8x16_t
+foo1 (int8x16_t value)
+{
+ int8x16_t b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
new file mode 100644
index 0000000000000000000000000000000000000000..8f7fa348d130e8456d5300ac25821fd96f9d5a97
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target arm_v8_1m_mve_ok } */
+/* { dg-additional-options "-march=armv8.1-m.main+mve -mfloat-abi=soft -mthumb" } */
+
+int
+foo1 (int value)
+{
+ int b = value;
+ return b;
+}
+
+/* { dg-final { scan-assembler "\.fpu softvfp" } } */
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
2020-03-10 18:19 [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch Srinath Parvathaneni
@ 2020-03-12 11:16 ` Kyrill Tkachov
2020-03-16 10:54 ` Srinath Parvathaneni
0 siblings, 1 reply; 4+ messages in thread
From: Kyrill Tkachov @ 2020-03-12 11:16 UTC (permalink / raw)
To: Srinath Parvathaneni, gcc-patches
Hi Srinath,
On 3/10/20 6:19 PM, Srinath Parvathaneni wrote:
> Hello Kyrill,
>
> This patch addresses all the comments in patch version v2.
> (version v2)
> https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540416.html
>
> ####
>
>
> Hello,
>
> This patch is part of MVE ACLE intrinsics framework.
> This patches add support to update (read/write) the APSR (Application
> Program Status Register)
> register and FPSCR (Floating-point Status and Control Register)
> register for MVE.
> This patch also enables thumb2 mov RTL patterns for MVE.
>
> A new feature bit vfp_base is added. This bit is enabled for all VFP,
> MVE and MVE with floating point
> extensions. This bit is used to enable the macro TARGET_VFP_BASE. For
> all the VFP instructions, RTL patterns,
> status and control registers are guarded by TARGET_HAVE_FLOAT. But
> this patch modifies that and the
> common instructions, RTL patterns, status and control registers
> bewteen MVE and VFP are guarded by
> TARGET_VFP_BASE macro.
>
> The RTL pattern set_fpscr and get_fpscr are updated to use
> VFPCC_REGNUM because few MVE intrinsics
> set/get carry bit of FPSCR register.
>
> Please refer to Arm reference manual [1] for more details.
> [1] https://developer.arm.com/docs/ddi0553/latest
>
> Regression tested on target arm-none-eabi and armeb-none-eabi and
> found no regressions.
>
> Ok for trunk?
Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the
other patches in this series)
Thanks,
Kyrill
>
> Thanks,
> Srinath
> gcc/ChangeLog:
>
> 2020-03-06 Andre Vieira <andre.simoesdiasvieira@arm.com>
> Mihail Ionescu <mihail.ionescu@arm.com>
> Srinath Parvathaneni <srinath.parvathaneni@arm.com>
>
> * common/config/arm/arm-common.c (arm_asm_auto_mfpu): When
> vfp_base
> feature bit is on and -mfpu=auto is passed as compiler option,
> do not
> generate error on not finding any match fpu. Because in this
> case fpu
> is not required.
> * config/arm/arm-cpus.in (vfp_base): Define feature bit, this
> bit is
> enabled for MVE and also for all VFP extensions.
> (VFPv2): Modify fgroup to enable vfp_base feature bit when
> ever VFPv2
> is enabled.
> (MVE): Define fgroup to enable feature bits mve, vfp_base and
> armv7em.
> (MVE_FP): Define fgroup to enable feature bits is fgroup MVE
> and FPv5
> along with feature bits mve_float.
> (mve): Modify add options in armv8.1-m.main arch for MVE.
> (mve.fp): Modify add options in armv8.1-m.main arch for MVE with
> floating point.
> * config/arm/arm.c (use_return_insn): Replace the
> check with TARGET_VFP_BASE.
> (thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with
> TARGET_VFP_BASE.
> (arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT ||
> TARGET_HAVE_MVE"
> with TARGET_VFP_BASE, to allow cost calculations for copies in
> MVE as
> well.
> (arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with
> TARGET_VFP_BASE, to allow space calculation for VFP registers
> in MVE
> as well.
> (arm_compute_frame_layout): Likewise.
> (arm_save_coproc_regs): Likewise.
> (arm_fixed_condition_code_regs): Modify to enable using
> VFPCC_REGNUM
> in MVE as well.
> (arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT ||
> TARGET_HAVE_MVE"
> with equivalent macro TARGET_VFP_BASE.
> (arm_expand_epilogue_apcs_frame): Likewise.
> (arm_expand_epilogue): Likewise.
> (arm_conditional_register_usage): Likewise.
> (arm_declare_function_name): Add check to skip printing .fpu
> directive
> in assembly file when TARGET_VFP_BASE is enabled and
> fpu_to_print is
> "softvfp".
> * config/arm/arm.h (TARGET_VFP_BASE): Define.
> * config/arm/arm.md (arch): Add "mve" to arch.
> (eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true.
> (vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT
> || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE.
> * config/arm/constraints.md (Uf): Define to allow modification
> to FPCCR
> in MVE.
> * config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify
> target guard
> to not allow for MVE.
> * config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile
> unspecs
> enum.
> (VUNSPEC_GET_FPSCR): Define.
> * config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR
> and VMRS
> instructions which move to general-purpose Register from
> Floating-point
> Special register and vice-versa.
> (thumb2_movhi_fp16): Likewise.
> (thumb2_movsi_vfp): Add support for VMSR and VMRS instructions
> along
> with MCR and MRC instructions which set and get Floating-point
> Status
> and Control Register (FPSCR).
> (movdi_vfp): Modify pattern to enable Single-precision scalar
> float move
> in MVE.
> (thumb2_movdf_vfp): Modify pattern to enable Double-precision
> scalar
> float move patterns in MVE.
> (thumb2_movsfcc_vfp): Modify pattern to enable single float
> conditional
> code move patterns of VFP also in MVE by adding
> TARGET_VFP_BASE check.
> (thumb2_movdfcc_vfp): Modify pattern to enable double float
> conditional
> code move patterns of VFP also in MVE by adding
> TARGET_VFP_BASE check.
> (push_multi_vfp): Add support to use VFP VPUSH pattern for MVE
> by adding
> TARGET_VFP_BASE check.
> (set_fpscr): Add support to set FPSCR register for MVE. Modify
> pattern
> using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
> register.
> (get_fpscr): Add support to get FPSCR register for MVE. Modify
> pattern
> using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
> register.
>
> gcc/testsuite/ChangeLog:
>
> 2020-03-06 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
>
> * gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: New test.
> * gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
> * gcc.target/arm/mve/intrinsics/mve_fpu1.c: Likewise.
> * gcc.target/arm/mve/intrinsics/mve_fpu2.c: Likewise.
> * gcc.target/arm/mve/intrinsics/mve_fpu3.c: Likewise.
>
>
> ############### Attachment also inlined for ease of reply
> ###############
>
>
> diff --git a/gcc/common/config/arm/arm-common.c
> b/gcc/common/config/arm/arm-common.c
> index
> 30a2a1deb864ee22d48cebb08247176640524955..83cc68009ac16a89ab5515f19d4eb84f595e33f1
> 100644
> --- a/gcc/common/config/arm/arm-common.c
> +++ b/gcc/common/config/arm/arm-common.c
> @@ -1009,7 +1009,8 @@ arm_asm_auto_mfpu (int argc, const char **argv)
> }
> }
>
> - gcc_assert (i != TARGET_FPU_auto);
> + gcc_assert (i != TARGET_FPU_auto
> + || bitmap_bit_p (arm_active_target.isa,
> isa_bit_vfp_base));
> }
>
> auto_fpu = (char *) xmalloc (strlen (fpuname) + sizeof ("-mfpu="));
> diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
> index
> 96f584da325172bd1460251e2de0ad679589d312..77b43090d69a599d8806cfcc02037e1bbed6e7a1
> 100644
> --- a/gcc/config/arm/arm-cpus.in
> +++ b/gcc/config/arm/arm-cpus.in
> @@ -135,6 +135,10 @@ define feature armv8_1m_main
> # Floating point and Neon extensions.
> # VFPv1 is not supported in GCC.
>
> +# This feature bit is enabled for all VFP, MVE and
> +# MVE with floating point extensions.
> +define feature vfp_base
> +
> # Vector floating point v2.
> define feature vfpv2
>
> @@ -234,7 +238,7 @@ define fgroup ALL_SIMD ALL_SIMD_INTERNAL
> ALL_SIMD_EXTERNAL
>
> # List of all FPU bits to strip out if -mfpu is used to override the
> # default. fp16 is deliberately missing from this list.
> -define fgroup ALL_FPU_INTERNAL vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl
> ALL_SIMD_INTERNAL
> +define fgroup ALL_FPU_INTERNAL vfp_base vfpv2 vfpv3 vfpv4 fpv5
> fp16conv fp_dbl ALL_SIMD_INTERNAL
> # Similarly, but including fp16 and other extensions that aren't part of
> # -mfpu support.
> define fgroup ALL_FPU_EXTERNAL fp16 bf16
> @@ -279,10 +283,12 @@ define fgroup ARMv8r ARMv8a
> define fgroup ARMv8_1m_main ARMv8m_main armv8_1m_main
>
> # Useful combinations.
> -define fgroup VFPv2 vfpv2
> +define fgroup VFPv2 vfp_base vfpv2
> define fgroup VFPv3 VFPv2 vfpv3
> define fgroup VFPv4 VFPv3 vfpv4 fp16conv
> define fgroup FPv5 VFPv4 fpv5
> +define fgroup MVE mve vfp_base armv7em
> +define fgroup MVE_FP MVE FPv5 fp16 mve_float
>
> define fgroup FP_DBL fp_dbl
> define fgroup FP_D32 FP_DBL fp_d32
> @@ -699,8 +705,8 @@ begin arch armv8.1-m.main
> option fp add FPv5 fp16
> option fp.dp add FPv5 FP_DBL fp16
> option nofp remove ALL_FP
> - option mve add mve armv7em
> - option mve.fp add mve FPv5 fp16 mve_float armv7em
> + option mve add MVE
> + option mve.fp add MVE_FP
> end arch armv8.1-m.main
>
> begin arch iwmmxt
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index
> a0283ed62c8047fe1ccbbb9b639ad34771fe46c2..c7453412959f23bf25c2052b4e0bb6a95faf3163
> 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -334,6 +334,19 @@ emission of floating point pcs attributes. */
> isa_bit_mve_float) \
> && !TARGET_GENERAL_REGS_ONLY)
>
> +/* MVE have few common instructions as VFP, like VLDM alias VPOP,
> VLDR, VSTM
> + alia VPUSH, VSTR and VMOV, VMSR and VMRS. In the same manner it
> updates few
> + registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and
> MVFR2. All
> + the VFP instructions, RTL patterns and register are guarded by
> + TARGET_HARD_FLOAT. But the common instructions, RTL pattern and
> registers
> + between MVE and VFP will be guarded by the following macro
> TARGET_VFP_BASE
> + hereafter. */
> +
> +#define TARGET_VFP_BASE (arm_float_abi != ARM_FLOAT_ABI_SOFT \
> + && bitmap_bit_p (arm_active_target.isa, \
> + isa_bit_vfp_base) \
> + && !TARGET_GENERAL_REGS_ONLY)
> +
> /* Nonzero if integer division instructions supported. */
> #define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
> || (TARGET_THUMB && arm_arch_thumb_hwdiv))
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index
> c769104a93746cd7c02b46b82f1a8f8057b9ae62..b40904a40e0979af4285fdbd85bfae55abea25dd
> 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -4295,7 +4295,7 @@ use_return_insn (int iscond, rtx sibling)
>
> /* Can't be done if any of the VFP regs are pushed,
> since this also requires an insn. */
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)
> if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p
> (regno))
> return 0;
> @@ -6289,7 +6289,7 @@ use_vfp_abi (enum arm_pcs pcs_variant, bool
> is_double)
> return false;
>
> return (TARGET_32BIT && TARGET_HARD_FLOAT &&
> - (TARGET_VFP_DOUBLE || !is_double));
> + (TARGET_VFP_DOUBLE || !is_double));
> }
>
> /* Return true if an argument whose type is TYPE, or mode is MODE, is
> @@ -8512,7 +8512,7 @@ thumb2_legitimate_index_p (machine_mode mode,
> rtx index, int strict_p)
>
> /* ??? Combine arm and thumb2 coprocessor addressing modes. */
> /* Standard coprocessor addressing modes. */
> - if (TARGET_HARD_FLOAT
> + if (TARGET_VFP_BASE
> && (mode == SFmode || mode == DFmode))
> return (code == CONST_INT && INTVAL (index) < 1024
> /* Thumb-2 allows only > -256 index range for it's core
> register
> @@ -9905,7 +9905,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code
> code, enum rtx_code outer_code,
> /* Assume that most copies can be done with a single insn,
> unless we don't have HW FP, in which case everything
> larger than word mode will require two insns. */
> - *cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + *cost = COSTS_N_INSNS (((!TARGET_VFP_BASE
> && GET_MODE_SIZE (mode) > 4)
> || mode == DImode)
> ? 2 : 1);
> @@ -20821,7 +20821,7 @@ arm_get_vfp_saved_size (void)
>
> saved = 0;
> /* Space for saved VFP registers. */
> - if (TARGET_HARD_FLOAT)
> + if (TARGET_VFP_BASE)
> {
> count = 0;
> for (regno = FIRST_VFP_REGNUM;
> @@ -22364,7 +22364,7 @@ arm_compute_frame_layout (void)
> func_type = arm_current_func_type ();
> /* Space for saved VFP registers. */
> if (! IS_VOLATILE (func_type)
> - && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
> + && TARGET_VFP_BASE)
> saved += arm_get_vfp_saved_size ();
>
> /* Allocate space for saving/restoring FPCXTNS in Armv8.1-M
> Mainline
> @@ -22588,7 +22588,7 @@ arm_save_coproc_regs(void)
> saved_size += 8;
> }
>
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> {
> start_reg = FIRST_VFP_REGNUM;
>
> @@ -24546,7 +24546,7 @@ arm_fixed_condition_code_regs (unsigned int
> *p1, unsigned int *p2)
> return false;
>
> *p1 = CC_REGNUM;
> - *p2 = TARGET_HARD_FLOAT ? VFPCC_REGNUM : INVALID_REGNUM;
> + *p2 = TARGET_VFP_BASE ? VFPCC_REGNUM : INVALID_REGNUM;
> return true;
> }
>
> @@ -24965,7 +24965,7 @@ arm_hard_regno_mode_ok (unsigned int regno,
> machine_mode mode)
> {
> if (GET_MODE_CLASS (mode) == MODE_CC)
> return (regno == CC_REGNUM
> - || ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + || (TARGET_VFP_BASE
> && regno == VFPCC_REGNUM));
>
> if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC)
> @@ -24982,7 +24982,7 @@ arm_hard_regno_mode_ok (unsigned int regno,
> machine_mode mode)
> start of an even numbered register pair. */
> return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM);
>
> - if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno))
> + if (TARGET_VFP_BASE && IS_VFP_REGNUM (regno))
> {
> if (mode == DFmode)
> return VFP_REGNO_OK_FOR_DOUBLE (regno);
> @@ -26933,7 +26933,7 @@ arm_expand_epilogue_apcs_frame (bool
> really_return)
> floats_from_frame += 4;
> }
>
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> {
> int start_reg;
> rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM);
> @@ -27179,7 +27179,7 @@ arm_expand_epilogue (bool really_return)
> }
> }
>
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> {
> /* Generate VFP register multi-pop. */
> int end_reg = LAST_VFP_REGNUM + 1;
> @@ -29699,7 +29699,7 @@ arm_conditional_register_usage (void)
> if (TARGET_THUMB1)
> fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;
>
> - if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
> + if (TARGET_32BIT && TARGET_VFP_BASE)
> {
> /* VFPv3 registers are disabled when earlier VFP
> versions are selected due to the definition of
> @@ -32478,7 +32478,8 @@ arm_declare_function_name (FILE *stream, const
> char *name, tree decl)
> = TARGET_SOFT_FLOAT
> ? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa);
>
> - if (fpu_to_print != arm_last_printed_arch_string)
> + if (!(!strcmp (fpu_to_print.c_str (), "softvfp") && TARGET_VFP_BASE)
> + && (fpu_to_print != arm_last_printed_arch_string))
> {
> asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ());
> arm_last_printed_fpu_string = fpu_to_print;
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index
> 8f8c91d5fe146ed64cd4eb5450f04b3cf0c0ed18..5387f972f5a864a153873f21b9423d28446daefc
> 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -134,7 +134,7 @@
> ; arm_arch6. "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M
> ; Baseline. This attribute is used to compute attribute "enabled",
> ; use type "any" to enable an alternative in all cases.
> -(define_attr "arch"
> "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
> +(define_attr "arch"
> "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon,mve"
> (const_string "any"))
>
> (define_attr "arch_enabled" "no,yes"
> @@ -188,6 +188,10 @@
> (and (eq_attr "arch" "neon")
> (match_test "TARGET_NEON"))
> (const_string "yes")
> +
> + (and (eq_attr "arch" "mve")
> + (match_test "TARGET_HAVE_MVE"))
> + (const_string "yes")
> ]
>
> (const_string "no")))
> @@ -11758,7 +11762,7 @@
> (match_operand:SI 2 "const_int_I_operand" "I")))
> (set (match_operand:DF 3 "vfp_hard_register_operand" "")
> (mem:DF (match_dup 1)))])]
> - "TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)"
> + "TARGET_32BIT && TARGET_VFP_BASE"
> "*
> {
> int num_regs = XVECLEN (operands[0], 0);
> diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
> index
> a12de97cdaab589e0c8704b408ac4c329def416d..bf8f4ff1e5d2d6132d0afdd05255cc697c54159d
> 100644
> --- a/gcc/config/arm/constraints.md
> +++ b/gcc/config/arm/constraints.md
> @@ -38,7 +38,7 @@
> ;; in all states: Pf, Pg
>
> ;; The following memory constraints have been used:
> -;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up
> +;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up, Uf
> ;; in ARM state: Uq
> ;; in Thumb state: Uu, Uw
> ;; in all states: Q
> @@ -46,6 +46,9 @@
> (define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"
> "MVE VPR register")
>
> +(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS"
> + "MVE FPCCR register")
> +
> (define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
> "The VFP registers @code{s0}-@code{s31}.")
>
> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
> index
> b0d3bd1cf1c484927e6ac6522bc30f0f089291c7..793f67068687a60abf94c230e5485a1eb2eca6a0
> 100644
> --- a/gcc/config/arm/thumb2.md
> +++ b/gcc/config/arm/thumb2.md
> @@ -517,7 +517,7 @@
> [(match_operand 4 "cc_register" "")
> (const_int 0)])
> (match_operand:SF 1 "s_register_operand" "0,r")
> (match_operand:SF 2 "s_register_operand"
> "r,0")))]
> - "TARGET_THUMB2 && TARGET_SOFT_FLOAT"
> + "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE"
> "@
> it\\t%D3\;mov%D3\\t%0, %2
> it\\t%d3\;mov%d3\\t%0, %1"
> diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
> index
> f0b1f465de4b63d624510783576700519044717d..e76609f79418af38b70746336dd43592a1dc8713
> 100644
> --- a/gcc/config/arm/unspecs.md
> +++ b/gcc/config/arm/unspecs.md
> @@ -170,6 +170,7 @@
> UNSPEC_TORC ; Used by the intrinsic form of the iWMMXt
> TORC instruction.
> UNSPEC_TORVSC ; Used by the intrinsic form of the
> iWMMXt TORVSC instruction.
> UNSPEC_TEXTRC ; Used by the intrinsic form of the
> iWMMXt TEXTRC instruction.
> + UNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
> ])
>
>
> @@ -216,7 +217,6 @@
> VUNSPEC_SLX ; Represent a store-register-release-exclusive.
> VUNSPEC_LDA ; Represent a store-register-acquire.
> VUNSPEC_STL ; Represent a store-register-release.
> - VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
> VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content.
> VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
> VUNSPEC_CDP ; Represent the coprocessor cdp instruction.
> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
> index
> ab16a6b0eac822b4e1a1ae4dcbe39491a82cc9fe..eb6ae7bea7927c666f36219797d54c0127001bc1
> 100644
> --- a/gcc/config/arm/vfp.md
> +++ b/gcc/config/arm/vfp.md
> @@ -74,10 +74,10 @@
> (define_insn "*thumb2_movhi_vfp"
> [(set
> (match_operand:HI 0 "nonimmediate_operand"
> - "=rk, r, l, r, m, r, *t, r, *t")
> + "=rk, r, l, r, m, r, *t, r, *t, Up, r")
> (match_operand:HI 1 "general_operand"
> - "rk, I, Py, n, r, m, r, *t, *t"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && !TARGET_VFP_FP16INST
> && (register_operand (operands[0], HImode)
> || register_operand (operands[1], HImode))"
> @@ -99,20 +99,24 @@
> return "vmov%?\t%0, %1\t%@ int";
> case 8:
> return "vmov%?.f32\t%0, %1\t%@ int";
> + case 9:
> + return "vmsr%?\t P0, %1\t@ movhi";
> + case 10:
> + return "vmrs%?\t %0, P0\t@ movhi";
> default:
> gcc_unreachable ();
> }
> }
> [(set_attr "predicable" "yes")
> (set_attr "predicable_short_it"
> - "yes, no, yes, no, no, no, no, no, no")
> + "yes, no, yes, no, no, no, no, no, no, no, no")
> (set_attr "type"
> "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
> - f_mcr, f_mrc, fmov")
> - (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
> - (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
> - (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
> - (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
> + f_mcr, f_mrc, fmov, mve_move, mve_move")
> + (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
> + (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
> + (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
> + (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
> )
>
> ;; Patterns for HI moves which provide more data transfer
> instructions when FP16
> @@ -170,10 +174,10 @@
> (define_insn "*thumb2_movhi_fp16"
> [(set
> (match_operand:HI 0 "nonimmediate_operand"
> - "=rk, r, l, r, m, r, *t, r, *t")
> + "=rk, r, l, r, m, r, *t, r, *t, Up, r")
> (match_operand:HI 1 "general_operand"
> - "rk, I, Py, n, r, m, r, *t, *t"))]
> - "TARGET_THUMB2 && TARGET_VFP_FP16INST
> + "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
> + "TARGET_THUMB2 && (TARGET_VFP_FP16INST || TARGET_HAVE_MVE)
> && (register_operand (operands[0], HImode)
> || register_operand (operands[1], HImode))"
> {
> @@ -194,21 +198,25 @@
> return "vmov.f16\t%0, %1\t%@ int";
> case 8:
> return "vmov%?.f32\t%0, %1\t%@ int";
> + case 9:
> + return "vmsr%?\tP0, %1\t%@ movhi";
> + case 10:
> + return "vmrs%?\t%0, P0\t%@ movhi";
> default:
> gcc_unreachable ();
> }
> }
> [(set_attr "predicable"
> - "yes, yes, yes, yes, yes, yes, no, no, yes")
> + "yes, yes, yes, yes, yes, yes, no, no, yes, yes, yes")
> (set_attr "predicable_short_it"
> - "yes, no, yes, no, no, no, no, no, no")
> + "yes, no, yes, no, no, no, no, no, no, no, no")
> (set_attr "type"
> "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
> - f_mcr, f_mrc, fmov")
> - (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
> - (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
> - (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
> - (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
> + f_mcr, f_mrc, fmov, mve_move, mve_move")
> + (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
> + (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
> + (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
> + (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
> )
>
> ;; SImode moves
> @@ -258,9 +266,11 @@
> ;; is chosen with length 2 when the instruction is predicated for
> ;; arm_restrict_it.
> (define_insn "*thumb2_movsi_vfp"
> - [(set (match_operand:SI 0 "nonimmediate_operand"
> "=rk,r,l,r,r,lk*r,m,*t, r,*t,*t, *Uv")
> - (match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,lk*r,
> r,*t,*t,*UvTu,*t"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + [(set (match_operand:SI 0 "nonimmediate_operand"
> "=rk,r,l,r,r,l,*hk,m,*m,*t,\
> + r,*t,*t,*Uv, Up, r,Uf,r")
> + (match_operand:SI 1 "general_operand"
> "rk,I,Py,K,j,mi,*mi,l,*hk,r,*t,\
> + *t,*UvTu,*t, r, Up,r,Uf"))]
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && ( s_register_operand (operands[0], SImode)
> || s_register_operand (operands[1], SImode))"
> "*
> @@ -275,30 +285,44 @@
> case 4:
> return \"movw%?\\t%0, %1\";
> case 5:
> + case 6:
> /* Cannot load it directly, split to load it via MOV / MOVT. */
> if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> return \"#\";
> return \"ldr%?\\t%0, %1\";
> - case 6:
> - return \"str%?\\t%1, %0\";
> case 7:
> - return \"vmov%?\\t%0, %1\\t%@ int\";
> case 8:
> - return \"vmov%?\\t%0, %1\\t%@ int\";
> + return \"str%?\\t%1, %0\";
> case 9:
> + return \"vmov%?\\t%0, %1\\t%@ int\";
> + case 10:
> + return \"vmov%?\\t%0, %1\\t%@ int\";
> + case 11:
> return \"vmov%?.f32\\t%0, %1\\t%@ int\";
> - case 10: case 11:
> + case 12: case 13:
> return output_move_vfp (operands);
> + case 14:
> + return \"vmsr\\t P0, %1\";
> + case 15:
> + return \"vmrs\\t %0, P0\";
> + case 16:
> + return \"mcr\\tp10, 7, %1, cr1, cr0, 0\\t @SET_FPSCR\";
> + case 17:
> + return \"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR\";
> default:
> gcc_unreachable ();
> }
> "
> [(set_attr "predicable" "yes")
> - (set_attr "predicable_short_it"
> "yes,no,yes,no,no,no,no,no,no,no,no,no")
> - (set_attr "type"
> "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores")
> - (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4")
> - (set_attr "pool_range" "*,*,*,*,*,1018,*,*,*,*,1018,*")
> - (set_attr "neg_pool_range" "*,*,*,*,*, 0,*,*,*,*,1008,*")]
> + (set_attr "predicable_short_it"
> "yes,no,yes,no,no,no,no,no,no,no,no,no,no,\
> + no,no,no,no,no")
> + (set_attr "type"
> "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,load_4,\
> + store_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores,mve_move,\
> + mve_move,mrs,mrs")
> + (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4")
> + (set_attr "pool_range" "*,*,*,*,*,1018,4094,*,*,*,*,*,1018,*,*,*,*,*")
> + (set_attr "arch" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,mve,mve,mve,mve")
> + (set_attr "neg_pool_range" "*,*,*,*,*, 0,
> 0,*,*,*,*,*,1008,*,*,*,*,*")]
> )
>
>
> @@ -306,12 +330,12 @@
>
> (define_insn "*movdi_vfp"
> [(set (match_operand:DI 0 "nonimmediate_di_operand"
> "=r,r,r,r,r,r,m,w,!r,w,w, Uv")
> - (match_operand:DI 1 "di_operand"
> "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
> - "TARGET_32BIT && TARGET_HARD_FLOAT
> + (match_operand:DI 1 "di_operand"
> "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
> + "TARGET_32BIT && TARGET_VFP_BASE
> && ( register_operand (operands[0], DImode)
> || register_operand (operands[1], DImode))
> - && !(TARGET_NEON && CONST_INT_P (operands[1])
> - && simd_immediate_valid_for_move (operands[1], DImode, NULL,
> NULL))"
> + && !((TARGET_NEON || TARGET_HAVE_MVE) && CONST_INT_P (operands[1])
> + && simd_immediate_valid_for_move (operands[1], DImode, NULL,
> NULL))"
> "*
> switch (which_alternative)
> {
> @@ -333,7 +357,7 @@
> case 8:
> return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\";
> case 9:
> - if (TARGET_VFP_SINGLE)
> + if (TARGET_VFP_SINGLE || TARGET_HAVE_MVE)
> return \"vmov%?.f32\\t%0, %1\\t%@ int\;vmov%?.f32\\t%p0,
> %p1\\t%@ int\";
> else
> return \"vmov%?.f64\\t%P0, %P1\\t%@ int\";
> @@ -390,9 +414,15 @@
> case 6: /* S register from immediate. */
> return \"vmov.f16\\t%0, %1\t%@ __<fporbf>\";
> case 7: /* S register from memory. */
> - return \"vld1.16\\t{%z0}, %A1\";
> + if (TARGET_HAVE_MVE)
> + return \"vldr.16\\t%0, %A1\";
> + else
> + return \"vld1.16\\t{%z0}, %A1\";
> case 8: /* Memory from S register. */
> - return \"vst1.16\\t{%z1}, %A0\";
> + if (TARGET_HAVE_MVE)
> + return \"vstr.16\\t%1, %A0\";
> + else
> + return \"vst1.16\\t{%z1}, %A0\";
> case 9: /* ARM register from constant. */
> {
> long bits;
> @@ -593,7 +623,7 @@
> (define_insn "*thumb2_movsf_vfp"
> [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t ,Uv,r
> ,m,t,r")
> (match_operand:SF 1 "hard_sf_operand" " ?r,t,Dv,UvHa,t,
> mHa,r,t,r"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && ( s_register_operand (operands[0], SFmode)
> || s_register_operand (operands[1], SFmode))"
> "*
> @@ -682,7 +712,7 @@
> (define_insn "*thumb2_movdf_vfp"
> [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w
> ,w,w ,Uv,r ,m,w,r")
> (match_operand:DF 1 "hard_df_operand" " ?r,w,Dy,G,UvHa,w,
> mHa,r, w,r"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && ( register_operand (operands[0], DFmode)
> || register_operand (operands[1], DFmode))"
> "*
> @@ -760,7 +790,7 @@
> [(match_operand 4 "cc_register" "") (const_int 0)])
> (match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")
> (match_operand:SF 2 "s_register_operand"
> "t,0,t,?r,0,?r,t,0,t")))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT && !arm_restrict_it"
> + "TARGET_THUMB2 && TARGET_VFP_BASE && !arm_restrict_it"
> "@
> it\\t%D3\;vmov%D3.f32\\t%0, %2
> it\\t%d3\;vmov%d3.f32\\t%0, %1
> @@ -806,7 +836,8 @@
> [(match_operand 4 "cc_register" "") (const_int 0)])
> (match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
> (match_operand:DF 2 "s_register_operand"
> "w,0,w,?r,0,?r,w,0,w")))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE &&
> !arm_restrict_it"
> + "TARGET_THUMB2 && TARGET_VFP_BASE && TARGET_VFP_DOUBLE
> + && !arm_restrict_it"
> "@
> it\\t%D3\;vmov%D3.f64\\t%P0, %P2
> it\\t%d3\;vmov%d3.f64\\t%P0, %P1
> @@ -1977,7 +2008,7 @@
> [(set (match_operand:BLK 0 "memory_operand" "=m")
> (unspec:BLK [(match_operand:DF 1 "vfp_register_operand" "")]
> UNSPEC_PUSH_MULT))])]
> - "TARGET_32BIT && TARGET_HARD_FLOAT"
> + "TARGET_32BIT && TARGET_VFP_BASE"
> "* return vfp_output_vstmd (operands);"
> [(set_attr "type" "f_stored")]
> )
> @@ -2065,16 +2096,18 @@
>
> ;; Write Floating-point Status and Control Register.
> (define_insn "set_fpscr"
> - [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")]
> VUNSPEC_SET_FPSCR)]
> - "TARGET_HARD_FLOAT"
> + [(set (reg:SI VFPCC_REGNUM)
> + (unspec_volatile:SI
> + [(match_operand:SI 0 "register_operand" "r")]
> VUNSPEC_SET_FPSCR))]
> + "TARGET_VFP_BASE"
> "mcr\\tp10, 7, %0, cr1, cr0, 0\\t @SET_FPSCR"
> [(set_attr "type" "mrs")])
>
> ;; Read Floating-point Status and Control Register.
> (define_insn "get_fpscr"
> [(set (match_operand:SI 0 "register_operand" "=r")
> - (unspec_volatile:SI [(const_int 0)] VUNSPEC_GET_FPSCR))]
> - "TARGET_HARD_FLOAT"
> + (unspec:SI [(reg:SI VFPCC_REGNUM)] UNSPEC_GET_FPSCR))]
> + "TARGET_VFP_BASE"
> "mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR"
> [(set_attr "type" "mrs")])
>
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..17ba616c041378b88463cb7ef150b70b2e7b95ad
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve.fp
> -mfloat-abi=hard -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..7b877c4a90c506343d6b4edb750ba06ce3d7a68d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve.fp
> -mfloat-abi=softfp -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..85fbb5767edc3c25ceb4d6da780d47afa1ee416c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve
> -mfloat-abi=hard -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..23b3683ae559b3f7bf6c3ad11c4070ad2ddb9387
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve
> -mfloat-abi=softfp -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..8f7fa348d130e8456d5300ac25821fd96f9d5a97
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve
> -mfloat-abi=soft -mthumb" } */
> +
> +int
> +foo1 (int value)
> +{
> + int b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler "\.fpu softvfp" } } */
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
2020-03-12 11:16 ` Kyrill Tkachov
@ 2020-03-16 10:54 ` Srinath Parvathaneni
2020-03-16 12:13 ` Srinath Parvathaneni
0 siblings, 1 reply; 4+ messages in thread
From: Srinath Parvathaneni @ 2020-03-16 10:54 UTC (permalink / raw)
To: Kyrill Tkachov, gcc-patches
Hi Kyrill,
> Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the
other patches in this series)
I have bootstrapped this patch on arm-none-linux-gnueabihf and found no issues.
There is problem with git commit rights, could you commit this patch on my behalf.
Regards
SRI.
________________________________
From: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
Sent: 12 March 2020 11:16
To: Srinath Parvathaneni <Srinath.Parvathaneni@arm.com>; gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
Hi Srinath,
On 3/10/20 6:19 PM, Srinath Parvathaneni wrote:
> Hello Kyrill,
>
> This patch addresses all the comments in patch version v2.
> (version v2)
> https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540416.html
>
> ####
>
>
> Hello,
>
> This patch is part of MVE ACLE intrinsics framework.
> This patches add support to update (read/write) the APSR (Application
> Program Status Register)
> register and FPSCR (Floating-point Status and Control Register)
> register for MVE.
> This patch also enables thumb2 mov RTL patterns for MVE.
>
> A new feature bit vfp_base is added. This bit is enabled for all VFP,
> MVE and MVE with floating point
> extensions. This bit is used to enable the macro TARGET_VFP_BASE. For
> all the VFP instructions, RTL patterns,
> status and control registers are guarded by TARGET_HAVE_FLOAT. But
> this patch modifies that and the
> common instructions, RTL patterns, status and control registers
> bewteen MVE and VFP are guarded by
> TARGET_VFP_BASE macro.
>
> The RTL pattern set_fpscr and get_fpscr are updated to use
> VFPCC_REGNUM because few MVE intrinsics
> set/get carry bit of FPSCR register.
>
> Please refer to Arm reference manual [1] for more details.
> [1] https://developer.arm.com/docs/ddi0553/latest
>
> Regression tested on target arm-none-eabi and armeb-none-eabi and
> found no regressions.
>
> Ok for trunk?
Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the
other patches in this series)
Thanks,
Kyrill
>
> Thanks,
> Srinath
> gcc/ChangeLog:
>
> 2020-03-06 Andre Vieira <andre.simoesdiasvieira@arm.com>
> Mihail Ionescu <mihail.ionescu@arm.com>
> Srinath Parvathaneni <srinath.parvathaneni@arm.com>
>
> * common/config/arm/arm-common.c (arm_asm_auto_mfpu): When
> vfp_base
> feature bit is on and -mfpu=auto is passed as compiler option,
> do not
> generate error on not finding any match fpu. Because in this
> case fpu
> is not required.
> * config/arm/arm-cpus.in (vfp_base): Define feature bit, this
> bit is
> enabled for MVE and also for all VFP extensions.
> (VFPv2): Modify fgroup to enable vfp_base feature bit when
> ever VFPv2
> is enabled.
> (MVE): Define fgroup to enable feature bits mve, vfp_base and
> armv7em.
> (MVE_FP): Define fgroup to enable feature bits is fgroup MVE
> and FPv5
> along with feature bits mve_float.
> (mve): Modify add options in armv8.1-m.main arch for MVE.
> (mve.fp): Modify add options in armv8.1-m.main arch for MVE with
> floating point.
> * config/arm/arm.c (use_return_insn): Replace the
> check with TARGET_VFP_BASE.
> (thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with
> TARGET_VFP_BASE.
> (arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT ||
> TARGET_HAVE_MVE"
> with TARGET_VFP_BASE, to allow cost calculations for copies in
> MVE as
> well.
> (arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with
> TARGET_VFP_BASE, to allow space calculation for VFP registers
> in MVE
> as well.
> (arm_compute_frame_layout): Likewise.
> (arm_save_coproc_regs): Likewise.
> (arm_fixed_condition_code_regs): Modify to enable using
> VFPCC_REGNUM
> in MVE as well.
> (arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT ||
> TARGET_HAVE_MVE"
> with equivalent macro TARGET_VFP_BASE.
> (arm_expand_epilogue_apcs_frame): Likewise.
> (arm_expand_epilogue): Likewise.
> (arm_conditional_register_usage): Likewise.
> (arm_declare_function_name): Add check to skip printing .fpu
> directive
> in assembly file when TARGET_VFP_BASE is enabled and
> fpu_to_print is
> "softvfp".
> * config/arm/arm.h (TARGET_VFP_BASE): Define.
> * config/arm/arm.md (arch): Add "mve" to arch.
> (eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true.
> (vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT
> || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE.
> * config/arm/constraints.md (Uf): Define to allow modification
> to FPCCR
> in MVE.
> * config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify
> target guard
> to not allow for MVE.
> * config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile
> unspecs
> enum.
> (VUNSPEC_GET_FPSCR): Define.
> * config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR
> and VMRS
> instructions which move to general-purpose Register from
> Floating-point
> Special register and vice-versa.
> (thumb2_movhi_fp16): Likewise.
> (thumb2_movsi_vfp): Add support for VMSR and VMRS instructions
> along
> with MCR and MRC instructions which set and get Floating-point
> Status
> and Control Register (FPSCR).
> (movdi_vfp): Modify pattern to enable Single-precision scalar
> float move
> in MVE.
> (thumb2_movdf_vfp): Modify pattern to enable Double-precision
> scalar
> float move patterns in MVE.
> (thumb2_movsfcc_vfp): Modify pattern to enable single float
> conditional
> code move patterns of VFP also in MVE by adding
> TARGET_VFP_BASE check.
> (thumb2_movdfcc_vfp): Modify pattern to enable double float
> conditional
> code move patterns of VFP also in MVE by adding
> TARGET_VFP_BASE check.
> (push_multi_vfp): Add support to use VFP VPUSH pattern for MVE
> by adding
> TARGET_VFP_BASE check.
> (set_fpscr): Add support to set FPSCR register for MVE. Modify
> pattern
> using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
> register.
> (get_fpscr): Add support to get FPSCR register for MVE. Modify
> pattern
> using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
> register.
>
> gcc/testsuite/ChangeLog:
>
> 2020-03-06 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
>
> * gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: New test.
> * gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
> * gcc.target/arm/mve/intrinsics/mve_fpu1.c: Likewise.
> * gcc.target/arm/mve/intrinsics/mve_fpu2.c: Likewise.
> * gcc.target/arm/mve/intrinsics/mve_fpu3.c: Likewise.
>
>
> ############### Attachment also inlined for ease of reply
> ###############
>
>
> diff --git a/gcc/common/config/arm/arm-common.c
> b/gcc/common/config/arm/arm-common.c
> index
> 30a2a1deb864ee22d48cebb08247176640524955..83cc68009ac16a89ab5515f19d4eb84f595e33f1
> 100644
> --- a/gcc/common/config/arm/arm-common.c
> +++ b/gcc/common/config/arm/arm-common.c
> @@ -1009,7 +1009,8 @@ arm_asm_auto_mfpu (int argc, const char **argv)
> }
> }
>
> - gcc_assert (i != TARGET_FPU_auto);
> + gcc_assert (i != TARGET_FPU_auto
> + || bitmap_bit_p (arm_active_target.isa,
> isa_bit_vfp_base));
> }
>
> auto_fpu = (char *) xmalloc (strlen (fpuname) + sizeof ("-mfpu="));
> diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
> index
> 96f584da325172bd1460251e2de0ad679589d312..77b43090d69a599d8806cfcc02037e1bbed6e7a1
> 100644
> --- a/gcc/config/arm/arm-cpus.in
> +++ b/gcc/config/arm/arm-cpus.in
> @@ -135,6 +135,10 @@ define feature armv8_1m_main
> # Floating point and Neon extensions.
> # VFPv1 is not supported in GCC.
>
> +# This feature bit is enabled for all VFP, MVE and
> +# MVE with floating point extensions.
> +define feature vfp_base
> +
> # Vector floating point v2.
> define feature vfpv2
>
> @@ -234,7 +238,7 @@ define fgroup ALL_SIMD ALL_SIMD_INTERNAL
> ALL_SIMD_EXTERNAL
>
> # List of all FPU bits to strip out if -mfpu is used to override the
> # default. fp16 is deliberately missing from this list.
> -define fgroup ALL_FPU_INTERNAL vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl
> ALL_SIMD_INTERNAL
> +define fgroup ALL_FPU_INTERNAL vfp_base vfpv2 vfpv3 vfpv4 fpv5
> fp16conv fp_dbl ALL_SIMD_INTERNAL
> # Similarly, but including fp16 and other extensions that aren't part of
> # -mfpu support.
> define fgroup ALL_FPU_EXTERNAL fp16 bf16
> @@ -279,10 +283,12 @@ define fgroup ARMv8r ARMv8a
> define fgroup ARMv8_1m_main ARMv8m_main armv8_1m_main
>
> # Useful combinations.
> -define fgroup VFPv2 vfpv2
> +define fgroup VFPv2 vfp_base vfpv2
> define fgroup VFPv3 VFPv2 vfpv3
> define fgroup VFPv4 VFPv3 vfpv4 fp16conv
> define fgroup FPv5 VFPv4 fpv5
> +define fgroup MVE mve vfp_base armv7em
> +define fgroup MVE_FP MVE FPv5 fp16 mve_float
>
> define fgroup FP_DBL fp_dbl
> define fgroup FP_D32 FP_DBL fp_d32
> @@ -699,8 +705,8 @@ begin arch armv8.1-m.main
> option fp add FPv5 fp16
> option fp.dp add FPv5 FP_DBL fp16
> option nofp remove ALL_FP
> - option mve add mve armv7em
> - option mve.fp add mve FPv5 fp16 mve_float armv7em
> + option mve add MVE
> + option mve.fp add MVE_FP
> end arch armv8.1-m.main
>
> begin arch iwmmxt
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index
> a0283ed62c8047fe1ccbbb9b639ad34771fe46c2..c7453412959f23bf25c2052b4e0bb6a95faf3163
> 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -334,6 +334,19 @@ emission of floating point pcs attributes. */
> isa_bit_mve_float) \
> && !TARGET_GENERAL_REGS_ONLY)
>
> +/* MVE have few common instructions as VFP, like VLDM alias VPOP,
> VLDR, VSTM
> + alia VPUSH, VSTR and VMOV, VMSR and VMRS. In the same manner it
> updates few
> + registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and
> MVFR2. All
> + the VFP instructions, RTL patterns and register are guarded by
> + TARGET_HARD_FLOAT. But the common instructions, RTL pattern and
> registers
> + between MVE and VFP will be guarded by the following macro
> TARGET_VFP_BASE
> + hereafter. */
> +
> +#define TARGET_VFP_BASE (arm_float_abi != ARM_FLOAT_ABI_SOFT \
> + && bitmap_bit_p (arm_active_target.isa, \
> + isa_bit_vfp_base) \
> + && !TARGET_GENERAL_REGS_ONLY)
> +
> /* Nonzero if integer division instructions supported. */
> #define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
> || (TARGET_THUMB && arm_arch_thumb_hwdiv))
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index
> c769104a93746cd7c02b46b82f1a8f8057b9ae62..b40904a40e0979af4285fdbd85bfae55abea25dd
> 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -4295,7 +4295,7 @@ use_return_insn (int iscond, rtx sibling)
>
> /* Can't be done if any of the VFP regs are pushed,
> since this also requires an insn. */
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)
> if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p
> (regno))
> return 0;
> @@ -6289,7 +6289,7 @@ use_vfp_abi (enum arm_pcs pcs_variant, bool
> is_double)
> return false;
>
> return (TARGET_32BIT && TARGET_HARD_FLOAT &&
> - (TARGET_VFP_DOUBLE || !is_double));
> + (TARGET_VFP_DOUBLE || !is_double));
> }
>
> /* Return true if an argument whose type is TYPE, or mode is MODE, is
> @@ -8512,7 +8512,7 @@ thumb2_legitimate_index_p (machine_mode mode,
> rtx index, int strict_p)
>
> /* ??? Combine arm and thumb2 coprocessor addressing modes. */
> /* Standard coprocessor addressing modes. */
> - if (TARGET_HARD_FLOAT
> + if (TARGET_VFP_BASE
> && (mode == SFmode || mode == DFmode))
> return (code == CONST_INT && INTVAL (index) < 1024
> /* Thumb-2 allows only > -256 index range for it's core
> register
> @@ -9905,7 +9905,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code
> code, enum rtx_code outer_code,
> /* Assume that most copies can be done with a single insn,
> unless we don't have HW FP, in which case everything
> larger than word mode will require two insns. */
> - *cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + *cost = COSTS_N_INSNS (((!TARGET_VFP_BASE
> && GET_MODE_SIZE (mode) > 4)
> || mode == DImode)
> ? 2 : 1);
> @@ -20821,7 +20821,7 @@ arm_get_vfp_saved_size (void)
>
> saved = 0;
> /* Space for saved VFP registers. */
> - if (TARGET_HARD_FLOAT)
> + if (TARGET_VFP_BASE)
> {
> count = 0;
> for (regno = FIRST_VFP_REGNUM;
> @@ -22364,7 +22364,7 @@ arm_compute_frame_layout (void)
> func_type = arm_current_func_type ();
> /* Space for saved VFP registers. */
> if (! IS_VOLATILE (func_type)
> - && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
> + && TARGET_VFP_BASE)
> saved += arm_get_vfp_saved_size ();
>
> /* Allocate space for saving/restoring FPCXTNS in Armv8.1-M
> Mainline
> @@ -22588,7 +22588,7 @@ arm_save_coproc_regs(void)
> saved_size += 8;
> }
>
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> {
> start_reg = FIRST_VFP_REGNUM;
>
> @@ -24546,7 +24546,7 @@ arm_fixed_condition_code_regs (unsigned int
> *p1, unsigned int *p2)
> return false;
>
> *p1 = CC_REGNUM;
> - *p2 = TARGET_HARD_FLOAT ? VFPCC_REGNUM : INVALID_REGNUM;
> + *p2 = TARGET_VFP_BASE ? VFPCC_REGNUM : INVALID_REGNUM;
> return true;
> }
>
> @@ -24965,7 +24965,7 @@ arm_hard_regno_mode_ok (unsigned int regno,
> machine_mode mode)
> {
> if (GET_MODE_CLASS (mode) == MODE_CC)
> return (regno == CC_REGNUM
> - || ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + || (TARGET_VFP_BASE
> && regno == VFPCC_REGNUM));
>
> if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC)
> @@ -24982,7 +24982,7 @@ arm_hard_regno_mode_ok (unsigned int regno,
> machine_mode mode)
> start of an even numbered register pair. */
> return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM);
>
> - if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno))
> + if (TARGET_VFP_BASE && IS_VFP_REGNUM (regno))
> {
> if (mode == DFmode)
> return VFP_REGNO_OK_FOR_DOUBLE (regno);
> @@ -26933,7 +26933,7 @@ arm_expand_epilogue_apcs_frame (bool
> really_return)
> floats_from_frame += 4;
> }
>
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> {
> int start_reg;
> rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM);
> @@ -27179,7 +27179,7 @@ arm_expand_epilogue (bool really_return)
> }
> }
>
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> {
> /* Generate VFP register multi-pop. */
> int end_reg = LAST_VFP_REGNUM + 1;
> @@ -29699,7 +29699,7 @@ arm_conditional_register_usage (void)
> if (TARGET_THUMB1)
> fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;
>
> - if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
> + if (TARGET_32BIT && TARGET_VFP_BASE)
> {
> /* VFPv3 registers are disabled when earlier VFP
> versions are selected due to the definition of
> @@ -32478,7 +32478,8 @@ arm_declare_function_name (FILE *stream, const
> char *name, tree decl)
> = TARGET_SOFT_FLOAT
> ? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa);
>
> - if (fpu_to_print != arm_last_printed_arch_string)
> + if (!(!strcmp (fpu_to_print.c_str (), "softvfp") && TARGET_VFP_BASE)
> + && (fpu_to_print != arm_last_printed_arch_string))
> {
> asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ());
> arm_last_printed_fpu_string = fpu_to_print;
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index
> 8f8c91d5fe146ed64cd4eb5450f04b3cf0c0ed18..5387f972f5a864a153873f21b9423d28446daefc
> 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -134,7 +134,7 @@
> ; arm_arch6. "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M
> ; Baseline. This attribute is used to compute attribute "enabled",
> ; use type "any" to enable an alternative in all cases.
> -(define_attr "arch"
> "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
> +(define_attr "arch"
> "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon,mve"
> (const_string "any"))
>
> (define_attr "arch_enabled" "no,yes"
> @@ -188,6 +188,10 @@
> (and (eq_attr "arch" "neon")
> (match_test "TARGET_NEON"))
> (const_string "yes")
> +
> + (and (eq_attr "arch" "mve")
> + (match_test "TARGET_HAVE_MVE"))
> + (const_string "yes")
> ]
>
> (const_string "no")))
> @@ -11758,7 +11762,7 @@
> (match_operand:SI 2 "const_int_I_operand" "I")))
> (set (match_operand:DF 3 "vfp_hard_register_operand" "")
> (mem:DF (match_dup 1)))])]
> - "TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)"
> + "TARGET_32BIT && TARGET_VFP_BASE"
> "*
> {
> int num_regs = XVECLEN (operands[0], 0);
> diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
> index
> a12de97cdaab589e0c8704b408ac4c329def416d..bf8f4ff1e5d2d6132d0afdd05255cc697c54159d
> 100644
> --- a/gcc/config/arm/constraints.md
> +++ b/gcc/config/arm/constraints.md
> @@ -38,7 +38,7 @@
> ;; in all states: Pf, Pg
>
> ;; The following memory constraints have been used:
> -;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up
> +;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up, Uf
> ;; in ARM state: Uq
> ;; in Thumb state: Uu, Uw
> ;; in all states: Q
> @@ -46,6 +46,9 @@
> (define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"
> "MVE VPR register")
>
> +(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS"
> + "MVE FPCCR register")
> +
> (define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
> "The VFP registers @code{s0}-@code{s31}.")
>
> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
> index
> b0d3bd1cf1c484927e6ac6522bc30f0f089291c7..793f67068687a60abf94c230e5485a1eb2eca6a0
> 100644
> --- a/gcc/config/arm/thumb2.md
> +++ b/gcc/config/arm/thumb2.md
> @@ -517,7 +517,7 @@
> [(match_operand 4 "cc_register" "")
> (const_int 0)])
> (match_operand:SF 1 "s_register_operand" "0,r")
> (match_operand:SF 2 "s_register_operand"
> "r,0")))]
> - "TARGET_THUMB2 && TARGET_SOFT_FLOAT"
> + "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE"
> "@
> it\\t%D3\;mov%D3\\t%0, %2
> it\\t%d3\;mov%d3\\t%0, %1"
> diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
> index
> f0b1f465de4b63d624510783576700519044717d..e76609f79418af38b70746336dd43592a1dc8713
> 100644
> --- a/gcc/config/arm/unspecs.md
> +++ b/gcc/config/arm/unspecs.md
> @@ -170,6 +170,7 @@
> UNSPEC_TORC ; Used by the intrinsic form of the iWMMXt
> TORC instruction.
> UNSPEC_TORVSC ; Used by the intrinsic form of the
> iWMMXt TORVSC instruction.
> UNSPEC_TEXTRC ; Used by the intrinsic form of the
> iWMMXt TEXTRC instruction.
> + UNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
> ])
>
>
> @@ -216,7 +217,6 @@
> VUNSPEC_SLX ; Represent a store-register-release-exclusive.
> VUNSPEC_LDA ; Represent a store-register-acquire.
> VUNSPEC_STL ; Represent a store-register-release.
> - VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
> VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content.
> VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
> VUNSPEC_CDP ; Represent the coprocessor cdp instruction.
> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
> index
> ab16a6b0eac822b4e1a1ae4dcbe39491a82cc9fe..eb6ae7bea7927c666f36219797d54c0127001bc1
> 100644
> --- a/gcc/config/arm/vfp.md
> +++ b/gcc/config/arm/vfp.md
> @@ -74,10 +74,10 @@
> (define_insn "*thumb2_movhi_vfp"
> [(set
> (match_operand:HI 0 "nonimmediate_operand"
> - "=rk, r, l, r, m, r, *t, r, *t")
> + "=rk, r, l, r, m, r, *t, r, *t, Up, r")
> (match_operand:HI 1 "general_operand"
> - "rk, I, Py, n, r, m, r, *t, *t"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && !TARGET_VFP_FP16INST
> && (register_operand (operands[0], HImode)
> || register_operand (operands[1], HImode))"
> @@ -99,20 +99,24 @@
> return "vmov%?\t%0, %1\t%@ int";
> case 8:
> return "vmov%?.f32\t%0, %1\t%@ int";
> + case 9:
> + return "vmsr%?\t P0, %1\t@ movhi";
> + case 10:
> + return "vmrs%?\t %0, P0\t@ movhi";
> default:
> gcc_unreachable ();
> }
> }
> [(set_attr "predicable" "yes")
> (set_attr "predicable_short_it"
> - "yes, no, yes, no, no, no, no, no, no")
> + "yes, no, yes, no, no, no, no, no, no, no, no")
> (set_attr "type"
> "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
> - f_mcr, f_mrc, fmov")
> - (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
> - (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
> - (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
> - (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
> + f_mcr, f_mrc, fmov, mve_move, mve_move")
> + (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
> + (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
> + (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
> + (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
> )
>
> ;; Patterns for HI moves which provide more data transfer
> instructions when FP16
> @@ -170,10 +174,10 @@
> (define_insn "*thumb2_movhi_fp16"
> [(set
> (match_operand:HI 0 "nonimmediate_operand"
> - "=rk, r, l, r, m, r, *t, r, *t")
> + "=rk, r, l, r, m, r, *t, r, *t, Up, r")
> (match_operand:HI 1 "general_operand"
> - "rk, I, Py, n, r, m, r, *t, *t"))]
> - "TARGET_THUMB2 && TARGET_VFP_FP16INST
> + "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
> + "TARGET_THUMB2 && (TARGET_VFP_FP16INST || TARGET_HAVE_MVE)
> && (register_operand (operands[0], HImode)
> || register_operand (operands[1], HImode))"
> {
> @@ -194,21 +198,25 @@
> return "vmov.f16\t%0, %1\t%@ int";
> case 8:
> return "vmov%?.f32\t%0, %1\t%@ int";
> + case 9:
> + return "vmsr%?\tP0, %1\t%@ movhi";
> + case 10:
> + return "vmrs%?\t%0, P0\t%@ movhi";
> default:
> gcc_unreachable ();
> }
> }
> [(set_attr "predicable"
> - "yes, yes, yes, yes, yes, yes, no, no, yes")
> + "yes, yes, yes, yes, yes, yes, no, no, yes, yes, yes")
> (set_attr "predicable_short_it"
> - "yes, no, yes, no, no, no, no, no, no")
> + "yes, no, yes, no, no, no, no, no, no, no, no")
> (set_attr "type"
> "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
> - f_mcr, f_mrc, fmov")
> - (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
> - (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
> - (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
> - (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
> + f_mcr, f_mrc, fmov, mve_move, mve_move")
> + (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
> + (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
> + (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
> + (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
> )
>
> ;; SImode moves
> @@ -258,9 +266,11 @@
> ;; is chosen with length 2 when the instruction is predicated for
> ;; arm_restrict_it.
> (define_insn "*thumb2_movsi_vfp"
> - [(set (match_operand:SI 0 "nonimmediate_operand"
> "=rk,r,l,r,r,lk*r,m,*t, r,*t,*t, *Uv")
> - (match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,lk*r,
> r,*t,*t,*UvTu,*t"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + [(set (match_operand:SI 0 "nonimmediate_operand"
> "=rk,r,l,r,r,l,*hk,m,*m,*t,\
> + r,*t,*t,*Uv, Up, r,Uf,r")
> + (match_operand:SI 1 "general_operand"
> "rk,I,Py,K,j,mi,*mi,l,*hk,r,*t,\
> + *t,*UvTu,*t, r, Up,r,Uf"))]
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && ( s_register_operand (operands[0], SImode)
> || s_register_operand (operands[1], SImode))"
> "*
> @@ -275,30 +285,44 @@
> case 4:
> return \"movw%?\\t%0, %1\";
> case 5:
> + case 6:
> /* Cannot load it directly, split to load it via MOV / MOVT. */
> if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> return \"#\";
> return \"ldr%?\\t%0, %1\";
> - case 6:
> - return \"str%?\\t%1, %0\";
> case 7:
> - return \"vmov%?\\t%0, %1\\t%@ int\";
> case 8:
> - return \"vmov%?\\t%0, %1\\t%@ int\";
> + return \"str%?\\t%1, %0\";
> case 9:
> + return \"vmov%?\\t%0, %1\\t%@ int\";
> + case 10:
> + return \"vmov%?\\t%0, %1\\t%@ int\";
> + case 11:
> return \"vmov%?.f32\\t%0, %1\\t%@ int\";
> - case 10: case 11:
> + case 12: case 13:
> return output_move_vfp (operands);
> + case 14:
> + return \"vmsr\\t P0, %1\";
> + case 15:
> + return \"vmrs\\t %0, P0\";
> + case 16:
> + return \"mcr\\tp10, 7, %1, cr1, cr0, 0\\t @SET_FPSCR\";
> + case 17:
> + return \"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR\";
> default:
> gcc_unreachable ();
> }
> "
> [(set_attr "predicable" "yes")
> - (set_attr "predicable_short_it"
> "yes,no,yes,no,no,no,no,no,no,no,no,no")
> - (set_attr "type"
> "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores")
> - (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4")
> - (set_attr "pool_range" "*,*,*,*,*,1018,*,*,*,*,1018,*")
> - (set_attr "neg_pool_range" "*,*,*,*,*, 0,*,*,*,*,1008,*")]
> + (set_attr "predicable_short_it"
> "yes,no,yes,no,no,no,no,no,no,no,no,no,no,\
> + no,no,no,no,no")
> + (set_attr "type"
> "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,load_4,\
> + store_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores,mve_move,\
> + mve_move,mrs,mrs")
> + (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4")
> + (set_attr "pool_range" "*,*,*,*,*,1018,4094,*,*,*,*,*,1018,*,*,*,*,*")
> + (set_attr "arch" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,mve,mve,mve,mve")
> + (set_attr "neg_pool_range" "*,*,*,*,*, 0,
> 0,*,*,*,*,*,1008,*,*,*,*,*")]
> )
>
>
> @@ -306,12 +330,12 @@
>
> (define_insn "*movdi_vfp"
> [(set (match_operand:DI 0 "nonimmediate_di_operand"
> "=r,r,r,r,r,r,m,w,!r,w,w, Uv")
> - (match_operand:DI 1 "di_operand"
> "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
> - "TARGET_32BIT && TARGET_HARD_FLOAT
> + (match_operand:DI 1 "di_operand"
> "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
> + "TARGET_32BIT && TARGET_VFP_BASE
> && ( register_operand (operands[0], DImode)
> || register_operand (operands[1], DImode))
> - && !(TARGET_NEON && CONST_INT_P (operands[1])
> - && simd_immediate_valid_for_move (operands[1], DImode, NULL,
> NULL))"
> + && !((TARGET_NEON || TARGET_HAVE_MVE) && CONST_INT_P (operands[1])
> + && simd_immediate_valid_for_move (operands[1], DImode, NULL,
> NULL))"
> "*
> switch (which_alternative)
> {
> @@ -333,7 +357,7 @@
> case 8:
> return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\";
> case 9:
> - if (TARGET_VFP_SINGLE)
> + if (TARGET_VFP_SINGLE || TARGET_HAVE_MVE)
> return \"vmov%?.f32\\t%0, %1\\t%@ int\;vmov%?.f32\\t%p0,
> %p1\\t%@ int\";
> else
> return \"vmov%?.f64\\t%P0, %P1\\t%@ int\";
> @@ -390,9 +414,15 @@
> case 6: /* S register from immediate. */
> return \"vmov.f16\\t%0, %1\t%@ __<fporbf>\";
> case 7: /* S register from memory. */
> - return \"vld1.16\\t{%z0}, %A1\";
> + if (TARGET_HAVE_MVE)
> + return \"vldr.16\\t%0, %A1\";
> + else
> + return \"vld1.16\\t{%z0}, %A1\";
> case 8: /* Memory from S register. */
> - return \"vst1.16\\t{%z1}, %A0\";
> + if (TARGET_HAVE_MVE)
> + return \"vstr.16\\t%1, %A0\";
> + else
> + return \"vst1.16\\t{%z1}, %A0\";
> case 9: /* ARM register from constant. */
> {
> long bits;
> @@ -593,7 +623,7 @@
> (define_insn "*thumb2_movsf_vfp"
> [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t ,Uv,r
> ,m,t,r")
> (match_operand:SF 1 "hard_sf_operand" " ?r,t,Dv,UvHa,t,
> mHa,r,t,r"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && ( s_register_operand (operands[0], SFmode)
> || s_register_operand (operands[1], SFmode))"
> "*
> @@ -682,7 +712,7 @@
> (define_insn "*thumb2_movdf_vfp"
> [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w
> ,w,w ,Uv,r ,m,w,r")
> (match_operand:DF 1 "hard_df_operand" " ?r,w,Dy,G,UvHa,w,
> mHa,r, w,r"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && ( register_operand (operands[0], DFmode)
> || register_operand (operands[1], DFmode))"
> "*
> @@ -760,7 +790,7 @@
> [(match_operand 4 "cc_register" "") (const_int 0)])
> (match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")
> (match_operand:SF 2 "s_register_operand"
> "t,0,t,?r,0,?r,t,0,t")))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT && !arm_restrict_it"
> + "TARGET_THUMB2 && TARGET_VFP_BASE && !arm_restrict_it"
> "@
> it\\t%D3\;vmov%D3.f32\\t%0, %2
> it\\t%d3\;vmov%d3.f32\\t%0, %1
> @@ -806,7 +836,8 @@
> [(match_operand 4 "cc_register" "") (const_int 0)])
> (match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
> (match_operand:DF 2 "s_register_operand"
> "w,0,w,?r,0,?r,w,0,w")))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE &&
> !arm_restrict_it"
> + "TARGET_THUMB2 && TARGET_VFP_BASE && TARGET_VFP_DOUBLE
> + && !arm_restrict_it"
> "@
> it\\t%D3\;vmov%D3.f64\\t%P0, %P2
> it\\t%d3\;vmov%d3.f64\\t%P0, %P1
> @@ -1977,7 +2008,7 @@
> [(set (match_operand:BLK 0 "memory_operand" "=m")
> (unspec:BLK [(match_operand:DF 1 "vfp_register_operand" "")]
> UNSPEC_PUSH_MULT))])]
> - "TARGET_32BIT && TARGET_HARD_FLOAT"
> + "TARGET_32BIT && TARGET_VFP_BASE"
> "* return vfp_output_vstmd (operands);"
> [(set_attr "type" "f_stored")]
> )
> @@ -2065,16 +2096,18 @@
>
> ;; Write Floating-point Status and Control Register.
> (define_insn "set_fpscr"
> - [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")]
> VUNSPEC_SET_FPSCR)]
> - "TARGET_HARD_FLOAT"
> + [(set (reg:SI VFPCC_REGNUM)
> + (unspec_volatile:SI
> + [(match_operand:SI 0 "register_operand" "r")]
> VUNSPEC_SET_FPSCR))]
> + "TARGET_VFP_BASE"
> "mcr\\tp10, 7, %0, cr1, cr0, 0\\t @SET_FPSCR"
> [(set_attr "type" "mrs")])
>
> ;; Read Floating-point Status and Control Register.
> (define_insn "get_fpscr"
> [(set (match_operand:SI 0 "register_operand" "=r")
> - (unspec_volatile:SI [(const_int 0)] VUNSPEC_GET_FPSCR))]
> - "TARGET_HARD_FLOAT"
> + (unspec:SI [(reg:SI VFPCC_REGNUM)] UNSPEC_GET_FPSCR))]
> + "TARGET_VFP_BASE"
> "mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR"
> [(set_attr "type" "mrs")])
>
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..17ba616c041378b88463cb7ef150b70b2e7b95ad
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve.fp
> -mfloat-abi=hard -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..7b877c4a90c506343d6b4edb750ba06ce3d7a68d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve.fp
> -mfloat-abi=softfp -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..85fbb5767edc3c25ceb4d6da780d47afa1ee416c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve
> -mfloat-abi=hard -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..23b3683ae559b3f7bf6c3ad11c4070ad2ddb9387
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve
> -mfloat-abi=softfp -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..8f7fa348d130e8456d5300ac25821fd96f9d5a97
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve
> -mfloat-abi=soft -mthumb" } */
> +
> +int
> +foo1 (int value)
> +{
> + int b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler "\.fpu softvfp" } } */
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
2020-03-16 10:54 ` Srinath Parvathaneni
@ 2020-03-16 12:13 ` Srinath Parvathaneni
0 siblings, 0 replies; 4+ messages in thread
From: Srinath Parvathaneni @ 2020-03-16 12:13 UTC (permalink / raw)
To: Kyrill Tkachov, gcc-patches
Hi Kyrill,
I have re-based this patch, please commit the following patch on my behalf.
https://gcc.gnu.org/pipermail/gcc-patches/2020-March/541826.html
Regards,
SRI.
________________________________
From: Gcc-patches <gcc-patches-bounces@gcc.gnu.org> on behalf of Srinath Parvathaneni <Srinath.Parvathaneni@arm.com>
Sent: 16 March 2020 10:54
To: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>; gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
Hi Kyrill,
> Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the
other patches in this series)
I have bootstrapped this patch on arm-none-linux-gnueabihf and found no issues.
There is problem with git commit rights, could you commit this patch on my behalf.
Regards
SRI.
________________________________
From: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
Sent: 12 March 2020 11:16
To: Srinath Parvathaneni <Srinath.Parvathaneni@arm.com>; gcc-patches@gcc.gnu.org <gcc-patches@gcc.gnu.org>
Subject: Re: [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch.
Hi Srinath,
On 3/10/20 6:19 PM, Srinath Parvathaneni wrote:
> Hello Kyrill,
>
> This patch addresses all the comments in patch version v2.
> (version v2)
> https://gcc.gnu.org/pipermail/gcc-patches/2020-February/540416.html
>
> ####
>
>
> Hello,
>
> This patch is part of MVE ACLE intrinsics framework.
> This patches add support to update (read/write) the APSR (Application
> Program Status Register)
> register and FPSCR (Floating-point Status and Control Register)
> register for MVE.
> This patch also enables thumb2 mov RTL patterns for MVE.
>
> A new feature bit vfp_base is added. This bit is enabled for all VFP,
> MVE and MVE with floating point
> extensions. This bit is used to enable the macro TARGET_VFP_BASE. For
> all the VFP instructions, RTL patterns,
> status and control registers are guarded by TARGET_HAVE_FLOAT. But
> this patch modifies that and the
> common instructions, RTL patterns, status and control registers
> bewteen MVE and VFP are guarded by
> TARGET_VFP_BASE macro.
>
> The RTL pattern set_fpscr and get_fpscr are updated to use
> VFPCC_REGNUM because few MVE intrinsics
> set/get carry bit of FPSCR register.
>
> Please refer to Arm reference manual [1] for more details.
> [1] https://developer.arm.com/docs/ddi0553/latest
>
> Regression tested on target arm-none-eabi and armeb-none-eabi and
> found no regressions.
>
> Ok for trunk?
Ok, but make sure it bootstraps on arm-none-linux-gnueabihf (as with the
other patches in this series)
Thanks,
Kyrill
>
> Thanks,
> Srinath
> gcc/ChangeLog:
>
> 2020-03-06 Andre Vieira <andre.simoesdiasvieira@arm.com>
> Mihail Ionescu <mihail.ionescu@arm.com>
> Srinath Parvathaneni <srinath.parvathaneni@arm.com>
>
> * common/config/arm/arm-common.c (arm_asm_auto_mfpu): When
> vfp_base
> feature bit is on and -mfpu=auto is passed as compiler option,
> do not
> generate error on not finding any match fpu. Because in this
> case fpu
> is not required.
> * config/arm/arm-cpus.in (vfp_base): Define feature bit, this
> bit is
> enabled for MVE and also for all VFP extensions.
> (VFPv2): Modify fgroup to enable vfp_base feature bit when
> ever VFPv2
> is enabled.
> (MVE): Define fgroup to enable feature bits mve, vfp_base and
> armv7em.
> (MVE_FP): Define fgroup to enable feature bits is fgroup MVE
> and FPv5
> along with feature bits mve_float.
> (mve): Modify add options in armv8.1-m.main arch for MVE.
> (mve.fp): Modify add options in armv8.1-m.main arch for MVE with
> floating point.
> * config/arm/arm.c (use_return_insn): Replace the
> check with TARGET_VFP_BASE.
> (thumb2_legitimate_index_p): Replace TARGET_HARD_FLOAT with
> TARGET_VFP_BASE.
> (arm_rtx_costs_internal): Replace "TARGET_HARD_FLOAT ||
> TARGET_HAVE_MVE"
> with TARGET_VFP_BASE, to allow cost calculations for copies in
> MVE as
> well.
> (arm_get_vfp_saved_size): Replace TARGET_HARD_FLOAT with
> TARGET_VFP_BASE, to allow space calculation for VFP registers
> in MVE
> as well.
> (arm_compute_frame_layout): Likewise.
> (arm_save_coproc_regs): Likewise.
> (arm_fixed_condition_code_regs): Modify to enable using
> VFPCC_REGNUM
> in MVE as well.
> (arm_hard_regno_mode_ok): Replace "TARGET_HARD_FLOAT ||
> TARGET_HAVE_MVE"
> with equivalent macro TARGET_VFP_BASE.
> (arm_expand_epilogue_apcs_frame): Likewise.
> (arm_expand_epilogue): Likewise.
> (arm_conditional_register_usage): Likewise.
> (arm_declare_function_name): Add check to skip printing .fpu
> directive
> in assembly file when TARGET_VFP_BASE is enabled and
> fpu_to_print is
> "softvfp".
> * config/arm/arm.h (TARGET_VFP_BASE): Define.
> * config/arm/arm.md (arch): Add "mve" to arch.
> (eq_attr "arch" "mve"): Enable on TARGET_HAVE_MVE is true.
> (vfp_pop_multiple_with_writeback): Replace "TARGET_HARD_FLOAT
> || TARGET_HAVE_MVE" with equivalent macro TARGET_VFP_BASE.
> * config/arm/constraints.md (Uf): Define to allow modification
> to FPCCR
> in MVE.
> * config/arm/thumb2.md (thumb2_movsfcc_soft_insn): Modify
> target guard
> to not allow for MVE.
> * config/arm/unspecs.md (UNSPEC_GET_FPSCR): Move to volatile
> unspecs
> enum.
> (VUNSPEC_GET_FPSCR): Define.
> * config/arm/vfp.md (thumb2_movhi_vfp): Add support for VMSR
> and VMRS
> instructions which move to general-purpose Register from
> Floating-point
> Special register and vice-versa.
> (thumb2_movhi_fp16): Likewise.
> (thumb2_movsi_vfp): Add support for VMSR and VMRS instructions
> along
> with MCR and MRC instructions which set and get Floating-point
> Status
> and Control Register (FPSCR).
> (movdi_vfp): Modify pattern to enable Single-precision scalar
> float move
> in MVE.
> (thumb2_movdf_vfp): Modify pattern to enable Double-precision
> scalar
> float move patterns in MVE.
> (thumb2_movsfcc_vfp): Modify pattern to enable single float
> conditional
> code move patterns of VFP also in MVE by adding
> TARGET_VFP_BASE check.
> (thumb2_movdfcc_vfp): Modify pattern to enable double float
> conditional
> code move patterns of VFP also in MVE by adding
> TARGET_VFP_BASE check.
> (push_multi_vfp): Add support to use VFP VPUSH pattern for MVE
> by adding
> TARGET_VFP_BASE check.
> (set_fpscr): Add support to set FPSCR register for MVE. Modify
> pattern
> using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
> register.
> (get_fpscr): Add support to get FPSCR register for MVE. Modify
> pattern
> using VFPCC_REGNUM as few MVE intrinsics use carry bit of FPSCR
> register.
>
> gcc/testsuite/ChangeLog:
>
> 2020-03-06 Srinath Parvathaneni <srinath.parvathaneni@arm.com>
>
> * gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c: New test.
> * gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c: Likewise.
> * gcc.target/arm/mve/intrinsics/mve_fpu1.c: Likewise.
> * gcc.target/arm/mve/intrinsics/mve_fpu2.c: Likewise.
> * gcc.target/arm/mve/intrinsics/mve_fpu3.c: Likewise.
>
>
> ############### Attachment also inlined for ease of reply
> ###############
>
>
> diff --git a/gcc/common/config/arm/arm-common.c
> b/gcc/common/config/arm/arm-common.c
> index
> 30a2a1deb864ee22d48cebb08247176640524955..83cc68009ac16a89ab5515f19d4eb84f595e33f1
> 100644
> --- a/gcc/common/config/arm/arm-common.c
> +++ b/gcc/common/config/arm/arm-common.c
> @@ -1009,7 +1009,8 @@ arm_asm_auto_mfpu (int argc, const char **argv)
> }
> }
>
> - gcc_assert (i != TARGET_FPU_auto);
> + gcc_assert (i != TARGET_FPU_auto
> + || bitmap_bit_p (arm_active_target.isa,
> isa_bit_vfp_base));
> }
>
> auto_fpu = (char *) xmalloc (strlen (fpuname) + sizeof ("-mfpu="));
> diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
> index
> 96f584da325172bd1460251e2de0ad679589d312..77b43090d69a599d8806cfcc02037e1bbed6e7a1
> 100644
> --- a/gcc/config/arm/arm-cpus.in
> +++ b/gcc/config/arm/arm-cpus.in
> @@ -135,6 +135,10 @@ define feature armv8_1m_main
> # Floating point and Neon extensions.
> # VFPv1 is not supported in GCC.
>
> +# This feature bit is enabled for all VFP, MVE and
> +# MVE with floating point extensions.
> +define feature vfp_base
> +
> # Vector floating point v2.
> define feature vfpv2
>
> @@ -234,7 +238,7 @@ define fgroup ALL_SIMD ALL_SIMD_INTERNAL
> ALL_SIMD_EXTERNAL
>
> # List of all FPU bits to strip out if -mfpu is used to override the
> # default. fp16 is deliberately missing from this list.
> -define fgroup ALL_FPU_INTERNAL vfpv2 vfpv3 vfpv4 fpv5 fp16conv fp_dbl
> ALL_SIMD_INTERNAL
> +define fgroup ALL_FPU_INTERNAL vfp_base vfpv2 vfpv3 vfpv4 fpv5
> fp16conv fp_dbl ALL_SIMD_INTERNAL
> # Similarly, but including fp16 and other extensions that aren't part of
> # -mfpu support.
> define fgroup ALL_FPU_EXTERNAL fp16 bf16
> @@ -279,10 +283,12 @@ define fgroup ARMv8r ARMv8a
> define fgroup ARMv8_1m_main ARMv8m_main armv8_1m_main
>
> # Useful combinations.
> -define fgroup VFPv2 vfpv2
> +define fgroup VFPv2 vfp_base vfpv2
> define fgroup VFPv3 VFPv2 vfpv3
> define fgroup VFPv4 VFPv3 vfpv4 fp16conv
> define fgroup FPv5 VFPv4 fpv5
> +define fgroup MVE mve vfp_base armv7em
> +define fgroup MVE_FP MVE FPv5 fp16 mve_float
>
> define fgroup FP_DBL fp_dbl
> define fgroup FP_D32 FP_DBL fp_d32
> @@ -699,8 +705,8 @@ begin arch armv8.1-m.main
> option fp add FPv5 fp16
> option fp.dp add FPv5 FP_DBL fp16
> option nofp remove ALL_FP
> - option mve add mve armv7em
> - option mve.fp add mve FPv5 fp16 mve_float armv7em
> + option mve add MVE
> + option mve.fp add MVE_FP
> end arch armv8.1-m.main
>
> begin arch iwmmxt
> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
> index
> a0283ed62c8047fe1ccbbb9b639ad34771fe46c2..c7453412959f23bf25c2052b4e0bb6a95faf3163
> 100644
> --- a/gcc/config/arm/arm.h
> +++ b/gcc/config/arm/arm.h
> @@ -334,6 +334,19 @@ emission of floating point pcs attributes. */
> isa_bit_mve_float) \
> && !TARGET_GENERAL_REGS_ONLY)
>
> +/* MVE have few common instructions as VFP, like VLDM alias VPOP,
> VLDR, VSTM
> + alia VPUSH, VSTR and VMOV, VMSR and VMRS. In the same manner it
> updates few
> + registers such as FPCAR, FPCCR, FPDSCR, FPSCR, MVFR0, MVFR1 and
> MVFR2. All
> + the VFP instructions, RTL patterns and register are guarded by
> + TARGET_HARD_FLOAT. But the common instructions, RTL pattern and
> registers
> + between MVE and VFP will be guarded by the following macro
> TARGET_VFP_BASE
> + hereafter. */
> +
> +#define TARGET_VFP_BASE (arm_float_abi != ARM_FLOAT_ABI_SOFT \
> + && bitmap_bit_p (arm_active_target.isa, \
> + isa_bit_vfp_base) \
> + && !TARGET_GENERAL_REGS_ONLY)
> +
> /* Nonzero if integer division instructions supported. */
> #define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
> || (TARGET_THUMB && arm_arch_thumb_hwdiv))
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index
> c769104a93746cd7c02b46b82f1a8f8057b9ae62..b40904a40e0979af4285fdbd85bfae55abea25dd
> 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -4295,7 +4295,7 @@ use_return_insn (int iscond, rtx sibling)
>
> /* Can't be done if any of the VFP regs are pushed,
> since this also requires an insn. */
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> for (regno = FIRST_VFP_REGNUM; regno <= LAST_VFP_REGNUM; regno++)
> if (df_regs_ever_live_p (regno) && !call_used_or_fixed_reg_p
> (regno))
> return 0;
> @@ -6289,7 +6289,7 @@ use_vfp_abi (enum arm_pcs pcs_variant, bool
> is_double)
> return false;
>
> return (TARGET_32BIT && TARGET_HARD_FLOAT &&
> - (TARGET_VFP_DOUBLE || !is_double));
> + (TARGET_VFP_DOUBLE || !is_double));
> }
>
> /* Return true if an argument whose type is TYPE, or mode is MODE, is
> @@ -8512,7 +8512,7 @@ thumb2_legitimate_index_p (machine_mode mode,
> rtx index, int strict_p)
>
> /* ??? Combine arm and thumb2 coprocessor addressing modes. */
> /* Standard coprocessor addressing modes. */
> - if (TARGET_HARD_FLOAT
> + if (TARGET_VFP_BASE
> && (mode == SFmode || mode == DFmode))
> return (code == CONST_INT && INTVAL (index) < 1024
> /* Thumb-2 allows only > -256 index range for it's core
> register
> @@ -9905,7 +9905,7 @@ arm_rtx_costs_internal (rtx x, enum rtx_code
> code, enum rtx_code outer_code,
> /* Assume that most copies can be done with a single insn,
> unless we don't have HW FP, in which case everything
> larger than word mode will require two insns. */
> - *cost = COSTS_N_INSNS (((!(TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + *cost = COSTS_N_INSNS (((!TARGET_VFP_BASE
> && GET_MODE_SIZE (mode) > 4)
> || mode == DImode)
> ? 2 : 1);
> @@ -20821,7 +20821,7 @@ arm_get_vfp_saved_size (void)
>
> saved = 0;
> /* Space for saved VFP registers. */
> - if (TARGET_HARD_FLOAT)
> + if (TARGET_VFP_BASE)
> {
> count = 0;
> for (regno = FIRST_VFP_REGNUM;
> @@ -22364,7 +22364,7 @@ arm_compute_frame_layout (void)
> func_type = arm_current_func_type ();
> /* Space for saved VFP registers. */
> if (! IS_VOLATILE (func_type)
> - && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
> + && TARGET_VFP_BASE)
> saved += arm_get_vfp_saved_size ();
>
> /* Allocate space for saving/restoring FPCXTNS in Armv8.1-M
> Mainline
> @@ -22588,7 +22588,7 @@ arm_save_coproc_regs(void)
> saved_size += 8;
> }
>
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> {
> start_reg = FIRST_VFP_REGNUM;
>
> @@ -24546,7 +24546,7 @@ arm_fixed_condition_code_regs (unsigned int
> *p1, unsigned int *p2)
> return false;
>
> *p1 = CC_REGNUM;
> - *p2 = TARGET_HARD_FLOAT ? VFPCC_REGNUM : INVALID_REGNUM;
> + *p2 = TARGET_VFP_BASE ? VFPCC_REGNUM : INVALID_REGNUM;
> return true;
> }
>
> @@ -24965,7 +24965,7 @@ arm_hard_regno_mode_ok (unsigned int regno,
> machine_mode mode)
> {
> if (GET_MODE_CLASS (mode) == MODE_CC)
> return (regno == CC_REGNUM
> - || ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + || (TARGET_VFP_BASE
> && regno == VFPCC_REGNUM));
>
> if (regno == CC_REGNUM && GET_MODE_CLASS (mode) != MODE_CC)
> @@ -24982,7 +24982,7 @@ arm_hard_regno_mode_ok (unsigned int regno,
> machine_mode mode)
> start of an even numbered register pair. */
> return (ARM_NUM_REGS (mode) < 2) || (regno < LAST_LO_REGNUM);
>
> - if ((TARGET_HARD_FLOAT || TARGET_HAVE_MVE) && IS_VFP_REGNUM (regno))
> + if (TARGET_VFP_BASE && IS_VFP_REGNUM (regno))
> {
> if (mode == DFmode)
> return VFP_REGNO_OK_FOR_DOUBLE (regno);
> @@ -26933,7 +26933,7 @@ arm_expand_epilogue_apcs_frame (bool
> really_return)
> floats_from_frame += 4;
> }
>
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> {
> int start_reg;
> rtx ip_rtx = gen_rtx_REG (SImode, IP_REGNUM);
> @@ -27179,7 +27179,7 @@ arm_expand_epilogue (bool really_return)
> }
> }
>
> - if (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)
> + if (TARGET_VFP_BASE)
> {
> /* Generate VFP register multi-pop. */
> int end_reg = LAST_VFP_REGNUM + 1;
> @@ -29699,7 +29699,7 @@ arm_conditional_register_usage (void)
> if (TARGET_THUMB1)
> fixed_regs[LR_REGNUM] = call_used_regs[LR_REGNUM] = 1;
>
> - if (TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE))
> + if (TARGET_32BIT && TARGET_VFP_BASE)
> {
> /* VFPv3 registers are disabled when earlier VFP
> versions are selected due to the definition of
> @@ -32478,7 +32478,8 @@ arm_declare_function_name (FILE *stream, const
> char *name, tree decl)
> = TARGET_SOFT_FLOAT
> ? "softvfp" : arm_identify_fpu_from_isa (arm_active_target.isa);
>
> - if (fpu_to_print != arm_last_printed_arch_string)
> + if (!(!strcmp (fpu_to_print.c_str (), "softvfp") && TARGET_VFP_BASE)
> + && (fpu_to_print != arm_last_printed_arch_string))
> {
> asm_fprintf (asm_out_file, "\t.fpu %s\n", fpu_to_print.c_str ());
> arm_last_printed_fpu_string = fpu_to_print;
> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> index
> 8f8c91d5fe146ed64cd4eb5450f04b3cf0c0ed18..5387f972f5a864a153873f21b9423d28446daefc
> 100644
> --- a/gcc/config/arm/arm.md
> +++ b/gcc/config/arm/arm.md
> @@ -134,7 +134,7 @@
> ; arm_arch6. "v6t2" for Thumb-2 with arm_arch6 and "v8mb" for ARMv8-M
> ; Baseline. This attribute is used to compute attribute "enabled",
> ; use type "any" to enable an alternative in all cases.
> -(define_attr "arch"
> "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
> +(define_attr "arch"
> "any,a,t,32,t1,t2,v6,nov6,v6t2,v8mb,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon,mve"
> (const_string "any"))
>
> (define_attr "arch_enabled" "no,yes"
> @@ -188,6 +188,10 @@
> (and (eq_attr "arch" "neon")
> (match_test "TARGET_NEON"))
> (const_string "yes")
> +
> + (and (eq_attr "arch" "mve")
> + (match_test "TARGET_HAVE_MVE"))
> + (const_string "yes")
> ]
>
> (const_string "no")))
> @@ -11758,7 +11762,7 @@
> (match_operand:SI 2 "const_int_I_operand" "I")))
> (set (match_operand:DF 3 "vfp_hard_register_operand" "")
> (mem:DF (match_dup 1)))])]
> - "TARGET_32BIT && (TARGET_HARD_FLOAT || TARGET_HAVE_MVE)"
> + "TARGET_32BIT && TARGET_VFP_BASE"
> "*
> {
> int num_regs = XVECLEN (operands[0], 0);
> diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
> index
> a12de97cdaab589e0c8704b408ac4c329def416d..bf8f4ff1e5d2d6132d0afdd05255cc697c54159d
> 100644
> --- a/gcc/config/arm/constraints.md
> +++ b/gcc/config/arm/constraints.md
> @@ -38,7 +38,7 @@
> ;; in all states: Pf, Pg
>
> ;; The following memory constraints have been used:
> -;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up
> +;; in ARM/Thumb-2 state: Uh, Ut, Uv, Uy, Un, Um, Us, Up, Uf
> ;; in ARM state: Uq
> ;; in Thumb state: Uu, Uw
> ;; in all states: Q
> @@ -46,6 +46,9 @@
> (define_register_constraint "Up" "TARGET_HAVE_MVE ? VPR_REG : NO_REGS"
> "MVE VPR register")
>
> +(define_register_constraint "Uf" "TARGET_HAVE_MVE ? VFPCC_REG : NO_REGS"
> + "MVE FPCCR register")
> +
> (define_register_constraint "t" "TARGET_32BIT ? VFP_LO_REGS : NO_REGS"
> "The VFP registers @code{s0}-@code{s31}.")
>
> diff --git a/gcc/config/arm/thumb2.md b/gcc/config/arm/thumb2.md
> index
> b0d3bd1cf1c484927e6ac6522bc30f0f089291c7..793f67068687a60abf94c230e5485a1eb2eca6a0
> 100644
> --- a/gcc/config/arm/thumb2.md
> +++ b/gcc/config/arm/thumb2.md
> @@ -517,7 +517,7 @@
> [(match_operand 4 "cc_register" "")
> (const_int 0)])
> (match_operand:SF 1 "s_register_operand" "0,r")
> (match_operand:SF 2 "s_register_operand"
> "r,0")))]
> - "TARGET_THUMB2 && TARGET_SOFT_FLOAT"
> + "TARGET_THUMB2 && TARGET_SOFT_FLOAT && !TARGET_HAVE_MVE"
> "@
> it\\t%D3\;mov%D3\\t%0, %2
> it\\t%d3\;mov%d3\\t%0, %1"
> diff --git a/gcc/config/arm/unspecs.md b/gcc/config/arm/unspecs.md
> index
> f0b1f465de4b63d624510783576700519044717d..e76609f79418af38b70746336dd43592a1dc8713
> 100644
> --- a/gcc/config/arm/unspecs.md
> +++ b/gcc/config/arm/unspecs.md
> @@ -170,6 +170,7 @@
> UNSPEC_TORC ; Used by the intrinsic form of the iWMMXt
> TORC instruction.
> UNSPEC_TORVSC ; Used by the intrinsic form of the
> iWMMXt TORVSC instruction.
> UNSPEC_TEXTRC ; Used by the intrinsic form of the
> iWMMXt TEXTRC instruction.
> + UNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
> ])
>
>
> @@ -216,7 +217,6 @@
> VUNSPEC_SLX ; Represent a store-register-release-exclusive.
> VUNSPEC_LDA ; Represent a store-register-acquire.
> VUNSPEC_STL ; Represent a store-register-release.
> - VUNSPEC_GET_FPSCR ; Represent fetch of FPSCR content.
> VUNSPEC_SET_FPSCR ; Represent assign of FPSCR content.
> VUNSPEC_PROBE_STACK_RANGE ; Represent stack range probing.
> VUNSPEC_CDP ; Represent the coprocessor cdp instruction.
> diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md
> index
> ab16a6b0eac822b4e1a1ae4dcbe39491a82cc9fe..eb6ae7bea7927c666f36219797d54c0127001bc1
> 100644
> --- a/gcc/config/arm/vfp.md
> +++ b/gcc/config/arm/vfp.md
> @@ -74,10 +74,10 @@
> (define_insn "*thumb2_movhi_vfp"
> [(set
> (match_operand:HI 0 "nonimmediate_operand"
> - "=rk, r, l, r, m, r, *t, r, *t")
> + "=rk, r, l, r, m, r, *t, r, *t, Up, r")
> (match_operand:HI 1 "general_operand"
> - "rk, I, Py, n, r, m, r, *t, *t"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && !TARGET_VFP_FP16INST
> && (register_operand (operands[0], HImode)
> || register_operand (operands[1], HImode))"
> @@ -99,20 +99,24 @@
> return "vmov%?\t%0, %1\t%@ int";
> case 8:
> return "vmov%?.f32\t%0, %1\t%@ int";
> + case 9:
> + return "vmsr%?\t P0, %1\t@ movhi";
> + case 10:
> + return "vmrs%?\t %0, P0\t@ movhi";
> default:
> gcc_unreachable ();
> }
> }
> [(set_attr "predicable" "yes")
> (set_attr "predicable_short_it"
> - "yes, no, yes, no, no, no, no, no, no")
> + "yes, no, yes, no, no, no, no, no, no, no, no")
> (set_attr "type"
> "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
> - f_mcr, f_mrc, fmov")
> - (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
> - (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
> - (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
> - (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
> + f_mcr, f_mrc, fmov, mve_move, mve_move")
> + (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
> + (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
> + (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
> + (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
> )
>
> ;; Patterns for HI moves which provide more data transfer
> instructions when FP16
> @@ -170,10 +174,10 @@
> (define_insn "*thumb2_movhi_fp16"
> [(set
> (match_operand:HI 0 "nonimmediate_operand"
> - "=rk, r, l, r, m, r, *t, r, *t")
> + "=rk, r, l, r, m, r, *t, r, *t, Up, r")
> (match_operand:HI 1 "general_operand"
> - "rk, I, Py, n, r, m, r, *t, *t"))]
> - "TARGET_THUMB2 && TARGET_VFP_FP16INST
> + "rk, I, Py, n, r, m, r, *t, *t, r, Up"))]
> + "TARGET_THUMB2 && (TARGET_VFP_FP16INST || TARGET_HAVE_MVE)
> && (register_operand (operands[0], HImode)
> || register_operand (operands[1], HImode))"
> {
> @@ -194,21 +198,25 @@
> return "vmov.f16\t%0, %1\t%@ int";
> case 8:
> return "vmov%?.f32\t%0, %1\t%@ int";
> + case 9:
> + return "vmsr%?\tP0, %1\t%@ movhi";
> + case 10:
> + return "vmrs%?\t%0, P0\t%@ movhi";
> default:
> gcc_unreachable ();
> }
> }
> [(set_attr "predicable"
> - "yes, yes, yes, yes, yes, yes, no, no, yes")
> + "yes, yes, yes, yes, yes, yes, no, no, yes, yes, yes")
> (set_attr "predicable_short_it"
> - "yes, no, yes, no, no, no, no, no, no")
> + "yes, no, yes, no, no, no, no, no, no, no, no")
> (set_attr "type"
> "mov_reg, mov_imm, mov_imm, mov_imm, store_4, load_4,\
> - f_mcr, f_mrc, fmov")
> - (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *")
> - (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *")
> - (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *")
> - (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4")]
> + f_mcr, f_mrc, fmov, mve_move, mve_move")
> + (set_attr "arch" "*, *, *, v6t2, *, *, *, *, *, mve, mve")
> + (set_attr "pool_range" "*, *, *, *, *, 4094, *, *, *, *, *")
> + (set_attr "neg_pool_range" "*, *, *, *, *, 250, *, *, *, *, *")
> + (set_attr "length" "2, 4, 2, 4, 4, 4, 4, 4, 4, 4, 4")]
> )
>
> ;; SImode moves
> @@ -258,9 +266,11 @@
> ;; is chosen with length 2 when the instruction is predicated for
> ;; arm_restrict_it.
> (define_insn "*thumb2_movsi_vfp"
> - [(set (match_operand:SI 0 "nonimmediate_operand"
> "=rk,r,l,r,r,lk*r,m,*t, r,*t,*t, *Uv")
> - (match_operand:SI 1 "general_operand" "rk,I,Py,K,j,mi,lk*r,
> r,*t,*t,*UvTu,*t"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + [(set (match_operand:SI 0 "nonimmediate_operand"
> "=rk,r,l,r,r,l,*hk,m,*m,*t,\
> + r,*t,*t,*Uv, Up, r,Uf,r")
> + (match_operand:SI 1 "general_operand"
> "rk,I,Py,K,j,mi,*mi,l,*hk,r,*t,\
> + *t,*UvTu,*t, r, Up,r,Uf"))]
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && ( s_register_operand (operands[0], SImode)
> || s_register_operand (operands[1], SImode))"
> "*
> @@ -275,30 +285,44 @@
> case 4:
> return \"movw%?\\t%0, %1\";
> case 5:
> + case 6:
> /* Cannot load it directly, split to load it via MOV / MOVT. */
> if (!MEM_P (operands[1]) && arm_disable_literal_pool)
> return \"#\";
> return \"ldr%?\\t%0, %1\";
> - case 6:
> - return \"str%?\\t%1, %0\";
> case 7:
> - return \"vmov%?\\t%0, %1\\t%@ int\";
> case 8:
> - return \"vmov%?\\t%0, %1\\t%@ int\";
> + return \"str%?\\t%1, %0\";
> case 9:
> + return \"vmov%?\\t%0, %1\\t%@ int\";
> + case 10:
> + return \"vmov%?\\t%0, %1\\t%@ int\";
> + case 11:
> return \"vmov%?.f32\\t%0, %1\\t%@ int\";
> - case 10: case 11:
> + case 12: case 13:
> return output_move_vfp (operands);
> + case 14:
> + return \"vmsr\\t P0, %1\";
> + case 15:
> + return \"vmrs\\t %0, P0\";
> + case 16:
> + return \"mcr\\tp10, 7, %1, cr1, cr0, 0\\t @SET_FPSCR\";
> + case 17:
> + return \"mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR\";
> default:
> gcc_unreachable ();
> }
> "
> [(set_attr "predicable" "yes")
> - (set_attr "predicable_short_it"
> "yes,no,yes,no,no,no,no,no,no,no,no,no")
> - (set_attr "type"
> "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores")
> - (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4")
> - (set_attr "pool_range" "*,*,*,*,*,1018,*,*,*,*,1018,*")
> - (set_attr "neg_pool_range" "*,*,*,*,*, 0,*,*,*,*,1008,*")]
> + (set_attr "predicable_short_it"
> "yes,no,yes,no,no,no,no,no,no,no,no,no,no,\
> + no,no,no,no,no")
> + (set_attr "type"
> "mov_reg,mov_reg,mov_reg,mvn_reg,mov_imm,load_4,load_4,\
> + store_4,store_4,f_mcr,f_mrc,fmov,f_loads,f_stores,mve_move,\
> + mve_move,mrs,mrs")
> + (set_attr "length" "2,4,2,4,4,4,4,4,4,4,4,4,4,4,4,4,4,4")
> + (set_attr "pool_range" "*,*,*,*,*,1018,4094,*,*,*,*,*,1018,*,*,*,*,*")
> + (set_attr "arch" "*,*,*,*,*,*,*,*,*,*,*,*,*,*,mve,mve,mve,mve")
> + (set_attr "neg_pool_range" "*,*,*,*,*, 0,
> 0,*,*,*,*,*,1008,*,*,*,*,*")]
> )
>
>
> @@ -306,12 +330,12 @@
>
> (define_insn "*movdi_vfp"
> [(set (match_operand:DI 0 "nonimmediate_di_operand"
> "=r,r,r,r,r,r,m,w,!r,w,w, Uv")
> - (match_operand:DI 1 "di_operand"
> "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
> - "TARGET_32BIT && TARGET_HARD_FLOAT
> + (match_operand:DI 1 "di_operand"
> "r,rDa,Db,Dc,mi,mi,r,r,w,w,UvTu,w"))]
> + "TARGET_32BIT && TARGET_VFP_BASE
> && ( register_operand (operands[0], DImode)
> || register_operand (operands[1], DImode))
> - && !(TARGET_NEON && CONST_INT_P (operands[1])
> - && simd_immediate_valid_for_move (operands[1], DImode, NULL,
> NULL))"
> + && !((TARGET_NEON || TARGET_HAVE_MVE) && CONST_INT_P (operands[1])
> + && simd_immediate_valid_for_move (operands[1], DImode, NULL,
> NULL))"
> "*
> switch (which_alternative)
> {
> @@ -333,7 +357,7 @@
> case 8:
> return \"vmov%?\\t%Q0, %R0, %P1\\t%@ int\";
> case 9:
> - if (TARGET_VFP_SINGLE)
> + if (TARGET_VFP_SINGLE || TARGET_HAVE_MVE)
> return \"vmov%?.f32\\t%0, %1\\t%@ int\;vmov%?.f32\\t%p0,
> %p1\\t%@ int\";
> else
> return \"vmov%?.f64\\t%P0, %P1\\t%@ int\";
> @@ -390,9 +414,15 @@
> case 6: /* S register from immediate. */
> return \"vmov.f16\\t%0, %1\t%@ __<fporbf>\";
> case 7: /* S register from memory. */
> - return \"vld1.16\\t{%z0}, %A1\";
> + if (TARGET_HAVE_MVE)
> + return \"vldr.16\\t%0, %A1\";
> + else
> + return \"vld1.16\\t{%z0}, %A1\";
> case 8: /* Memory from S register. */
> - return \"vst1.16\\t{%z1}, %A0\";
> + if (TARGET_HAVE_MVE)
> + return \"vstr.16\\t%1, %A0\";
> + else
> + return \"vst1.16\\t{%z1}, %A0\";
> case 9: /* ARM register from constant. */
> {
> long bits;
> @@ -593,7 +623,7 @@
> (define_insn "*thumb2_movsf_vfp"
> [(set (match_operand:SF 0 "nonimmediate_operand" "=t,?r,t, t ,Uv,r
> ,m,t,r")
> (match_operand:SF 1 "hard_sf_operand" " ?r,t,Dv,UvHa,t,
> mHa,r,t,r"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && ( s_register_operand (operands[0], SFmode)
> || s_register_operand (operands[1], SFmode))"
> "*
> @@ -682,7 +712,7 @@
> (define_insn "*thumb2_movdf_vfp"
> [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w
> ,w,w ,Uv,r ,m,w,r")
> (match_operand:DF 1 "hard_df_operand" " ?r,w,Dy,G,UvHa,w,
> mHa,r, w,r"))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT
> + "TARGET_THUMB2 && TARGET_VFP_BASE
> && ( register_operand (operands[0], DFmode)
> || register_operand (operands[1], DFmode))"
> "*
> @@ -760,7 +790,7 @@
> [(match_operand 4 "cc_register" "") (const_int 0)])
> (match_operand:SF 1 "s_register_operand" "0,t,t,0,?r,?r,0,t,t")
> (match_operand:SF 2 "s_register_operand"
> "t,0,t,?r,0,?r,t,0,t")))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT && !arm_restrict_it"
> + "TARGET_THUMB2 && TARGET_VFP_BASE && !arm_restrict_it"
> "@
> it\\t%D3\;vmov%D3.f32\\t%0, %2
> it\\t%d3\;vmov%d3.f32\\t%0, %1
> @@ -806,7 +836,8 @@
> [(match_operand 4 "cc_register" "") (const_int 0)])
> (match_operand:DF 1 "s_register_operand" "0,w,w,0,?r,?r,0,w,w")
> (match_operand:DF 2 "s_register_operand"
> "w,0,w,?r,0,?r,w,0,w")))]
> - "TARGET_THUMB2 && TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE &&
> !arm_restrict_it"
> + "TARGET_THUMB2 && TARGET_VFP_BASE && TARGET_VFP_DOUBLE
> + && !arm_restrict_it"
> "@
> it\\t%D3\;vmov%D3.f64\\t%P0, %P2
> it\\t%d3\;vmov%d3.f64\\t%P0, %P1
> @@ -1977,7 +2008,7 @@
> [(set (match_operand:BLK 0 "memory_operand" "=m")
> (unspec:BLK [(match_operand:DF 1 "vfp_register_operand" "")]
> UNSPEC_PUSH_MULT))])]
> - "TARGET_32BIT && TARGET_HARD_FLOAT"
> + "TARGET_32BIT && TARGET_VFP_BASE"
> "* return vfp_output_vstmd (operands);"
> [(set_attr "type" "f_stored")]
> )
> @@ -2065,16 +2096,18 @@
>
> ;; Write Floating-point Status and Control Register.
> (define_insn "set_fpscr"
> - [(unspec_volatile [(match_operand:SI 0 "register_operand" "r")]
> VUNSPEC_SET_FPSCR)]
> - "TARGET_HARD_FLOAT"
> + [(set (reg:SI VFPCC_REGNUM)
> + (unspec_volatile:SI
> + [(match_operand:SI 0 "register_operand" "r")]
> VUNSPEC_SET_FPSCR))]
> + "TARGET_VFP_BASE"
> "mcr\\tp10, 7, %0, cr1, cr0, 0\\t @SET_FPSCR"
> [(set_attr "type" "mrs")])
>
> ;; Read Floating-point Status and Control Register.
> (define_insn "get_fpscr"
> [(set (match_operand:SI 0 "register_operand" "=r")
> - (unspec_volatile:SI [(const_int 0)] VUNSPEC_GET_FPSCR))]
> - "TARGET_HARD_FLOAT"
> + (unspec:SI [(reg:SI VFPCC_REGNUM)] UNSPEC_GET_FPSCR))]
> + "TARGET_VFP_BASE"
> "mrc\\tp10, 7, %0, cr1, cr0, 0\\t @GET_FPSCR"
> [(set_attr "type" "mrs")])
>
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..17ba616c041378b88463cb7ef150b70b2e7b95ad
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve.fp
> -mfloat-abi=hard -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..7b877c4a90c506343d6b4edb750ba06ce3d7a68d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fp_fpu2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve.fp
> -mfloat-abi=softfp -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler "\.fpu fpv5-sp-d16" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..85fbb5767edc3c25ceb4d6da780d47afa1ee416c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu1.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve
> -mfloat-abi=hard -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..23b3683ae559b3f7bf6c3ad11c4070ad2ddb9387
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu2.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve
> -mfloat-abi=softfp -mthumb" } */
> +
> +#include "arm_mve.h"
> +
> +int8x16_t
> +foo1 (int8x16_t value)
> +{
> + int8x16_t b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler-not "\.fpu softvfp" } } */
> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
> new file mode 100644
> index
> 0000000000000000000000000000000000000000..8f7fa348d130e8456d5300ac25821fd96f9d5a97
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/mve_fpu3.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target arm_v8_1m_mve_ok } */
> +/* { dg-additional-options "-march=armv8.1-m.main+mve
> -mfloat-abi=soft -mthumb" } */
> +
> +int
> +foo1 (int value)
> +{
> + int b = value;
> + return b;
> +}
> +
> +/* { dg-final { scan-assembler "\.fpu softvfp" } } */
>
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2020-03-16 12:13 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-10 18:19 [PATCH v3][ARM][GCC][2/x]: MVE ACLE intrinsics framework patch Srinath Parvathaneni
2020-03-12 11:16 ` Kyrill Tkachov
2020-03-16 10:54 ` Srinath Parvathaneni
2020-03-16 12:13 ` Srinath Parvathaneni
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).