public inbox for gcc-patches@gcc.gnu.org
 help / color / mirror / Atom feed
* [PATCH 4/6] aarch64: Add machine modes for Neon vector-tuple types
@ 2021-10-22 14:48 Jonathan Wright
  2021-10-22 15:13 ` Richard Sandiford
  0 siblings, 1 reply; 4+ messages in thread
From: Jonathan Wright @ 2021-10-22 14:48 UTC (permalink / raw)
  To: gcc-patches; +Cc: Richard Sandiford, Kyrylo Tkachov

[-- Attachment #1: Type: text/plain, Size: 30028 bytes --]

Hi,

Until now, GCC has used large integer machine modes (OI, CI and XI)
to model Neon vector-tuple types. This is suboptimal for many
reasons, the most notable are:

 1) Large integer modes are opaque and modifying one vector in the
    tuple requires a lot of inefficient set/get gymnastics. The
    result is a lot of superfluous move instructions.
 2) Large integer modes do not map well to types that are tuples of
    64-bit vectors - we need additional zero-padding which again
    results in superfluous move instructions.

This patch adds new machine modes that better model the C-level Neon
vector-tuple types. The approach is somewhat similar to that already
used for SVE vector-tuple types.

All of the AArch64 backend patterns and builtins that manipulate Neon
vector tuples are updated to use the new machine modes. This has the
effect of significantly reducing the amount of boiler-plate code in
the arm_neon.h header.

While this patch increases the quality of code generated in many
instances, there is still room for significant improvement - which
will be attempted in subsequent patches.

Bootstrapped and regression tested on aarch64-none-linux-gnu and
aarch64_be-none-linux-gnu - no issues.

Ok for master?

Thanks,
Jonathan

---

gcc/ChangeLog:

2021-08-09  Jonathan Wright  <jonathan.wright@arm.com>
            Richard Sandiford  <richard.sandiford@arm.com>

	* config/aarch64/aarch64-builtins.c (v2x8qi_UP): Define.
	(v2x4hi_UP): Likewise.
	(v2x4hf_UP): Likewise.
	(v2x4bf_UP): Likewise.
	(v2x2si_UP): Likewise.
	(v2x2sf_UP): Likewise.
	(v2x1di_UP): Likewise.
	(v2x1df_UP): Likewise.
	(v2x16qi_UP): Likewise.
	(v2x8hi_UP): Likewise.
	(v2x8hf_UP): Likewise.
	(v2x8bf_UP): Likewise.
	(v2x4si_UP): Likewise.
	(v2x4sf_UP): Likewise.
	(v2x2di_UP): Likewise.
	(v2x2df_UP): Likewise.
	(v3x8qi_UP): Likewise.
	(v3x4hi_UP): Likewise.
	(v3x4hf_UP): Likewise.
	(v3x4bf_UP): Likewise.
	(v3x2si_UP): Likewise.
	(v3x2sf_UP): Likewise.
	(v3x1di_UP): Likewise.
	(v3x1df_UP): Likewise.
	(v3x16qi_UP): Likewise.
	(v3x8hi_UP): Likewise.
	(v3x8hf_UP): Likewise.
	(v3x8bf_UP): Likewise.
	(v3x4si_UP): Likewise.
	(v3x4sf_UP): Likewise.
	(v3x2di_UP): Likewise.
	(v3x2df_UP): Likewise.
	(v4x8qi_UP): Likewise.
	(v4x4hi_UP): Likewise.
	(v4x4hf_UP): Likewise.
	(v4x4bf_UP): Likewise.
	(v4x2si_UP): Likewise.
	(v4x2sf_UP): Likewise.
	(v4x1di_UP): Likewise.
	(v4x1df_UP): Likewise.
	(v4x16qi_UP): Likewise.
	(v4x8hi_UP): Likewise.
	(v4x8hf_UP): Likewise.
	(v4x8bf_UP): Likewise.
	(v4x4si_UP): Likewise.
	(v4x4sf_UP): Likewise.
	(v4x2di_UP): Likewise.
	(v4x2df_UP): Likewise.
	(TYPES_GETREGP): Delete.
	(TYPES_SETREGP): Likewise.
	(TYPES_LOADSTRUCT_U): Define.
	(TYPES_LOADSTRUCT_P): Likewise.
	(TYPES_LOADSTRUCT_LANE_U): Likewise.
	(TYPES_LOADSTRUCT_LANE_P): Likewise.
	(TYPES_STORE1P): Move for consistency.
	(TYPES_STORESTRUCT_U): Define.
	(TYPES_STORESTRUCT_P): Likewise.
	(TYPES_STORESTRUCT_LANE_U): Likewise.
	(TYPES_STORESTRUCT_LANE_P): Likewise.
	(aarch64_simd_tuple_types): Define.
	(aarch64_lookup_simd_builtin_type): Handle tuple type lookup.
	(aarch64_init_simd_builtin_functions): Update frontend lookup
	for builtin functions after handling arm_neon.h pragma.
	(register_tuple_type): Manually set modes of single-integer
	tuple types. Record tuple types.
	* config/aarch64/aarch64-modes.def
	(ADV_SIMD_D_REG_STRUCT_MODES): Define D-register tuple modes.
	(ADV_SIMD_Q_REG_STRUCT_MODES): Define Q-register tuple modes.
	(SVE_MODES): Give single-vector modes priority over vector-
	tuple modes.
	(VECTOR_MODES_WITH_PREFIX): Set partial-vector mode order to
	be after all single-vector modes.
	* config/aarch64/aarch64-simd-builtins.def: Update builtin
	generator macros to reflect modifications to the backend
	patterns.
	* config/aarch64/aarch64-simd.md (aarch64_simd_ld2<mode>):
	Use vector-tuple mode iterator and rename to...
	(aarch64_simd_ld2<vstruct_elt>): This.
	(aarch64_simd_ld2r<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_simd_ld2r<vstruct_elt>): This.
	(aarch64_vec_load_lanesoi_lane<mode>): Use vector-tuple mode
	iterator and rename to...
	(aarch64_vec_load_lanes<mode>_lane<vstruct_elt>): This.
	(vec_load_lanesoi<mode>): Use vector-tuple mode iterator and
	rename to...
	(vec_load_lanes<mode><vstruct_elt>): This.
	(aarch64_simd_st2<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_simd_st2<vstruct_elt>): This.
	(aarch64_vec_store_lanesoi_lane<mode>): Use vector-tuple mode
	iterator and rename to...
	(aarch64_vec_store_lanes<mode>_lane<vstruct_elt>): This.
	(vec_store_lanesoi<mode>): Use vector-tuple mode iterator and
	rename to...
	(vec_store_lanes<mode><vstruct_elt>): This.
	(aarch64_simd_ld3<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_simd_ld3<vstruct_elt>): This.
	(aarch64_simd_ld3r<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_simd_ld3r<vstruct_elt>): This.
	(aarch64_vec_load_lanesci_lane<mode>): Use vector-tuple mode
	iterator and rename to...
	(vec_load_lanesci<mode>): This.
	(aarch64_simd_st3<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_simd_st3<vstruct_elt>): This.
	(aarch64_vec_store_lanesci_lane<mode>): Use vector-tuple mode
	iterator and rename to...
	(vec_store_lanesci<mode>): This.
	(aarch64_simd_ld4<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_simd_ld4<vstruct_elt>): This.
	(aarch64_simd_ld4r<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_simd_ld4r<vstruct_elt>): This.
	(aarch64_vec_load_lanesxi_lane<mode>): Use vector-tuple mode
	iterator and rename to...
	(vec_load_lanesxi<mode>): This.
	(aarch64_simd_st4<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_simd_st4<vstruct_elt>): This.
	(aarch64_vec_store_lanesxi_lane<mode>): Use vector-tuple mode
	iterator and rename to...
	(vec_store_lanesxi<mode>): This.
	(mov<mode>): Define for Neon vector-tuple modes.
	(aarch64_ld1x3<VALLDIF:mode>): Use vector-tuple mode iterator
	and rename to...
	(aarch64_ld1x3<vstruct_elt>): This.
	(aarch64_ld1_x3_<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_ld1_x3_<vstruct_elt>): This.
	(aarch64_ld1x4<VALLDIF:mode>): Use vector-tuple mode iterator
	and rename to...
	(aarch64_ld1x4<vstruct_elt>): This.
	(aarch64_ld1_x4_<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_ld1_x4_<vstruct_elt>): This.
	(aarch64_st1x2<VALLDIF:mode>): Use vector-tuple mode iterator
	and rename to...
	(aarch64_st1x2<vstruct_elt>): This.
	(aarch64_st1_x2_<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_st1_x2_<vstruct_elt>): This.
	(aarch64_st1x3<VALLDIF:mode>): Use vector-tuple mode iterator
	and rename to...
	(aarch64_st1x3<vstruct_elt>): This.
	(aarch64_st1_x3_<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_st1_x3_<vstruct_elt>): This.
	(aarch64_st1x4<VALLDIF:mode>): Use vector-tuple mode iterator
	and rename to...
	(aarch64_st1x4<vstruct_elt>): This.
	(aarch64_st1_x4_<mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_st1_x4_<vstruct_elt>): This.
	(*aarch64_mov<mode>): Define for vector-tuple modes.
	(*aarch64_be_mov<mode>): Likewise.
	(aarch64_ld<VSTRUCT:nregs>r<VALLDIF:mode>): Use vector-tuple
	mode iterator and rename to...
	(aarch64_ld<nregs>r<vstruct_elt>): This.
	(aarch64_ld2<mode>_dreg): Use vector-tuple mode iterator and
	rename to...
	(aarch64_ld2<vstruct_elt>_dreg): This.
	(aarch64_ld3<mode>_dreg): Use vector-tuple mode iterator and
	rename to...
	(aarch64_ld3<vstruct_elt>_dreg): This.
	(aarch64_ld4<mode>_dreg): Use vector-tuple mode iterator and
	rename to...
	(aarch64_ld4<vstruct_elt>_dreg): This.
	(aarch64_ld<VSTRUCT:nregs><VDC:mode>): Use vector-tuple mode
	iterator and rename to...
	(aarch64_ld<nregs><vstruct_elt>): Use vector-tuple mode
	iterator and rename to...
	(aarch64_ld<VSTRUCT:nregs><VQ:mode>): Use vector-tuple mode
	(aarch64_ld1x2<VQ:mode>): Delete.
	(aarch64_ld1x2<VDC:mode>): Use vector-tuple mode iterator and
	rename to...
	(aarch64_ld1x2<vstruct_elt>): This.
	(aarch64_ld<VSTRUCT:nregs>_lane<VALLDIF:mode>): Use vector-
	tuple mode iterator and rename to...
	(aarch64_ld<nregs>_lane<vstruct_elt>): This.
	(aarch64_get_dreg<VSTRUCT:mode><VDC:mode>): Delete.
	(aarch64_get_qreg<VSTRUCT:mode><VQ:mode>): Likewise.
	(aarch64_st2<mode>_dreg): Use vector-tuple mode iterator and
	rename to...
	(aarch64_st2<vstruct_elt>_dreg): This.
	(aarch64_st3<mode>_dreg): Use vector-tuple mode iterator and
	rename to...
	(aarch64_st3<vstruct_elt>_dreg): This.
	(aarch64_st4<mode>_dreg): Use vector-tuple mode iterator and
	rename to...
	(aarch64_st4<vstruct_elt>_dreg): This.
	(aarch64_st<VSTRUCT:nregs><VDC:mode>): Use vector-tuple mode
	iterator and rename to...
	(aarch64_st<nregs><vstruct_elt>): This.
	(aarch64_st<VSTRUCT:nregs><VQ:mode>): Use vector-tuple mode
	iterator and rename to aarch64_st<nregs><vstruct_elt>.
	(aarch64_st<VSTRUCT:nregs>_lane<VALLDIF:mode>): Use vector-
	tuple mode iterator and rename to...
	(aarch64_st<nregs>_lane<vstruct_elt>): This.
	(aarch64_set_qreg<VSTRUCT:mode><VQ:mode>): Delete.
	(aarch64_simd_ld1<mode>_x2): Use vector-tuple mode iterator
	and rename to...
	(aarch64_simd_ld1<vstruct_elt>_x2): This.
	* config/aarch64/aarch64.c (aarch64_advsimd_struct_mode_p):
	Refactor to include new vector-tuple modes.
	(aarch64_classify_vector_mode): Add cases for new vector-
	tuple modes.
	(aarch64_advsimd_partial_struct_mode_p): Define.
	(aarch64_advsimd_full_struct_mode_p): Likewise.
	(aarch64_advsimd_vector_array_mode): Likewise.
	(aarch64_sve_data_mode): Change location in file.
	(aarch64_array_mode): Handle case of Neon vector-tuple modes.
	(aarch64_hard_regno_nregs): Handle case of partial Neon
	vector structures.
	(aarch64_classify_address): Refactor to include handling of
	Neon vector-tuple modes.
	(aarch64_print_operand): Print "d" for "%R" for a partial
	Neon vector structure.
	(aarch64_expand_vec_perm_1): Use new vector-tuple mode.
	(aarch64_modes_tieable_p): Prevent tieing Neon partial struct
	modes with scalar machines modes larger than 8 bytes.
	(aarch64_can_change_mode_class): Don't allow changes between
	partial and full Neon vector-structure modes.
	* config/aarch64/arm_neon.h (vst2_lane_f16): Use updated
	builtin and remove boiler-plate code for opaque mode.
	(vst2_lane_f32): Likewise.
	(vst2_lane_f64): Likewise.
	(vst2_lane_p8): Likewise.
	(vst2_lane_p16): Likewise.
	(vst2_lane_p64): Likewise.
	(vst2_lane_s8): Likewise.
	(vst2_lane_s16): Likewise.
	(vst2_lane_s32): Likewise.
	(vst2_lane_s64): Likewise.
	(vst2_lane_u8): Likewise.
	(vst2_lane_u16): Likewise.
	(vst2_lane_u32): Likewise.
	(vst2_lane_u64): Likewise.
	(vst2q_lane_f16): Likewise.
	(vst2q_lane_f32): Likewise.
	(vst2q_lane_f64): Likewise.
	(vst2q_lane_p8): Likewise.
	(vst2q_lane_p16): Likewise.
	(vst2q_lane_p64): Likewise.
	(vst2q_lane_s8): Likewise.
	(vst2q_lane_s16): Likewise.
	(vst2q_lane_s32): Likewise.
	(vst2q_lane_s64): Likewise.
	(vst2q_lane_u8): Likewise.
	(vst2q_lane_u16): Likewise.
	(vst2q_lane_u32): Likewise.
	(vst2q_lane_u64): Likewise.
	(vst3_lane_f16): Likewise.
	(vst3_lane_f32): Likewise.
	(vst3_lane_f64): Likewise.
	(vst3_lane_p8): Likewise.
	(vst3_lane_p16): Likewise.
	(vst3_lane_p64): Likewise.
	(vst3_lane_s8): Likewise.
	(vst3_lane_s16): Likewise.
	(vst3_lane_s32): Likewise.
	(vst3_lane_s64): Likewise.
	(vst3_lane_u8): Likewise.
	(vst3_lane_u16): Likewise.
	(vst3_lane_u32): Likewise.
	(vst3_lane_u64): Likewise.
	(vst3q_lane_f16): Likewise.
	(vst3q_lane_f32): Likewise.
	(vst3q_lane_f64): Likewise.
	(vst3q_lane_p8): Likewise.
	(vst3q_lane_p16): Likewise.
	(vst3q_lane_p64): Likewise.
	(vst3q_lane_s8): Likewise.
	(vst3q_lane_s16): Likewise.
	(vst3q_lane_s32): Likewise.
	(vst3q_lane_s64): Likewise.
	(vst3q_lane_u8): Likewise.
	(vst3q_lane_u16): Likewise.
	(vst3q_lane_u32): Likewise.
	(vst3q_lane_u64): Likewise.
	(vst4_lane_f16): Likewise.
	(vst4_lane_f32): Likewise.
	(vst4_lane_f64): Likewise.
	(vst4_lane_p8): Likewise.
	(vst4_lane_p16): Likewise.
	(vst4_lane_p64): Likewise.
	(vst4_lane_s8): Likewise.
	(vst4_lane_s16): Likewise.
	(vst4_lane_s32): Likewise.
	(vst4_lane_s64): Likewise.
	(vst4_lane_u8): Likewise.
	(vst4_lane_u16): Likewise.
	(vst4_lane_u32): Likewise.
	(vst4_lane_u64): Likewise.
	(vst4q_lane_f16): Likewise.
	(vst4q_lane_f32): Likewise.
	(vst4q_lane_f64): Likewise.
	(vst4q_lane_p8): Likewise.
	(vst4q_lane_p16): Likewise.
	(vst4q_lane_p64): Likewise.
	(vst4q_lane_s8): Likewise.
	(vst4q_lane_s16): Likewise.
	(vst4q_lane_s32): Likewise.
	(vst4q_lane_s64): Likewise.
	(vst4q_lane_u8): Likewise.
	(vst4q_lane_u16): Likewise.
	(vst4q_lane_u32): Likewise.
	(vst4q_lane_u64): Likewise.
	(vtbl3_s8): Likewise.
	(vtbl3_u8): Likewise.
	(vtbl3_p8): Likewise.
	(vtbl4_s8): Likewise.
	(vtbl4_u8): Likewise.
	(vtbl4_p8): Likewise.
	(vld1_u8_x3): Likewise.
	(vld1_s8_x3): Likewise.
	(vld1_u16_x3): Likewise.
	(vld1_s16_x3): Likewise.
	(vld1_u32_x3): Likewise.
	(vld1_s32_x3): Likewise.
	(vld1_u64_x3): Likewise.
	(vld1_s64_x3): Likewise.
	(vld1_f16_x3): Likewise.
	(vld1_f32_x3): Likewise.
	(vld1_f64_x3): Likewise.
	(vld1_p8_x3): Likewise.
	(vld1_p16_x3): Likewise.
	(vld1_p64_x3): Likewise.
	(vld1q_u8_x3): Likewise.
	(vld1q_s8_x3): Likewise.
	(vld1q_u16_x3): Likewise.
	(vld1q_s16_x3): Likewise.
	(vld1q_u32_x3): Likewise.
	(vld1q_s32_x3): Likewise.
	(vld1q_u64_x3): Likewise.
	(vld1q_s64_x3): Likewise.
	(vld1q_f16_x3): Likewise.
	(vld1q_f32_x3): Likewise.
	(vld1q_f64_x3): Likewise.
	(vld1q_p8_x3): Likewise.
	(vld1q_p16_x3): Likewise.
	(vld1q_p64_x3): Likewise.
	(vld1_u8_x2): Likewise.
	(vld1_s8_x2): Likewise.
	(vld1_u16_x2): Likewise.
	(vld1_s16_x2): Likewise.
	(vld1_u32_x2): Likewise.
	(vld1_s32_x2): Likewise.
	(vld1_u64_x2): Likewise.
	(vld1_s64_x2): Likewise.
	(vld1_f16_x2): Likewise.
	(vld1_f32_x2): Likewise.
	(vld1_f64_x2): Likewise.
	(vld1_p8_x2): Likewise.
	(vld1_p16_x2): Likewise.
	(vld1_p64_x2): Likewise.
	(vld1q_u8_x2): Likewise.
	(vld1q_s8_x2): Likewise.
	(vld1q_u16_x2): Likewise.
	(vld1q_s16_x2): Likewise.
	(vld1q_u32_x2): Likewise.
	(vld1q_s32_x2): Likewise.
	(vld1q_u64_x2): Likewise.
	(vld1q_s64_x2): Likewise.
	(vld1q_f16_x2): Likewise.
	(vld1q_f32_x2): Likewise.
	(vld1q_f64_x2): Likewise.
	(vld1q_p8_x2): Likewise.
	(vld1q_p16_x2): Likewise.
	(vld1q_p64_x2): Likewise.
	(vld1_s8_x4): Likewise.
	(vld1q_s8_x4): Likewise.
	(vld1_s16_x4): Likewise.
	(vld1q_s16_x4): Likewise.
	(vld1_s32_x4): Likewise.
	(vld1q_s32_x4): Likewise.
	(vld1_u8_x4): Likewise.
	(vld1q_u8_x4): Likewise.
	(vld1_u16_x4): Likewise.
	(vld1q_u16_x4): Likewise.
	(vld1_u32_x4): Likewise.
	(vld1q_u32_x4): Likewise.
	(vld1_f16_x4): Likewise.
	(vld1q_f16_x4): Likewise.
	(vld1_f32_x4): Likewise.
	(vld1q_f32_x4): Likewise.
	(vld1_p8_x4): Likewise.
	(vld1q_p8_x4): Likewise.
	(vld1_p16_x4): Likewise.
	(vld1q_p16_x4): Likewise.
	(vld1_s64_x4): Likewise.
	(vld1_u64_x4): Likewise.
	(vld1_p64_x4): Likewise.
	(vld1q_s64_x4): Likewise.
	(vld1q_u64_x4): Likewise.
	(vld1q_p64_x4): Likewise.
	(vld1_f64_x4): Likewise.
	(vld1q_f64_x4): Likewise.
	(vld2_s64): Likewise.
	(vld2_u64): Likewise.
	(vld2_f64): Likewise.
	(vld2_s8): Likewise.
	(vld2_p8): Likewise.
	(vld2_p64): Likewise.
	(vld2_s16): Likewise.
	(vld2_p16): Likewise.
	(vld2_s32): Likewise.
	(vld2_u8): Likewise.
	(vld2_u16): Likewise.
	(vld2_u32): Likewise.
	(vld2_f16): Likewise.
	(vld2_f32): Likewise.
	(vld2q_s8): Likewise.
	(vld2q_p8): Likewise.
	(vld2q_s16): Likewise.
	(vld2q_p16): Likewise.
	(vld2q_p64): Likewise.
	(vld2q_s32): Likewise.
	(vld2q_s64): Likewise.
	(vld2q_u8): Likewise.
	(vld2q_u16): Likewise.
	(vld2q_u32): Likewise.
	(vld2q_u64): Likewise.
	(vld2q_f16): Likewise.
	(vld2q_f32): Likewise.
	(vld2q_f64): Likewise.
	(vld3_s64): Likewise.
	(vld3_u64): Likewise.
	(vld3_f64): Likewise.
	(vld3_s8): Likewise.
	(vld3_p8): Likewise.
	(vld3_s16): Likewise.
	(vld3_p16): Likewise.
	(vld3_s32): Likewise.
	(vld3_u8): Likewise.
	(vld3_u16): Likewise.
	(vld3_u32): Likewise.
	(vld3_f16): Likewise.
	(vld3_f32): Likewise.
	(vld3_p64): Likewise.
	(vld3q_s8): Likewise.
	(vld3q_p8): Likewise.
	(vld3q_s16): Likewise.
	(vld3q_p16): Likewise.
	(vld3q_s32): Likewise.
	(vld3q_s64): Likewise.
	(vld3q_u8): Likewise.
	(vld3q_u16): Likewise.
	(vld3q_u32): Likewise.
	(vld3q_u64): Likewise.
	(vld3q_f16): Likewise.
	(vld3q_f32): Likewise.
	(vld3q_f64): Likewise.
	(vld3q_p64): Likewise.
	(vld4_s64): Likewise.
	(vld4_u64): Likewise.
	(vld4_f64): Likewise.
	(vld4_s8): Likewise.
	(vld4_p8): Likewise.
	(vld4_s16): Likewise.
	(vld4_p16): Likewise.
	(vld4_s32): Likewise.
	(vld4_u8): Likewise.
	(vld4_u16): Likewise.
	(vld4_u32): Likewise.
	(vld4_f16): Likewise.
	(vld4_f32): Likewise.
	(vld4_p64): Likewise.
	(vld4q_s8): Likewise.
	(vld4q_p8): Likewise.
	(vld4q_s16): Likewise.
	(vld4q_p16): Likewise.
	(vld4q_s32): Likewise.
	(vld4q_s64): Likewise.
	(vld4q_u8): Likewise.
	(vld4q_u16): Likewise.
	(vld4q_u32): Likewise.
	(vld4q_u64): Likewise.
	(vld4q_f16): Likewise.
	(vld4q_f32): Likewise.
	(vld4q_f64): Likewise.
	(vld4q_p64): Likewise.
	(vld2_dup_s8): Likewise.
	(vld2_dup_s16): Likewise.
	(vld2_dup_s32): Likewise.
	(vld2_dup_f16): Likewise.
	(vld2_dup_f32): Likewise.
	(vld2_dup_f64): Likewise.
	(vld2_dup_u8): Likewise.
	(vld2_dup_u16): Likewise.
	(vld2_dup_u32): Likewise.
	(vld2_dup_p8): Likewise.
	(vld2_dup_p16): Likewise.
	(vld2_dup_p64): Likewise.
	(vld2_dup_s64): Likewise.
	(vld2_dup_u64): Likewise.
	(vld2q_dup_s8): Likewise.
	(vld2q_dup_p8): Likewise.
	(vld2q_dup_s16): Likewise.
	(vld2q_dup_p16): Likewise.
	(vld2q_dup_s32): Likewise.
	(vld2q_dup_s64): Likewise.
	(vld2q_dup_u8): Likewise.
	(vld2q_dup_u16): Likewise.
	(vld2q_dup_u32): Likewise.
	(vld2q_dup_u64): Likewise.
	(vld2q_dup_f16): Likewise.
	(vld2q_dup_f32): Likewise.
	(vld2q_dup_f64): Likewise.
	(vld2q_dup_p64): Likewise.
	(vld3_dup_s64): Likewise.
	(vld3_dup_u64): Likewise.
	(vld3_dup_f64): Likewise.
	(vld3_dup_s8): Likewise.
	(vld3_dup_p8): Likewise.
	(vld3_dup_s16): Likewise.
	(vld3_dup_p16): Likewise.
	(vld3_dup_s32): Likewise.
	(vld3_dup_u8): Likewise.
	(vld3_dup_u16): Likewise.
	(vld3_dup_u32): Likewise.
	(vld3_dup_f16): Likewise.
	(vld3_dup_f32): Likewise.
	(vld3_dup_p64): Likewise.
	(vld3q_dup_s8): Likewise.
	(vld3q_dup_p8): Likewise.
	(vld3q_dup_s16): Likewise.
	(vld3q_dup_p16): Likewise.
	(vld3q_dup_s32): Likewise.
	(vld3q_dup_s64): Likewise.
	(vld3q_dup_u8): Likewise.
	(vld3q_dup_u16): Likewise.
	(vld3q_dup_u32): Likewise.
	(vld3q_dup_u64): Likewise.
	(vld3q_dup_f16): Likewise.
	(vld3q_dup_f32): Likewise.
	(vld3q_dup_f64): Likewise.
	(vld3q_dup_p64): Likewise.
	(vld4_dup_s64): Likewise.
	(vld4_dup_u64): Likewise.
	(vld4_dup_f64): Likewise.
	(vld4_dup_s8): Likewise.
	(vld4_dup_p8): Likewise.
	(vld4_dup_s16): Likewise.
	(vld4_dup_p16): Likewise.
	(vld4_dup_s32): Likewise.
	(vld4_dup_u8): Likewise.
	(vld4_dup_u16): Likewise.
	(vld4_dup_u32): Likewise.
	(vld4_dup_f16): Likewise.
	(vld4_dup_f32): Likewise.
	(vld4_dup_p64): Likewise.
	(vld4q_dup_s8): Likewise.
	(vld4q_dup_p8): Likewise.
	(vld4q_dup_s16): Likewise.
	(vld4q_dup_p16): Likewise.
	(vld4q_dup_s32): Likewise.
	(vld4q_dup_s64): Likewise.
	(vld4q_dup_u8): Likewise.
	(vld4q_dup_u16): Likewise.
	(vld4q_dup_u32): Likewise.
	(vld4q_dup_u64): Likewise.
	(vld4q_dup_f16): Likewise.
	(vld4q_dup_f32): Likewise.
	(vld4q_dup_f64): Likewise.
	(vld4q_dup_p64): Likewise.
	(vld2_lane_u8): Likewise.
	(vld2_lane_u16): Likewise.
	(vld2_lane_u32): Likewise.
	(vld2_lane_u64): Likewise.
	(vld2_lane_s8): Likewise.
	(vld2_lane_s16): Likewise.
	(vld2_lane_s32): Likewise.
	(vld2_lane_s64): Likewise.
	(vld2_lane_f16): Likewise.
	(vld2_lane_f32): Likewise.
	(vld2_lane_f64): Likewise.
	(vld2_lane_p8): Likewise.
	(vld2_lane_p16): Likewise.
	(vld2_lane_p64): Likewise.
	(vld2q_lane_u8): Likewise.
	(vld2q_lane_u16): Likewise.
	(vld2q_lane_u32): Likewise.
	(vld2q_lane_u64): Likewise.
	(vld2q_lane_s8): Likewise.
	(vld2q_lane_s16): Likewise.
	(vld2q_lane_s32): Likewise.
	(vld2q_lane_s64): Likewise.
	(vld2q_lane_f16): Likewise.
	(vld2q_lane_f32): Likewise.
	(vld2q_lane_f64): Likewise.
	(vld2q_lane_p8): Likewise.
	(vld2q_lane_p16): Likewise.
	(vld2q_lane_p64): Likewise.
	(vld3_lane_u8): Likewise.
	(vld3_lane_u16): Likewise.
	(vld3_lane_u32): Likewise.
	(vld3_lane_u64): Likewise.
	(vld3_lane_s8): Likewise.
	(vld3_lane_s16): Likewise.
	(vld3_lane_s32): Likewise.
	(vld3_lane_s64): Likewise.
	(vld3_lane_f16): Likewise.
	(vld3_lane_f32): Likewise.
	(vld3_lane_f64): Likewise.
	(vld3_lane_p8): Likewise.
	(vld3_lane_p16): Likewise.
	(vld3_lane_p64): Likewise.
	(vld3q_lane_u8): Likewise.
	(vld3q_lane_u16): Likewise.
	(vld3q_lane_u32): Likewise.
	(vld3q_lane_u64): Likewise.
	(vld3q_lane_s8): Likewise.
	(vld3q_lane_s16): Likewise.
	(vld3q_lane_s32): Likewise.
	(vld3q_lane_s64): Likewise.
	(vld3q_lane_f16): Likewise.
	(vld3q_lane_f32): Likewise.
	(vld3q_lane_f64): Likewise.
	(vld3q_lane_p8): Likewise.
	(vld3q_lane_p16): Likewise.
	(vld3q_lane_p64): Likewise.
	(vld4_lane_u8): Likewise.
	(vld4_lane_u16): Likewise.
	(vld4_lane_u32): Likewise.
	(vld4_lane_u64): Likewise.
	(vld4_lane_s8): Likewise.
	(vld4_lane_s16): Likewise.
	(vld4_lane_s32): Likewise.
	(vld4_lane_s64): Likewise.
	(vld4_lane_f16): Likewise.
	(vld4_lane_f32): Likewise.
	(vld4_lane_f64): Likewise.
	(vld4_lane_p8): Likewise.
	(vld4_lane_p16): Likewise.
	(vld4_lane_p64): Likewise.
	(vld4q_lane_u8): Likewise.
	(vld4q_lane_u16): Likewise.
	(vld4q_lane_u32): Likewise.
	(vld4q_lane_u64): Likewise.
	(vld4q_lane_s8): Likewise.
	(vld4q_lane_s16): Likewise.
	(vld4q_lane_s32): Likewise.
	(vld4q_lane_s64): Likewise.
	(vld4q_lane_f16): Likewise.
	(vld4q_lane_f32): Likewise.
	(vld4q_lane_f64): Likewise.
	(vld4q_lane_p8): Likewise.
	(vld4q_lane_p16): Likewise.
	(vld4q_lane_p64): Likewise.
	(vqtbl2_s8): Likewise.
	(vqtbl2_u8): Likewise.
	(vqtbl2_p8): Likewise.
	(vqtbl2q_s8): Likewise.
	(vqtbl2q_u8): Likewise.
	(vqtbl2q_p8): Likewise.
	(vqtbl3_s8): Likewise.
	(vqtbl3_u8): Likewise.
	(vqtbl3_p8): Likewise.
	(vqtbl3q_s8): Likewise.
	(vqtbl3q_u8): Likewise.
	(vqtbl3q_p8): Likewise.
	(vqtbl4_s8): Likewise.
	(vqtbl4_u8): Likewise.
	(vqtbl4_p8): Likewise.
	(vqtbl4q_s8): Likewise.
	(vqtbl4q_u8): Likewise.
	(vqtbl4q_p8): Likewise.
	(vqtbx2_s8): Likewise.
	(vqtbx2_u8): Likewise.
	(vqtbx2_p8): Likewise.
	(vqtbx2q_s8): Likewise.
	(vqtbx2q_u8): Likewise.
	(vqtbx2q_p8): Likewise.
	(vqtbx3_s8): Likewise.
	(vqtbx3_u8): Likewise.
	(vqtbx3_p8): Likewise.
	(vqtbx3q_s8): Likewise.
	(vqtbx3q_u8): Likewise.
	(vqtbx3q_p8): Likewise.
	(vqtbx4_s8): Likewise.
	(vqtbx4_u8): Likewise.
	(vqtbx4_p8): Likewise.
	(vqtbx4q_s8): Likewise.
	(vqtbx4q_u8): Likewise.
	(vqtbx4q_p8): Likewise.
	(vst1_s64_x2): Likewise.
	(vst1_u64_x2): Likewise.
	(vst1_f64_x2): Likewise.
	(vst1_s8_x2): Likewise.
	(vst1_p8_x2): Likewise.
	(vst1_s16_x2): Likewise.
	(vst1_p16_x2): Likewise.
	(vst1_s32_x2): Likewise.
	(vst1_u8_x2): Likewise.
	(vst1_u16_x2): Likewise.
	(vst1_u32_x2): Likewise.
	(vst1_f16_x2): Likewise.
	(vst1_f32_x2): Likewise.
	(vst1_p64_x2): Likewise.
	(vst1q_s8_x2): Likewise.
	(vst1q_p8_x2): Likewise.
	(vst1q_s16_x2): Likewise.
	(vst1q_p16_x2): Likewise.
	(vst1q_s32_x2): Likewise.
	(vst1q_s64_x2): Likewise.
	(vst1q_u8_x2): Likewise.
	(vst1q_u16_x2): Likewise.
	(vst1q_u32_x2): Likewise.
	(vst1q_u64_x2): Likewise.
	(vst1q_f16_x2): Likewise.
	(vst1q_f32_x2): Likewise.
	(vst1q_f64_x2): Likewise.
	(vst1q_p64_x2): Likewise.
	(vst1_s64_x3): Likewise.
	(vst1_u64_x3): Likewise.
	(vst1_f64_x3): Likewise.
	(vst1_s8_x3): Likewise.
	(vst1_p8_x3): Likewise.
	(vst1_s16_x3): Likewise.
	(vst1_p16_x3): Likewise.
	(vst1_s32_x3): Likewise.
	(vst1_u8_x3): Likewise.
	(vst1_u16_x3): Likewise.
	(vst1_u32_x3): Likewise.
	(vst1_f16_x3): Likewise.
	(vst1_f32_x3): Likewise.
	(vst1_p64_x3): Likewise.
	(vst1q_s8_x3): Likewise.
	(vst1q_p8_x3): Likewise.
	(vst1q_s16_x3): Likewise.
	(vst1q_p16_x3): Likewise.
	(vst1q_s32_x3): Likewise.
	(vst1q_s64_x3): Likewise.
	(vst1q_u8_x3): Likewise.
	(vst1q_u16_x3): Likewise.
	(vst1q_u32_x3): Likewise.
	(vst1q_u64_x3): Likewise.
	(vst1q_f16_x3): Likewise.
	(vst1q_f32_x3): Likewise.
	(vst1q_f64_x3): Likewise.
	(vst1q_p64_x3): Likewise.
	(vst1_s8_x4): Likewise.
	(vst1q_s8_x4): Likewise.
	(vst1_s16_x4): Likewise.
	(vst1q_s16_x4): Likewise.
	(vst1_s32_x4): Likewise.
	(vst1q_s32_x4): Likewise.
	(vst1_u8_x4): Likewise.
	(vst1q_u8_x4): Likewise.
	(vst1_u16_x4): Likewise.
	(vst1q_u16_x4): Likewise.
	(vst1_u32_x4): Likewise.
	(vst1q_u32_x4): Likewise.
	(vst1_f16_x4): Likewise.
	(vst1q_f16_x4): Likewise.
	(vst1_f32_x4): Likewise.
	(vst1q_f32_x4): Likewise.
	(vst1_p8_x4): Likewise.
	(vst1q_p8_x4): Likewise.
	(vst1_p16_x4): Likewise.
	(vst1q_p16_x4): Likewise.
	(vst1_s64_x4): Likewise.
	(vst1_u64_x4): Likewise.
	(vst1_p64_x4): Likewise.
	(vst1q_s64_x4): Likewise.
	(vst1q_u64_x4): Likewise.
	(vst1q_p64_x4): Likewise.
	(vst1_f64_x4): Likewise.
	(vst1q_f64_x4): Likewise.
	(vst2_s64): Likewise.
	(vst2_u64): Likewise.
	(vst2_f64): Likewise.
	(vst2_s8): Likewise.
	(vst2_p8): Likewise.
	(vst2_s16): Likewise.
	(vst2_p16): Likewise.
	(vst2_s32): Likewise.
	(vst2_u8): Likewise.
	(vst2_u16): Likewise.
	(vst2_u32): Likewise.
	(vst2_f16): Likewise.
	(vst2_f32): Likewise.
	(vst2_p64): Likewise.
	(vst2q_s8): Likewise.
	(vst2q_p8): Likewise.
	(vst2q_s16): Likewise.
	(vst2q_p16): Likewise.
	(vst2q_s32): Likewise.
	(vst2q_s64): Likewise.
	(vst2q_u8): Likewise.
	(vst2q_u16): Likewise.
	(vst2q_u32): Likewise.
	(vst2q_u64): Likewise.
	(vst2q_f16): Likewise.
	(vst2q_f32): Likewise.
	(vst2q_f64): Likewise.
	(vst2q_p64): Likewise.
	(vst3_s64): Likewise.
	(vst3_u64): Likewise.
	(vst3_f64): Likewise.
	(vst3_s8): Likewise.
	(vst3_p8): Likewise.
	(vst3_s16): Likewise.
	(vst3_p16): Likewise.
	(vst3_s32): Likewise.
	(vst3_u8): Likewise.
	(vst3_u16): Likewise.
	(vst3_u32): Likewise.
	(vst3_f16): Likewise.
	(vst3_f32): Likewise.
	(vst3_p64): Likewise.
	(vst3q_s8): Likewise.
	(vst3q_p8): Likewise.
	(vst3q_s16): Likewise.
	(vst3q_p16): Likewise.
	(vst3q_s32): Likewise.
	(vst3q_s64): Likewise.
	(vst3q_u8): Likewise.
	(vst3q_u16): Likewise.
	(vst3q_u32): Likewise.
	(vst3q_u64): Likewise.
	(vst3q_f16): Likewise.
	(vst3q_f32): Likewise.
	(vst3q_f64): Likewise.
	(vst3q_p64): Likewise.
	(vst4_s64): Likewise.
	(vst4_u64): Likewise.
	(vst4_f64): Likewise.
	(vst4_s8): Likewise.
	(vst4_p8): Likewise.
	(vst4_s16): Likewise.
	(vst4_p16): Likewise.
	(vst4_s32): Likewise.
	(vst4_u8): Likewise.
	(vst4_u16): Likewise.
	(vst4_u32): Likewise.
	(vst4_f16): Likewise.
	(vst4_f32): Likewise.
	(vst4_p64): Likewise.
	(vst4q_s8): Likewise.
	(vst4q_p8): Likewise.
	(vst4q_s16): Likewise.
	(vst4q_p16): Likewise.
	(vst4q_s32): Likewise.
	(vst4q_s64): Likewise.
	(vst4q_u8): Likewise.
	(vst4q_u16): Likewise.
	(vst4q_u32): Likewise.
	(vst4q_u64): Likewise.
	(vst4q_f16): Likewise.
	(vst4q_f32): Likewise.
	(vst4q_f64): Likewise.
	(vst4q_p64): Likewise.
	(vtbx4_s8): Likewise.
	(vtbx4_u8): Likewise.
	(vtbx4_p8): Likewise.
	(vld1_bf16_x2): Likewise.
	(vld1q_bf16_x2): Likewise.
	(vld1_bf16_x3): Likewise.
	(vld1q_bf16_x3): Likewise.
	(vld1_bf16_x4): Likewise.
	(vld1q_bf16_x4): Likewise.
	(vld2_bf16): Likewise.
	(vld2q_bf16): Likewise.
	(vld2_dup_bf16): Likewise.
	(vld2q_dup_bf16): Likewise.
	(vld3_bf16): Likewise.
	(vld3q_bf16): Likewise.
	(vld3_dup_bf16): Likewise.
	(vld3q_dup_bf16): Likewise.
	(vld4_bf16): Likewise.
	(vld4q_bf16): Likewise.
	(vld4_dup_bf16): Likewise.
	(vld4q_dup_bf16): Likewise.
	(vst1_bf16_x2): Likewise.
	(vst1q_bf16_x2): Likewise.
	(vst1_bf16_x3): Likewise.
	(vst1q_bf16_x3): Likewise.
	(vst1_bf16_x4): Likewise.
	(vst1q_bf16_x4): Likewise.
	(vst2_bf16): Likewise.
	(vst2q_bf16): Likewise.
	(vst3_bf16): Likewise.
	(vst3q_bf16): Likewise.
	(vst4_bf16): Likewise.
	(vst4q_bf16): Likewise.
	(vld2_lane_bf16): Likewise.
	(vld2q_lane_bf16): Likewise.
	(vld3_lane_bf16): Likewise.
	(vld3q_lane_bf16): Likewise.
	(vld4_lane_bf16): Likewise.
	(vld4q_lane_bf16): Likewise.
	(vst2_lane_bf16): Likewise.
	(vst2q_lane_bf16): Likewise.
	(vst3_lane_bf16): Likewise.
	(vst3q_lane_bf16): Likewise.
	(vst4_lane_bf16): Likewise.
	(vst4q_lane_bf16): Likewise.
	* config/aarch64/geniterators.sh: Modify iterator regex to
	match new vector-tuple modes.
	* config/aarch64/iterators.md (insn_count): Extend mode
	attribute with vector-tuple type information.
	(nregs): Likewise.
	(Vendreg): Likewise.
	(Vetype): Likewise.
	(Vtype): Likewise.
	(VSTRUCT_2D): New mode iterator.
	(VSTRUCT_2DNX): Likewise.
	(VSTRUCT_2DX): Likewise.
	(VSTRUCT_2Q): Likewise.
	(VSTRUCT_2QD): Likewise.
	(VSTRUCT_3D): Likewise.
	(VSTRUCT_3DNX): Likewise.
	(VSTRUCT_3DX): Likewise.
	(VSTRUCT_3Q): Likewise.
        (VSTRUCT_3QD): Likewise.
	(VSTRUCT_4D): Likewise.
	(VSTRUCT_4DNX): Likewise.
	(VSTRUCT_4DX): Likewise.
	(VSTRUCT_4Q): Likewise.
        (VSTRUCT_4QD): Likewise.
	(VSTRUCT_D): Likewise.
	(VSTRUCT_Q): Likewise.
	(VSTRUCT_QD): Likewise.
	(VSTRUCT_ELT): New mode attribute.
	(vstruct_elt): Likewise.
	* genmodes.c (VECTOR_MODE): Add default prefix and order
	parameters.
	(VECTOR_MODE_WITH_PREFIX): Define.
	(make_vector_mode): Add mode prefix and order parameters.

gcc/testsuite/ChangeLog:

	* gcc.target/aarch64/advsimd-intrinsics/bf16_vldN_lane_2.c:
	Relax incorrect register number requirement.
	* gcc.target/aarch64/sve/pcs/struct_3_256.c: Accept
	equivalent codegen with fmov.

[-- Attachment #2: rb14782.patch.zip --]
[-- Type: application/zip, Size: 38522 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2021-11-02 12:34 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-22 14:48 [PATCH 4/6] aarch64: Add machine modes for Neon vector-tuple types Jonathan Wright
2021-10-22 15:13 ` Richard Sandiford
2021-11-02 11:19   ` [PATCH 4/6 V2] " Jonathan Wright
2021-11-02 12:34     ` Richard Sandiford

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).